7 min

Sql.js-httpvfs (Static SQLite) 5 months later James' Audiolog: Indie.am

    • Personal Journals

So I've been playing with sql.jSHTtPvfs for a few months now. Basically, what this is is it's a sequel light worker compiled in web ASM and it runs against the remote database now the interesting that is the implements, implements a virtual file system and arrange requests. So say you have a gigabyte SQL database somewhere in the cloud. As long as you have a FASTSIGNS support range requests, then this is just grabbing data, one kilobyte at a time. Similarly, to how a sequel, I would read a local database from the file system if it was configured to page with file, reads of 1024 bias anyway, that's the interesting part. So what makes it interesting is, if you indexed an organize your data, in a way that a request would be fast from a local SQL database, then in theory, you could perform the same requests remotely from the browser to that SQL database on a CDN, but you're Back in the server would not be running a database senses and in theory, if you optimize those they would be, they have some overhead for every request. But in theory it's not horrible, but maybe your hundred millisecond request becomes her 900 and the second request, something like that. So I was thinking about it and in August I wrote a possible analytics clone in a couple weekends that used sequel to JSHTTPVFS end. It was interesting, but it was not interesting to me as I just very Leslie implemented it. It cleared a lot of data, it might as well just download the entire database because it didn't really taste the idea out too thoroughly, but give me a chance to play around with it. And it was a nice proof of concept fast forward to December, and I rewrote this at to start off with in injects walk parser, so that I will have a pretty large status at minded up with a 300 MB log file, which I parsed turns into a Roughly 300 MB equal a database, and that gave me a large enough database to play around with him. It was large enough that I had to optimize long, parsing and inserts so that it didn't take an incredibly long mountain time to parse over. I think 1.3 million rose. I got it down to about 57 seconds and then includes a bunch of like I Peed, a country, look at parsing, URLs person, user agents, things like that, nothing that requires a web request, just things that can be done locally and I know it's all stored in Sql light which brought me to actually querying the database running it locally or indexed that size Deezer about 13. Second, queries to just group by pathname to do like top 25 requested pages and then count of requests for that page, not ideal and for the server that I was looking at. I actually only loaded like eight hours of data, so it would be a much larger data set in production. So obviously that's not gon na work, even if it run locally. So I would never have a whole of adding a few indexes and immediately you know that clears up the problem that becomes like 100 and 300 millisecond. Where is than when you get to the end of all this and get it working, and you know I'm just about to implement the web assembly module and do it all remotely when it occurs to me, why why? What does it get you to get you like? A little bit of scaling, you know say that there's no overhead, it just means you can scale. The number of reads infinitely doesn't really make a lot of sense. It'S like now that I reduce this to like the eight or so columns that I want to index. I'Ve basically written out the only eight SQL queries in a run against this. I might as well pre-computer or from the sky surfer, because it's it's kind of pointless to do this over 80 TTPVFS unless you're talking about sticking it on a CDN and then, like millions of people, view those charts and you just can't be bothered to pre-compute the Data, this is like the most interesting little thing I've played around with in a while. I got to the end of it, and it's just like wait as fun as this is like. This is not a compelling use case for the virtual file system that works over HTTPit's like

So I've been playing with sql.jSHTtPvfs for a few months now. Basically, what this is is it's a sequel light worker compiled in web ASM and it runs against the remote database now the interesting that is the implements, implements a virtual file system and arrange requests. So say you have a gigabyte SQL database somewhere in the cloud. As long as you have a FASTSIGNS support range requests, then this is just grabbing data, one kilobyte at a time. Similarly, to how a sequel, I would read a local database from the file system if it was configured to page with file, reads of 1024 bias anyway, that's the interesting part. So what makes it interesting is, if you indexed an organize your data, in a way that a request would be fast from a local SQL database, then in theory, you could perform the same requests remotely from the browser to that SQL database on a CDN, but you're Back in the server would not be running a database senses and in theory, if you optimize those they would be, they have some overhead for every request. But in theory it's not horrible, but maybe your hundred millisecond request becomes her 900 and the second request, something like that. So I was thinking about it and in August I wrote a possible analytics clone in a couple weekends that used sequel to JSHTTPVFS end. It was interesting, but it was not interesting to me as I just very Leslie implemented it. It cleared a lot of data, it might as well just download the entire database because it didn't really taste the idea out too thoroughly, but give me a chance to play around with it. And it was a nice proof of concept fast forward to December, and I rewrote this at to start off with in injects walk parser, so that I will have a pretty large status at minded up with a 300 MB log file, which I parsed turns into a Roughly 300 MB equal a database, and that gave me a large enough database to play around with him. It was large enough that I had to optimize long, parsing and inserts so that it didn't take an incredibly long mountain time to parse over. I think 1.3 million rose. I got it down to about 57 seconds and then includes a bunch of like I Peed, a country, look at parsing, URLs person, user agents, things like that, nothing that requires a web request, just things that can be done locally and I know it's all stored in Sql light which brought me to actually querying the database running it locally or indexed that size Deezer about 13. Second, queries to just group by pathname to do like top 25 requested pages and then count of requests for that page, not ideal and for the server that I was looking at. I actually only loaded like eight hours of data, so it would be a much larger data set in production. So obviously that's not gon na work, even if it run locally. So I would never have a whole of adding a few indexes and immediately you know that clears up the problem that becomes like 100 and 300 millisecond. Where is than when you get to the end of all this and get it working, and you know I'm just about to implement the web assembly module and do it all remotely when it occurs to me, why why? What does it get you to get you like? A little bit of scaling, you know say that there's no overhead, it just means you can scale. The number of reads infinitely doesn't really make a lot of sense. It'S like now that I reduce this to like the eight or so columns that I want to index. I'Ve basically written out the only eight SQL queries in a run against this. I might as well pre-computer or from the sky surfer, because it's it's kind of pointless to do this over 80 TTPVFS unless you're talking about sticking it on a CDN and then, like millions of people, view those charts and you just can't be bothered to pre-compute the Data, this is like the most interesting little thing I've played around with in a while. I got to the end of it, and it's just like wait as fun as this is like. This is not a compelling use case for the virtual file system that works over HTTPit's like

7 min