That's so much code to implement something that any SQL database gives you in a couple of lines of code. Just go to Supabase and create a single table. Or if you are more ambitious install PostgreSQL + PostgREST combo by yourself.
In reality the question is completely opposite: do you even need an application?
I highly doubt most applications that are choosing between a database and reading/writing to json files “rarely look up records by more than one column, use joins, or write to multiple tables atomically”. Static sites I guess?
Even if that WERE the case that your application has extremely simple read/write throughput… You definitely want it there for when — not IF — it becomes more complicated.
“I don’t need a database” is genuinely a rookie mistake.
In a second year course, I still didn't know much about databases. So I built dicts of dicts and so on and serialized them. Then tied that to load and save buttons. So a mistake a genuine rookie made, for sure.
Agree completely. Sqlite’s speed is competitive with just directly reading files (as reflected in the article as well) and has had an incredible amount of effort put into making it durable, reliable, fault tolerant and easy to use. Just use sqlite
So the question is not whether to use files. You're always using files
I mean. this is incorrect straight off the bat, most industrial DB's built since the 90s are using some flavour of block store rather than file based abstractions.
In-memory map is the ceiling. 97k req/s with sub-millisecond latency at every scale. If your dataset fits in RAM, nothing on disk will match it.
Databases know this too hence obsession with using L1-L3 caches for as much as possible. Issue is knowing for certain that you never will exceed it or put too much pressure on the system while a lot is happening, or that you definitely don't care if you do have that problem.
None of these constraints apply to a lot of applications. Plenty of internal tools, side projects, and early-stage products will never have a dataset that doesn't fit in a single server's RAM, never need to join across tables under heavy load, and never run more than one instance. For those applications, this approach works.
So basically a technical problem with hardly any constraints it doesn't really matter how you solve it. Ok great insight. to take the code for instance
struct
Store
{
users:
RwLock
<
HashMap
<
String
,
User
>>,
file:
Mutex
<
File
>,
}
impl
Store
{
fn load(path: &str) ->
Arc
<Self> {
let mut map =
HashMap
::new();
if let
Ok
(f) =
File
::open(path) {
for line in
BufReader
::new(f).lines().flatten() {
if let
Ok
(u) = serde_json::from_str::<
User
>(&line) {
map.insert(u.id.clone(), u);
}
}
}
let file =
OpenOptions
::new().create(true).append(true).open(path).unwrap();
Arc
::new(
Store
{ users:
RwLock
::new(map), file:
Mutex
::new(file) })
}
fn get(&self, id: &str) ->
Option
<
User
> {
self.users.read().unwrap().get(id).cloned()
}
}
This implementation is moronic to try to replicate a database. If your whole point is you don't care about concurrency why even bother with a RwLock? If you do care to the extent of a database then you're using a spin lock rather than jumping straight to a RwLock first time round and you'd still have problems as any reader is blocking writes over a Mutex. Which we have on the file. So Now we can have a fun situation of users being locked while you're file is not locked.
everything is a database though. csv file is a database, in-memory linked list is a database. you just can't avoid it. when project growths and doesn't fit a single machine anymore, that's time to have a dedicated database server, the distributed one.
Active-Struggle-1937@reddit
In my opinion, a database is a must if you want to scale and get realtime responses, it will drastically improve user experience.
klekpl@reddit
That's so much code to implement something that any SQL database gives you in a couple of lines of code. Just go to Supabase and create a single table. Or if you are more ambitious install PostgreSQL + PostgREST combo by yourself.
In reality the question is completely opposite: do you even need an application?
dangerbird2@reddit
Yeah, it’s not like it takes a rocket scientist to set up Postgres or mariadb
spotter@reddit
A rite of passage.
AdQuirky3186@reddit
I install Excel on all my web servers so I can have a reliable and usable database whenever I need it.
Huge_Leader_6605@reddit
That would be better then whatever described in this article lol
Separate_Expert9096@reddit
So… What they advice is to write your own NoSQL document database from scratch instead of using established ones?
Steveadoo@reddit
I highly doubt most applications that are choosing between a database and reading/writing to json files “rarely look up records by more than one column, use joins, or write to multiple tables atomically”. Static sites I guess?
Just use SQLite. This is dumb.
PhatClowns@reddit
Even if that WERE the case that your application has extremely simple read/write throughput… You definitely want it there for when — not IF — it becomes more complicated.
“I don’t need a database” is genuinely a rookie mistake.
Huge_Leader_6605@reddit
The weird thing is, the article does not looks like written by a "rookie". Is he like trying to misguide people on purpose lol
mikenikles@reddit
It's a marketing play, he and his buddy want to promote their DB client. I'd do the same for mine if I wasn't busy building features for customers :).
reivblaze@reddit
This is just the mongodb thing all over again.
MONGODB IS WEB SCALE
NewPhoneNewSubs@reddit
In a second year course, I still didn't know much about databases. So I built dicts of dicts and so on and serialized them. Then tied that to load and save buttons. So a mistake a genuine rookie made, for sure.
boiledbarnacle@reddit
I haven't tested it but SQLite claims it's 30+% faster than files. Probably due to compression and indexes for fast seeks.
thatikey@reddit
Agree completely. Sqlite’s speed is competitive with just directly reading files (as reflected in the article as well) and has had an incredible amount of effort put into making it durable, reliable, fault tolerant and easy to use. Just use sqlite
Ha_Deal_5079@reddit
you always end up needing atomic writes or queries eventually. sqlite is right there and its just a file lol
jeenajeena@reddit
When I design an application, I very rarely start with a DB. I often go with a file, recently using the Option 2 (loading all in memory).
To my big surprise, the number of times I actually need to migrate to a real DB is really, really small.
CaffeinatedT@reddit
I mean. this is incorrect straight off the bat, most industrial DB's built since the 90s are using some flavour of block store rather than file based abstractions.
Databases know this too hence obsession with using L1-L3 caches for as much as possible. Issue is knowing for certain that you never will exceed it or put too much pressure on the system while a lot is happening, or that you definitely don't care if you do have that problem.
So basically a technical problem with hardly any constraints it doesn't really matter how you solve it. Ok great insight. to take the code for instance
This implementation is moronic to try to replicate a database. If your whole point is you don't care about concurrency why even bother with a RwLock? If you do care to the extent of a database then you're using a spin lock rather than jumping straight to a RwLock first time round and you'd still have problems as any reader is blocking writes over a Mutex. Which we have on the file. So Now we can have a fun situation of users being locked while you're file is not locked.
jeenajeena@reddit
In the benchmark table I would use the same unit of measure for all rows, to highlight the 3-order-of-magnitude difference with in-memory.
sweetno@reddit
Filesystem is also a database of sorts!
imihnevich@reddit
I would still go with the database, I guess, but I think it's a very interesting study
mss-cyclist@reddit
Throw in some concurrency and have fun /s
apparently_DMA@reddit
theres way more to dbs than just store data somewhere. this article is just a dumb clickbaity marketing.
ApyPulse@reddit
everything is a database though. csv file is a database, in-memory linked list is a database. you just can't avoid it. when project growths and doesn't fit a single machine anymore, that's time to have a dedicated database server, the distributed one.
Nicksaurus@reddit
This is still a database
boysitisover@reddit
If you need state then yes it ain't that deep buddy
recuriverighthook@reddit
I mean I just write all my data in linked lists to text files. /s (Fowler reference)