Do You Even Need a Database?

[-]

Active-Struggle-1937@reddit

In my opinion, a database is a must if you want to scale and get realtime responses, it will drastically improve user experience.

[-]

That's so much code to implement something that any SQL database gives you in a couple of lines of code. Just go to Supabase and create a single table. Or if you are more ambitious install PostgreSQL + PostgREST combo by yourself.

In reality the question is completely opposite: do you even need an application?

[-]

dangerbird2@reddit

Yeah, it’s not like it takes a rocket scientist to set up Postgres or mariadb

[-]

spotter@reddit

A rite of passage.

[-]

AdQuirky3186@reddit

I install Excel on all my web servers so I can have a reliable and usable database whenever I need it.

[-]

Huge_Leader_6605@reddit

That would be better then whatever described in this article lol

[-]

Separate_Expert9096@reddit

So… What they advice is to write your own NoSQL document database from scratch instead of using established ones?

[-]

Steveadoo@reddit

I highly doubt most applications that are choosing between a database and reading/writing to json files “rarely look up records by more than one column, use joins, or write to multiple tables atomically”. Static sites I guess?

Just use SQLite. This is dumb.

[-]

PhatClowns@reddit

Even if that WERE the case that your application has extremely simple read/write throughput… You definitely want it there for when — not IF — it becomes more complicated.

“I don’t need a database” is genuinely a rookie mistake.

[-]

Huge_Leader_6605@reddit

The weird thing is, the article does not looks like written by a "rookie". Is he like trying to misguide people on purpose lol

[-]

mikenikles@reddit

It's a marketing play, he and his buddy want to promote their DB client. I'd do the same for mine if I wasn't busy building features for customers :).

[-]

reivblaze@reddit

This is just the mongodb thing all over again.

MONGODB IS WEB SCALE

[-]

NewPhoneNewSubs@reddit

In a second year course, I still didn't know much about databases. So I built dicts of dicts and so on and serialized them. Then tied that to load and save buttons. So a mistake a genuine rookie made, for sure.

[-]

boiledbarnacle@reddit

I haven't tested it but SQLite claims it's 30+% faster than files. Probably due to compression and indexes for fast seeks.

[-]

thatikey@reddit

Agree completely. Sqlite’s speed is competitive with just directly reading files (as reflected in the article as well) and has had an incredible amount of effort put into making it durable, reliable, fault tolerant and easy to use. Just use sqlite

[-]

Ha_Deal_5079@reddit

you always end up needing atomic writes or queries eventually. sqlite is right there and its just a file lol

[-]

jeenajeena@reddit

When I design an application, I very rarely start with a DB. I often go with a file, recently using the Option 2 (loading all in memory).

To my big surprise, the number of times I actually need to migrate to a real DB is really, really small.

[-]

CaffeinatedT@reddit

So the question is not whether to use files. You're always using files

I mean. this is incorrect straight off the bat, most industrial DB's built since the 90s are using some flavour of block store rather than file based abstractions.

In-memory map is the ceiling. 97k req/s with sub-millisecond latency at every scale. If your dataset fits in RAM, nothing on disk will match it.

Databases know this too hence obsession with using L1-L3 caches for as much as possible. Issue is knowing for certain that you never will exceed it or put too much pressure on the system while a lot is happening, or that you definitely don't care if you do have that problem.

None of these constraints apply to a lot of applications. Plenty of internal tools, side projects, and early-stage products will never have a dataset that doesn't fit in a single server's RAM, never need to join across tables under heavy load, and never run more than one instance. For those applications, this approach works.

So basically a technical problem with hardly any constraints it doesn't really matter how you solve it. Ok great insight. to take the code for instance

struct 
Store
 {
    users: 
RwLock
<
HashMap
<
String
, 
User
>>,
    file: 
Mutex
<
File
>,
}

impl 
Store
 {
    fn load(path: &str) -> 
Arc
<Self> {
        let mut map = 
HashMap
::new();
        if let 
Ok
(f) = 
File
::open(path) {
            for line in 
BufReader
::new(f).lines().flatten() {
                if let 
Ok
(u) = serde_json::from_str::<
User
>(&line) {
                    map.insert(u.id.clone(), u);
                }
            }
        }
        let file = 
OpenOptions
::new().create(true).append(true).open(path).unwrap();

Arc
::new(
Store
 { users: 
RwLock
::new(map), file: 
Mutex
::new(file) })
    }

    fn get(&self, id: &str) -> 
Option
<
User
> {
        self.users.read().unwrap().get(id).cloned()
    }
}

This implementation is moronic to try to replicate a database. If your whole point is you don't care about concurrency why even bother with a RwLock? If you do care to the extent of a database then you're using a spin lock rather than jumping straight to a RwLock first time round and you'd still have problems as any reader is blocking writes over a Mutex. Which we have on the file. So Now we can have a fun situation of users being locked while you're file is not locked.

[-]

jeenajeena@reddit

In the benchmark table I would use the same unit of measure for all rows, to highlight the 3-order-of-magnitude difference with in-memory.

[-]

sweetno@reddit

Filesystem is also a database of sorts!

[-]

imihnevich@reddit

I would still go with the database, I guess, but I think it's a very interesting study

[-]

mss-cyclist@reddit

Throw in some concurrency and have fun /s

[-]

apparently_DMA@reddit

theres way more to dbs than just store data somewhere. this article is just a dumb clickbaity marketing.

[-]

ApyPulse@reddit

everything is a database though. csv file is a database, in-memory linked list is a database. you just can't avoid it. when project growths and doesn't fit a single machine anymore, that's time to have a dedicated database server, the distributed one.

[-]