Why You Should Never Use MongoDB

[-]

katafrakt@reddit

I used to be a huge fan of this article, but let's be honest - it was written 11 years ago. Some things changed. Not saying the you should use MongoDB for everything, like the hype-train commanded 11 years ago - fortunately this has passed and people know better now. But we, as a tech in general, have found some very legit cases for document databases, like for example read models in CQRS setups.

Reply

[-]

mikaball@reddit

MongoDB as a OLTP db is a no go for me. For instance, when you have a one-to-many relation like "Master 1-\* Item", adding an item is an insert on the Item table. If you have the items in an embedded document, you have to change the Master doc. Embedded relations have a much higher probability of optimistic concurrency exception and with that a lot of fucking problems.

Reply

[-]

aregulardude@reddit

Not too hard to just retry after an etag violation imo, I’ll take that versus dealing with sql indexes and schema management.

Reply

[-]

mikaball@reddit

In order to retry you have to fetch the new version and re-set the changes you performed, means also you have to track the changes. Schema is the best thing to keep you data model in check, instead of having null pointer exception in you code. Indexes you still have them in MongoDB. You have to deal with these problems to understand, and I don't think you had.

Reply

[-]

aregulardude@reddit

When updating a row/document, you’re always going to have to deal with concurrency if multiple users are touching the same data. The example you described was adding/removing items from a 1-many sub collection nested within the document. In this case, you should know what the operation is usually, unless you’re doing some crazy stuff like posting a whole collection instead of the operation to mutate it to the backend.

Reply

[-]

mikaball@reddit

You didn't get it. When adding items as embedded items in the document you need to perform an UPDATE. On SQL this is a different table and you only need to INSERT. You have no concurrency on INSERTS. The probability of optimistic concurrency errors on documents is much higher. You can say, this can be avoided using different collections. But the common approach in MongoDB is to have deep nested JSON structures, otherwise you would just use SQL tables. Many people are not aware of this and they get problems latter on production. It may be uncommon for different users to change the same item. The problem is that they are not hitting the same item, just changing different parts of the same document.

Reply

[-]

aregulardude@reddit

You need to read what I wrote more closely. It’s you who don’t get it clearly.

Reply

[-]

mikaball@reddit

"You need to read what I wrote more closely." - Yes read again. "But the common approach in MongoDB is to have deep nested JSON structures, otherwise you would just use SQL tables." This conversation is useless. You ignore the practical difficulties. When you inherit a project with MongoDB with a lot of deep nested structures you will curse MongoDB.

Reply

[-]

aregulardude@reddit

Ok I agree, can’t get too deep with the nesting or it gets stupid. I was more referring to your original example.

Reply

[-]

codingstuffonly@reddit

> 11 years ago when I saw 2013 in the title I thought huh, 5 years ago, I wonder is it still relevant. Where did the time go?

Reply

[-]

syklemil@reddit

Pandemic lockdowns fucked with our perception of time a lot I think. I mean, thinking 2013 is five years ago is a bit more than I'd ascribe to that phenomenon, but it's still kind of hard to intuit time calculations across the lockdown; I find myself needing to use math more.

Reply

[-]

rsclient@reddit

And thank goodness "some very legit cases" is something that's explicitly mentioned in the blog: A document may have internal structure — headings and subheadings and paragraphs and footers — but it doesn’t link to other documents. It’s a self-contained piece of semi-structured data. *If your data looks like that, you’ve got documents. Congratulations! It’s a good use case for Mongo* The rest of her site is chock full of awesome writing, too!

Reply

[-]

bittlelum@reddit

So when she says "never" she means..."sometimes"?

Reply

[-]

chicknfly@reddit

Let’s prefer *not always*

Reply

[-]

bittlelum@reddit

So it's a shitty clickbait title.

Reply

[-]

chicknfly@reddit

Pretty much

Reply

[-]

yrubooingmeimryte@reddit

I’m fairly certain people already weren’t using it “always”.

Reply

[-]

Atulin@reddit

It's a good case for Mongo... or for Postgres and a JSONB column. That way, if some relation *does* appear in the future (say, document's author, tags, category, etc) you don't need to migrate.

Reply

[-]

sisyphus@reddit

Nowdays I'd be wondering why I don't just throw a jsonb column into the postgres I already have instead of adding a dedicated document database though.

Reply

[-]

katafrakt@reddit

Every time you update something in JSONB, you effectively write a new one and save the new version of the row. Do this often and you will end up with a huge number of dead rows, killing your db performance untill you do a vacuum. Been there, done that. Document databases tend to be much more efficient with frequent small updates of just a part of the document.

Reply

[-]

zellyman@reddit

Holy shit this is awful. I'm not saying I don't believe you but can you provide a link so I can read more about this?

Reply

[-]

LuckyHedgehog@reddit

This goes into it a bit: https://pganalyze.com/blog/5mins-postgres-jsonb-toast It talks about when the json is larger than the default paging size then it uses an "oversized attribute storage technique" called TOAST, so I'm not sure if this applies to the normal paging storage or not >The one slide that I wanted to highlight is where they show the way that TOAST storage works: if you update something, if you update a value in a row that exceeds that two kilobyte limit and is stored in TOAST, TOAST was not designed for updating values and particularly it wasn't designed for updating a small part of a large value. It was designed for atomic data type, it knows nothing about the internal structure of these types. When you update a value that's in TOAST, it always duplicates the whole value.

Reply

[-]

jmaN-@reddit

TLDR: “I chose a non-relational database for my relational data”. MongoDB is fine for cases, not all. Use the correct tool for the job.

Reply

[-]

hou32hou@reddit

What kind of cases would we need non-relational data?

Reply

[-]

CyAScott@reddit

It’s a paradigm shift that you put your relational data in the document, so by reading a document you also get all the relational data included in the returned document. In the classic example of relational data, if you have a set of books and there is a relationship between books and authors then you would create a document that represents the book and you put a copy of the author’s info into the book document. So when you read the book document you get a copy of the author too. You would also have a set of authors in another collection in the DB. There are limits. If the relational data is frequently volatile (ie you do more writes than reads), then this is not a good paradigm because writes are expensive because updating an author means updating the copy in all those book documents as well as the author document. Another limit is if a one to many relationship where the many could be in the millions. These documents would be too large to reasonably manage. In addition, a many to many relationship does not work well. As someone who uses both relational and document DBs, I find those edge cases where document DBs are not a good fit are not frequent so there lots of services that would do well with document DBs. However, when those edge cases are a realistic scenario then stick with a relational DB. If you design a system around services then mixing DBs is typical and you don’t have to think of future proofing your DB choice when you start your project.

Reply

[-]

hou32hou@reddit

How do you do analytics without first normalizing them?

Reply

[-]

CyAScott@reddit

Mongo does have aggregate pipelines that allow you to do joins, projections, and grouping. Except for joins, those operations perform very well. In the years I’ve used mongo I have never needed a join query for a service. Likely because if I needed to make a report query about publisher or authors, I would run an aggregate against books, which contains all that data so no join is needed. However, I have used a join to do some analytics on data after an incident to gather information for a post mortem report.

Reply

[-]

hou32hou@reddit

It’s slow as hell, I used to use MongoDb for analytics, took 1 minutes and the job got killed, and the database only consists of a frw thousand rows. Did the same in Postgres by normalizing the data, got the response in less than a sec

Reply

[-]

Gloomy_Anywhere_5490@reddit

The correct tool for the job comment. I’ve worked on dozens of projects across multiple companies that have used mongo/document db and I don’t think you can find a case where mongo is the right solution. It might look fine for a trivial blog article on mongo, for mongo’s sake, but invariably it’s a terrible mess of eventually necessary pseudo joins, and almost impenetrable business logic where answering basic questions like “Is this thing unique?” is a deep dive into the code base. You’re far better off with a relational model at the start - at least if you care about your data. Honestly mongo is a fucking trap.

Reply

[-]

aregulardude@reddit

Short of many-many relationship, I don’t think data by nature is relational or document. You can model almost any problems with documents. Until you need many-many, then yeah go relational.

Reply

[-]

bighi@reddit

As a Canadian poet once said: never say never.

Reply

[-]

ThatInternetGuy@reddit

Tabular data with uniform set of fields are best with relational DB, whereas hierarchical JSON-like data with fairly unpredictable set of fields are best with something like a NoSQL DB. MongoDB is notorious for data corruption and fragility of its storage, so if you're doing a mission-critical system, other NoSQL DB should be considered.

Reply

[-]

WWJewMediaConspiracy@reddit

Absolutes tend to be silly. I'd recommend [Stripe's overview of their MongoDB as a service](https://stripe.com/blog/how-stripes-document-databases-supported-99.999-uptime-with-zero-downtime-data-migrations) as an example of why "never" in the title here is silly - even ignoring a decade+ of development. On the other hand - if someone said loosely structured data can't be stored in a relational DB Reddit's [ThingDB](https://github.com/reddit-archive/reddit/wiki/Architecture-Overview#thingdb) is a reasonable counterpoint. Though wouldn't be surprising if they've since had to migrate to a different DB.

Reply

[-]

pythosynthesis@reddit

We are building a new DB at my place, and we chose MongoDB after think a fair bit about the best choice. The reason? In short, the documents we need to store are highly likely to change over time and migrating old data to new tables and all associated headaches just wasn't anyone's idea of fun, or efficiency either. So maybe we shouldn't have used MongoDB, but we did, and we're quite positive about the choice.

Reply

[-]

Void_mgn@reddit

I find SQL dbs to be very mutable in terms of structure and the nice thing is you can do zero downtime migrations with transaction protection

Reply

[-]

Unusual_Flounder2073@reddit

I wrote an admittedly poor article criticizing JRuby about 15 years ago. it was my highest viewed article like it got 100 views big. lol. But it aged like a dirty diaper as the technology evolved and use cases refined.

Reply

Why You Should Never Use MongoDB

Reply to Post

36 Comments

katafrakt@reddit

mikaball@reddit

aregulardude@reddit

mikaball@reddit

aregulardude@reddit

mikaball@reddit

aregulardude@reddit

mikaball@reddit

aregulardude@reddit

codingstuffonly@reddit

syklemil@reddit

rsclient@reddit

bittlelum@reddit

chicknfly@reddit

bittlelum@reddit

chicknfly@reddit

yrubooingmeimryte@reddit

Atulin@reddit

sisyphus@reddit

katafrakt@reddit

zellyman@reddit

LuckyHedgehog@reddit

jmaN-@reddit

hou32hou@reddit

CyAScott@reddit

hou32hou@reddit

CyAScott@reddit

hou32hou@reddit

Gloomy_Anywhere_5490@reddit

aregulardude@reddit

bighi@reddit

ThatInternetGuy@reddit

WWJewMediaConspiracy@reddit

pythosynthesis@reddit

Void_mgn@reddit

Unusual_Flounder2073@reddit