how would you approach reading Designing Data-Intensive Applications as a software engineer?
Posted by DifficultSecretary22@reddit | ExperiencedDevs | View on Reddit | 29 comments
i recently picked up Designing Data-Intensive Applications by Martin Kleppmann. i’ve heard it's one of those must-read books for backend engineers, but honestly, it's pretty dense and a bit overwhelming at first glance .
i'm a software engineer and i want to actually understand the ideas behind it, not just skim it for buzzwords. but i also don’t want to burn out trying to read it like a novel front to back.
so here’s my question to fellow engineers who’ve read or are reading it: how would you approach this book to actually retain and apply what it teaches?
do you read it cover to cover or jump around based on interest or job relevance?
do you take notes, build mental models, try to apply stuff immediately?
are there chapters you found more useful than others for real-world work?
any tips or battle-tested approaches are welcome. i’d rather read it slowly and well than fast and forget everything .
DeterminedQuokka@reddit
I haven't read this one, but I've read a couple other very dense books. What I usually do is that I actually start building something in chapter 1 and after every chapter I modify that thing with what I learned from the new chapter.
So like I'm reading a really dense book on neural networks right now so I built a neural network to actually attempt to do the stuff that was happening in the books.
I also agree if it's good rereading helps. I've read SICP a couple times. read it wait a year, do some of it, then come back.
Humxnsco_at_220416@reddit
I think this is a good approach in general but I struggle to see how that would work with this book. The different approaches/tradeoffs are so fundamental and big that I often couldn't relate because I haven't been in projects of that size. Like you don't experiment with map/reduce on terabytes of data. And if you just boot up a minimal example I fear you will miss the point.
dlm2137@reddit
I read it in a book club at work with my engineering team, we read and discussed a chapter a week. Massively helped my understanding and comprehension.
Humxnsco_at_220416@reddit
Same. And while I can't recommend book clubs enough, I think this is a book that is so fundamental that you just need to power through, go back to work, and keep revisiting when you brush up against the challenges discussed. I recommended it to a colleague on the same assignment that wasn't in the book club and he said he would read it over the summer. I'm looking forward to hear what he thought about it and being able to discuss it with a team mate.
servermeta_net@reddit
It's like the bible: you gotta read it multiple times until you can quote it by memory
dash_bro@reddit
I've had some degree of success doing this:
You need to "understand" how to do it yourself, and see the pitfall or the advantage compared to the book; only then does it start making sense. If you still have questions or ideas, Gemini Live is actually a fairly decent resource to explain your conceptual questions to and understand better.
Don't read the entire thing. Focus on small things you actually need or are curious about, try to do them yourself, and refer to the "textbook" way of doing it. Also note that most designs are tradeoffs and there's no silver bullet, so while you should read the book, don't take it as gospel!
ravenclau13@reddit
My 2c: this book is pretty much tailored as an advanced intro to making your own distributed storage/db. Pretty. great as it covers wide topics, but it's not a book for the devs building client facing apps, or data engineering in general.
With that in mind, if it's the book for you, I usually start a summary on each topic(say reconciliation algos) when it comes to high level tech books, and save that in gitbook. It's great for keeping long term track of what is useful for me. It might be useful for your colleagues as well, or when you're applying for new roles as a way to refresh your cache
noonemustknowmysecre@reddit
Probably page 1 if I really wanted to read a book about it. Skip the bullshit parts that don't interest me.
Otherwise, I'd probably just go with a database. If you've got needs beyond that, sure, read up. White papers, SO, wikipedia. Do whatever the industry standard is doing. I'm no PHD post-doc and I'm not working at a startup trying to get a step ahead of the pack.
I don't really read these sort of books unless I'm paid to.
travishummel@reddit
I just read this cover to cover as I’m soon to come back from a career break so I had a lot of time. I wish I had read it sooner.
I’d suggest doing it one chapter at a time. Like over 8 or so weeks, maybe 2 chapters a week if you’re making good progress. I would reach the chapter, then bookmark the summary. I’d ask ChatGPT to summarize the chapter and then I’d ask questions. Then I’d write a small paragraph in a notebook as my own summary. After a few chapters, I would read the previous 2 or 3 summaries + my notes to help retain the ideas.
Setting a goal of 1 chapter a week is pretty solid. I think it will change the way I come up with solutions. Idk about applying it immediately because it’s not like I have a message queue setup right now, but if I was working at a big tech company I’d probably jump around the codebase to look for the message queues, rpc calls, caches, replication, and ETL pipelines. I got through 10yoe with only relying on these things already being setup so I don’t think it was absolutely necessary to know the in depth setup.
ninseicowboy@reddit
Start at the beginning and end at the end
shifty_lifty_doodah@reddit
IMO the goals of reading a book like this
1) internalize the data structures and approaches so you have some intuition for them
2) Become of aware of how people frame and think about systems problems.
So I read the index, then study the structures in an interesting/novel looking section, and try to internalize that structure into a “mental shortcut”.
everything is arrays, maps, graphs, and trees, but you’re learning slightly new ways to approach and model those structures in a distributed setting
rahul91105@reddit
Why don’t you try reading Understanding Distributed Systems by Roberto Vitillo as an alternative, easier to understand and then move DDIA.
There is also a YouTube channel: Jordan has no life (if you would prefer a video approach)
kevinossia@reddit
The way I approach books is I read something useful out of them and try to apply it to a task I’m working on at work.
If I can’t do that then the book isn’t useful.
DDIA is great but as I’m not a web guy it was mostly a novelty.
In general take the practical approach. You’re not a student and this isn’t school. Find the useful bits, and apply them somehow, somewhere, or no learning will take place.
pxpxy@reddit
Just read it like a novel, it's fine. Skip stuff you don't care about. Come back for details when you actually need it. Don't let it sit in the shelf forever because "you can't do it justice right now"
sawpsawp@reddit
go chapter by chapter and ask ChatGPT to write you Anki cards as you go, review nightly
my_coding_account@reddit
Instead of reading it you could watch the youtube channel https://www.youtube.com/@jordanhasnolife5163
Independent_Grab_242@reddit
I remember back in 2022, I said I'll read 5 pages a day, max 20mins.
1 page took half an hour!
KarmaIssues@reddit
I jumped around it, starting with what I was most interested in and then the next most interesting bit and on and on.
In the end, I had like 50 pages left to read, so I just decided to finish it.
I retained a lot, but I didn't really make a point to try and retain it since I can just reopen the book.
thehumblestbean@reddit
It's a silly title, but How to Read a Book is pretty much the gold standard for learning how to read for the purposes of education or learning.
https://www.amazon.com/How-Read-Book-Classic-Intelligent/dp/0671212095
tdifen@reddit
Study it, don't read it.
When I went through it I took notes on each page and worked hard to understand the concepts. I just acted like I was in university again.
monvictor3@reddit
When I read it the first time, it read it from cover to cover. However, it took me more than a month to finish it. Take it slow. Let the ideas sink in slowly. First 3 chapters are not as dense as rest of the book. Your progress will likely get slower from chapter 4. That's completely normal. I took notes when I read it. I learn better that way. I don't reference them anymore as there are YouTube/LLM can provide better notes.
Chapters that you find useful depends on your day to day work. For me, chapter 6 and chapter 8 were most relevant. However, I have applied principles from almost all the chapters over the last few years. Surprisingly, a lot of time from chapter 1.
IMO, best way to read is to take it slow. Try to apply what you have read in some theoretical system and think about PROs and CONs.
mx_code@reddit
IMO DDIA makes most sense when you have experience with the topics mentioned in the book.
I wouldn't read it cover to cover, rather I would go understand the high level concepts and then skip the deeper low level concepts (there's a lot in the transaction chapter that is dense and not something that you will immediately apply).
So:
I would get the high level concepts, understand how you would apply to a project you've done in the past.
And base on this identify where your shortcomings are in terms of the low-level implementation and then dive into that
compute_fail_24@reddit
> IMO DDIA makes most sense when you have experience with the topics mentioned in the book.
I agree with this but my suggestion is always (1) read it once before you have the experience (2) go into many battles (3) read again (4) redirect to #2
mx_code@reddit
Yes, that's also applicable but I've seen a lot of people go through their career with encountering those kind of challenges.
So to somehow rephrase is: do take a look at the book, but make sure to place yourself in a work environment that makes you tackle these kind of challenges (lest, reading the book won't be a fruitful thing)
amaroq137@reddit
There's this method which makes a lot of sense although I've never tried it:
https://www.youtube.com/watch?v=nqYmmZKY4sA
julz_yo@reddit
Thank you Kind stranger: that is actually a genius way to read a book. I predict I'll wish I came across this years ago!
DecisiveVictory@reddit
I just read it chapter by chapter. I wanted to invest some time to build some relevant apps to better understand the concepts after each chapter, but that's postponed to an indeterminate future when I have more time & energy.
rebuilt@reddit
It might be easier if you first get a summary of each chapter before trying to read it yourself. This is a good task for ai.
confuseddork24@reddit
It is very dense but very informative. I personally got through it by just being thorough and taking my time. I would do additional research on some of the topics being covered chapter by chapter, which I think helped me digest everything as I went. I also read Fundamentals of Data Engineering prior to reading this one, which turns out gave a decent intro to some of the topics in Designing Data-Intensive Applications.
I will say it's been about a year since I read it and I'm thinking about going through it again after a couple of other books on my list.