Eternal tension, and the honest answer is "which mistake is cheaper to fix later." Both over- and under-engineering are real failure modes, the skill is knowing which you're risking.
My rule after enough years: optimize for change you can see coming, not change you can imagine. An enum vs a string isn't predicting the future, it's making invalid states unrepresentable today, which pays off immediately in fewer bugs. That's not speculative future-proofing, it's tightening the current model.
Where I push back on "design for everything" is speculative abstraction: the plugin system for the one plugin that exists, config for the option nobody asked for. That complexity is real today, the benefit hypothetical.
Clean line: make the current model correct and hard to misuse (cheap, do it), don't build machinery for requirements that don't exist yet (expensive, resist it). "It might happen" is not a spec. "It's invalid and I want the compiler to say so" is.
I tell all junior developers that whatever the product team says, you need to assume the users are bloodlusted monkeys on cocaine who will do literally everything you give them access to do, in every combination.
If there's some problematic scenario, instead of first thinking "what should happen when the user does xyz?" try to think "If the business says this won't happen, how can I make it impossible?"
[...] try to think "If the business says this won't happen, how can I make it impossible?"
Yeah, making the problematic states impossible is a great practice. It's not always an option when integrating with someone else's software; especially if it's still being developed and they change the data model midway (been there).
Gotta be honest - some scenarios are so rare as to not be worth the effort of preventing. Best to log it and move on...
A recent-ish example I had is updating a row in a backend DB. An LLM review (SOTA from mid-2025, maybe) suggested that the update is a race condition: if the row is modified by two users, only one of those modifications would persist.
Well, no shit, Sherlock, but I don't see the entire world redesigning their systems to ensure that once a user loads a page to edit a record, it locks the row in the DB until the user hits submit.
The ticket system we use does check that. I believe it just uses version numbers on the record - if a user submits an edited version, it includes the version it was based on and rejects the submission if the version has increased in the meantime. That doesn't seem too complicated and I suppose you could implement the mechanism once and reuse it. Probably libraries exist for this.
We do this not for the sake of form-based updates, but for regular writes because dynamodb doesn't natively have something like read-for-update, so keeping a version field and doing a conditional update is the closest equivalent. The same exact mechanism can be used also for the scenario you're referring to.
We do this not for the sake of form-based updates, but for regular writes because dynamodb doesn't natively have something like read-for-update, so keeping a version field and doing a conditional update is the closest equivalent. The same exact mechanism can be used also for the scenario you're referring to.
It can, but the juice is not worth the squeeze. My point is that every update to a record in a DB should do this, but we don't, we do it only for a certain specific set of records.
Even something as simple as a user updating their profile - we don't perform a check-before-update when they change their display name, even though two instances of the form getting submitted would result in only the final instance taking effect.
I'm not a database guy but isn't this something you could implement once to abstract it away and re-use it everywhere where you edit existing objects interactively?
I'm not a database guy but isn't this something you could implement once to abstract it away and re-use it everywhere where you edit existing objects interactively?
I've done that. For some systems in the far past, I also implemented a checkout/commit process, also abstracted away. Here are the problems I ran into with read-before-updates.
The first problem you run into is that this has to be done in the application layer, and I can't think of many databases in production that have only a single application access it. Youy application may apply read-before-write and show the user an error if the underlying record changed, but the other applications talking to the database don't do that.
The second is that it's not an easy thing to abstract in a general sense - sometimes you only need to keep the record of a row in a single table, but that's rare. More often, you need to track any number of different tables, because an update (for example) during a client returns process will update multiple tables.
IOW, the application is updating an existing object, but that object is more likely than not to be composed of multiple tables, so the application needs to not only be schema-aware (bypassing the ORM, if any), your "read-before-update" abstraction needs to handle any number of tables, at which point it stops being an abstraction anyway.
The third problem is performance: doing a read before every write absolutely kills performance - even worse if you're doing it in the application, that's double the network roundtrips and double the latency for every update.
The fourth problem is that it doesn't work if you try to do it from the application side (some other application may have updated the record between your read and your write, even if you do it inside a transaction). Even inside a transaction, you've got to lock the specific rows for a large time (a single order of magnitude at least, maybe 2x orders of magnitude) between the read and the write, which cascades latencies and poor performance all throughout the DB.
The fifth problem, if you decide "Well, no problem - I'll just do it in a single one-trip transaction inside the single query, maybe using a CTE or something", is that that approach is not abstractable - it's a specific CTE for a specific update to a specific set of tables.
Since that's the only sane way to do it, you aren't going to do it for every single table, just the ones that are most at risk.
It’s insane to model everything that could be related to a hotel stay as one type, and have that one type be a string. This has to be a crazily unmaintainable codebase.
We only integrate with the hotel software, so there's no need for us to differentiate between these. But based on the APIs, it's clear that the old systems treat almost everything as "metadata", just like the old e-commerce platforms.
Why would you use strings instead of enums? And why not use date intervals for each category, so the ones that are valid for 365 days are not duplicated? You could also have an enums ValiditySpan with variants ParentBooking to inherit the span of the booking and CustomIntervals for a list of custom intervals. Anyways, having both categories and categoriesByDate is a bad code smell!
The categories are free text inputs in the hotel management software (PMS), so we can't know them all in advance, let alone put them into one enum. Some hotels have thousands of them.
And why not use date intervals for each category, so the ones that are valid for 365 days are not duplicated?
We considered that too. It'd allow us to cover whole-stay packages with a constant overhead, but would add some for the single-date ones (and these are more common).
Having said that, it is the next natural step to compress the categoriesByDate further (if needed).
You could also have an enums ValiditySpan with variants ParentBooking to inherit the span of the booking and CustomIntervals for a list of custom intervals.
Yeah, we could also have "every other day" and similar. However, making it the most compact representation is not our goal -- the ease of use across multiple tools (including reports) while staying performant is.
The hotels need to decide if they want automation and/or just a free text input and manage them manually. My understanding is you're doing automation and that's dangerous with a poorly-conditioned dataset. And it requires code anyway. In any case, there's no free lunch here and maybe they need to think twice before adding yet another corner case to the other thousand.
By the way, I'm not saying this cannot be automated partially, it's more like the database schema should be strict enough for the exact use cases that are covered for automation. So you can probably do enums for those specific cases and use some other field to track manually-managed cases (such as a comments field), no reason to lump them all up together and overload meaning.
The hotels need to decide if they want automation and/or just a free text input and manage them manually.
If I'd be the one working on the hotel software, I'd probably do so. But we're only integrating with them (seven and counting), so we have to play with what we're given.
My understanding is you're doing automation and that's dangerous with a poorly-conditioned dataset.
Kind of -- we let the hotels do automation in our software (restaurant management) based on their hotel bookings data (e.g., "create a breakfast reservation on each day that has the Breakfast Included category"). In that manner, the quality of automation depends exclusively on the quality of the categories they operate on.
Luckily, most hotel staff is able to understand it, and we see they develop good patterns, e.g., decide on a standard casing convention.
So you can probably do enums for those specific cases and use some other field to track manually-managed cases (such as a comments field), no reason to lump them all up together and overload meaning.
We also synchronize comments for different purposes, and there's no automation based on that. Though some users are wild and would love to do things like "use AI to detect whether they will come with a dog"... Yeah, we're not doing that.
The problem is we cannot retain something that isn't there. The APIs of the PMS software do expose various date-based formats (e.g., a list of dates with a list of packages, date ranges for each package, or even a list of (date, package) pairs), so there's no way to know why it looks like this.
Anyways, you should then never make categories and categoriesByDate accessible from outside and instead provide accessor methods that encapsulate that implementation detail.
You think about the code, and that's fine here -- we do have a getCategories(HotelBooking, Date): string[] method. But these two fields are exposed via the API, and you can use the includeCategories to achieve the exact thing you mentioned.
I would probably model different bookings as separate tables, not just tagging them with a category, as it seems you need more type-specific information than just the category itself. You may keep a common table for common metadata, however be very careful that it fits all foreseeable use cases, so be conservative.
The thing is that we're not building hotel software but only integrate with it. And as there are plenty with completely different formats, we went with the only common denominator.
deadbeef1a4@reddit
Things that will never happen, never do… until they happen
antico5@reddit
Design skill issue ;p
Jokes aside, it's these kind of stories that take a junior dev into senior-land. You only develop these thinking patterns from experience.
sagarpatel1244@reddit
Eternal tension, and the honest answer is "which mistake is cheaper to fix later." Both over- and under-engineering are real failure modes, the skill is knowing which you're risking.
My rule after enough years: optimize for change you can see coming, not change you can imagine. An enum vs a string isn't predicting the future, it's making invalid states unrepresentable today, which pays off immediately in fewer bugs. That's not speculative future-proofing, it's tightening the current model.
Where I push back on "design for everything" is speculative abstraction: the plugin system for the one plugin that exists, config for the option nobody asked for. That complexity is real today, the benefit hypothetical.
Clean line: make the current model correct and hard to misuse (cheap, do it), don't build machinery for requirements that don't exist yet (expensive, resist it). "It might happen" is not a spec. "It's invalid and I want the compiler to say so" is.
programming-ModTeam@reddit
No content written mostly by an LLM. If you don't want to write it, we don't want to read it.
thisisjustascreename@reddit
I tell all junior developers that whatever the product team says, you need to assume the users are bloodlusted monkeys on cocaine who will do literally everything you give them access to do, in every combination.
If there's some problematic scenario, instead of first thinking "what should happen when the user does xyz?" try to think "If the business says this won't happen, how can I make it impossible?"
EntroperZero@reddit
Exactly. This is known as enforcing your assumptions.
radekmie@reddit (OP)
Yeah, making the problematic states impossible is a great practice. It's not always an option when integrating with someone else's software; especially if it's still being developed and they change the data model midway (been there).
lelanthran@reddit
Gotta be honest - some scenarios are so rare as to not be worth the effort of preventing. Best to log it and move on...
A recent-ish example I had is updating a row in a backend DB. An LLM review (SOTA from mid-2025, maybe) suggested that the update is a race condition: if the row is modified by two users, only one of those modifications would persist.
Well, no shit, Sherlock, but I don't see the entire world redesigning their systems to ensure that once a user loads a page to edit a record, it locks the row in the DB until the user hits submit.
fnordstar@reddit
The ticket system we use does check that. I believe it just uses version numbers on the record - if a user submits an edited version, it includes the version it was based on and rejects the submission if the version has increased in the meantime. That doesn't seem too complicated and I suppose you could implement the mechanism once and reuse it. Probably libraries exist for this.
lelanthran@reddit
Does it do it for every single table? Because the problem can occur for every single table.
nevon@reddit
We do this not for the sake of form-based updates, but for regular writes because dynamodb doesn't natively have something like read-for-update, so keeping a version field and doing a conditional update is the closest equivalent. The same exact mechanism can be used also for the scenario you're referring to.
lelanthran@reddit
It can, but the juice is not worth the squeeze. My point is that every update to a record in a DB should do this, but we don't, we do it only for a certain specific set of records.
Even something as simple as a user updating their profile - we don't perform a check-before-update when they change their display name, even though two instances of the form getting submitted would result in only the final instance taking effect.
fnordstar@reddit
I'm not a database guy but isn't this something you could implement once to abstract it away and re-use it everywhere where you edit existing objects interactively?
lelanthran@reddit
I've done that. For some systems in the far past, I also implemented a checkout/commit process, also abstracted away. Here are the problems I ran into with read-before-updates.
The first problem you run into is that this has to be done in the application layer, and I can't think of many databases in production that have only a single application access it. Youy application may apply read-before-write and show the user an error if the underlying record changed, but the other applications talking to the database don't do that.
The second is that it's not an easy thing to abstract in a general sense - sometimes you only need to keep the record of a row in a single table, but that's rare. More often, you need to track any number of different tables, because an update (for example) during a client returns process will update multiple tables.
IOW, the application is updating an existing object, but that object is more likely than not to be composed of multiple tables, so the application needs to not only be schema-aware (bypassing the ORM, if any), your "read-before-update" abstraction needs to handle any number of tables, at which point it stops being an abstraction anyway.
The third problem is performance: doing a read before every write absolutely kills performance - even worse if you're doing it in the application, that's double the network roundtrips and double the latency for every update.
The fourth problem is that it doesn't work if you try to do it from the application side (some other application may have updated the record between your read and your write, even if you do it inside a transaction). Even inside a transaction, you've got to lock the specific rows for a large time (a single order of magnitude at least, maybe 2x orders of magnitude) between the read and the write, which cascades latencies and poor performance all throughout the DB.
The fifth problem, if you decide "Well, no problem - I'll just do it in a single one-trip transaction inside the single query, maybe using a CTE or something", is that that approach is not abstractable - it's a specific CTE for a specific update to a specific set of tables.
Since that's the only sane way to do it, you aren't going to do it for every single table, just the ones that are most at risk.
f3xjc@reddit
Yeah for dotNet efcore support RowVersion. It's a timestamp field using db clock.
michiganalt@reddit
It’s insane to model everything that could be related to a hotel stay as one type, and have that one type be a string. This has to be a crazily unmaintainable codebase.
radekmie@reddit (OP)
We only integrate with the hotel software, so there's no need for us to differentiate between these. But based on the APIs, it's clear that the old systems treat almost everything as "metadata", just like the old e-commerce platforms.
fnordstar@reddit
Why would you use strings instead of enums? And why not use date intervals for each category, so the ones that are valid for 365 days are not duplicated? You could also have an enums ValiditySpan with variants ParentBooking to inherit the span of the booking and CustomIntervals for a list of custom intervals. Anyways, having both categories and categoriesByDate is a bad code smell!
radekmie@reddit (OP)
The categories are free text inputs in the hotel management software (PMS), so we can't know them all in advance, let alone put them into one enum. Some hotels have thousands of them.
We considered that too. It'd allow us to cover whole-stay packages with a constant overhead, but would add some for the single-date ones (and these are more common).
Having said that, it is the next natural step to compress the
categoriesByDatefurther (if needed).Yeah, we could also have "every other day" and similar. However, making it the most compact representation is not our goal -- the ease of use across multiple tools (including reports) while staying performant is.
edgmnt_net@reddit
The hotels need to decide if they want automation and/or just a free text input and manage them manually. My understanding is you're doing automation and that's dangerous with a poorly-conditioned dataset. And it requires code anyway. In any case, there's no free lunch here and maybe they need to think twice before adding yet another corner case to the other thousand.
By the way, I'm not saying this cannot be automated partially, it's more like the database schema should be strict enough for the exact use cases that are covered for automation. So you can probably do enums for those specific cases and use some other field to track manually-managed cases (such as a comments field), no reason to lump them all up together and overload meaning.
radekmie@reddit (OP)
If I'd be the one working on the hotel software, I'd probably do so. But we're only integrating with them (seven and counting), so we have to play with what we're given.
Kind of -- we let the hotels do automation in our software (restaurant management) based on their hotel bookings data (e.g., "create a breakfast reservation on each day that has the
Breakfast Includedcategory"). In that manner, the quality of automation depends exclusively on the quality of the categories they operate on.Luckily, most hotel staff is able to understand it, and we see they develop good patterns, e.g., decide on a standard casing convention.
We also synchronize comments for different purposes, and there's no automation based on that. Though some users are wild and would love to do things like "use AI to detect whether they will come with a dog"... Yeah, we're not doing that.
fnordstar@reddit
It's also not necessarily a matter of making it most compact but retaining information like "every weekday" etc. which might be valuable for a UI.
radekmie@reddit (OP)
The problem is we cannot retain something that isn't there. The APIs of the PMS software do expose various date-based formats (e.g., a list of dates with a list of packages, date ranges for each package, or even a list of
(date, package)pairs), so there's no way to know why it looks like this.fnordstar@reddit
Anyways, you should then never make categories and categoriesByDate accessible from outside and instead provide accessor methods that encapsulate that implementation detail.
radekmie@reddit (OP)
You think about the code, and that's fine here -- we do have a
getCategories(HotelBooking, Date): string[]method. But these two fields are exposed via the API, and you can use theincludeCategoriesto achieve the exact thing you mentioned.HighRelevancy@reddit
Did you finish reading the post?
nocondo4me@reddit
366 days on leap years. Divisible by 4, not 100, with an 400 year exception. Time stuff sux.
edgmnt_net@reddit
I would probably model different bookings as separate tables, not just tagging them with a category, as it seems you need more type-specific information than just the category itself. You may keep a common table for common metadata, however be very careful that it fits all foreseeable use cases, so be conservative.
radekmie@reddit (OP)
The thing is that we're not building hotel software but only integrate with it. And as there are plenty with completely different formats, we went with the only common denominator.