To Enum or Not to Enum
Posted by Mortimer452@reddit | ExperiencedDevs | View on Reddit | 203 comments
Something I always struggle with in architecture/design is the proper use of Enums for object members that have a distinct set of possible values. Stack is C#/MSSQL/Blazor if that matters.
A simple example of this would be an Customer object with a property MembershipStatus. There's only four possible values: Active, Trial, Expired, Cancelled.
There's two choices here:
Define MembershipStatus as an integer enum: - (pro) Normalized, in the back-end the DB column is an integer - (pro) MembershipStatus is strongly typed in code and is therefore constrained to those four values, they pop-up in autocomplete and accidental assignment of invalid values is impossible without a runtime error - (pro) I can just use .ToString in the UI to show a "friendlier" name instead of the int values - (con) On the DB side, it's a meaningless int value. Anyone doing stuff in the DB layer (stored procs, reporting, custom queries, exports, etc.) have to keep track of these and roll their own logic for display purposes (replacing "1" with "Active", etc.) They could also assign an invalid int value and nothing would break. - (pro/con) I could create a MembershipStatus table with an FK to Customers.MembershipStatus to eliminate the above issue (SQL people can JOIN to this table for "friendly" names, FK constraint prevents invalid values) but now every time I add another value to my Enum I have to remember to add it in the lookup table as well.
Define MembershipStatus as a string: - (pro) Non-ambiguous and easy to read everywhere - (pro) I can define the possible values as Consts in code to make sure they are kept consistent in code - (con) For the DBA in me this just "feels wrong" to have a freeform text field containing what really should be a lookup table to maintain integrity - (con) Uses more storage on the DB side (varchar versus 4-byte int) - (con) Anything using this on the C# side is just a string value, not strongly typed, so it's possible to assign invalid values without generating any errors
Anyway, sorry for the long post, hopefully at least a few here have dealt with this dilemma. Are you always one or the other? Do you have some criteria to decide which is best?
OvisInteritus@reddit
This is not JavaScript, enums in C# are the way to go.
SansSariph@reddit
Enums in the app layer are unambiguously correct for closed sets, in any language that enables build-time exhaustiveness checks for switches, type projections, etc.
The ability to break the build for forgetting to map a new value is most of the reason that modeling the set as an enum has value in the first place imo
Schmittfried@reddit
There are exceptions to this rule. For example you might have an enum that is technically a closed set, but some downstream services only care about a closed subset. Intentionally not using a proper enum for those allows non-breaking additions of new constants in the upstream service.
link23@reddit
Downstream services can handle enum values that they don't care about by just having a
default:branch in a switch statement. Nothing about enums prevents clients from only paying attention to a subset of the allowable values.Schmittfried@reddit
No they can’t. An unknown value cannot be deserialized into an enum. Some serializers support it, but even then you need a sentinel/default enum value, so you lose the original value.
SansSariph@reddit
Whenever I've needed to model forward compat I've reached for this pattern and found it useful. Often it's when one of the clients of my service needs a point-in-time reference of the valid values and the contract is that unknown values either:
I've usually just logged the mismatch ("Handling unknown {X}; mapping to {Y} per contract") at the data boundary and then I have a clean type internally from then on out.
Ultimately to me this is where it stops becoming an enum question and more an external dependency + data versioning + breaking change question. It's fair to say that additions to an enum are a breaking change and should be versioned as such, in which case your consumers know there will never be unknown values. Otherwise it's also fair to say that the enum is designed to be forward-compatible, and client expectations for unexpected values are well-defined.
Without either of those documented, then you don't actually have an enum in the contract, it's inherently not a closed set at all from the consumer's perspective because it can change at any time.
Schmittfried@reddit
That’s my entire point. Concerns like these make it hard to apply a fixed rule dogmatically. I‘ve used the pattern you described, too. I’ve also used freeform text columns in cases where values were technically closed sets but didn’t matter in the context of the consuming service (yet) and were just kept for future reference. I‘ve also used strict enums when I explicitly wanted breaking changes to fail fast. It all depends on the situation.
SansSariph@reddit
The original post is around internal enums - a contract with yourself that you control. A persistence layer complicates it and sure, you need a migration story if you ever change your contract, but the point you raised is really about public APIs and how to reason about data you don't control.
At any API boundary I am going to tend towards defensive even if I own the full stack. Once the "enum" has made it through the data boundary and into my app layer, I'm going to be working on a closed set, and I'm going to model it as an app-layer enum :) Hell if the service and client live long enough there is going to be drift in both directions where the client has actually built an abstraction over the service value and they are truly decoupled.
There is nuance to the dogma!
Schmittfried@reddit
The post didn‘t explicitly exclude interactions with the outside world, so even if it doesn’t apply here, I don’t see why it shouldn’t be mentioned there can be exceptions to the rule.
SansSariph@reddit
It reads as though you feel I am arguing with you or saying you're wrong or something. I don't feel that way.
Schmittfried@reddit
Ok, same here. :D I thought you were arguing against me adding a possible exception to the rule, which genuinely confused me. Glad we could clear this up, have a nice day!
link23@reddit
If there are unknown values, then it isn't a closed set, which invalidates the assumption we started at in your previous comment.
Schmittfried@reddit
It may be in the domain context and the owning domain service would likely model it with an enum. For a downstream service it might be open or closed but forward-compatible.
Individual-Brief1116@reddit
Exactly this. The exhaustiveness check alone makes enums worth it. I've seen too many bugs from forgotten string constants when requirements changed months later.
Ysilla@reddit
Also not limited to enums, Union types are fantastic for that exact same reason in the languages that support them. When used well, you can just have the compiler tell you if you missed to update something when adding a new value to them, and even just point to all the places you need to update in many cases.
Dependent-Birthday29@reddit
You are completely conflating two distinct concepts.
Use enums in your code obviously for a fixed set of things. This is non-negotiable best practice.
What you describe in your post is how that value gets represented in a database. That doesn't matter. Use entity or whatever to deserialize into an enum in your code. That's a parsing question - which your code should always be flexible with respect to.
Dyledion@reddit
Enum. Always.
Shookfr@reddit
Except in Typescript then it's never
captain_obvious_here@reddit
As a not too serious user of Typescript, I have read that several times but never got a clear explanation why.
Do you have a clear explanation why Enums are a bad thing in TS?
Infiniteh@reddit
I think this explains it pretty well
https://www.totaltypescript.com/why-i-dont-like-typescript-enums
captain_obvious_here@reddit
This is indeed a good explanation, thank you!
But then, what is a correct way to handle the need for Enums in TS?
Infiniteh@reddit
From a string union type
or an object literal
or from an array
Sometimes, if you are working with libraries or frameworks that use decorators to perform codegen or the like, to expose an API spec for instance, you might be forced to use enums. TypeGraphQL comes to mind.
Then, stick to string enums, as the article an video reocmmend.
Zeragamba@reddit
There's also the fact that enums are currently working their way into EcmaScript, and their implementation might differ from TypeScript's
ConsiderationSea1347@reddit
Almost every part of typescript is buttery perfection. Almost.
Wonderful-Habit-139@reddit
Typescript made me fall in love with type systems.
hegelsforehead@reddit
{ } as const, always
Limietaru@reddit
And then satisfies. Actually satisfying.
Empty Object though? Eslint on my back about this this week for some old interface {}
type Record ?
thekwoka@reddit
They meant defining you keyvalues as an object with const.
thekwoka@reddit
I wouldn't say never.
It can still be useful.
for example, a lot of bundlers will inline literal values for enums, so the "enum object" doesn't even get bundled into the code.
Which is dope.
so long as you never use
x in Enumor some shit.Mortimer452@reddit (OP)
This is usually my answer as well, but I keep second-guessing myself.
Do you create a lookup table on the DB side as well and just deal with having to keep both the enum and FK lookup table in sync?
kani_kani_katoa@reddit
You could create a lookup table in the db automatically from the enum values. I've seen that used in databases that don't have native enum types. You could also store the string value in the db with a check constraint on it limiting it to valid values.
rredpoppy@reddit
check constraint modification is a migration concern, which you would do anyway, via your orm or via regular migrations
Mortimer452@reddit (OP)
Yeah both are decent approaches.
Hadn't thought about the check constraint idea, problem is, it's kinda the same problem as just having an FK to a lookup table, end up with two places to maintain that list of possible values (C# enum, check constraint on the DB side) unless you can automate changing that constraint in the CD pipeline
PartBanyanTree@reddit
no you can do it really smart. generate the check constraint sql dynamically based on enum values.
a lifetime of working with databases makes me want strings in the database- make it human understandable. status value of 7 is who know what but status value of 'orderCancelled' you've got a hope in hell. self documenting. but whether it is numbers or strings, yeah, what about out-of-range/invalid values? if its possible it definitely will happen. so definitely check constraints
but yes you can have the sql definition of the check constraints be derived from the enumerated values of the enumerated itself.
then if the enums ever change you can generate a new migration and it will automatically update the constraints!
warning: if you ever modify the enumerated names you will need to write a data migration because those values are in the database. I think this is a reasonable tradeoff and ultimately the enums help you manage/maintain/force this complexity
I'm on mobile right now but if you want I can give you the exact code snippets I use to do all this when im back at my desk, just reply if you're interested in a howto example
kani_kani_katoa@reddit
Yeah I was going to say, I have this exact script set up as part of a project I work on. It builds the check constraints and validates them against the current schema, and makes a migration if they change. The integration tests also validate that the enums match the db schema. Its not particularly hard.
PartBanyanTree@reddit
yes I have a CI test too! it adds a migration and confirms i5 generates empty and that it didnt cause the snapshot file to be git modified. if so then it fails the CI test.
so if I (or some future dev) rename an enum without realizing it has downstream effects they know it immediately
SolarNachoes@reddit
Strings for enums doesn’t scale as well for millions and billions of records.
deux3xmachina@reddit
Design your tables correctly and it's only one copy of each string that actually gets stored.
DaRadioman@reddit
That's no longer what they are describing and is not a string but a FK. No different from enum at that point especially if you have a descriptions table.
PartBanyanTree@reddit
yeah for sure I only use string/enums when there is no need for a foreign table. as soon as there is then it's getting converted to ids. one table per entity
in my travels I've run across the architecture a few times of a "LookupTable" that has many rows each of which has a "LookupType" as a way to reuse what's basically a bunch of simple key/value pairs (id & value) and in my experience it always goes sideways as the structure gets abused, more columns get added.
I figured if its worth having a foreign key then it should have its own specific table.
obviously if you have some type of performance or throughput or io situation then strings are maybe bad. but most times people just think they need to be super performant but they're just overthinking their 35 user crud system. like if your processing billions if rows every second then, obviously, worry about that.
just far far far more likely your wading Into an unfamiliar database on a forgotten server trying to make heads or tails of a db structure and boy am I always glad to human readable strings. it's the same reason json became such a ubiquitous format. binary streams might be more opaque but you can sniff network traffic and visually inspect the json
half_man_half_cat@reddit
This is the way ^^
MadeWithPat@reddit
I’ve used EF fluent configurations in the past to generate the seed data via some reflection stuff. You’re still keeping track of the table I guess, but it’s more “oh yeah, I added a value, I need a migration”, which sounds like maybe that’s lighter than what you’re experiencing?
Dyledion@reddit
What DB are you using that doesn't have enums?
Mortimer452@reddit (OP)
MSSQL. Postgres and MySql are the only DB engines that natively support enums
immersiveGamer@reddit
I used enum column type in MySQL and it was a pain long term. Adding a new enum was a schema change and if you did the alter column with the enums in a different order oops! you just re-coded the enum value for every row in the table.
Also keeping the enum in sync was a pain point that our team experienced as well.
I don't know yet the long term impact but we recently migrated all enums to use a pure int column and auto generate an enum table on service boot.
k2718@reddit
I don’t know about MSSQL but in Postgres you creat an enumerated type and it works just the same as in application code. Your db client will even convert for you if you keep the exact names the same (I usually go SHOUTING_SNAKE)
Zulakki@reddit
had an external contractor try and convince otherwise, didn't work. Im an Enum man, and I've always tied that together with a table in the DB to ensure data integrity, but for some reason, this contractor was convinced this was overkill and he's never done it that way with no issue. claimed we can support integrity via the code. none of it sat right with me, but I let him make the call
sharpcoder29@reddit
What's the difference between enforcing via code or db? Both are "code" done by an engineer. And both equally have an opportunity to screw it up (and need to add a value if needed)
ings0c@reddit
Enforcing it via the DB is the strongest guarantee.
If your check is in the application code, there’s nothing stopping someone coming along and doing
UPDATE [MyTable] SET MyEnumColumn = ‘NotAnEnumMember’In general, doing stuff in the DB is harder to debug and maintain than it would be in the application code. Here though, I don’t think there’s much advantage to having the constraint in code - I’d do it in the DB.
Big check constraints can also slow down writes, but not usually enough to matter.
sharpcoder29@reddit
The db is just a plugin. The main thing is business logic, like you said, that can be unit tested. There's nothing stopping an app dev or a db dev from doing something stupid
HairyIce@reddit
Definitely. I would have an enum with a backing lookup table in the db. It keeps everything normalized and gives you the foreign key constraints as extra protection against inserting bad data when a junior dev comes along and doesn't use your enum or const.
In addition it covers your concern about somebody wanting to display the text 'Active' when they're working on the database side...a simple join takes care of that concern.
max123246@reddit
It's more complex than that. When you're writing a library that cares about backwards compatibility, you might need to document or annotate that an enum is non-exhaustive in languages where it's a breaking change to not exhaustively match all enum variants
Also, you can often get the benefits of a known closed set plus an extensible custom user set by making the last variant of an enum a Custom variant that accepts some interface or trait object or whatever your language allows
supercargo@reddit
What kind of enumerations are we talking about here?
If they’re the kind that find themselves the determinant in a case statement or any branching logic you should use constrained enums wherever you can (not really an int vs string thing, it’s a closed vs open thing). Expanding the valid set of enums is a breaking change.
If they’re really just another piece of data where the system only ever needs to validate/lookup, you have some flexibility.
max123246@reddit
Oh for sure, what I was talking about is orthogonal to whether to use enums or not. I just wanted to open up the discussion because the answer is often it depends and I didn't want people to take such a reductive stance without really thinking about it all
The_Northern_Light@reddit
yeah why is this a question???
BusEquivalent9605@reddit
thank you. only sane answer
Exirel@reddit
There is nothing else to say.
hachface@reddit
If you’re working in a type system that supports Literals that’s just as good if not better.
WheresTheSauce@reddit
Nothing is this simple.
Dyledion@reddit
My Turing-complete ADT enums beg to differ.
Intelligent-Day-4059@reddit
I've gone back and forth on this so many times. For a closed set that rarely changes, enum with a lookup table is my sweet spot. Yeah you have to maintain both but the FK constraint keeps the DB honest and you get compile time safety in code. Strings feel nice until someone types Activve with two v's at 2am. Then you remember why enums exist.
No_Quit_5301@reddit
The real anti pattern is persisting a “status” value to DB instead of deriving it at runtime based on other props of the object
HeyExcuseMeMister@reddit
Enum plus automated ser/deser to and from string.
Aggressive_Ad_5454@reddit
Either one is gonna work just fine. Neither one gets you off the hook for writing really clear documentation for your table definitions. Most of the pros and cons you mention relate to explaining what you are doing for the benefit of the next poor schlubb who works on this code.
Hint. That poor schlubb is your future self. Write good docs for that poor schlubb.
sethkills@reddit
It’s a shame the ORM does not support enumeration types defined in the Db schema, because that would be the ideal…
VRT303@reddit
Database columns can have / be enums too
instadit@reddit
I can't count the times I regretted not using enum from the get go
I have never regretted using it.
ben_bliksem@reddit
We use enum and the json serializers are set to use the string values. Ditto if I'm going to store it in the db.
Is it best practice to? I'm sure somebody somewhere is going to say no but then we've never hit any sort of performance (or any) problems doing this.
Cell-i-Zenit@reddit
This here.
Its so much work to communicate the enum values to externals if they are not human readable.
And then you can ask yourself the question: Why do you not want to have a human readable enum value in the DB? There is basically 0 performance gain to use ints. Most ORMs can map from String to Enum.
Stephonovich@reddit
Someone’s never done the math on how big billions of rows of strings are compounded across many, many tables. The DB only has so much RAM; don’t make its job harder.
Cell-i-Zenit@reddit
if you have such a big dataset that it has an effect, then you wouldnt ask here
hooahest@reddit
What if you end up renaming an internal enum? if you saved it as string in the DB then it starts to be an issue to rename it
bence_p@reddit
Migrate it or add an unknown, handle the unknown case however is most reasonable for the context.
Ecstatic-Passenger55@reddit
I do a similar thing with custom attributes on the enum fields to define how to map the values to the database and back again.
matthedev@reddit
Enumerated types are best when:
hipsterdad_sf@reddit
My default on this exact case: use a string enum in the DB with a CHECK constraint (or a lookup table with an FK), and keep the type-safe enum on the C# side. You get most of the wins from both columns and only one real downside.
Concretely for MembershipStatus:
DB column: varchar(16) with CHECK (MembershipStatus IN ('Active','Trial','Expired','Cancelled')) or an FK to a MembershipStatus lookup table. Readable in queries, enforced at the DB level, no orphaned rows, reports just work.
C# side: enum MembershipStatus { Active, Trial, Expired, Cancelled } with your ORM configured to map to/from the string value. EF Core has HasConversion() for this, one line. Dapper has a type handler. You keep strong typing, autocomplete, and exhaustive switch.
The varchar storage cost is not a real concern in 99 percent of schemas. 16 bytes vs 4 bytes on a status column is a rounding error next to the actual row size. Index performance on a short varchar with a small cardinality is fine, especially with a filtered index if you mostly query active rows.
The one real tradeoff: when you add a value in code, you also have to change the CHECK constraint or lookup table. That's a feature, not a bug. It forces you to actually think about the migration and what happens to rows with the new value vs. old readers. If you make this a one line schema change it's much safer than "forgot to update the MembershipStatusLookup table and now everything is a magic int."
What I'd avoid: integer enums stored in the DB with no lookup table. That's the worst of both worlds. Every SQL query needs tribal knowledge to interpret, every report has a CASE statement, and one day someone renumbers the enum in code and silently changes the meaning of historical data. Seen this exact bug destroy a week of analytics.
Rule of thumb: if the value is ever going to show up in a SQL query a human reads, store it as a string. If it's a pure internal implementation detail that never leaves the service, ints are fine.
CodelinesNL@reddit
Doesn't MSSQL support enums? Because that's the 3rd option. I know for a fact Postgres does.
ActiveTrader007@reddit
Use Enums Transactional systems are designed for efficient operations and not reporting
Db layer should be light weight and just for data persistence. Stored procs and business logic in them should be avoided and instead be in the orm/application layer Stored procs are not used anymore because of tight coupling and not easily unit testable. Stored procs also do not scale well
Inline sql is fine to use in application layer if parameterized and enums can be used with their string values so they are readable
Stephonovich@reddit
I’m a DBRE. I want correctness, and the DB is the only thing that can truly enforce that. I do not want strings, because they have no meaning unless you add CHECK constraints, at which point you’re recreating foreign key constraints, poorly. Normalize your data.
This is why at my work, there is easily at least one incident spawning from a data integrity error per day: because a generation of devs has come to believe that the DB is a dumb KV store, instead of the single source of truth.
Bullshit; you can write stored procedures in Rust if you want (Postgres, PL/Rust), among others. That said, even if they were pure SQL, you’re almost certainly going to hit some other bottleneck before you hit that.
As to testing, I mean… it’s a function. Write tests that give it inputs, and expect outputs. This isn’t hard.
03263@reddit
Maybe later you'll have inactive for accounts that are not yet verified?
It depends how much control you have over the db. If you can easily write and run migrations to update enums as needed, go for it. If you have to submit every schema change request to a DBA and wait, prefer a more flexible schema. There's no single best use case.
And I would store it as a string. I prefer stuff in the db to be readable over performant, especially in younger apps.
Brave-Kitchen-832@reddit
Is the extra storage and bandwidth associated with string representations genuinely a meaningful performance bottleneck here?
I don't know your situation but am biased: in my 15 year career I have never encountered an edge case where someone is simultaneously using a database over a network connection but high level business concepts encoded with a few extra bytes make or break performance.
You probably want to tell your DBA to fuck off to be honest
Stephonovich@reddit
It’s not just the performance (it in fact does matter at the billions of rows scale, I assure you), it’s the data integrity. There are so, so many issues that can crop up from storing dumb strings that the DB has no frame of reference for. I stopped trusting devs a long time ago to handle anything in code; the DB is the only arbiter of truth. They are WAY better-tested than anything most people are ever writing.
Ignisami@reddit
Personally, as not a DBA, I favour clarity (in this case, storing as a string) over properness (store as int and an extra mapping table) every day. If you're worried about free-form text being abused you can always create a constraint on the column that only allows the four statuses you defined.
That said, can't C# store string enums in whatever persistence stuff it has?
Like, in Java I'd do this:
which neatly writes the variant as a nicely legible string into the database. Add a trigger to the column that checks for whether your status is one of the four allowed values and done.
If you want to store it as an int, just use
@Enumerated(EnumType.ORDINAL)and it'll do so. And since the field is defined as a MembershipStatus you still get the autocomplete because the enum maps 1 to MembershipStatus.ACTIVE and back.gjionergqwebrlkbjg@reddit
Constraints are a massive pain in the ass in postgres, if you want to add a new value to the column you will need to re-create that constraint, and this requires either dance with
not valid, or obtaining an exclusive lock on the table until all rows are scanned to validate it.Ignisami@reddit
True. Would depend on whether there ever will be new statuses for memberships.
thekwoka@reddit
string enum.
AdreKiseque@reddit
This is exactly the use case enums exist for
blackarea@reddit
Just use postres enum type to reflect it in persistence as well. Go full typed, keep your constraints and types as tight as possible, welcome defensive coding. There's gonna be some exceptions in prod but gradually it's gonna be smoother and smoother and SRE or your own pager duty will thank you for it.
IndividualIssue1967@reddit
Is rafi a bad guy
Helpful_Fly2878@reddit
enum. AI respects them much better than strings
itix@reddit
Using an integer enum is cleaner.
My honest answer is that, in our system, we use both. This is due to the system's age and the fact that we have been flipping back and forth between which one is desirable.
My rough rule is currently:
When coordinating efforts with the frontend, it is easier to use a string enum.
nullbyte420@reddit
It's a performance concern??
itix@reddit
Performance in SQL queries.
nullbyte420@reddit
What changes?
Mortimer452@reddit (OP)
SQL is not compiled it's interpreted at runtime.
When it comes to indexing and JOINs, lookups in WHERE clauses, integers just perform better in the DB engine. Much easier to brute force compared to a string value of variable length. They also take up less storage space. String fields are always one byte per character, integers are always 4 bytes regardless of the size of the number (up to 32-bit maximum).
nullbyte420@reddit
But wouldn't you have an index on membership status anyway? And don't you cache it? I'm not convinced it's an optimization that ever really is worth it.
Mortimer452@reddit (OP)
Sure, you can definitely index string (varchar) fields.
But lookups on an int field are always going to take less space and out-perform a varchar(20) 100% of the time.
I'll admit, in this particular case, it probably wouldn't really matter. A Customer table is never going to be hundreds of millions of records long and I'm not going to be dealing with thousands of hits per second on that table. But my background is from systems that are that size so I tend to default towards what I know to be the more performant and scalable option.
nullbyte420@reddit
Yeah for sure it'll take up a bit less space and perform minimally better, but do you really think the db doesn't do any optimizations for enum fields? Seems like such an obvious thing to do. Thanks for admitting it might not be worth it. I'm going to stick with enums until it becomes a problem 🙂
Mortimer452@reddit (OP)
Well it's a lot more than microscopically better for very large systems, but yeah in this case like I said probably not noticeable.
Enums don't exist in MSSQL, hence the whole reason of this post. If a field should only be one of N distinct values the proper way is a lookup table with a Foreign Key or CHECK constraint
nullbyte420@reddit
Ah okay. I didn't know that! Thanks for explaining
itix@reddit
Looking up rows with a string field is presumably slower than with an integer.
nullbyte420@reddit
Presumably or actually?
itix@reddit
It is actually, but with small datasets it is meaningless.
I didnt get what you were trying to say until I saw your other message:
You are right on that. I am wired to use integers because it is optimal. String can be fast enough anyway, and space considerations are meaningless for small datasets.
This is something we are currently testing in our DB design. We have a table that is ever growing with a string id.
Cell-i-Zenit@reddit
and by how much lool.
This is just a micro optimization and if you have to ask the question here on reddit the answer is 100000000% of the time: It doesnt matter for you
nullbyte420@reddit
Right? I'd rather have the readability and lose that kilobyte of storage.
Manic5PA@reddit
I no longer make engineering decisions around scaling concerns that will either never be a real problem, or will only be a real problem in a future where the project has evolved to the point where this sort of concern will be handled by a more specialized professional.
In other words, the difference between an unsigned int and a varchar probably isn't all that important.
For me it's rather simple. If a value can be one of several compile-time constants, that's an enum.
thatyousername@reddit
I do have concerns about scaling. Service I work on is ~100M TPS. Always use enums. String enums are godly. The sooner you use them, the better. It’s a huge QOL improvement for the engineer and AI loves it. At my scale, I’m not worried about how the data performs in a relational DB. I don’t use a relational DB.
MmmmmmJava@reddit
I’m curious which industry this service supports?
thatyousername@reddit
Ads
Mortimer452@reddit (OP)
Yeah, this is kind of my hang up. This particular system is likely never to see that kind of traffic, but I come from systems that do see that type of traffic, so I tend to lean towards the more performant option even if it's not really necessary in the current context.
jedfrouga@reddit
did you ask claude?! /s
Glove_Witty@reddit
Postgres has enumerated support.
Dimencia@reddit
This is a tooling issue if you're having those problems with enums. If you're struggling to understand your DB values in your support queries, use Linqpad with a typed context so it just automatically prints the enum names instead of the values. And use the EFCore configuration option that creates the enum mapping table automatically
Thonk_Thickly@reddit
Json converters will say as a string in the db for easier querying and readability, but the code stays strongly typed with an enum. [JsonConverter(typeof(JsonStringEnumConverter))] public MembershipStatus Status { get; set; } }
grappleshot@reddit
Enum in your code (model your domain correctly). Persist to db as string and have front end send string. Front end devs prefer strings to ints. data engineers and others who spend time looking at the raw data prefer to strings to ints. Tranmission size across the wire and storage space are not concerns.
Karuza1@reddit
The con doesn't have to be a meaningless int value, I do recommend have a Lookup table with id where you can join when needed to display the value of that column
Mortimer452@reddit (OP)
Right, but as stated in the post, I no longer have a single place to manage those. Anytime I add a new value to the Enum I have to remember to add it to the lookup table as well. This is actually a huge PITA in CI/CD pipelines, I can't deploy new code without checking and potentially modifying data in the production DB
aj0413@reddit
It really shouldn’t be.
10yoe and I recently migrated to platform engineering to focus on the CI/CD side explicitly. Coming from dotnet app dev world
Adding stuff via pipelines should be simple. Just use EF Core migrations
If you dont use ef core migrations, than that’s a design problem of fragmented code that lives together but doesn’t ship together
Mortimer452@reddit (OP)
Yeah but EF is just 🤢🤮 in my opinion
aj0413@reddit
There are many different ways to solve your problem, but complaining that the pipeline enforced deployments makes life hard while actively deriding the language native way of solving it….seems like firing a gun at your foot and then complaining that running a jog is an unreasonable ask lol
Mortimer452@reddit (OP)
Maybe your experience has been different, but in many environments I've worked access to the production DB environment is gated pretty hard. Getting read-only access is tough enough in some situations, but actual INSERT/UPDATE/DELETE privileges, even scoped down to a single object, are sometimes a pretty big ask.
As for the ORM piece Dapper is my preferred tool by a very large margin, I don't even touch EF these days, it just causes too many problems in larger environments.
aj0413@reddit
No. You’re misconstruing my responses
I work on the security stuff too in my comp; no way you’d get prod access
Everything is handled via pipelines; I’m saying that there’s zero issues with pipeline db deployments cause there are many paths forwards
EF migrations is just the easiest way for dotnet
remy_porter@reddit
You only touch production during the deployment step, after you've already deployed to a series of staging environments. As I said elsewhere, you've changed a data validation rule, of course your database needs to change. And you should have a database upgrade script that migrates from the last release to this release, and that script should be well tested by the time you get it to production.
cinnamonjune@reddit
I'm not very familiar with your stack, but in C++ my answer to this is usually just to have the lookup table be a function with a switch statement inside of it and no default case.
```
enum Color {
COLOR_RED,
COLOR_GREEN
};
const char* color_to_str(Color value) {
switch (value) {
case COLOR_RED:
return "RED";
case COLOR_GREEN:
return "GREEN";
}
}
```
This way if I add a new enum value, the compiler warnings will yell at me because my switch statement doesn't have an execution path for all cases, and it will also yell at me because my function doesn't return a value in all cases.
NortySpock@reddit
Well you can
"have cicd deploy the change"
"have the app auto update the lookup table"
"have the app loudly refuse to start if the lookup table does not match the enum"
I'm sure I'm forgetting an option
GoodishCoder@reddit
Fill your enum with the values in the lookup table and only update it in the lookup table or reconcile it in the code so it inserts new values whenever a new one is created.
Karuza1@reddit
I mean it largely depends on your schema of the lookup table and what those values mean to you.
Do you really need the enum? As in, can you add a flag to the lookup value that indicates something specific? Eg if Membership Status allows a user to do X, then can possible have Lookup Value of "Active" with flag "CanDoX"
This removes the need for explicitly checking enum in code and instead grab lookup value from cache and see what its allowed to do. This allows you to also change behavior of different status without a deployment.
In any case, you absolutely should have a lookup table just for data integrity.
gjionergqwebrlkbjg@reddit
It's not really a pain, just run the full set of migrations (flyway or something) before you deploy the app. Any automated integration test will catch a regression of that sort.
spastical-mackerel@reddit
This is the only way. Without the FK you’ve got “magic numbers” you have to resolve everywhere. Imagine you someday want to change “Active” to current. With the FK in the DB it’s a simple single operation
RelevantJackWhite@reddit
isn't that just the pro/con he lists right after that?
Karuza1@reddit
In either case it should be a lookup table. You can easily generate your enums from values within the table using something like old t4 templates.
To go a step beyond a lookup table allows you to define behavior to a lookup value, giving yourself flexibility in the event your application is multi tenant and those lookup values have different behavior based on the tenant.
klimaheizung@reddit
always enum. use a proper database and language that supports string-enums.
CreepyNewspaper8103@reddit
Finally, a real engineering question. This question has spanned decades. I'm happy to hear it's still a question.
haskell_rules@reddit
If I need to serialize, the enum serialization and checking syntax is so ugly in C# that I make it a static class with named member variables of an appropriately efficient type along with built in conversion properties.
AnnoyedVelociraptor@reddit
What about enum in DB? https://www.postgresql.org/docs/current/datatype-enum.html
gjionergqwebrlkbjg@reddit
Tbh I would not recommend, you get some extra limitations - you can never remove a value, so if you want to ex. retire a status, you have to do it via code anyway.
ThatSituation9908@reddit
That’s the only limitation AFAIC.
What’s so bad about it? You will still have to handle rows that reference the deprecated enum value and code still writing that value. How is that any different than strings?
Cell-i-Zenit@reddit
we used something similiar at work, but i think we did a weird "enum lookup table" thing with foreign keys to that lookup table and it was always extremly annoying to migrate
AmosIsFamous@reddit
I’ve done this and then written tests that hit the database to ensure the code enum and DB enum stay in sync.
ThatSituation9908@reddit
I’m confused about the db representation limitations. Doesn’t your DB allow you to to store things as enum type but reference them as string later?
This is always how I did it with Postgres https://www.postgresql.org/docs/current/datatype-enum.html
jpfed@reddit
public class AlmostEnum : IShouldProbablyBeEquatable, ICouldAlsoBeParseable {
public Name {get; private set;} private AlmostEnum(string readable) { Name = readable; }
public static readonly AlmostEnum Hearts = new AlmostEnum(“Hearts”);
public static readonly AlmostEnum Diamonds = new AlmostEnum(“Diamonds”);
public static readonly AlmostEnum Clubs = new AlmostEnum(“Clubs”);
public static readonly AlmostEnum Spades = new AlmostEnum(“Spades”);
public static readonly Values = new[] { Hearts, Diamonds, Clubs, Spades }
}
aj0413@reddit
Just convert enum to string when saving to and pulling from DB
HasConversion()
Also PostgresSQL has native enums
Basically, you don’t have to compromise here with smart upfront design
AdUnlucky9870@reddit
enum until you need to add metadata, then you end up with a lookup table anyway. ive gone through this cycle like 3 times now and always land on starting with enums and migrating when the requirements inevitably get weird
sweetnsourgrapes@reddit
I tend to have a lookup table in the db to enforce data consistency. It's a relationship like any other.
However for short lookup tables like that, I also introduce an enum or a Value Object. It's quick and handy in code, and makes for very self-explanatory if statements etc.
But in every case where I do that, I make a test case to make sure the enums or whatever accurately and completely reflect the lookup table. Then you have one source of truth - the lookup table. Add a row and the test fails so you fix the code, simple.
Your argument about "having to update a table" seems nitpicky since, as a small lookup table, it's not going to change very often. Better IMO to have a) easier time coding, b) more readable code, c) safe code using the test against the table.
CatolicQuotes@reddit
Enum in domain. For UI map enum to whatever word you wanna use in UI, for DB map it however you wanna save in database. UI and DB don't have to have same values. It's very simple
CommunistElf@reddit
Enum is the way
Sea-Quail-5296@reddit
If you are using a DB without native enum types you have bigger issues
failsafe-author@reddit
Enum. Also, I don’t ToString() Enums for display. Also, I prefer Postgres :)
No-Vast-6340@reddit
What about when your product team comes next year and says we want a "suspended" status
remy_porter@reddit
No it isn't: it's a foreign key to a table which contains the enum mapping.
Yes? This can just be part of your database upgrade script; you're changing the schema; it's not a DDL schema change, but it's a schema change in that the definition of your data shape has changed. By changing the enum, you've changed a validation rule!
Remember: objects and relations are not the same thing, you cannot and should not try and directly map objects to tables and back, no matter what your ORM tells you. You can make a set of relations which are isomorphic to an object graph, but that takes planning and ORMs are at best mediocre at this. They can handle the trivial cases well, but fail at more complex normalization scenarios.
northrupthebandgeek@reddit
There's always the cursed option of defining MembershipStatus as an abstract class and then defining the values as singleton subclasses.
More serious answer: I would let the database be the record of truth, with a table of membership statuses and something on the application side to query that table and derive them dynamically. Hell, that “cursed” answer above might even get you halfway there.
marshallandy83@reddit
Why do you say this is cursed? This is the approach I'd always take once I'd read the value from the DB and needed to apply business logic. Which, admittedly, is not the scenario the OP is asking about.
Mortimer452@reddit (OP)
(shudder) tried that approach once, never again
ApprehensiveNewt3049@reddit
i'd lean towards enums for compile-time safety, but yeah, db clarity is tricky.
Dangerous-Sale3243@reddit
Use strings. Hate having to debug old code and every lookup is like “status is 2”, wtf does that mean? Enums can map to strings anyway if you code it right.
WanderingGoodNews@reddit
I tought you where going to compare it to an extra table Status but string?? Nuh uh
yobagoya@reddit
Since you mentioned you're using C# and presumably .NET, are you using EF Core? It has value converters that allow storing enum values in the database as strings.
lunacraz@reddit
just to be clear, MembershipStatus is it's own table, with an id and a readable name? and Customer references that table?
in my experience, this is always the way to go. sure it's an extra table, but if and when (and it's only when) you need to add another membership status, migrations are trivial, whereas when it's a hardcoded string, doing db wide updates become insanely hard to do
Cell-i-Zenit@reddit
Then comes the requirement to display the human readable enum value on a website and then you need to add a join to every single query which returns this value.
Its really not that hard to update enum values in the DB. How often do you really need to touch the existing values? Only when you rename the enum really, but that is a trivial query.
lunacraz@reddit
eh in this circumstance maybe but i’ve worked on state machines where adding a state in a workflow caused a ton of database locks that almost made migrations impossible
Kind-Armadillo-2340@reddit
Why can’t you just define enums with string values?
EternalBefuddlement@reddit
Been on both sides of this in my short-ish career.
For a place that was super high throughput with structured and statically typed code, we used char representations.
Two benefits: - Reduced storage requirements, which was essential as everything we did was going to store this value - Easier to read than pure int, so you can mentally map chars to their meaning
Mortimer452@reddit (OP)
Not sure I'm following, you mean something like a fixed-length CHAR(1) on the DB side?
Enum would be something like 1="A" (Active), 2="T" (trial), etc.?
Not a bad approach, human-readable enough on the DB side, but still lacks a referential constraint unless you add a lookup table and FK
roodammy44@reddit
It is a bad approach when you want to add more and end up with tons of letters with no-one having any idea what it means.
What I want to know is what the fuck sort of ancient DB is OP using that doesn’t support enum values?
EternalBefuddlement@reddit
It's talking about account states. There's not really that many states that an account can exist in, and if you have more than 26 states then you should've evaluated that beforehand.
You can figure out what each character means through: - Documentation - An enum, where the character resolves to a specific value
Mortimer452@reddit (OP)
Not everyone is using MySQL or Postgres, which are the only DB engines support enum data types
roodammy44@reddit
Glad you’re still here OP. I guess I have only really seriously used MySQL and Postgres and it’s obvious.
The reason I advise against the 1 char thing is that I have seen many “status” columns in my time filled with random letters. And even some with letters and numbers. People were not strict about what “status” meant.
Mortimer452@reddit (OP)
I agree, it has drawbacks, bloat/mistakes being one of them. It sounds great at first but over time can get unwieldly. What's "T" again, is that Trial? Or Terminated? Oh right Trial was first so we made Terminated "M"
Teh_Original@reddit
Don't forget than in C# enum's can be implemented as any integral type, so you can make them bytes in code.
EternalBefuddlement@reddit
Correct, yeah. It worked nicely for our case - enum in a shared library (with other DB accessors, common logic etc) and then applications interacting with the DB knew how to read/write data.
You can also add that constraint, sure, but then you're just changing the SOT from code -> database. There's always tradeoffs, it's up to you to decide what is needed.
Doub1eVision@reddit
If you can enumerate it, Enum it.
Schmittfried@reddit
Enum with a string mapping on the DB side (based on the enum constant name or an attribute)
BoBoBearDev@reddit
No enum in JSON or DB because it is not human readable.
Enum for anything in processing logic to reduce bad values.
Enum in Typescript no matter all those "purists" told you not to.
gwydionthewz@reddit
I’d advocate for neither enum nor freeform string here.
This is one of those cases where I’d model "MembershipStatus" as a small type hierarchy instead: a base class with concrete types like "ActiveStatus", "TrialStatus", "ExpiredStatus", etc.
The reason is simple: statuses almost never stay “just values.” They accumulate behavior (validity rules, expiration logic, permissions, transitions…). Enums and strings push you toward scattered "switch"/"if" logic across handlers. A type hierarchy keeps that logic in one place.
VIII8@reddit
This was also my first intuitive solution because the value of status is not important but whether you can check if status.is_active() or do status.expire(). And you may choose different representation of values in database.
juan_furia@reddit
Enum always, you can serialize from the string in DB to the enum behind the scenes and use the enum everywhere else.
I do not recommend using the string directly for your customers, rather use it as an i18n key
admin_password@reddit
Honestly enum in code, int in DB. Avoids so many issues when you end up expanding the enum later it’s just a code change the DB doesn’t care about. There are some exceptions but in your example this is what I’d do, 15 yoe
DueAbbreviations8292@reddit
enum today, string tomorrow, existential crisis forever
spoonraker@reddit
There truly is no one-size-fits-all answer, but after 20 years in the industry, I'd say my opinion is that the vast majority of the time when people think they want an enum what they really want is a string union.
Why do I say that?
Most of the time people end up wishing the actual underlying values they pipe through the system (especially when they start querying the database directly) were human readable. That's a big piece of why I say this.
Also, most of the time when people think that they definitely won't ever add another option to the set they're wrong. It comes up, and when it does, you can really get yourself into some trouble with database migrations if you're using true enums, depending of course on the specific database in question, but there's a LOT of really scary foot-guns in a lot of popular database choices when you start modifying true enum columns.
Depending on the exact programming language or database you pick, you can often having this quite literally be a string union and life is good. Sometimes you can get away with using an enum in the programming language but being careful to convert it string-ily when you pipe it into and out of your DB, and some stacks make this easier than others. But in general, I just think when people say enums what they really imagine in their head is "string with a constrained but flexible set of allowed values" which isn't exactly when an enum is.
chadder06@reddit
This. Also 20 years of experience.
Replicating string <> int tuples into the database to preserve readability is more work
Dealing with the eventuality that your closed set is not actually a closed set is more work
Dealing with dependencies around Enum versions and validating enum entries, where unknown values cause an Exception is more work
Just go with a string union.
Lumen_Co@reddit
I agree completely. In languages with string unions, I very rarely want or use an enum, and in languages without them, enums are usually a bit annoying to work with (because you usually actually want them to function like a string unions).
Naive_Freedom_9808@reddit
Maybe not the best approach, but for the project I'm on, the standard is that the API and client use an enum while the database uses a string. The conversion from string to enum or vice versa is done at runtime. Of course, this has drawbacks, though none of those drawbacks have surfaced yet due to discipline on our part
ok_computer@reddit
We use NHibernate ORM and a class property attribute
With the property mapped to type
On enum types that handles the string representation to front end and database, but is in fact locked to the few enum values.
Temporary_Pie2733@reddit
This language-dependent. In Haskell, I’ll write
and not really care what each of the four values “really” is. If there needs to be an underlying integer value as well, I’ll throw in an
Enuminstance.The database is a separate problem from the code. It’s not strongly typed, so make a choice and define an appropriate mapping into the language.
GoodishCoder@reddit
I would probably just store it as a string unless I had a reason to believe it's actually going to cause performance issues or the cost difference would be substantial with my dataset. If it was going to cause performance issues or a substantial cost difference, I would store it as an integer and create a lookup table.
Shazvox@reddit
Does there exist specific code for specific values (like, if (status == whatever) {})? If yes, then enum.
If the code is the same regardless of values and you expect new values will need to be added (possibly via some kind of admin UI) then use a simple id-name object in your repository.
If you use an enum and need to be able to have a human readable repository (like if you need to pull reports or something from it, or if some other application needs access to the repo) then you might want to keep id-name objects tied to your enum in your repo (pretty simple to set up if using EF).
Isofruit@reddit
In the scenario you posted, I'd want be fine with either of these two things (Note: I have not used MSSQL and am only vaguely aware of its limitations):
I'd likely prefer your first option. I'd be also fine with the second option, but only if it means you're still using an enum on the C# side and converting to string just before writing to the DB, and parsing the enum from string when reading from the DB. In that scenario I'd also heavily recommend using a constraint (as per google that should be possible) to guarantee only correct values can exist, but at that point you could just as well have your enum-table from your first option.
thebig77@reddit
Eh I store it as a string. It's readable, makes more sense for people who don't have access to the definition through the code or a lookup table, and storage is cheap. If there were a memory or storage constraint I'd consider a proper enum.
kagato87@reddit
String is unfortunately subject to typos and bloat. I've seen it...
We use ints in rhe database. For the analyst? We materialize the enums to the database.
For something as mutable as status, which is prone to evolving over time, it can even get its own definition table in the database that carries flags to tell the code what can happen with with each status. (For your membership example, flags might include "cards can open gates" or "needs customer followup" or "is staff" and so on.)
You still get the misspellings and bloat, but you an also customize it on the tenant level without code changes because you're selling your product to a gym, a country club, and a grocery chain, the code paths don't need to be changed if funny combinations are added (like if you allow an account access to entry but not to complete a transaction, like Costco, something a gym usually won't need). It also makes the misspellings and bloat easier to clean up.
Defyz89@reddit
Third option people keep missing: store as string in the DB with a CHECK constraint, map to enum at the application boundary.
DB stays readable (no magic numbers, no lookup sync). CHECK enforces valid values. Enum in code gives type safety. Adding a value = one migration + one enum update, no drift.
The int-backed enum is almost always premature optimization. Storage savings are trivial for a 4-value column, and every DBA cursing your joins is real cost.
Fair_Local_588@reddit
Just store the enum. My company went through a whole phase where we updated our stack to expect a normalized int value for enums when stored in the DB, but eventually even the top tier of engineers began advising against it.
It’s only really worth it if you need to really minimize how much data you’re storing. Like 20M+ records.
belavv@reddit
Enum in c# stored as a string in the DB.
jerryk414@reddit
I think int is superior.
Lookup tables is too much normalization, in my opinion.
nocondo4me@reddit
The clang api lets you iterate over all AST tree and see all of the value/ name mappings. You can then auto generate those lookups via cmake tool chains.
Manic5PA@reddit
_f0CUS_@reddit
You can easily store the enum as string values in the database with a constraint to ensure that you can only add values allowed in the enum.
_A_Good_Cunt_@reddit
the best of both worlds enum + save to DB as string
there's a few graceful ways to handle this on .net so they just serialialise as string, and you don't need lookup table. the code ensures the data integrity
Early-Pie-4765@reddit
I mean, one could technically still create undefined enum values since something like this still compiles and prints 4.
get_MEAN_yall@reddit
This example is unambiguous and should be an enum.
gjionergqwebrlkbjg@reddit
Is the con of having to add a new lookup value really that much of a con? The very first integration test will catch that. And you likely will at some point change the name of at least one of those statuses (although you should really just generate a new value and migrate at that point).
tim128@reddit
There's no best solution here.
My preference would go to an enum-like type and a lookup table in the database. Yes you need to add a migration but this seems like a minor issue any LLM can handle for you.
Definitely do not use type string in your codebase. If you do end up storing a string in the db use a wrapper type in code.
At my last project we had both. All enums used int in the db except for ISO 639 language codes.
I also recommend you don't expose your enum names in the API directly. You should be able to rename them without breaking the API.
RelevantJackWhite@reddit
I think the right way to approach this is to think about the likely user groups of such a value. Is it a web UI maker? Is it someone looking at the DB directly? Is it a service that creates an object out of it, with a toString available? Is it someone reading console logs?
I see your list as conflicting concerns that come from different use-cases. You've got to identify which ones will be most affected by this decision, either through the sheer number of users doing that, or the impact to each one
Unlikely_Secret_5018@reddit
I use enum field ONLY IF the library/framework has the ability to auto-generate migrations based on enum changed.
eg. Adding / removing enum values.
It's nice not to allow an entire class of errors (illegal / meaningless values in the DB)