Variable names do not travel with values. When should domain meaning live in types?

Posted by ResponseSeveral6678@reddit | Python | View on Reddit | 26 comments

A variable name can carry a lot of meaning:

price_in_usd_cents: int

But the value itself is still just int.

Once it is passed to another function, stored in a model, serialized, sent to a queue, or returned from a repository, the original variable name may be gone.

So the domain meaning was attached to a local name, not to the data.

It gets even more visible when working with AI coding agents.

They are very good at following local patterns, but if everything is just int and str, the "density of meaning" is low.

I suspect this may be one reason TS works well with AI-assisted workflows:
type information becomes part of the code context.

Humans see it. IDEs see it. Type checkers see it. AI coding agents see it.

Python has type hints too, but domain meaning often still collapses into primitives.

If the type does not carry the meaning, something else will fill that gap:
names, comments, local conventions, copied patterns, or guesses/assumptions.

A few examples where the IDE is happy, but the semantics are wrong:

# Accidental swap
delay_seconds = 5
timeout_seconds = 30
def schedule_retry(timeout: int, delay: int) -> None: ...
schedule_retry(delay_seconds, timeout_seconds)


# Different units
created_at_microseconds = 1_777_961_207_000_000
retry_delay_seconds = 30
retry_deadline = created_at_microseconds + retry_delay_seconds


# In this example, different developers may imagine different units or precision:
class AuditRecord:
    created_at: int
    updated_at: int

Type lacks meaning and strictness. So, we all tried to solve the problem partially.

- typing.NewType
- small wrapper classes
- dataclasses around one value
- Pydantic custom validators
- plain inheritance from str / int
- UUID-specific helpers

I have also been experimenting, mostly to understand the trade-offs.
The principles I ended up caring about were:
- Strictness:
- no implicit coercion
- invalid input → fail fast
- Runtime type preservation:
- value keeps its domain type, not downgraded to str / int
- Pydantic and pickle preserve the subtype in model/container boundaries
- Static type preservation:
- works correctly with type checkers (mypy / pyright)
- type checkers can distinguish UserInputRaw from UserInputValidated
- Transparency:
- behaves like underlying primitive
- no extra API surface
- Semantic stability:
- arithmetic should downgrade to a primitive
- I would rather create a new domain value explicitly than keep compromised meaning
- Inheritance:
- children can add more meaning
- Minimal API / hot-path friendly:
- no .value or extra attributes

from base_typed_int import BaseTypedInt
from base_typed_string import BaseTypedString
from base_typed_id import BaseTypedId



class UserInputRaw(BaseTypedString):
    """Raw user input before validation."""



class UserInputValidated(BaseTypedString):
    """Validated user input."""



class UnixTimestampSeconds(BaseTypedInt):
    """Wall-clock UNIX timestamp expressed in seconds."""



class DurationSeconds(BaseTypedInt):
    """Duration expressed in seconds."""



class MessageId(BaseTypedId):
    """UUID-based message identifier."""

This approach is not free. It adds more types, more names, and another convention the team has to understand.
So I am trying to understand where people draw the line.

I do not think every primitive should become a domain type.

But some values cross boundaries. How do you handle it in practice?
- typing.NewType
- primitive subclasses
- wrapper value objects
- Pydantic models
- something else?

Where do you draw the line between "this should just be an int / str" and "this deserves a domain type"?

[-]

TheseTradition3191@reddit

the newtype approach scales better than it looks on paper. once youve been burned by microsecond/second confusion in a codebase thats 6 months old, you never want plain int for time again

from typing import NewType

Microseconds = NewType("Microseconds", int)
Seconds = NewType("Seconds", int)

def wait(duration: Seconds) -> None:
    time.sleep(duration)

created\_at = Microseconds(1\_777\_961\_207\_000\_000)
wait(created\_at)  # mypy catches this

zero runtime cost, pure type-checker info. mypy/pyright catch the swap at the call site. only real downside is you cant define methods on them - thats where a small dataclass with `__slots__` earns its place

[-]

ResponseSeveral6678@reddit (OP)

Yep, I agree. NewType is a very good baseline for this.

The part I wanted to cover is the runtime / boundary side.

With NewType, this is great:

Microseconds = NewType("Microseconds", int)
Seconds = NewType("Seconds", int)

def wait(duration: Seconds) -> None:
    ...

created_at = Microseconds(1_777_961_207_000_000)
wait(created_at)  # type checker catches it

But after validation / serialization / deserialization, the value is just int again.

The pattern I'm experimenting with keeps the subtype at runtime:

from base_typed_int import BaseTypedInt
from pydantic import BaseModel


class Microseconds(BaseTypedInt):
    pass


class Seconds(BaseTypedInt):
    pass


class Event(BaseModel):
    created_at: Microseconds


event = Event.model_validate_json(
    '{"created_at": 1777961207000000}'
)

dumped = event.model_dump_json()
restored = Event.model_validate_json(dumped)

assert type(restored.created_at) is Microseconds
assert isinstance(restored.created_at, int)

So for me the difference is:

NewType: excellent static-only signal, zero runtime cost
BaseTypedInt: static signal + near-zero runtime cost + runtime subtype survives Pydantic/container boundaries
dataclass/value object: better when the value needs behavior or richer invariants

I would not use this everywhere. Mostly for boundary values like timestamps, durations, IDs, raw/validated input, etc.

[-]

snugar_i@reddit

What does BaseTypedInt do?

As for the question - for time-based stuff, timedelta is usually enough (the team already got used to the .seconds/.total_seconds() footgun), for the rest usually dataclasses.

[-]

ResponseSeveral6678@reddit (OP)

BaseTypedInt is intentionally boring.

from pydantic import BaseModel
from base_typed_string import BaseTypedString


class UserInputRaw(BaseTypedString):
    """Raw user input before validation."""


class UserInputValidated(BaseTypedString):
    """Validated user input."""


class IncomingRequest(BaseModel):
    user_input: UserInputRaw


class StoredCommand(BaseModel):
    user_input: UserInputValidated


class QueueMessage(BaseModel):
    user_input: UserInputValidated


def validate_user_input(user_input: UserInputRaw) -> UserInputValidated:
    return UserInputValidated(user_input.strip())


incoming_request = IncomingRequest.model_validate_json(
    '{"user_input": " hello "}'
)

validated_user_input = validate_user_input(incoming_request.user_input)

stored_command = StoredCommand(user_input=validated_user_input)

stored_json = stored_command.model_dump_json()

queue_message = QueueMessage.model_validate_json(stored_json)

type(incoming_request.user_input) is UserInputRaw type(queue_message.user_input) is UserInputValidated

So the meaning survives: request JSON -> Pydantic model -> validation -> storage DTO -> JSON -> queue DTO

With plain str, both values are just str.

With field names alone, the meaning is attached to the container.

With this pattern, the meaning is also attached to the value

But timedelta does not say whether a value is a retry delay, timeout, monotonic elapsed time, or wall-clock timestamp. It gives behavior. It does not necessarily carry your domain meaning.

[-]

shadowdance55@reddit

Or you can always create a custom class. Relying on primitives to transfer semantics is generally a bad idea.

Except from interpreter optimization, there is no real difference between using a primitive or a custom class instances. Like everything else in Python, they're both objects.

[-]

ResponseSeveral6678@reddit (OP)

Do your custom classes usually stay primitive-like, or do they grow their own API?

And do you use the same pattern across codebases, or change it per project?

[-]

BrannyBee@reddit

Overuse of primitives is one of those things that isnt technically wrong, but gives bad vibes to more experienced devs because we've seen how it can bite you in the ass down the road, and its kinda hard to get it without having seen it yourself.

If you havent heard of the term "code smell", its basically dev speak for "this works i guess but it gives bad vibes", often they're surface level things that indicates a deeper problem may exist, but to someone with less experience it may not even register as someshing bad, especially if the code works..

Over reliance on primitives is definitely one of those things, Refactoring Guru is a cool resource and their section on code smells is great, it explains why it may be a problem, how to recognize it, ways to address it, and the benefits of doing so

Check it out, kinda goes into exactly what you have and other better ways to handle this https://refactoring.guru/smells/primitive-obsession

[-]

shadowdance55@reddit

They follow the domain requirements, of course.

[-]

ResponseSeveral6678@reddit (OP)

Do you usually model those as simple wrappers like:

class Price: def init(self, value_in_cents: int) -> None: self.value_in_cents = value_in_cents

or do you sometimes go for subclassing primitives (int/str) to keep behavior more transparent?

[-]

shadowdance55@reddit

Neither. Primitives are an implementation detail, and hard-coding currency in attribute names is not domain language. What does "cents" even mean?

[-]

ResponseSeveral6678@reddit (OP)

Do you mean that things like "cents" are implementation details and shouldn't appear in the domain model at all?

I might have misunderstood your point.

[-]

shadowdance55@reddit

It should, as a separate attribute, e.g. "currency". Alternatively, the domain might not care about currencies, in which case "cents"are superfluous.

Additionally, in this particular examples, monetary amounts are not decimal values.

[-]

CzyDePL@reddit

NewType for parsed values with semantics (e.g. ClientId, which I only care is non-empty string), Value Objects for richer behaviour (validations, comparisons, lifecycles). In general I also try to leverage typing system and object-oriented design as much as possible to carry logic

[-]

ResponseSeveral6678@reddit (OP)

NewType for parsed values with semantics (e.g. ClientId, which I only care is non-empty string),

That is a funny, but actually effective way to keep things clear.

Also in my example, ClientId is a stable contract for an argument typing - today it can be string, tomorrow an int or uuid, but it still carries the same meaning, which is arguably more important quality for the programmer (maybe not so much for the interpreter)

But I wouldn't use inheritance for this purpose in any shape or form - I try to limit to strict subtyping / specialization (IS-A relation) where super class is carries some meaning - which BaseTypeInt doesn't really

My concern with value objects is API surface growth. Once each semantic value becomes its own object, you often need conversions, adapters, serializers, ORM/Pydantic handling, test factories, etc.

How do you usually limit the boundary of that pattern?

Do you keep value objects only at domain boundaries, or do they flow through most layers/ports?

[-]

CzyDePL@reddit

Domain model is module's private concern and doesn't cross its boundary. I strive to expose only module's facade with input / output DTOs. Yeah sometimes I had to write custom mapping for sqlalchemy or serializer/deserializer. But not for every VO, as not every is persisted. It's just my "custom type". With tests, I don't see it being a problem, actually it makes testing easier - I know which behaviour I need to test on value object vs which is tested in module tests.

[-]

Severe-Atmosphere790@reddit

Good question. Imho it depends on a proper architecture on all layers. So let's say I need client firstname. So I use "firstname" as column name in database, later in django model, later as key in json response, and as react variable on frontend side. Everywhere firstname is just string, but the name of variable travels through the project.

Of course there are situations when variable losts context for example in the function like to_uppercase(text: str). But then I aim to the point where firstname was some string before call and becomes uppercased after the call. But before and after the operation it was and is firstname.

The good practice is when we think and write names related to what data we operate, not how it works.

[-]

ResponseSeveral6678@reddit (OP)

Yeah, I do the same.

My main question is enforcement: how do you make sure other developers, or AI-generated code, keep that naming discipline across 1k+ lines?

[-]

Severe-Atmosphere790@reddit

Then of course I don't keep 100 variables, 100 parameters from one layer to another, but I collect them in pydatic models/ dataclasses/ forms. And then I can introduce validators/ sanity checks / property methods etc. And then combine all these things with SOLID principles makes code clean.

And if we talk about other programmers and AI - code review is mandatory. If I don't understand someone else code then I put comments. And in good team we polish code to the point when everyone understand context and way how this code works.

Worth noting- I appreciate comments after CR in my pull requst - it means my partners have read the code and they want to keep it clean and readable for everyone.

[-]

Ok_Tap7102@reddit

Have you ever actually encountered a genuine issue with this other than "my LLM needs more hints"?

[-]

ResponseSeveral6678@reddit (OP)

Mostly around boundaries.

Swapped args, mixed units, indistinguishable IDs, raw vs validated input.

And oftent it's me a few months later, not an LLM.

[-]

messedupwindows123@reddit

newtype

[-]

ResponseSeveral6678@reddit (OP)

Thanks, yeah - NewType is one of the common options.

[-]

messedupwindows123@reddit

sometimes, instead of a namedtuple, i will actually just make a tuple of newtypes, so that i get safe destructuring

[-]

ResponseSeveral6678@reddit (OP)

The main limitation for me is runtime - it doesn't really survive things like Pydantic or serialization/deserialization. Was that ever an issue for you?

[-]

messedupwindows123@reddit

i feel like i'm typically serializing into structs with named fields. and at least i can write the serializer to know: "I'm putting a 'cents' amount into the 'charge' field"

[-]

aloobhujiyaay@reddit

Wrapper/value objects feel more robust especially when you need validation + invariants