Everything Should Be Typed: Scalar Types Are Not Enough

[-]

hrvbrs@reddit

type ShopId = string & { readonly __brand: "ShopId" };

In TypeScript, branded types are just structural, so all string or number operations work without any extra code. The brand only exists at compile time.

One quick note— given the above types in TS, you would need to “cast” every relevant primitive value to the type you want. So I wouldn’t say “without any extra code” at all.

const shop_id: ShopId = "shop__1234"; // Error: __brand is missing
const shop_id: ShopId = "shop__1234" as ShopId; // fixed

[-]

Specialist-Owl2603@reddit (OP)

Yeah, You need to cast at the boundary where the value enters your system. What I meant is that once you have a ShopId, all the normal string operations work on it without unwrapping or extra methods. The brand is invisible at runtime. But yeah, the initial cast is required. Thanks :p

[-]

Dreadgoat@reddit

There's been a recent scad of "OOP/WebDev discovers functional programming" articles here lately.

I'm amused that this seems to be the opposite: rust/go devs discovering OOP.

fn process_order_payout(
  shop_id: String, 
  customer_id: String, 
  order_id: String, 
  amount: i64, 
  platform_fee: i64, 
  tx_fee: i64, net_amount: i64
)

Sure, you can wrap all these primitives up to be their own types, but why are you passing unwrapped references to clear logical objects around like this to begin with? Choose the entity that is "acting" and give it the entities it requires to make decisions.

struct Order {
  shop: Shop,
  customer: Customer,
  platform: Platform,
  amount: i64
}
impl Order {
  fn process(&self,...) //...
}
let order = Order::new(shop, customer, platform, amount);
order.process();

You could also wrap amount in an Amount struct if you really wanted to I guess, but there's no way for confusion to happen at this point because the only scalars we ever pass around for an entity are not doing anything except being a true representation of a scalar value; that is, they don't logically link to anything except their own value.

Scalar types are good when they represent a logical scalar value. When you're not representing a logical scalar value, you should represent the entire logical entity (so long as performance allows)

[-]

Tubthumper8@reddit

so long as performance allows

I think this is kinda the central philosophical difference, the Rust programmers tend not to think this way (I'll fetch all the data even if I don't need it), they tend to think about fetching only what is necessary to do the job.

In your example, how many bytes are Shop, Customer, and Platform? The way that Order is written in the example, that could be potentially a large struct to pass on the stack because it's all allocated inline. I assume what you meant was this to allocate each entity separately:

struct Order { shop: Box, customer: Box, platform: Box, amount: i64 }

(or & pointer for borrowed data rather than owned. Or one of the several other pointer types depending on the use case)

Why make heap allocations if you don't need to? And do Shop, Customer, and Platform also have their own unnecessary heap allocations inside them as well?

The other thing is, assuming these entities come from a database, is it doing SELECT * FROM shop (and Customer, Platform) to get every column of every table? The Rust programmer would ask - why? Do I actually need every column?

[-]

Dreadgoat@reddit

The argument 20 years ago was "why make your code harder to maintain and understand before you even know if performance is an issue?"

I think the argument stands even stronger today. The odds that whatever you're doing will matter after compilation are slim. The odds that the extra millisecond you incur for writing "heavy" is relevant is incredibly slim.

After you've written easily maintained code, you can profile it, identify unacceptable performance bottlenecks, and turn them into assembly if you really have to.

[-]

Matthew94@reddit

Hardware and compilers have gotten very powerful.

The eternal “sufficiently smart compiler”. Meanwhile in the real world vscode uses more RAM to idle than it took to run all of windows 98.

This idea that we should stop caring because we can come back and fix it is a fiction. People often don’t come back and fix things and the shit programmers just take this as permission to not even try.

[-]

Tubthumper8@reddit

Sure, and you could argue that having a bunch of data and variables that you don't need doesn't help understandability

I get that performance argument, I'm sympathetic to the "avoid premature optimization" philosophy, I'm saying that many Rust programmers are also sympathetic to the "avoid premature pessimization" philosophy.

It's situational of course, but in this exact situation (fetching joined entities that you don't need) I've personally seen plenty of cartesian explosions that cause real performance issues. Some of them were profiled and found to be a bottleneck, and fixed, which we would be aligned is a proper way to deal with it. Many should never have required that effort in the first place if it was written to only use data that it needed.

It's all situational obviously. I just think the characterization of "Rust programmer discovers OOP" for an article talking about struct design doesn't really make sense here, people are well aware of what OOP is (well, to the extent that OOP has an actual definition that people agree on), there's just plenty of times where you don't want to fetch/join all foreign keys to the entity you're working with and that using IDs instead of the full entity is perfectly acceptable, and possible even preferred

[-]

Valmar33@reddit

Speaking of OOP... Rust-style OOP is a trillion times cleaner than C++'s OOP nightmare.

Does Rust allow for dynamic dispatch? Was watching a talk on some of Casey Muratori's criticisms of C++, and a major thing he bemoans is the total lack of dynamic dispatch in C++ found in pre-C++ OOP languages, which he perceived as the real strength of OOP that C++ threw away. He speculated that Stroustrup perhaps didn't understand the sheer power it provided, so didn't implement it, losing so much flexibility in the process.

[-]

Full-Spectral@reddit

Rust supports dynamic dispatch through traits. Traits in Rust act similarly to both pure virtual interfaces and concepts in C++. But, the big difference is that implementing a trait that doesn't give your class a v-table.

So you can have a method that takes a 'dyn Foo', and you can dynamically pass it anything that takes a Foo. That parameter will actually be a fat pointer that includes the instance pointer and the v-table pointer.

You can also accept an 'impl Foo', which allows it to accept anything that implements Foo, but that call will be a generic and get instantiated for each actual type it's invoked with.

Otherwise, when you are just calling those trait methods via the actual struct that implements them, they just called normally. And this would be important in Rust where huge amounts of standard functionality is based on traits. Every type would be bloated up like crazy if they used the C++ scheme.

[-]

serviscope_minor@reddit

a major thing he bemoans is the total lack of dynamic dispatch in C++

I mean... C++ has dynamic dispatch with the virtual keyword. Dynamic as in it's not a statically determined function call. So, I'm assuming you're referring to something more advanced than dynamic dispatch here...

He speculated that Stroustrup perhaps didn't understand the sheer power it provided, so didn't implement it, losing so much flexibility in the process.

This seems wildly unlikely. C++ was from the get-go meant to be basically as fast as C. Even with 44 years of improvement in JIT compilers, none of the more flexible languages (in this regard) are as fast as C.

[-]

Kok_Nikol@reddit

Hah nice!

There's been a recent scad of "OOP/WebDev discovers functional programming" articles here lately.

I'm amused that this seems to be the opposite: rust/go devs discovering OOP.

From experience, I first learned C, then OOP, but after going back to C my code was sooo much nicer, even though it wasn't strictly OO code.

Dogma is bad, learning is always beneficial, and other obvious stuff :)

[-]

WuPaulTangClan@reddit

Great observation and great point!

[-]

abraxasnl@reddit

Amen! This is great, and would be even better if there was first class language support (that wouldn’t require a struct and deref). But you gotta start somewhere.

Next up, typing should start to include additional arbitrary constraints. A number type should be constrainable to a range, not just by virtue of how many bytes the int is (to give one example).

[-]

granadesnhorseshoes@reddit

Validation in the constructor gives you exactly that. Just as easy to test for a specific range (or anything else) as his shop_ prefix example.

[-]

LegendaryMauricius@reddit

But that is required to be checked at runtime. It simplifies things if it's a language feature enforced outside of the object's lifetime because that's where we have the most information. Any counter-arguments?

[-]

UdPropheticCatgirl@reddit

I mean you want dependent type system at that point…

C++ has pseudo one if you are willing to go far enough into template land, not sure if it can reliably do what you want.
Scala is technically path dependent so it can’t do what you want
Zig is pseudo dependent, I am pretty sure it’s not expressive enough to do what you want reliably
Haskell has pseudo one as well with enough compilers extensions, could probably do some of what you want.
Idris and Agda can definitely do what you want reliably.
Ada can technically do what you want? Albeit I don’t think in the way you envision.

[-]

LegendaryMauricius@reddit

I should certainly dig more into them.

Although I didn't exactly target dependant typing, the topic was more about some logical checks.

[-]

UdPropheticCatgirl@reddit

well how do you ever statically analyze and type-check these “logical checks” without dependent types… SPARK Ada can do some magic in that regard but still only extremely limited…

[-]

DrShocker@reddit

This is what I always point to https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/

[-]

LegendaryMauricius@reddit

Yeah. For the counter argument of 'types' being the memory structure and semantics up to the developer, float and int32 take the same amount of bytes but differ only in semantics.

[-]

lelanthran@reddit

Next up, typing should start to include additional arbitrary constraints. A number type should be constrainable to a range, not just by virtue of how many bytes the int is (to give one example).

Pascal had it since the 80s. You can use Delphi or Lazarus today with that sort of typing if you want.

[-]

gamunu@reddit

I feel like this should be solved with proper unit testing

[-]

waterkip@reddit

Don't state obvious things, it breaks logics

[-]

Old_County5271@reddit

So you want ada? Because I kinda want ada too. In learning it and it's like "yessss this is how it should work!"

[-]

Full-Spectral@reddit

For most folks though, it needs to be a language they can use professionally in their area.

[-]

Old_County5271@reddit

This is the problem with most languages, You gotta build an ecosystem, but nobody uses the language enough to build an ecosystem, so its forever stuck until it breaks through.

[-]

Full-Spectral@reddit

Well, to be fair, Ada had issues back then that made that worse. It was seen as being pushed by the military, not a bottom up deal, and most implementations (if not all initially) were targeting that military usage and were very expensive. From what I was just reading $10K and up per seat, and that's in 80s/90s dollars.

[-]

beders@reddit

Be careful with premature "concretions". If you get it wrong, you will be in pain. Especially with content provided by end users or external systems.

Concrete example I stumbled upon recently: We did an integration with a badly documented third party service. They sent us JSON webhooks with an "id" field. All examples in their docs and the sample webhooks we received contained IDs that looked like regular UUIds. So one of our engineers picked the UUID column type when storing those webhooks.

Of course a week later we received ids that looked like some weird prefix+uuid. INSERTs were failing, dashboards lighting up.

Due to the lack of specification, the engineer should have picked a more permissible data type like TEXT in the first place.

Luckily we didn't attempt to define static types for those webhooks and stuck with spec checks and immutable data, so the harm was limited to database inserts.

[-]

Valmar33@reddit

That appears like a scenario that unions would be handy for?

[-]

beders@reddit

Wouldn't help with Postgresql though.

And we prefer to treat data as data when it comes in and not encapsulate it behind magic things like statically typed objects that go beyond primitives. It's an immutable value.

A single data item stands on its own. Ideally it has a spec associated with it. We don't shoehorn it into a specific parent type. Its interpretation is subject to the use case at hand and can change. For example an integer representing an age. It can't be negative but depending on the use case it needs to be < 18, or > 18 or whatever else is required in the business logic.

We then don't try to invent new types based on that. That's nice and cool for toy examples. It gets gnarly very quickly for real-world scenarios.

On the platform level, it is great if you can define things like IDs that are strings on the surface level, but only and only if you have full control over them. The moment they end up in a database, you don't have that anymore.

[-]

useablelobster2@reddit

I get the idea, but you've just moved the weak link. What's stopping you instatiating the CustomerId type with shop_id, and what makes it different from putting shop_id in the customer_id field on the struct? Exactly the same issue, exactly the same lack of care needed when writing it.

I like more expressive and strict types, but this looks more like type fetishism to me, because you haven't actually fixed the core issue.

And the claim that a unit test wouldnt catch such an issue just means your tests aren't very good, no amount of strict typing will fix that.

[-]

rooktakesqueen@reddit

The only time you would ever have to instantiate a ShopId with a number is at your API boundary, with a web service or with your database for example. Everywhere else in your codebase, it's strongly typed. Minimizes the cross section of where that kind of error is possible, and ideally could fail fast if you made it (for example, if you prefix all customer IDs with c and all shop IDs with s, this could have a runtime check to only allow the correct ID pattern, but that only has to be checked once upon instantiation).

[-]

useablelobster2@reddit

This is probably the detail which can actually swing it in favour. Its effectively parsing the raw value into something explicit, and could even be done automagically.

[-]

rooktakesqueen@reddit

This feels like a specific case of the more general idea of runtime constraints on values. Like Java has the @NotNull annotation where the compiler can require you to make a null check to get that trait, and then everywhere else can skip the null check because it already happened.

Or if your methods regularly require positive integers, you could have a PositiveInteger type that can only be constructed by going through a runtime check that rejects values ≤ 0.

I've never used a language that took fullest advantage of this, though. Like, how do you keep track of a value that's been checked for multiple traits? If I have a PositiveInteger type and an EvenInteger type, and my method requires an input that's both even and positive, do I need a separate PositiveEvenInteger type? Or can these traits be composed, like public int splitInHalf(@Positive @Even int operand) {...}

And how can a compiler or static analysis tool keep track of when a trait is satisfied, like knowing that if (value % 4 == 0) {...} means value has the Even trait within that block? Or foo = bar * 2 means foo is even if bar is an integer. Java's @NotNull annotation works because it's a special case known ahead of time by the toolchain, but ideally we should be able to write our own traits.

[-]

General_Session_4450@reddit

The only time you would ever have to instantiate a ShopId with a string is at your API boundary,

This is usually also the largest surface where these types of assignments are done in the first place though. After that you have DTOs and other typed objects being passed around.

[-]

rooktakesqueen@reddit

In my experience, even systems with a data transfer layer still often use unboxed IDs as method arguments. The DTO collects associated data, but the ID is a single data value that isn't inherently associated with anything.

Like, studentRepository.getStudent(id) will return you a DTO, but its argument is still a raw string (unless following the advice in this article).

Of course this brings up a flaw with this approach, because even strict type safety can only get you so far. I could have a method assignStudent(teacher: TeacherID, student: StudentID) and this protects me from accidentally swapping the order of those arguments. But if I have a method assignTutor(tutor: StudentID, learner: StudentID) then the positions matter but they're the same type.

[-]

manifoldjava@reddit

> but you've just moved the weak link

No. ID types in the API eliminate the weak link. In most data models IDs are typically generated and obtained through the API exclusively as ID types - API consumers don't/can't create them directly.

[-]

mpinnegar@reddit

You are collapsing the weak link being in literally every function siganture that cares about the primary key of the type to being in the function signature of the constructor ostensibly one that will be used when marshaling and unmarshaling a class and not throughout the codebase. That's a huge win.

[-]

davidalayachew@reddit

You are collapsing the weak link being in literally every function siganture that cares about the primary key of the type to being in the function signature of the constructor ostensibly one that will be used when marshaling and unmarshaling a class and not throughout the codebase. That's a huge win.

To put it more simply, you have turned 100's and 100's of weak links into 1 singular weak link.

Rather than checking on every function call that you did things right (100's of weak links), the only check needs to happen in the constructor (1 singular weak link), and you just use the type that that constructor is attached to.

This is what people mean when they say that modeling your data correctly and in detail can solve a lot of problems before you ever even reach them. Here, by richly typing your data, a whole class of validation errors doesn't have to occur.

Granted, it has tradeoffs, but the point remains.

[-]

TheCritFisher@reddit

It's not a weak link if it's only one and well tested. Then it becomes a constraint, yeah?

[-]

fghjconner@reddit

It doesn't fully stop it, but this

new CustomerId(shop_id);

is much more obviously wrong than this

process_order_payout(customer_id, shop_id, order_id, amount, platform_fee, tx_fee, net_amount)

[-]

Pamasich@reddit

OP is asking what's the difference there to this example from the article:

let params = OrderPayoutParams {
    shop_id: customer_id,       // oops, customer ID assigned to shop field
    customer_id: shop_id,       // oops, shop ID assigned to customer field
    order_id: order_id,
    amount: net_amount,         // oops, net amount assigned to gross amount field
    platform_fee: tx_fee,       // oops, fees are swapped
    tx_fee: platform_fee,
    net_amount: amount,         // oops, gross amount assigned to net field
};

Not to the one you're bringing up there.

I get the idea, but you've just moved the weak link. What's stopping you instatiating the CustomerId type with shop_id, and what makes it different from putting shop_id in the customer_id field on the struct?

[-]

Dreamtrain@reddit

What's stopping you instatiating the CustomerId type with shop_id, and what makes it different from putting shop_id in the customer_id field on the struct?

A PR approval

[-]

lelanthran@reddit

What's stopping you instatiating the CustomerId type with shop_id, and what makes it different from putting shop_id in the customer_id field on the struct?

Well, it's better to have the bug in a single place, at the construction of the type, rather than spread out throughout the system!

[-]

spline_reticulator@reddit

The difference is one or two weak link vs possibly hundreds. If you have CustomerId and ShopId types, you only weak links are when you instantiate those two types. If you just represent both as strings then every time you define a function with a customer_id: str arg or a shop_id: str, that's another weak link in your system.

And the claim that a unit test wouldnt catch such an issue just means your tests aren't very good, no amount of strict typing will fix that.

You shouldn't need tests for these kinds of bugs. The whole point is the compiler can detect them. I've had this debate with a lot of people. Many of people have trouble wrapping their heads around this way of programming, but over a long enough timeline they all invariably create an instant where they pass the wrong primitive type to the wrong primitive argument.

[-]

meowsqueak@reddit

As others have said, it's arguably better to have one weak link at the surface than an unbounded number of weak links throughout the code. It's clearly better.

However, the other important aspect is that anywhere else in the code you're now guaranteed to have your type invariants upheld, so it's now significantly easier to reason about how functions behave, and your error handling burden/exceptional flow is significantly reduced.

Dedicated types also make it easier to understand some APIs, because function signatures provide a roadmap of A to B to C, whereas it can be a lot harder to see where i32 -> i32 -> i32 when there are a lot of different i32 types. Aliases help, but types enforce.

In Rust there are plenty of helper crates to add useful functions to wrapper/newtypes, it's really not much effort to create a new type. The only bits that I find annoying are combining different types, e.g. dividing Metre by Second. There are crates for common types that help with this, but it's still annoying with custom types, and you often have to explicitly handle references to them as well.

[-]

manifoldjava@reddit

> but you've just moved the weak link

No. ID types in the API eliminate the weak link. In most data models IDs are typically generated and obtained through the API exclusively as ID types - API consumers don't/can't create them directly.

[-]

rooktakesqueen@reddit

I would say that depends on the context.

REST APIs will typically use strings or numbers as IDs, via path params, query params, or JSON. And at some point you'll need to store your data somewhere, and most databases will require you to flatten that CustomerID down into a varchar or whatever. Which your callers wouldn't see, but it's still an API boundary you have to cross, and the goal here is also to protect you from yourself.

On the other hand a Protobuf API might offer its own specific ID types, as might the API of a library you're linking to rather than a network service. Wherever practical, make your API type-rich!

[-]

markehammons@reddit

You can reduce the chance of customer id being instantiated by shop if by limiting instantiation to regions of code where knowledge of which is which is more common.

[-]

TheTomato2@reddit

I feel like in this situation packing everything onto a struct is enough. Too many programmers care too much about designing idiot proof systems instead of just simply reducing cognitive load. Yes I might make an understandable mistake if there are 18 parameters but assigning wrong values to variables is not a mistake I should be making and you can't design a system where I don't make that mistake. At some point you have to trust the programmer.

[-]

Absolute_Enema@reddit

The "weak link" is always going to exist, this is fundamental complexity.

[-]

Blecki@reddit

The weak link is now contain able. If you're making a library you can choose to make your function return a string or a UserName - if the latter, your client code can't do anything with that that's not allowed for usernames.

[-]

backfire10z@reddit

you’ve just moved the weak link

I had the same thought almost immediately. Maybe it is still better. At the very least, the problem has been pushed to the surface: there’s only one place that needs fixing and that’s the initialization/initial parsing of the variable.

[-]

Specialist-Owl2603@reddit (OP)

https://www.reddit.com/r/rust/comments/1skg83f/comment/ofz4dip/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

[-]

purbub@reddit

Yeay code design article!

Honestly I agree with the article. This is a cure to the classic anti pattern: primitive obsession.

[-]

Full-Spectral@reddit

I agree as well, though at some point it becomes so burdensome that it's hard to justify. If you look at some of the fully wrought units and measures type libraries, they can really put a lot of burden on the developer. On the whole, I don't have a problem with that, we signed up for this and we have responsibilities to our users, but there's a point of diminishing returns.

Rust has pretty good support for this kind of stuff (though no numeric range restriction capabilities beyond what the wrapper type does at runtime.) But in a large system with thousands of small types that are technically unique and should be not be mixed, that's a lot of work. I do it for the most important stuff (for my needs) like time related stuff, socket ports, lengths, etc...

[-]

Blecki@reddit

You just need tagging. Cumbersome the first time, but then easy as pie. Write a wrapper around string that takes a generic parameter - then your new string type just derives from that class, with itself as the generic parameter. Implement this generic class once per underlying type and you can make as many mutually exclusive "string types" as you want.

[-]

Full-Spectral@reddit

Rust doesn't support inheritance. And of course Rust also supports a lot of standard functionality via traits. Some of that you can auto-derive for a wrapper class, but others not. but in a large system with hundreds of such types, it would be a LOT of work to make these types fully integrate into the system like their wrapped standard counterparts.

And there are always gotchas. Such strings couldn't be passed to any standard library or third party code that needs a mutable string, since they would require access to the underlying string type and would not honor the limitations.

[-]

cosmic-parsley@reddit

The Rust equivalent would be similar to what they describe, something like:

struct Stringish<T: StringWrapped>(String, PhantomData<T>);

trait StringWrapped {}

// Whatever helper methods you always need
impl<T: StringWrapped> Stringish<T> { … }

// Make it read-only usable as a str
impl<T: StringWrapped> Deref<str> for Stringish<T> { … }

Then for each string-like thing it’s:

struct UsernameWrap;
impl StringWrapped for UsernameWrap {}
type Username = Stringish<UsernameWrap>;
impl Username {
    // Validate and create
    fn new(s: &str) -> Option<Self>  { … }
}

…
type Password = Stringish<PasswordWrap>;
type Phone = Stringish<PhoneWrap>;

So all the core functionality is in the generic implementation. Each new kind needs a dummy struct, an empty impl on that struct, an alias for convenience, and a new method or anything specific. So like 3 more lines to get a typesafe wrapper vs. having a standalone “validate_username” somewhere.

[-]

DivideSensitive@reddit

I think you can get it easier:

    #[derive(
    Clone,
    derive_more::Into,
    derive_more::Debug,
    derive_more::Display,
    derive_more::Deref,
    PartialEq,
    Eq,
)]
#[debug("Client#{_0}")]
#[display("{_0}")]
struct Username(String);
impl Username {
    fn new(s: &str) -> Result<Self, Whatever> {
        [...]
    }
}

[...]
i_want_a_string(*username);
i_want_a_string(username.into());

[-]

jl2352@reddit

Yes, `Deref` is the closest Rust has to inheritance.

However whilst your approach works, tbh I'd recommend people just make separate structs for `Username`, `Password`, and `Phone`, and use macros to decrease the boilerplate.

`type` causes issues as you will often see `Stringish` show up in the type system rather than `Username`, and so things get confusing.

You also end up with bespoke needs coming up. For example with `Password`, you may want to write empty data over the strings contents before it is dropped. You may also want the constructors to return a `Result` of `Self` and `PasswordError` or `UsernameError`, which would each give information about why the validation failed. It ends up being simpler if they are just structs.

[-]

cosmic-parsley@reddit

This was meant to be a mapping of the C++ version described above, I’d probably use macros in many cases too. It’s a clean enough design though: type aliases aren’t much of a concern (usually errors will point to a Username somewhere in code even if it prints the alias), and you can return a Result or add whatever type-specific functions you want (impls are on the concrete filled in type, not the generic).

[-]

Blecki@reddit

Exclusive type aliases are a feature I'd very much like. My example actually had c# in mind but it should work in any language with generics.

But oh how I'd like to just write "class Username exclusive alias string;" and have a new type with all of strings features except that it can't be converted to a string.

[-]

Full-Spectral@reddit

But, as I said, you aren't really creating a new string type. It's a very limited, immutable string type. That has it's place but it's not really the answer for this problem. To make it actually useful, you'd have to implement a lot of traits, and you could still never pass it mutably to anything that expects an actual string, which will be a lot of stuff.

[-]

ShinyHappyREM@reddit

standard library or third party code

That's when you give them the underlying data (possibly in a copy) and then check what they return.

[-]

Full-Spectral@reddit

Again, you could do it, but in practice it will be full of gotchas.

[-]

New_Enthusiasm9053@reddit

Rust does have const fn so you could validate the size of a type at compile time if it's a known value but obviously that's notnauper useful.

[-]

GhostPilotdev@reddit

Primitive obsession is one of those things you don't notice until you've debugged a function that takes four different strings and two ints as arguments.

[-]

CandidateNo2580@reddit

I feel the same way. I haven't taken it this far yet but this is how I've been pushing my codebases lately. It just makes more sense to pass in domain objects defined this way imo. I get it for non-domain performant code you do what you need to do, but that's not the majority of changes to large codebases.

[-]

tubbstosterone@reddit

I think this could work for many cases, but be disastrous in my field. Type bloat and boxing are fine in a LOT of circumstances, but specialized container types + copious DTOs become a counterproductive guardrail when you're processing terabytes of data a day. At a certain point you'd have to use specialized language features like some of those introduced by c++ 20+ to tell the compiler how to work with your types and at that point you've added too much cleverness.

Neat idea, but I'd probably avoid it unless im feeling fancy and frisky.

[-]

garnet420@reddit

How does the use of these wrapper types interfere with throughput?

[-]

tubbstosterone@reddit

Usually has to do with things like SIMD operations, passing data between language layers, and byte packing requiring super strict type constraints. Gets weirder once the very few third party libraries we're allowed to use require their own typing constraints. We also often use python to bridge scientific code, so types there are really just mountains of dicts rather than individual values. Haven't worked on it in a while, but Java USED to be super eager to box and unbox unless you were right up on it with the typing. Starts to really hurt when you're bouncing millions of floating point values around and doing things like rolling slices.

Granted it's not as painful as I first thought with C and C++, where it just adds more cognitive overhead.

No idea on the Rust-front. New blood may bring that into the org, but hiring freezes and turbulent times aren't helping there.

[-]

MEaster@reddit

In Rust, simple wrapper types like these:

struct OrderId(String);
struct Amount(i64);

Have an identical runtime representation as the inner type. You could still run into issues with some optimizations. For example, the standard library has some optimizations for creating a Vec filled with zero-values of certain types (that is, values where all the bits are zero) which wouldn't apply to the wrapper type.

[-]

krutsik@reddit

I agree with the general premise, but the example is bad.

The application runs, pays out the wrong entity, credits the wrong amount, and nobody notices until a seller asks why they received ₦350 instead of ₦54,000.

The odds of there being a customer with an id identical to a shop and a shop with an id identical to a customer (assuming random UUID) is soooo astronomically low that you'll notice a ton of failed payments, before a single one actually goes through.

[-]

hl_lost@reddit

the last comment kinda misses it imo — sure you can still misuse it at the construction site, but you've reduced the surface area to one place instead of every function signature that passes ids around. compilers catching 90% of id mixups is still a huge win ngl

[-]

codeconscious@reddit

I very briefly had a bug in my F# application due to this sort of thing recently. I carelessly swapped the string parameters to a writeFile function so that I was passing JSON as the filename and vice versa. I decided to add an enclosing Json type to help prevent such silliness in the future.

let parseToSettings (Json json) : Result<Settings, CommandError> =

That syntax auto-extracts the inner string so that the function can access it directly. I consider this better than just passing a simply string, and I'm aiming to reduce primitive obsession where appropriate moving forward.

[-]

nsn@reddit

Did this in a codebase years ago - everything was a type, FirstName, LastName etc. Lead to a fuckton of boilerplate and still didn't prevent mistakes when receiving data via REST or when fetching data from external data sources.

Wasn't worth it after all and we ditched it for the next project.

I think the underlying issue is having functions that require seven strings as input in the first place.

[-]

Absolute_Enema@reddit

That's a more general issue of using static typing and a compile-modify dev cycle a context where most of the complexity is interacting with external systems outside of your control.

[-]

Valmar33@reddit

A good old struct is the best middle ground ~ along with adding checks and asserts for what the valid values of major fields should be.

[-]

o5mfiHTNsH748KVq@reddit

When I read the title I was in 100% agreement, then I glanced at the article.

Yes everything should be typed. String is a type.

[-]

Specialist-Owl2603@reddit (OP)

String is a type, but it's not your type. It tells the compiler "this is text" but not "this is a shop ID." When two fields are both String, the compiler can't stop you from swapping them. That's the point.

[-]

jean_dudey@reddit

I think this pattern would be more used if Rust had a notion of refined types.

[-]

MaleficentCaptain114@reddit

Have you heard of Flux?

[-]

PropagandaOfTheDude@reddit

Similar vibe: identifying exceptions by pattern matching on their human-readable message.

[-]

o5mfiHTNsH748KVq@reddit

To me, it looks over-engineered and wouldn’t pass code review on my teams, but perhaps there’s a time and a place where this sort of diligence is necessary.

You’re not technically incorrect, but I’m not confident the juice is worth the squeeze to wrap literally everything.

Maybe on things where there’s dire consequences like avionics or medical stuff.

[-]

lelanthran@reddit

You’re not technically incorrect, but I’m not confident the juice is worth the squeeze to wrap literally everything.

I'm here to tell you that it is worth the squeeze.

Maybe on things where mistakes have dire consequences like avionics or medical stuff.

Yeah, I wrote munitions control software.

[-]

o5mfiHTNsH748KVq@reddit

See, that's an example where I thought maybe I should walk my take back a bit.

[-]

lelanthran@reddit

See, that's an example where I thought maybe I should walk my take back a bit.

Yeah, but to be fair to you; I should have seen you hedge and left it alone :-) You are not wrong!

[-]

c-digs@reddit

To me, it looks over-engineered and wouldn’t pass code review on my teams

Do you think it makes sense to have a Uri type that represents URLs?

I think that's quite nice because there are a number of operations that make sense on a URI. You could just pass around a string url parameter, but isn't it kind of nice to pass around the Uri uri and get all of the nice URI specific operations?

We do this a lot in C# like:

// Type
public record EmailAddress(string Email);

// Usage
public async Task SendMail(EmailAddress recipient) { ... }

We can have some base behaviors on the EmailAddress class, but specific verticals can add behaviors using extension members/extension methods specific to their use cases.

We don't do this for every string, but for ones where we know that we have behaviors related to the string (in this case, the domain)

[-]

o5mfiHTNsH748KVq@reddit

I’m not sure Uri is a great example because that’s a class and has quite a lot of additional properties and methods on it, as you mentioned.

If encapsulating functionality is the objective, yes it makes a ton of sense. That’s the foundation of OOP.

[-]

c-digs@reddit

Of course it is a class, but it the class encapsulates behaviors on the underlying string; that's the point. A Uri type never enters a system as a URI; it always enters the system as a primitive type and has to be converted to a an instance of Uri (in C#).

The point of the article is to write a class that encapsulates the underlying primitive because it is often useful to do so, even if just for safety of internal API calls.

An API that says:

public void DoSomething(Uri uri)

gives stronger clues and compile time safety versus:

public void DoSomething(string uri)

(Still not perfect because there are still runtime errors possible when creating the Uri)

[-]

o5mfiHTNsH748KVq@reddit

It’s only better on 99% of codebases because you want that Validate method. Just wrapping a string in a named type to prevent someone from accidentally assigning a different string to it is not, in my opinion, adding much value relative to the additional complexity.

The fact that you keep falling back to explaining basic OOP principles illustrates my point. Uri has utility methods on it. Your AmazonAsin has a utility method on it.

[-]

lelanthran@reddit

It’s only better on 99% of codebases because you want that Validate method.

You don't want to validate it, you want to parse it.

Parse. Don't Validate.

And once it's parsed into a Uri type, then no more validation is necessary, and all the methods work without needing to perform null-checks, and all the return values from functions that retrieve parts of the Uri all return something sensible.

Parse, Don't Validate.

[-]

c-digs@reddit

The fact that you keep falling back to explaining basic OOP principles illustrates my point

What point? That we should use classes and types to represent important domain structures instead of primitives? Yes. We should. That is the point of the article. It follows that once you have a type or a class, you can attach behavior to it.

[-]

o5mfiHTNsH748KVq@reddit

I’m not sure you can attach behavior to records or structs in c#

[-]

Absolute_Enema@reddit

If you have five UrlUtils classes over a single codebase where everything is under your control, either your build system is so shit that extracting a library is not worth the effort and thus you have far greater problems, or it's a fundamental process issue.

[-]

c-digs@reddit

This is common when teams are working in isolated verticals on a monorepo and teams are writing what they think are one-offs for working with common strings (e.g domains, emails, phone numbers) with static classes in their own namespaces/modules. You're just not going to know that another team wrote something similar and it should be promoted to a common lib.

It becomes especially common now when agents are writing a lot of the code.

The agent is not very good at always following instructions to follow DRY.

[-]

lelanthran@reddit

String is a type, but it's not your type. It tells the compiler "this is text" but not "this is a shop ID." When two fields are both String, the compiler can't stop you from swapping them. That's the point.

I'm glad my point is being spread far and wide; from my blog post using this example:

And finally, the last upside: with different type names for different types, there will never be a situation where a caller might accidentally switch around the parameters in a call.

[-]

frankster@reddit

in C, (void*) is a type.

[-]

ProfessionalLimp3089@reddit

The funny thing is this argument just got way more relevant with AI coding tools. When I write code myself, a UserId vs int mixup is something I catch before I finish the line. When AI generates a function, that class of bug is invisible in the output until it explodes at runtime. Strong opaque types are a machine-readable contract the model can actually follow. Without them you're hoping the LLM infers your intent correctly from variable names, which works fine until it very much doesn't.

[-]

shayan_el@reddit

I love it every time we rediscover "Parse, don't validate" (https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate)

[-]

Plank_With_A_Nail_In@reddit

Its still possible for someone to put the wrong value into shop_id, they can't assign it incorrectly using other methods but the root problem still exists someone put the wrong value in shop_id.

[-]

Kanegou@reddit

I disagree. It's too verbose and leads to boilerplate code.

[-]

Specialist-Owl2603@reddit (OP)

A few lines per type vs a 2am debugging session I had. I'll take the boilerplate lmao

[-]

Valmar33@reddit

A few lines per type vs a 2am debugging session I had. I'll take the boilerplate lmao

At some point, you become buried in a mountain of boilerplate, when throwing relevant stuff into a carefully checked struct is better.

[-]

Kanegou@reddit

Thats what i thought too at first. But it all starts to add up quickly when you start to use it in a real codebase. E.g. Methods that work on these scalar types need to be designed or templated for it. Code reuse becomes harder. Or youll end up defining common traits for these types with implicit type conversions. Which end up defeating the purpose. Everything has a cost. And some of it is not visible up front.

Thats why this pattern stayed in a green field project and never made it into the "bread and butter" codebase. Ofc its a question of type of project, team size and programming language too. So your conclusion might differ. But for me, this experiment was not worth it all. So go on reddit, downvote me for having a different opinion.

[-]

Hot-Employ-3399@reddit

Code reuse becomes harder

If anything it becomes easier because if you use function foo(shop_id: ShopId, customer_id:CustomerId) and try to call bar(customer_id:CustomerId, shop_id:ShopId) using the same argument order bar(shop_id, customer_id) as 10 other functions written across several years do, it will be caught.

Everything has a cost

Yeah, Specialist-Owl2603 will not get several bucks for overworking at night.

[-]

Kanegou@reddit

You dont understand my point. You cant have a single func foo(int id) that works for all types of ids. You need to define overloads, implicit conversions or templates

[-]

lelanthran@reddit

You cant have a single func foo(int id) that works for all types of ids. You need to define overloads, implicit conversions or templates

Option 1: I suppose if you're deep into OO and are okay with inheritance, you can create type hierarchies (not necessarily object hierarchies, depending on the OO language you are using).

Option 2: It's a nothing-burger. Have a look at how Pascal does range types - assigning to a wider type is allowed, so foo (i: int) can be called with a value that is of type 1..10. For obvious reasons, if foo returns an Int that return value cannot be assigned to a range type of 1..10 unless you explicitly cast, but then you broke the typing system anyway.

[-]

UdPropheticCatgirl@reddit

You dont understand my point. You cant have a single func foo(int id) that works for all types of ids. You need to define overloads, implicit conversions or templates

But that’s a language issue isn’t it, rust actually isn’t even the worse offender here because it has Deref trait, and you can solve lot of it with generics… similarly something like haskell and scala give you much better tools to tackle it.

What you are complaining is essentially the type checker actually doing its job…

[-]

UdPropheticCatgirl@reddit

Only if the language has bad support for distinct types… It leads to some boilerplate in rust as the article shows, but that can be worked around with Deref trait… Similarly you can sort of live with it in TypeScript. Go, Haskell and Scala do this pretty well. C#, C++ and Java are very boilerplaty, but that’s kinda expected. Kotlin and Swift handle it well enough, fall into similar space as rust.

[-]

joemwangi@reddit

How is java verbose? Yet the article actually describes all the features java records have with validation at constructor level.

[-]

UdPropheticCatgirl@reddit

You have no way to actually destruct those types concisely… Yes records mean you can do it in 2 lines instead of like 6, but you also have to name the inner element something because java has no structural products and you can’t just pass the wrapper type into places where you can typically pass the primitive to begin with.

[-]

joemwangi@reddit

Java can actually do nested deconstruction through pattern matching. Java records are quite structural by design. Probably you're referring to deconstruction assignment. That is planned soon. Also, thinking much about your argument, do you mean the scenario where type is inferred in deconstruction/destructuring such as Order(ShopId(id), amount) = order;?

[-]

UdPropheticCatgirl@reddit

I didn’t say there is no way to do destructing in java, just that there is no concise way to do it… and pattern matching in switch expressions is not particularly concise, tho that’s a minor issue, bigger issue is that you end up doing a lot of destructing and constructing back and forward or wrapping functions of the inner type to get anything done with them, because you lack some form of sub typing on primitives or something like the Deref trait, that’s actually why you need the destructing to begin with.

More expressive generics would probably also ease the pain here.

Also nothing about the java record is particularly structural, it type checked entirely nominally and everything still has to be bound to a name, which again just forces you to add identifiers for the inner members.

[-]

davidalayachew@reddit

I understand how Java chose nominal over structural deconstruction. I don't see how that equates to verbose or wordy. All the verbosity is at the class declaration, but everything after that is 1-2 lines. Can you give an example of where Java forces you to be verbose?

[-]

UdPropheticCatgirl@reddit

Hey, we actually spoke on java subreddit couple of times. I don’t disagree with the choice of nominal products in context of java, but I just think that having tools to just do wrapper type without having to name the inner components can be useful and communicates intent better…

And it’s inherently more verbose because you have to name the inner component, and the name there is completely inconsequential yet it will keep popping up.

[-]

davidalayachew@reddit

Hey, we actually spoke on java subreddit couple of times.

Yes, I remember you. I don't remember the occasions, but your tag has +6 next to it, so I certainly agreed with you lol.

I don’t disagree with the choice of nominal products in context of java, but I just think that having tools to just do wrapper type without having to name the inner components can be useful and communicates intent better…

Ok, so it is at the class declaration point that you are saying Java is more verbose, which is true. I don't make enough ad-hoc types to pay the price often, but maybe being structural encourages more ad-hoc types, making this problem more noticeable.

And it’s inherently more verbose because you have to name the inner component, and the name there is completely inconsequential yet it will keep popping up.

Assuming you are talking about class declaration, yes, though mine are rarely inconsequential.

But for class deconstruction, you can usually just use a catch-all (_) to completely elide the name. Thus, even a 10 argument record Blah becomes if (someObject instanceof Blah(_, _, _, _, _, _, _, _, _, var someField)).

[-]

UdPropheticCatgirl@reddit

Ok, so it is at the class declaration point that you are saying Java is more verbose, which is true. I don't make enough ad-hoc types to pay the price often, but maybe being structural encourages more ad-hoc types, making this problem more noticeable.

Scala does this really well imo:

 object PositiveInt {
    opaque type PositiveInt = Int
    def from(i: Int): Option[PositiveInt] =
       if i > 0 then Some(i) else None
  }

The entirety of the scope where the opaque alias is defined will treat it as the underlying type, outside of it it’s completely distinct, when you need to thread them as Ints you just do:

 given positiveAsInt: Conversion[PositiveInt, Int] = identity

And then you can just import the positiveAsInt when you need to be able to treat it as int at the callsite:

 def foo(a:PositiveInt,b:PositiveInt) = {
    import PositiveInt.positiveAsInt
    a+b
 }

or better yet:

import PositiveInt.given

def add[A](a: A, b: A)(using Conversion[A, Int]): Int = {
    val conv = summon[Conversion[A, Int]]
    conv(a) + conv(b) 
}

Assuming you are talking about class declaration, yes, though mine are rarely inconsequential.

 record Integer(int inner){}

the inner here is typically completely inconsequential yet you can’t just leave it empty.

But for class deconstruction, you can usually just use a catch-all () to completely elide the name. Thus, even a 10 argument record Blah becomes if (someObject instanceof Blah(, , , , , , , , , var someField)).

sure, that’s mostly useful for actual products, not type aliases…

[-]

davidalayachew@reddit

The entirety of the scope where the opaque alias is defined will treat it as the underlying type, outside of it it’s completely distinct, when you need to thread them as Ints you just do:

I guess I've never really needed to thread them as ints. Usually, if I am making a whole wrapper, I am doing it with a purpose. Usually constrained access to some state or something like it. That might be why I never ran into this -- my code never really needed or wanted it.

the inner here is typically completely inconsequential yet you can’t just leave it empty.

Same point here -- I wrap with intent. For example, if I am making Float16, one of my fields is called mantissa. Even a uint8 is called something like positive. And worst case scenario, I can just call it i. For me, it's just kind of a non-issue.

sure, that’s mostly useful for actual products, not type aliases…

I use actual Product Types as my Type Aliases. Granted, a poor man's version, as I don't get to call the methods of the wrapped type, but that's just java not having easy means to delegate.

[-]

UltraPoci@reddit

I'll never understand complaints like this one. Types are information, it's not boilerplate. Code should be readable and solid, not one gets points for having less lines of code. Sure, types can be introduced badly and make a mess of a code base, but that's just true for everything: you can write an algorithm badly involving only integers.

[-]

Vectorial1024@reddit

Depending on the language, there may be user defined implicit casting of types.

[-]

BenchEmbarrassed7316@reddit

Oh yes, types are just noise. Ave dynamic typing! Ave JavaScript! /s

[-]

freecodeio@reddit

some of us are working on drone software, some of us are working on strait of hormuz mine sweeping algorithms from sonar data, some of us are building todo lists via chatgpt

whoever you are, types are an investment, saying types lead to boilerplate code is like saying investment leaves you with less money

[-]

Blue_Moon_Lake@reddit

"Too verbose", it's 1 line.

[-]

sards3@reddit

I think this should be done sparingly, and only when confusion between primitive-typed values is likely.

[-]

oatmealcraving@reddit

AKA the reason every one abandoned Pascal and moved to C++

If Pascal had a less frustrating typing system there would have been far less buggy C++ floating around and far better software for every one.

[-]

ShinyHappyREM@reddit

How is Pascal lacking in this regard?

Note that there's more than one standard; the 2 current ones are Free Pascal/Lazarus and Delphi.

[-]

oatmealcraving@reddit

I didn't say lacking, I said too much type checking causing user frustration.

Pragmatically, if you can actually program, the type system is there to improve compiler effectiveness. And only incidentally to reduce bugs.

Pascal is a nice language though, with clear layout of code. It is a pity about the unbearable type system.

[-]

KyNorthstar@reddit

This is why I made SpecialString. Makes making these specialty compiletime types much easier

[-]

Supuhstar@reddit

also a great argument for Swift’s argument labels

[-]

garnet420@reddit

I like your article and its points but the "interchangeable_params" flag idea is just not great. There are far too many operations on multiple things of the same type. For example, basically every interesting string operation involves two or more strings!

[-]

WarEagleGo@reddit

Ada is strongly typed

Are we circling back to the 80s?

[-]

nelmaven@reddit

In the context of a medium/large project, this makes perfect sense.

It might be hard to enforce though.

[-]