Tests Don’t Prove Code Is Correct… They Just Agree With It

[-]

drnullpointer@reddit

The idea of test is that the behavior is expressed in an easy to understand and easy to verify way.

For example, when my test says "given parameters X and Y the function F is expected to return value Z", this is easy for me to understand and verify.

So the tests are *MEANT* to agree with the code. It is a feature, not the bug.

But for the tests to be valuable, it is important that it is easy to see and understand what exactly we are agreeing with.

[-]

Tests are a sieve - the first bulwark. Bugs are particles. They'll get through. Tightening the sieve to 100% is impractical. You need to add layers to the quality approach, and in this manner less bugs make it to production with impact.

Selenium automation can be layer but is hard to maintain.

Hypothesis testing can help for very sensitive domains.

Good CI CD with linters, formatters and perhaps type checkers on PR, appropriate staging envs to promote and test, synthetic endpoint testing, and API dog folding are great.

Having your PMs or dedicated analysts perform post release checks from data and user perspective is great.

Tracking user analytics for quality issues either with select session recording or other event tracking is also beneficial.

Conways law - your org stucture and it's communication is what you sell. Aka, you get what you pay for. If you build for layers of quality you get it :)

[-]

Embarrassed-Lion735@reddit

The real win is cheap, layered feedback from pre-merge to production, not chasing 100% coverage.

Three moves that cut our false confidence the most: mutation testing (Stryker or PIT) to expose weak assertions; consumer-driven contract tests (Pact) to catch shared wrong assumptions between services; and SLO-based canaries with auto rollback so prod tells you when intent is wrong. Keep UI tests thin (Playwright) and focus on API tests with a bit of schema fuzzing. Spin up ephemeral envs per PR with masked seed data so feedback is fast and realistic. Add shadow traffic or replay to validate changes against real patterns before full rollout.

We used Postman for contract collections and Pact for CDC; DreamFactory helped keep REST endpoints consistent across databases so tests weren’t fighting schema drift.

Instrument with tracing and error budgets so tests map to user impact, not just green checks. Optimize for fast, layered signals that align with real behavior, not green checks.

[-]

la-kumma@reddit

Agree, tests are for the future. Good luck refactoring / extending an untested piece of code

[-]

crap-with-feet@reddit

In my experience they’re largely to raise alarms if some new code unexpectedly breaks the older code.

[-]

MehYam@reddit

That's the 80%. The other 20% is the bump in code quality when tests are there from inception, because it starts with a second use case, and the extra scrutiny shapes its design.

[-]

Crimson_Raven@reddit

Good for testing edge cases. "What if the input to this function is undefined?"

[-]

ClownPFart@reddit

The problem is that you might overlook an edge case both when writing the code and when writing the tests. Then the code passes the tests, the tests have 100% coverage, but that edge case will still break it.

[-]

jcoleman10@reddit

Then you write a test for the edge case that you have newly recognized as an edge case. No one is expected to see all the edge cases on first pass.

[-]

TimelessTrance@reddit

First pass just get code coverage. On every fixed bug there should be a test proving that the cause of the bug is fixed. Bugs no longer regress and broken tests means that something broke or they need to be re-evaluated with new functionality. Tests are not magic they just raise confidence.

[-]

kyune@reddit

This is what I ended up doing for a project after introducing a framework for testing our code. Whenever a defect came in that escaped our existing tests (I think we had around 2k for our particular repo by the time I left?), I was able to just input the failing scenario as a new test, quickly debug, and then leave a comment with the relevant JIRA ticket and a base description of what the business expects to happen for the customer.

Being empowered to lay out the groundwork for that and cement working code in conjunction with the existing use of pull requests as a quality gate made a terrible job much more bearable despite aggressive management.

[-]

cstopher89@reddit

Yep, can't be expected to catch everything but when anything new happens you first write a test demonstrating the failure then fix the code to make the test pass.

[-]

mjec@reddit

Sure! But more than zero tests are more likely to cover an edge case than zero tests.

I also find the act of writing tests, usually holding all but one condition constant, helps me to identify where edge cases might live. Manually testing the flow which runs the code is rarely as effective for me.

[-]

mattl33@reddit

This is where fuzz testing enters the chat. Test all the edges.

[-]

hotel2oscar@reddit

A guy walks into a bar and orders 1 beer

2 beers

A lizard

-1 beers

NaN beers

...

[-]

Billy_Twillig@reddit

Wasn’t he with a priest and a llama?

[-]

Crimson_Raven@reddit

User walks into the bar and asks where the restaurant is.

The bar explodes.

[-]

DuckDatum@reddit

Person says ”over there.” User walks into a bar.

[-]

Chii@reddit

That's why you cant two jokes at the same hear time

[-]

Crimson_Raven@reddit

Here's a joke, I promise there will be a punchline.

...

I changed my mind, cancel the joke.

[-]

HMikeeU@reddit

Testing edge cases is the only thing unit tests are bad at because you need to come up with these edge cases yourself in the first place

[-]

camaris1234@reddit

That's actually a bad example. Undefined input should be prevented by a compiler or static analysis, not tests.

[-]

-Redstoneboi-@reddit

grumbles non-nullably

[-]

masklinn@reddit

If you have an expressive type system and leverage it, then that’s where the compatibility stuff lives, checked by the compiler, and you only need to test the rest.

[-]

KevinCarbonara@reddit

The other 20% is the bump in code quality when tests are there from inception, because it all starts with a second use case, whose extra scrutiny shapes design.

I've heard a lot of people make some form of this argument. I've never seen these benefits materialize.

[-]

harbourwall@reddit

I can understand why it might make some people think more accurately about the problem they're trying to solve before they try to solve it. But it's annoying when some they insist that everyone needs that and must write shitty code if they don't.

[-]

KevinCarbonara@reddit

I've said this elsewhere, but I think it's far more common for Python devs. A lot of them just aren't very used to thinking through things like what types they're using, and writing unit tests is necessary to ensure type safety in that language.

[-]

harbourwall@reddit

Modelling, we used to call that!

[-]

Rattle22@reddit

I developed an algorithm for work once, and manually determining and implementing test cases in the middle of it helped me figure out the algorithm because they told me when I was missing edge cases in the actual code.

[-]

MehYam@reddit

They've materialized for me. Sometimes you don't realize when you're building a candy machine interface into your API and libraries. Writing tests can weed them out sooner.

All that said, I still don't write as many as I should.

[-]

Magneon@reddit

It depends a lot on what you're implementing.

If I'm porting a math library to Rust, or writing an implementation of an algorithm described in an academic paper... That's a great candidate for TDD since the overall shape of the code is relatively deterministic, and the correct output is fairly easy to predict.

If I'm writing a configuration class for a tool to organize photos on disk by exif and image metadata, and I'm not sure exactly what the configuration needs to contain yet since the part that uses it isn't fully complete... I'd maybe skip tests initially or keep them super basic (to lower friction at adding later tests), since the codebase is in a large state of flux, and I'm still exploring requirements and implementation details.

It is often useful to keep unit test-ability in mind during design and implement though, since the kinds of constraints it imposes are generally useful (for example it'll push you away from giant run-on-sentence functions doing 4 different things, and pushes towards more natural encapsulation of functionality and state).

[-]

Mikelius@reddit

It also helps surface bad abstraction/overloading. If you are hampered with initializing mocks because they are too cumbersome chance are you should refactor your code to simplify it.

[-]

KevinCarbonara@reddit

This is heavily dependent on the technology being used. What you say probably holds true for python, but it absolutely does not for languages like Java.

[-]

experimental1212@reddit

For the tool config class thingy, what would be time spent writing tests for TDD could be spent writing the test framework / tools to make the test cases, even if you don't know what a "correct" test case is yet. You'll want low friction to add a test case or you'll never get around to it when the tool is working.

[-]

KevinCarbonara@reddit

You'll want low friction to add a test case or you'll never get around to it when the tool is working.

Why would that be the case? I don't write tests when they aren't meaningful. I do write tests when they are meaningful, because they're much easier to write.

[-]

KevinCarbonara@reddit

If I'm porting a math library to Rust, or writing an implementation of an algorithm described in an academic paper... That's a great candidate for TDD since the overall shape of the code is relatively deterministic, and the correct output is fairly easy to predict.

There are certainly use cases where TDD is a solid "not bad", and it may even align with business interests. But even in those situations, I find TDD to take a lot more work than simply requiring all PRs to come with unit tests.

[-]

frymaster@reddit

certainly the time I was made to do test-driven design (I'm a sysadmin by occupation, not a programmer) I found I saved at least a modest amount of rewrite time because the initial design iteration happened before I'd even written the actual code

[-]

KevinCarbonara@reddit

But the initial design iteration didn't just "happen". You had to write it. And probably multiple times.

[-]

nixgang@reddit

I don't get what they're trying to say. But I know for a fact that I don't refactor code without tests, so it stays shitty.

[-]

Mental_Scientist1662@reddit

I rate that part higher than 20%. If you don’t write tests alongside the code, and don’t allow the tests to inform the way you write the code, you can’t tell if your code is good or just unmaintainable spaghetti.

[-]

booch@reddit

(With the caveat that I disagree these are the only uses for tests...)

the bump in code quality

Writing tests, especially as or before you write your code, helps you understand your requirements better. It makes you take the time to think things through. And (I'm agreeing with you here) that helps bump code quality overall.

[-]

FrewdWoad@reddit

I'd say:

10% detecting breaking changes

10% making me think about code better

80% catching the first 6 months of bugs in 30 minutes.

[-]

Snape_Grass@reddit

Yup, this is why there are integration tests

[-]

Horror_Jicama_2441@reddit

Or, another way to put it, they are a way of removing fear of breaking things while refactoring. Without fear, you actually refactor, and by refactoring the code doesn't slowly but surely become an unintelligible mess.

[-]

yoomiii@reddit

How do unit tests help with refactoring as they usually need to change as well in my experience. Integration tests survive a lot longer in that regard. But maybe I'm doing it wrong.

[-]

Horror_Jicama_2441@reddit

The problem with "unit" testing is defining what an "unit" is, different people refer to different things. Plus, you may be programming in a language/ecosystem with its own challenges.

But no, I don't usually need to change the unit tests. Difficult to say why the difference in experiences but, probably stating the obvious:

I split the logic into components with a well defined interface and only unit test that interface, no internal details.
Those interfaces rarely need to change. Most refactoring is inside the components (which are "units", but not necessarily trivially small, there is stuff to refactor in there), not in the interaction between components.
Once you get into refactoring that requires changing the interfaces sure, you will need to change the unit tests. And sure, a higher level test that depends on less interfaces (going up to the extreme of testing the whole thing together, where the only interface is the one with the user) will need to change less. But since it becomes difficult to test more specific situations the more the higher level you go (again, your specific ecosystem may be better or worse at this), you have no option other than also have lower level test, and trust you thought your component interfaces well enough for them to last.

[-]

Mental-Net-953@reddit

Can't talk too much about this because I might dox myself accidentally lol, but yeah I've worked on a project where (basically any) refactoring causes almosy every unit test related to that part of the codebase to be rendered entirely useless and failing.

[-]

Boom9001@reddit

Yeah new tests don't prove your new code works. It does however insure people who make changes that unknowingly break things.

For example of you wrote a function that receives a list and sorts it. You may care that you don't actually modify the passed object. You should write a test that verifies the passed object isn't modified.

This ensures that if later someone changes that your test breaks. They have to change that test too, which should at least make some people question if it's correct in the code review.

[-]

RedditRage@reddit

My experience is they mostly raise alarms that I need to update the tests, but the new code didn't break anything.

[-]

cs_office@reddit

You're probably testing internals, not the more stable public APIs/contracts

[-]

yoomiii@reddit

Well yes, how else are you gonna reach 100% coverage? ;D

[-]

cs_office@reddit

You don't, your code coverage is an indication not a goal. You know the saying any metric that becomes a benchmark ceases to be useful at either? Yeah. That.

[-]

booch@reddit

Testing the internals can be useful, too. It just serves a different purpose. It's possible for an "implementation detail" to require the implementation of an algorithm to achieve it's goal; and testing that that algorithm does what you want it to helps you write better code.

[-]

cs_office@reddit

Yeah, that's a valid case of testing internals, because you're testing publicly visible properties you want to enforce

[-]

SaxAppeal@reddit

There’s two instances where tests will break, changing core functionality or changing shared utility/side effect behavior. In the former, which you’re referring to, tests are expected to break. In the latter, the code did genuinely break something. But that’s why tests are still important. With zero tests you may not know that you’re breaking something seemingly unrelated to your changes.

[-]

booch@reddit

There's a 3rd case, where the tests are testing the wrong thing. For example, if you have a sort function that doesn't need to be stable and your test requires that it is. Your test can fail because the new output doesn't match the old; but the new output does meet the stated requirements.

[-]

_dogzilla@reddit

9/10 times yes. But that 1/10 times may still be worth it.

But for me it is nice actually to see a code change in a PR also changing the behaviour in the tests.

You review the code change, and also get a sanity check on the affected changes to the business domain

So for me the REAL value is in reviewing the PR and the peace of mind to merge and deploy it

[-]

Perfect-Campaign9551@reddit

Ever hear of the boy who cried wolf? If it's only an issue 1/10 times, it's gonna get ignored anyway

[-]

Mr_s3rius@reddit

You can't ignore it if you have to fix the test to get it through CICD. And then the code reviewer sees the diff as well. Now you have two pairs of eyeballs looking at the changed test. That's two chances to catch that 1/10 change that you otherwise wouldn't have.

[-]

Perfect-Campaign9551@reddit

If it fails 9 times and those 9 times don't find a bug, but rather they were just the test flipping out, someone is gonna disable that test.

[-]

_dogzilla@reddit

What? That’s completely different than my experience. The test reflects a contract of what you expect something to do. Also sometimes it reflects what you expect to happen in the business domain.

Imo every PR that causes a change to the way the service operates should have a change to a test.

A refactor should ideally not affect a test (unless you have performance tests)

[-]

andynormancx@reddit

They also help you understand what the code actually does and how the original writer of the test expected it to be used. Which is very helpful when working on someone else’s code, in the real world, where code doesn’t have any documentation 🥲

[-]

FrewdWoad@reddit

>they mostly raise alarms that I need to update the tests, but the new code didn't break anything

This is a sign you're writing tests wrong: they are too specific to the old implementation, not the purpose (of the code you are testing).

[-]

flukus@reddit

That's a sign of poorly written tests.

[-]

max123246@reddit

That tells me that you broke your interface. Which is an incredibly important signal to have since that might mean needing a version bump or redesigning to not break your interface

[-]

ricky_clarkson@reddit

My guess is this happens when you overuse mocks. Tests that verify the overall results or effects are less fragile when refactoring than tests to verify interactions.

[-]

Vesuvius079@reddit

Your test is dependent on your interface just like the consuming code. If you change the interface , of course the tests need to be updated because everything that takes your interface as a dependency needs to be updated.

That hurdle (updating dependents) is why it’s often cleaner to avoid modifying the existing interface. Instead you can extend the interface, migrate your dependents, and eventually remove the now obsolete parts of the interface. With this approach, your tests never break and you can (usually) migrate dependents one at a time.

[-]

fiskfisk@reddit

How many mocks are in your test suite?

[-]

Solonotix@reddit

In simple systems, definitely. But if you are the glue between multiple different systems, that breaking test signifies a larger problem with design.

I own a library that is used by a ton of different teams at my job. If a unit test breaks, it can be a major problem. I even have a unit test to check that the exports are present as expected (JavaScript). Funny enough, even that didn't stop the problem, because there was a separate issue with the different module systems (ESM vs CJS) that I missed.

I wish there was a way for me to test that the TypeScript declarations were also correct, but that's yesterday's problem. I'm currently in the final stages of rewriting the entire thing in TypeScript because hand-rolling ESM and CJS exports with compliant TypeScript declarations was a nightmare to manage manually.

[-]

jonhanson@reddit

Unit tests really come into their own when you have to refactor a codebase. Without unit tests you will spend more time repeatedly testing the code than you will refactoring it.

[-]

ZorbaTHut@reddit

Yeah, I have an extremely persnickety and fragile library that I've done a few major refactors of, and this would have been impossible without extensive tests. I've gone something like four layers deep on nested changes to fix obscure edge cases that I dealt with multiple years ago and then completely forgotten about.

This way, I don't have to remember them, the test suite just has them.

It's not perfect - I have introduced regressions despite the test suite - but depending on how you look at it, this has prevented probably a 10x increase in the number of bugs, or has allowed me to do major changes that would have otherwise been completely impossible.

[-]

yoomiii@reddit

4 layers deep? making it an integration test?

[-]

ZorbaTHut@reddit

"Okay, yeah, I remember that weird case, and I can see why I'm failing on it. Lemme just fix that, and . . . there we go. That should work! Wait, hold on. What's this new failing test? Oh right, yeah, that's a weird exception. Sigh. Okay. Yeah, I can fix it. Aaaaand . . . done! Except there's a new failure. Right. That thing. Okaaaay . . . I can deal with this. There we go, there we go. Wait. Another one? I don't even remember writing this! Did I document it? Oh. I did. Because I knew it was weird. . . . Yes, alright, this is valid too, fuck, how am I going to handle this"

[-]

mtutty@reddit

I've been saying this for 20 years. Test driven development is fine if it helps you code, but tests are for controlling behavior against future changes.

[-]

mv1527@reddit

Lately I feel that the new code in this equation is often dependency updates. It seems python libraries love to make breaking changes on minor version updates that you would otherwise only find out about at runtime.

[-]

ReflectionEquals@reddit

That’s the standard dev interpretation. If you ever practice TDD, the point of writing tests is to help you think and design your code. Every new test is an opportunity to think about that and define what you think should happen.

[-]

Trang0ul@reddit

TDD also helps you handle edge cases. If you have tests prepared for such cases, you can run then and check whether your code handles them correctly.

And, if you write the tests before the code, when doing so, you'll be thinking about the original requirements and possible edge cases to handle. OTOH, if you already have the code written and you write tests after that, you'll likely focus on that code and write tests testing the code only (i.e. coverage and happy paths).

[-]

post-death_wave_core@reddit

Yeah, if you write tests cleanly they are basically a requirements definition which is useful to work out in granular test cases rather than trying to define the whole system at once.

[-]

BleLLL@reddit

also tests show the bare minimum that the code executes

[-]

Abangranga@reddit

Damn it is almost like you have experience and didn't AI-generate this

[-]

WhosYoPokeDaddy@reddit

yeppp... #1 use for me is providing confidence that all my code works when someone else updates a library that I'm using. Among other things that smarter people than me have written about.

[-]

UnidentifiedBlobject@reddit

Or a bug is found and the test can be changed to ensure it doesn’t happen again

[-]

sickofthisshit@reddit

Well, that just pushes back the problem to "what does 'break' mean?"

It's trivial to write "change detector tests" that are just the implementation written a second time but inside out.

[-]

Harha@reddit

If you have to change the tests after changing the implementation, your API design has failed. You abstract the implementation behind an API and write the tests against this API, the test code should not care what the implementation actually is.

[-]

sickofthisshit@reddit

I think I am in agreement. Software developers have been known to change APIs, though.

[-]

davvblack@reddit

honestly i fully agree with you, and eschew unit tests in most cases for this same reason. i unit tests small pure functions that have some “significant logic”, but beyond that strongly prefer integration tests. “breaks the api” is a meaningful breakage.

the other type of breakage is “breaks some specific assumptions the original team made s long time ago”, and while that’s worth capturing, most code isn’t bound by that kind of thing.

[-]

sickofthisshit@reddit

There's a fuzzy spectrum between "unit" and "integration" tests: in a clean, modular architecture, you can draw a whole hierarchy of units encompassing smaller units.

[-]

davvblack@reddit

yeah i think the test im most against is:

a business logic function that makes a few queries, does something trivial with the data and makes another query. very common pattern, reasonable design. a unit test defending this implementation adds nothing at all to the codebase, except like you were saying, a separate inside out tautology of the same code.

generally speaking, im not interested in unit tests that mock db calls. if the “something trivial” in the above example actually does get tricky (like changing from a map to an aggregation with a filter), then it should be extracted into a pure function and unit tested, but still no mock based unit tests of the next hierarchy up.

[-]

sickofthisshit@reddit

There's definitely some art to it. Sometimes the value of tests is that they can prove someone was able to code a call to your library and it actually compiled and ran: the ultimate "smoke test".

[-]

standard-and-boars@reddit

I think thats kinda the point. The flip side of the test is that it’s a representation of expected behavior. If you extend something, the existing tests may pass, or not—but if they don’t, then that flags for more review or adjusting your changes.

If any change results in a change in output given constant inputs, then I’d argue that your tests were poorly written. If I write a function that takes in a series of args and executes logic, then we need another permutation of behavior, the existing tests should still pass, unless the entire pre-existing logic is invalidated by the new feature.

[-]

sickofthisshit@reddit

That's a very concrete view, in a more abstract sense, the tests can encode a wide variety of expectations: if the expectations are too closely bound to the specific implementation, their breaking only tells you the implementation changed.

The value comes from tests that clearly set expectations that can be validated independently: if requirements change, then perhaps they need to change, and you can even change them first (e.g., when an external bug report can be turned into a failing test first, to verify the defect is present).

And if your change causes unrelated tests to fail, you might have learned about external expectations that you didn't realize were expected. (Of course, those tests could have been hardcoding expectations without good reason.)

[-]

aboothe726@reddit

IMO, it’s about enforcing assumptions.

I agree that the value of these “inside-out” (sounds like “white box” tests to me) is dubious If all they’re doing is change detection. These kinds of tests really only make code twice as expensive to write, since you’re effectively doing it twice, in exchange for not much.

However, if code correctness depends on certain assumptions about the structure of other code (as opposed to the content of inputs, for example), then white-box tests (frequently of upstream or downstream code) have real value when they ensure these assumptions are honored by requiring that breaking the assumptions involves changing code in two different places: the implementation and the test.

Without these tests, it’s too easy to break assumptions in a way that are only (a) caught in seemingly-unrelated tests that depended on that assumption implicitly, which can be hard to debug; or (b) in production, which is much worse.

The idea of these assumption-enforcing tests occupies the same area of my brain as a Lock-Out-Tag-Out system, but maybe that’s just me.

[-]

sickofthisshit@reddit

"White-box" is related: the more you know about how it works inside, the more likely you are just to test the inside workings.

Ideally, your tests encode expectations based on external requirements and definitions, independent of implementation choices. For example, in a code review, someone can look at the tests, see that they are what the new thing should do, then the implementation kind of doesn't matter, as long as the tests pass.

(Of course, you do review the implementation, because software developers will sometimes try to ship complete crap just because it passed the test suite once.)

[-]

Welp_BackOnRedit23@reddit

It's a first line of defense, and in some cases also useful for testing NFR. We throw everything through E2E behavioral testing for quality control.

[-]

WinonasChainsaw@reddit

In my experience, they weren’t added until the intern was assigned a code coverage improvement ticket 5 years since initial development

[-]

remy_porter@reddit

Tests imply testable code, and the biggest benefit tests give you is that they make you construct your code in such a way that you can pick off any individual feature and run it in isolation from the rest of your system. The second benefit is that they document your expectations for your system’s behavior.

[-]

ender89@reddit

It's also a decent tool for addressing bugs, especially in big projects that can take a while to build and longer to actually reproduce the issues. You still have to actually check your work at the end, but you save a lot of time checking your work as you go.

[-]

light24bulbs@reddit

100% that's the point of them. And that's why I mostly only write medium to high level integration tests

[-]

thecoode@reddit

Exactly! They’re like smoke alarms, not proof, just early warnings.

[-]

deadwisdom@reddit

If you're coding something where the intent is wrong, what are you even doing?

[-]

huuaaang@reddit

Who thought tests proved code to be correct in the first place??

This is why we devs should never tell QA how the code should behave. That's the only way you can get close to testing correctness. Unit tests are just for the developer to feel confident that the code behaves the way the developer thinks it should.

[-]

actinium226@reddit

Having worked with code that was written without testing, it's surprising how deeply buried the inputs and output become. You'll see things like

void myfunc(char * fileroot, int date, char * ext) {
    // construct a filename from the inputs and read it
    // do some computations
    // store those computations in the same file
}

And it becomes really hard to parse out where the code starts and ends. When you write tests, it forces input and output to come to the surface, and suddenly the code is much more readable. Even if the tests just call functions and throw away the outputs, this feature of forcing the inputs/outputs to the surface is extremely useful.

Also regression testing.

[-]

FortuneIIIPick@reddit

Tests can be written to prove the result is correct. Often tests do not do this.

[-]

thomas_michaud@reddit

It's about locking down intent and raising awareness if something changes.

TBH: if you're chasing 100% code coverage, someone needs to be slapped. (That final 5% coverage is going to exception cases -- did the hard drive suddenly fail (or fill)) It misses the intent.

Furthermore, if you've coding for yourself you don't need TDD and these huge regression tests. (It's good to have - but who cares?)

If you're a company though....say with millions of dollars flowing through your contracting software every day -- yes, you want to make sure Billy Joe, the recent undergrad developer you hired, didn't cause you to loose millions because you were measuring in meters instead of feet. (Or using floats for currency)

And if that change DOES get approved to your development branch - you want it stopped before it goes to production.

[-]

BridgeFourArmy@reddit

Tests are useful for breaking down code into smaller testable functions without needing to functionally test.

Does this function correctly generate the string, with 20 different inputs so you don’t have to key the application 20 times. It’s way faster and more consistent. Which is 10x more useful in modifying existing code, the tests should help show the current expectation of the code. So that when a test fails you can understand it’s a breaking change or find a way to leave that path for current consumers.

Tests can be great but having 1000x unintuitive tests is dumb. Having an extensive test bed that is also minimal is the goal. Using tests that really stretch the limits of the code yo mark the edge cases.

[-]

cstopher89@reddit

Without tests it becomes impossible to refactor and know that the code you touched works the same as it did before. The more complex the app the more impossible manually testing everything becomes. The tests are really useful in being confident in the future when needing to refactor the codebase. You can do radical refactoring's and as long as the tests pass you can be sure that the new code at least works the same as the previous. Without this I would say refactoring and taking on and solving TD becomes next to impossible.

You are right that the only thing a test tells you is that your assumptions about how it should work are correct or not. Its still up to the engineer to define the assumptions so there will always be a degree of uncertainty.

[-]

Iojpoutn@reddit

Yeah, I’ve been trying out test driven development on a personal project and I’m not sure I’m a fan. So far it’s just been a huge time sink. I write the tests how I think they should be, write the component, the tests fail, I manually test the component to see that it’s working as intended, then spend a bunch of time fixing the test.

It seems like the main benefit will come later, when I go to modify existing code or add new features. I can see the tests saving time if they flag early on that I broke something. But man, right now it’s just a huge drag on the initial build time.

[-]

RupertMaddenAbbott@reddit

Tests are documentation. They should help the reader understand how a part of the system behaves and why. They are better than documentation because they prove that the system does actually behave in that way.

[-]

killerstorm@reddit

Well, that's why tests involving mock objects are often useless: they only confirm that code is written the way it's written, it's just busy work.

Integration tests have real value, though: they demonstrate that different parts work together, and can detect regressions in future.

[-]

NonnoBomba@reddit

I just fixed an issue caused by a developer not understanding specifications that was successfully producing records in the DB and then linking them in the wrong way through FKs... it had a unittest/integration test too, both successfully passing with mocked and real connection to DB, who were proving with 100% certainty that the wrong assumption still behaved the way the original developer understood it should: still wrong, in the same way, dozens released versions after it was first written.

At least we're not wrong *randomly*: we're wrong *consistently* and that's something.

[-]

Thick-Protection-458@reddit

Hm... Obviously?

I mean anything short of formal theorem proof is not a proof, it is just checking that in some subset of possibly inputs (theoretically infinite, really finite due to computers discrete nature, but enormous) it behave in expected way?

[-]

metaglot@reddit

Not even that. A test is at best testing whoever wrote the tests assumptions about what the code should do. Still better than nothing.

[-]

Revolutionary_Dog_63@reddit

The primary purpose of most automated tests is regression testing: the inadvertent loss of functionality due to a code change.

[-]

Lightor36@reddit

I'd look into TDD my friend, it proves the exact opposite of this.

[-]

mad_pony@reddit

This is why we have math.

[-]

Virtual-Chemist-7384@reddit

r/im14andthisisdeep

[-]

GAELICATSOUL@reddit

I regularly use tests to show a readable example of how to use our interface to other teams, future developers or even future me.

The tests document other teams requirements, if they report a bug on our failing code we work together to prove it with a failing test first.

Your tests should match assumptions made during the design process. And yes, they should remind you that certain expectations are there when you break them, so you don't have to remember them all.

[-]

BelsnickelBurner@reddit

Took this long to realize huh?

[-]

ScaredScorpion@reddit

just confirming that my assumptions in two places matched

Yes, that's the entire point. If you're writing code that doesn't produce what you expect that is an obvious problem. And having a way to ensure your expected behaviour persists even when someone else changes the code is the entire point of automated testing.

If the assumptions are wrong that's something that needs to be debugged further up.

[-]

adhd6345@reddit

You’re right about it being a few points where the code is determined to behave as expected; however, “when both your implementation and your test share the same wrong assumptions” this shouldn’t be happening.

You test to ensure the implementation behaves as you expect. How you expect to behave is not related to the implementation. The testing verifies that your code meets at least that testing behavioral requirement.

[-]

peepeedog@reddit

Nothing proves code is correct. So what is your point here?

[-]

CrowSodaGaming@reddit

Nice GPT write up.

[-]

lepetitmousse@reddit

I guess I've always viewed this as obvious? I write tests to document the current behavior of the system. They don't dictate whether something is right or wrong, they simply capture how the system behaves today. Tests should be written to alert you if that behavior changes and should be updated as the expected behavior adapts over time.

[-]

sickofthisshit@reddit

The problem with this approach is that the "current behavior of the system" is already in the code. The system executes the code, after all.

What you need is to test that the behavior of the system matches expectations from some other source than "what it does." If you know what the system should do, then you can use testing to ensure you keep doing it even after changes.

[-]

lepetitmousse@reddit

I guess what I'm saying is, while the code is self-explanatory of its behavior, the tests confirm that behavior matches an expected outcome. They are just a shortcut for providing a variety of inputs to a system and confirming the outputs. They do not assert the correctness of the outputs themselves, they just assert the consistency of the result. I want the tests to alert me when that behavior changes (either intended or unintended) so I can make the determination of what it correct.

[-]

sickofthisshit@reddit

Right, but the key thing is to have some independent basis for what the expectations are.

Sometimes, particularly in legacy code, people like to lock in the current behavior before doing things like "refactoring"; this can have value if you can identify the behavior that external entities rely on, but it can easily degrade (especially with AI assistants vibe-coding tests) to "test that the implementation does what the implementation does" which is kind of silly. (I'm also skeptical of the value of refactoring when it means "a bunch of rearranging code to end up with an exact functional equivalent").

[-]

eddiewould_nz@reddit

It's the whole "double checking" thing. Kent Beck said something along the lines of

"imagine you have a list of numbers on a piece of paper and want to add them up by hand (without a calculator). If you do it once, starting from the top, then do it again, starting from the bottom, and get the same result, you can be a bit more confident".

Agree with everything you've said in other threads in this topic - which is why it was surprising to hear you're skeptical of refactoring code to end up with something functionally equivalent. The whole point is that code is read much more often than it is written, so a big reason is to make it easier to understand for the next person. Another reason is to facilitate a subsequent (to the refactoring) behavior change.

If you haven't read it, recommend his book "Tidy First?"

[-]

sickofthisshit@reddit

The problem I have with refactoring is that it often is just "I don't like this code, it's not how I would have written it, I want to change it, and, oh yeah, it will be much better after I am done."

You know what? The guy who left behind all that "tech debt"? The code is his imagination was also clean and easy to maintain, but see how that turned out?

The truth is we all hate the code that exists and wish it was replaced with code that exists in our imagination.

[-]

optimal_random@reddit

Tests lock-in your functionality, and the assumptions and requirements around it.

If later on, "Junior Jimmy" comes along and breaks the functionality, the failing tests are its first alarm that something is going wrong. Obviously, if "Jimmy" changes the tests to align to its new "reality distortion field" and the code review passes with a "LGTM" approval, then you have multiple problems in your Team, and the test are the least of your problems.

The goal of having tests is not to make an overarching mathematical demonstration that your code is flawless written by the likes of John Carmack and blessed by Donald Knuth, but to provide another alarm barrier when anyone breaks a requirement while developing a new feature.

Again, if you have a bad review process, no integrations tests, and your QA team just presses buttons until they go green, then no test will save you.

[-]

Perfect-Campaign9551@reddit

Having useless tests IS worse than having none.

[-]

optimal_random@reddit

That logic can be applied to anything... even Reddit comments.

[-]

anubus72@reddit

and having useful tests is better than having none. So write useful tests

[-]

Zombie_Bait_56@reddit

Every paper I've ever read focused on proving a small function (< 10 LOC). Expanding this to 100,000 LOC is left to the reader.

[-]

bog2k3@reddit

Solid point

[-]

Gabe_Isko@reddit

The best comment I have seen about tests is that they test for the absence of bugs, not the presence of bugs. There can always be bugs present beyond the cases you test for.

Really, the purpose of test in TDD is to prevent regression as you continue developing the application. It is a lot less about making sure the code is working properly than checking that you aren't adding undesirable behavior to software during ongoing development.

[-]

vato20071@reddit

I’ve found a good balance by writing initial tests for the happy paths first. Then, whenever something breaks or a bug gets fixed, I add a test for that specific case so it never happen again. This keeps time writing tests to the minimum while ensuring there are no major bugs for most users.

That approach has worked well for me so far. Though to be fair, I’m not dealing with anything super important, which probably helps

[-]

hoxxii@reddit

I remind people (cause we all deep down know this) that passing those ten tests doesn't mean that we a) shipped without bugs or b) actually made anybody happy. But people get depressed when I say this.

We can just have high/low confidence in our system with the goal of achieving a fast feedback system. We know bugs or faulty interpretations will pop up - how can we identify those as fast as possible?

[-]

imihnevich@reddit

You write your test to prove the implementation is wrong, then you pass it, when you can no longer prove it's wrong, it doesn't mean it isn't, you just haven't found it yet, and it means it's probably good enough

[-]

FlyingRhenquest@reddit

The test is there to inform me that I'm breaking an established API to the point where I have to stop and consider whether I'm fixing something to provide a more correct value than the one you used to get or if, whether I should deprecate the current API call in favor of a correct one so as not to break existing code using my code or whether another course of action is required.

When I get a bug report, I write a test to replicate the result. That way I prove to myself that I fixed it when the test passed and I don't get regressions in the future.

Test output only matters if people pay attention to it. For me, they are an essential part of the design process. In my most recent parser project I was writing one parser rule at a time and using tests I wrote against them verify that I was getting the tokens I expected and to try to find edge cases.

In regulatory situations, every requirement should be in a database and I should be able to use that database to view every version control commit related to that requirement, every test that tests the code related to that requirement and the output of all those tests every time a production build is run. That information is usually provided to some government agency before permission is granted to take the product to market. There are additional regulations around tracking customer feedback and incidents related to the product, which should result in tests being changed or added over time.

[-]

tangoshukudai@reddit

right because it is a contract.

[-]

goomyman@reddit

“proof that one piece of code behaves the way another piece of code thinks it should behave.”

This isn’t correct. Tests prove that a piece of code behaves the way that you think it should. That it meets your requirements.

The key word is you. Are you letting AI write your tests or something?

[-]

Milyardo@reddit

This isn’t correct. Tests prove that a piece of code behaves the way that you think it should. That it meets your requirements.

I think this is the problem here. Test don't prove anything. Tests are the proposition. Your program is proof the proposition. Getting this relationship backwards is the reason so many testing strategies fail.

[-]

goomyman@reddit

So many people are saying this and I think it’s just over the definition of “proof”.

Obviously nothing is 100%. Tests can have bugs or miss scenarios. It’s not a scientific proof.

But it’s the closest thing you can do.

Maybe I don’t understand what the issue is.

I don’t think anyone is saying that tests guarantee something is bug free.

[-]

ward2k@reddit

I think this is exactly it, tests show that your code still works even when a piece of code is changed. They are like alarms or guardrails that help alert you when something isn't working the way it should

It's impossible to manually test every single part of a service, every single time you make a minor change. It becomes too time consuming and too easy to miss tiny things. For example you decide to refactor a method and run your existing tests afterwards to check the functionality (inputs/outputs) still work the same

Personally the people I find that tend to complain about tests being useless haven't ever worked on larger services/applications or have never had to work on anything after the service has gone live

[-]

Otis_Inf@reddit

yes exactly. My tests prove the code works at least for these use cases and does what I think should be done in these use cases. that's it. Obviously if the code changes and therefore the tests fail, I get a notification, but that's not why they're there. I have them to know the code at least works for a given fixed set of cases.

[-]

muntaxitome@reddit

Tests prove that a piece of code behaves the way that you think it should

No they generally don't. They test if a certain input gives a certain outcome. If you want to prove your code, that's a completely different ballgame. How do you prove your code will work for every given string input? There logical verification comes into play.

Tests are not really related to proving your code.

[-]

max123246@reddit

The thing is, code changes often. As long as we aren't using software that allows us to prove constraints, automated testing is the best we have

[-]

muntaxitome@reddit

That doesn't make it proof though. I am nit saying you should provide formal proof of your crud webapp, I am just saying that it's not proof to have unit tests.

[-]

goomyman@reddit

You don’t prove your code works for every given input - this is a horrible test but I see it done all the time.

I saw someone write 1000 tests to test every combination of input when the code itself only had a couple of conditions. I deleted all the tests and wrote a handful of them.

You look at the code and you look at the conditions it runs. And then you verify that you hit each condition.

You aren’t verifying input. Youre verifying logic.

If I have say an input that takes a string and and writes it to a db. Splits it or something in like first and last name and writes it to a db. You mock the db pass in any string that has a split and one that doesn’t and then verify the mock got called correctly.

And maybe you have something like - some people don’t have last names. So you verify that.

You don’t need to verify an infinite amounts of inputs.

And while your writing tests maybe you think about and go oh wait someone might input first middle and last name. Or maybe it’s a first name with a space. And you rethink your design.

You’re testing every scenario. The code isn’t going to magically change on you.

When someone else updates it, if they are adding functionality they need to add a new test for that functionality.

You’re not testing input. There are no SDEs blind testing your code who don’t know how it works. Look at the scenarios, test the scenarios, and verify the code coverage hit the scenarios, and usually have false and positive test to verify asserts.

[-]

muntaxitome@reddit

Man can you just look up logical verification please.

[-]

KerPop42@reddit

I guess it varies by discipline, but when I write code that, for example, solves Lambert's problem, or does other intensive calculations, I usually find that my tests are just verifying that my code produces the results I expect it to. Oftentimes it's effectively reversing entropy, so I can take the solution and derive the problem easier and verify that the code works backwards.

But I still have to trust that I'm not running some false baseline assumption as I play catch with myself.

[-]

goomyman@reddit

Yes that’s what tests should do :). You’re doing it right.

Later if you don’t like your code you can refactor it and youll know it still works.

[-]

LordAmras@reddit

This only work In an environment where requirements are golden and that's the only thing that drives the code.

Unfortunately I've seen very few of those environment, in my experience requirement are usually barely formed ideas, and ideas on a whiteboard don't always corrispond to reality so the code is usually always changing.

Tests are still very useful, some small key core part of the system usually work this way so it's useful to do a tests first approach, and tests are also the only solid way to know if things in one part of the code break things in another part.

But tests should focus on that, the core part of your system with the main logic that should never change, and when one part of the code interact with another part so that you avoid unintended consequence in a part of th codebase you weren't looking for.

The main issue when you have too many tests is that they break too easily and people ignore them, you can only look at a finite number of warnings before you stop looking.

[-]

goomyman@reddit

I use the rule of thumb that all tests should always be passing. If a test is failing it needs to be fixed. If it’s failing transiently - it’s a bad test and just delete it - it’s providing negative value and affecting the trust of all tests.

[-]

untypedfuture@reddit (OP)

That’s exactly my point though. Tests prove your code matches YOUR expectations. But if your expectations are wrong, the tests are just theater. They’re not proving correctness - they’re proving ‘I implemented what I thought I should implement.’ Which is useful for regression, but let’s not pretend it’s validation.

[-]

wherewereat@reddit

But it's proving that it matches your expectations.. That's the point. If the expectation is wrong that's another issue. Now if the code in production breaks, would you wanna guess if it's because it doesn't match your expectations or if you expectations are wrong? With tests you eliminate one side of the equation at least.

[-]

goomyman@reddit

What do you mean expectations. Are not you looking at the code when you write the test?

It’s not an expectation- it’s how the code actually works.

If your requirements has gaps and someone uses it some weird way that can happen.

If your tests have bugs and don’t actually verify what you think you’re verifying that can happen.

This is why people should spend more time reviewing tests - so you don’t have false confidence. Code coverage doesn’t review asserts.

Tests don’t prevent 100% of bugs nothing do.

[-]

wherewereat@reddit

Code gets updated over time, libraries change, optimizations are done, extra stuff are added or removed. The idea is to know "hmm this sign up endpoint should have a new user added with these random strings i just put in, a new user should be there in db, and notification http request sent to this url" regardless of what we did inside it. If we do breaking updatese whether that's the library or the framework/language, then expectations would stay the same.

The code is deterministic, but programmers ain't, or if you're one of those, then llms also ain't. Tests can have bugs, bugs can slip through them too, they ain't perfect, but they help reduce the amount of bugs.

this mindset of "if it doesn't solve a problem 100% then it's not worth it" would have had us stuck in the stone-age. Computers don't make us have 100% of our day back. Railroads didn't 100% solve the time and hassle of traveling. Whatever project you're working on won't solve 100% of the client's problem. Tests don't reach 100%, but they help a lot. I wonder if you've ever done a codebase-wide migrations/changes, these are never good, but when they're needed, tests catch hell of a lot more bugs than guesswork.

[-]

goomyman@reddit

Code gets updated over time… and so should tests. If someone changes some functionality that breaks your logic and it doesn’t break your tests your either missing tests or missing asserts.

If someone adds new functionality they should be adding tests.

[-]

wherewereat@reddit

Yes but the old tests still make sure the old functionality didn't break because of the new one.

[-]

NewPhoneNewSubs@reddit

I'm upvoting because your content is mostly valuable and accurate. I think OP and any other testing naysayers should read it.

But your opening paragraph is still incorrect. It's not proof the code works how you think it should, it's only proof that the test and the code agree. Regardless of who writes the test, AI, or you, or a junior who's trying to make everything green, there's a chance the test is wrong. It is our best proxy for proving intent, but it is still only a proxy.

[-]

Inaccurate-@reddit

"program testing can be a very effective way to show the presence of bugs, but it is hopelessly inadequate for showing their absence"
-Dijkstra

[-]

Paragonswift@reddit

Starting to think this Dijkstra guy was pretty smart

[-]

munchbunny@reddit

He definitely was, but this quote is the reality we all live… the hard question is: if not testing, then what else do we have?

Formal proofs don’t work for a lot of stuff, especially when the software has to interface with humans and human processes, because the complexity and the arbitrariness come from the humans.

[-]

FrewdWoad@reddit

Yeah the headline sounds like a criticism of tests, and is getting upvoted by all the students and juniors who don't get automated testing yet. But the article is actually about the author starting to understand how to write proper/useful tests.

[-]

aiij@reddit

Really? From his papers I've cited I thought he was a pretty big fan of formal verification.

[-]

Haunting_Swimming_62@reddit

Encoding as much information in types as you can is a great way :)

[-]

Kuinox@reddit

Like making invalid state unrepresentable thanks to clever types.

[-]

Awesan@reddit

There's fuzzing which is pretty effective at getting really high code coverage for low effort. But it requires a certain coding style that crashes the program when there are bugs to be effective (e.g. a lot of asserts).

[-]

masklinn@reddit

Fuzzing also only shows the presence of bugs. Though it has the advantage of exploring the potential-bug-space broadly.

[-]

ultranoobian@reddit

But how do I get to where he is, at his level?

[-]

jjkramok@reddit

Math, lots of math. Oh and do everything on paper with a pen so that you must commit to something being correct before you write it down.

[-]

ultranoobian@reddit

But I mean, what's the shortest path?

[-]

AnomalousUnderdog@reddit

I know right? He should make some algorithms or something and name it after himself.

[-]

Full-Hyena4414@reddit

Well I mean as long as you find them you can get closer to the absence of it

[-]

Inaccurate-@reddit

You missed the point of the quote, which is saying that testing itself will never be enough to show that a program is free of bugs. The quote can be found here, among all of Dijkstra's other essays.

Right before that quote, he basically said the opposite of your comment:

testing by random sampling is hopelessly inadequate as well, because even the most vigorous exercising possible will only cover a truly negligible fraction of the possible number of cases, and whole classes of in some sense critical cases can —and will!— be missed: only the most obvious blunders will show up.

Instead, he believed the only way to truly remove the possibility of bugs was by writing code in such a way that it was provably correct.

this approach can be so effective that the number of test cases needed is eventually reduced to zero, i.e. that the correctness can be shown a priori.

[-]

T_D_K@reddit

Instead, he believed the only way to truly remove the possibility of bugs was by writing code in such a way that it was provably correct.

I think this comment chain is a really good explanation of why so many people love functional programming once it clicks. FP can add a ton of confidence to your code. If it compiles, then much of the core logic must be correct. Add a couple unit tests for complex transforms, and a couple integration tests to check the overall system, and thats it. No need to write a bunch of mid level tests that tediously check the internal data flow

[-]

booch@reddit

Functional Programming reduces the need for unit tests, in much the same way that a static type system does. It removes the need to test certain things that are now enforced by the system. But there's still a lot of ways in which automated tests are beneficial, beyond "just the complex transforms".

[-]

mot_hmry@reddit

Adding on to this:

If the types align you're in the right ballpark. A ballpark can be refined pretty narrowly if you want to (effects can be tracked so you know only the things you want done are, though often representing this is more trouble than it's worth.)
You can still have errors like + versus * but these are usually best spotted with property tests.
At the end of the day, tests don't fill me with confidence while code that obviously does what it says does.

To quote Tony Hoare:

There are two ways of constructing a software design: one way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.

[-]

koreth@reddit

Instead, he believed the only way to truly remove the possibility of bugs was by writing code in such a way that it was provably correct.

And even that isn't enough, because you can't actually prove that code is correct in the sense your end users think of the concept of "correct." At best you can prove that your code behaves according to a specification, but the specification can be wrong or incomplete. From the end user's perspective, a specification bug is still a bug.

[-]

JoelMahon@reddit

find 100 blacks crows and zero of any other colour doesn't prove all crows are black

finding 100000 black crows... still doesn't prove all crows are black

1 in 1 billion crows might be albino

honestly this whole discussion, including OP's post, is a waste of time, idk anyone who actually believes 100% code coverage or even 200% code coverage (every line is tested in at least two unique ways) means a codebase is bug free.

[-]

Full-Hyena4414@reddit

Where did I ever state it would eventually lead to the absence of it?Just saying automatic testing in any capacity helps reducing bugs and ensuring they don't pop up again, that's good enough in my opinion. I agree arguing whether it reaches provable bug free or not it's a waste of time

[-]

JoelMahon@reddit

Yeah I wasn't clear but I wasn't disagreeing with you

[-]

pythosynthesis@reddit

It's an iterative process that proceeds by discovering bugs, one at a time.

[-]

jeenajeena@reddit

Correct.

You just discovered a very important parallelism between TDD anr the Double Book Keeping:

http://blog.unhandled-exceptions.com/index.php/2009/02/15/uncle-bob-tdd-as-double-entry-bookkeeping/

https://blog.cleancoder.com/uncle-bob/2017/12/18/Excuses.html

[-]

marsten@reddit

That's an interesting analogy between testing and double entry accounting.

Relatedly, when I do algorithms work I sometimes find it useful to have multiple independent algorithms for key steps, that I can swap in/out at test time to verify they work identically. Each one serves as a check against the others, and if they are logically distinct algorithms it's unlikely they will all fail in an identical way.

[-]

jeenajeena@reddit

Brilliant idea, indeed.

[-]

Neocrasher@reddit

That's why there are two Vs in V&V. Verficiation and validation. With your tests you've verified your solution, but now you've realized that the solution itself doesn't do what you/your customer wants. It didn't pass validation.

[-]

ZZartin@reddit

Well if it passes based on the requirements but the requirements are wrong that's not the code's or the coder's fault.

[-]

andynormancx@reddit

In my world the coder was almost certainly also involved in clarifying if not also gathering the requirements. So not quite that clear-cut.

Not all us work in a magical world of unicorns, rainbows, software architects and competent business analysts 😉

[-]

nj_tech_guy@reddit

I work QA and deal with this from time to time (less so now that we're adding safeguards so that only POs and business can change requirements on a Jira story). if the dev says the requirements should be changed (and I believe them), they have to be the one to change the text of the requirement, and I make note of it in my test when my tests diverge from the original requirements (and why).

If we deploy and someone complains about the changed requirement, I point them to the dev who said the requirement should change. Not my circus, not my monkeys.

But now that we have tightened who should be updating stories, I only accept requirement changes from a PO or similar, or I run things by the PO before continuing. I have one dev who will look at the acceptance criteria on the story, do what he thinks the story meant (but not what the story says) and then when I go "hey, it's not doing scenario X" "ah, well, I only did Y" "well you only did not the story."

[-]

ZZartin@reddit

Yes in that scenario that is entirely fair, it really depends on how isolated the developer is from the actual use case.

[-]

pydry@reddit

The issue is that most coders write low level "unit" tests which match the implementation because they think that's what theyre supposed to do rather than high level tests which match the requirements.

Then they get grumpy because the tests only test the implementation.

[-]

booch@reddit

Both of those are useful, they're just different kinds of tests.

[-]

ZZartin@reddit

That's not their job.

The QA's job.

[-]

szank@reddit

I have never worked in a job where the qa did actually catch problems. Either I caught them before it got to qa or it was caught on prod.

[-]

KerPop42@reddit

I once worked on a team that had as a QA someone that mostly worked contracts, with about the minimum technical skill we were targeting. She was very good at finding ways to break our code

[-]

key_lime_pie@reddit

Many SQA organizations aren't run properly. Management thinks that because software isn't a tangible product that QA can be confined to inspection tasks, and then when schedules slip as they typically do, the amount of time allowed for inspection is reduced. It sounds like you've had the unfortunate experience of working for multiple orgs that are run this way.

[-]

Jump-Zero@reddit

Might just be my experience, but I would encourage all devs to exercise a level of ownership end-to-end. Saying “that’s QA’s job” or “that’s the PM’s job” is usually not a good sign of a reliable colleague.

[-]

ZZartin@reddit

Those people in QA and PM are also our colleague's :)

[-]

vom-IT-coffin@reddit

I have a dev on my team what has implemented something three times that crashes the application. Clearly indicating he never ran the project locally thus never confirming the code worked.

[-]

vom-IT-coffin@reddit

Are you confusing integration/automated tests with unit tests?

[-]

Kraigius@reddit

It sounds like these devs should learn a thing or two about composability and abstractions.

[-]

JoelMahon@reddit

indeed, when I worked for an aerospace and defence company we'd get given requirements in IBM DOORS and have to link each one to at least one test (massive ball ache) and as long as the tests weren't bugged we were pretty likely to meet all the requirements. especially since two different people than the coder verified the tests and that the tests matched the requirements that they said they did.

if a requirement wasn't tested the ticket wasn't complete.

if the requirements were wrong then that's that!

[-]

zephyrtr@reddit

That's why you always have to ask "When will this test fail, and will I care?"

[-]

dominjaniec@reddit

you start form red test

[-]

SirClueless@reddit

Starting from a red test only proves the test, at one point in the past, had at least one assertion that was meaningful.

It's a good practice as a matter of self-discipline, but it's neither necessary nor sufficient for a good test.

[-]

ZZartin@reddit

Yep yep the existential question of the developer, do I do as told or argue with management :P

[-]

disappointedinitall@reddit

Requirements?

I’ve always been in-house, in one way or another. And actually trying to get the customer to describe what they need seems to be very difficult.

Getting it in writing requires acts of the unspeakable.

It does my head in.

[-]

TotalBismuth@reddit

Most of the time it’s the edge cases that fail, and it’s debatable whose fault that is.

[-]

WinonasChainsaw@reddit

Yall got documented requirements??

[-]

ZZartin@reddit

Better sign off from the LOB that it works as expected.

[-]

AegisToast@reddit

But it could also be a poorly written test

[-]

ZZartin@reddit

And that would be the devs fault.

[-]

AvidCoco@reddit

Same goes for mathematical proofs - a theorem is only as correct as its proof.

There’s countless examples of “proofs” that had existed for years until someone proved them wrong.

[-]

captainAwesomePants@reddit

This is why integration/acceptance/requirements tests are the gold standard. If you use it like a customer would and evaluate success like a customer would, then you will know that it is doing the right thing unless the requirements are wrong.

[-]

booch@reddit

unless the requirements are wrong

The requirements are always wrong, because there are always cases that weren't considered ahead of time.

Plus, integration/acceptance/requirements tests must test things "in combination", which can vastly expand the coverage area. They're useful, no doubt; but lower level unit tests are also useful, because they can be more targeted.

[-]

ACoderGirl@reddit

The problem with integration (and E2E, etc) tests, however, is that they can't cover very much complexity. Like, my product has a ton of independent levers and knobs. You can't integration test all the combinations of features. Unit tests can generally cover all the values of individual functions (or whatever the "unit" is), so still play a critical role for preventing regressions, ensuring your edge cases mostly do what you expect, etc.

It's imperfect either way, but I view it as a testing pyramid. You typically want a ton of unit tests, a smaller number of integration tests, and then an even smaller number of E2E tests. You need em all, or else you'll have gaps of some kind (either in feature coverage or integration coverage). And some lucky programs might be able to combine integration and E2E tests (namely those with few dependencies or that are cheap to run).

[-]

m_cardoso@reddit

We use this approach here and it works pretty well. Isolated business logic, unit tests + integration tests that only make sure we are passing through every component (not expecting every kind of result) has been really good.

At least from my experience, testing is always good and I don't agree much with OP's post.

[-]

sickofthisshit@reddit

Until you discover nobody really knows what the customer wants.

[-]

andynormancx@reddit

Nobody ever really knows what the customer wants, especially not them…

[-]

MadCervantes@reddit

Hence the point of ux people.

[-]

andynormancx@reddit

When you have UX people and they are actually good at what they do, that is great.

However I’ve rarely actually worked on a project that has dedicated UX people. And about 50% of the ones that did the customer rejected the good UX ideas anyway.

The other 50% though were great.

[-]

stravant@reddit

That's completely beside the point of the thread.

If there's a workflow that's been working for years you can be pretty sure the customer wants it to keep working.

It's not hard to make useful integration tests.

[-]

pydry@reddit

It's a skill figuring it out that not many people have.

This is one aspect that differentiates a great PM from a mediocre one.

Also sometimes you just have to build it and see what happens.

[-]

ThisIsMyCouchAccount@reddit

Where I work is small and we do not have enough people to fill the roles they want rather than looking at different ways to do things.

One thing we have a person that has been filing the role of a BA but they are actually just an analog for our customers. They are experienced in the field we are entering. And they have been really great at that. If they ever want to do a career change they could absolutely be a BA.

Leadership has now basically given them the keys to the product. They are no longer looking at high level features and users stories. They are now dictating the solution as well.

And they are not good at it. But why would they? The last month they have ran the dev team ragged. They don't have a concept of the product as whole let alone good product design. They are heading down a the common path of "it should be able to do anything any time regardless of data or process". We've had to redo so many things because code isn't magic. They will define something for one part and then say it needs to be over there and there but also nowhere and also slightly different too. But should generate a perfect standardized report every single time.

And beyond that we aren't really solving problems. At this point we are - via them - just remaking a slapped together version of the other tools they have used and a spreadsheet.

[-]

MadCervantes@reddit

It's why people need ux people.

[-]

boobsbr@reddit

Not even the customer!

[-]

Nekadim@reddit

What do you build then and why?

[-]

sickofthisshit@reddit

I mean, you do the best you can, but customers can be very confused. Like Henry Ford said, if he asked customers what they wanted, they would have said they wanted a better horse.

[-]

przemo_li@reddit

Cheaper horse, and Ford delivered on that. Cars were overall cheaper.

[-]

ArtOfWarfare@reddit

Interesting - pretty sure horses are way cheaper than cars these days. I think if you go to an animal shelter you can get a horse for around $100. Not sure you’ll ever find a car anywhere near that price - the scrap material is worth way more than that.

I’ve not looked into how much horses cost beyond that initial purchase.

[-]

Nekadim@reddit

So you build that you or your manager thinks customers want. Better if you build exactly what you think you build (here tests are shine). Then you are testing your hypothesis and if it is not what customers want you rebuild worst parts to the better (again tests will serve you here) trong to not break things that actually coined (these tests you were built earlier).

[-]

captainAwesomePants@reddit

Henry Ford never said that. I only know of one Henry Ford quote about horses, and it's this one: "As betting at the race ring adds neither strength nor speed to the horse, so the exchange of shares in the stock market adds no capital to business, no increase in the production and no purchasing power to the market."

[-]

sickofthisshit@reddit

Fair enough: it's kind of funny that it it is newer than I thought and one of the sources is Ford's grandson, and it would be funny if he had fallen for an apocryphal quote. https://www.snopes.com/news/2025/02/23/horses-quote-henry-ford/

[-]

andynormancx@reddit

The “best” cases of this are when:

- you spend time with them in a workshop or two, understanding what they say they need and what you think they really need

- you try and explain why what you think you should deliver is slight different to what they asked for

- they push back and insist that they want what they asked for, despite trying to talk them round

- you deliver what they asked for

- they tell you what you’ve delivered isn’t what they need and what they now say they need is what you tried to convince them they needed

Thankfully not all clients all of the time, but it is equally infuriating and entertaining when this happens.

[-]

wslagoon@reddit

Unit tests confirm your module under test conforms to a contract. Integration tests confirm the contract is correct.

[-]

TheSkiGeek@reddit

Yep. And also not having those tests written by the same people doing the feature implementation.

[-]

YeshilPasha@reddit

We are supposed to write the tests first, then the code. Then your code will have to agree with your tests instead. But I don't know anyone who does it that way.

[-]

Danyboii@reddit

The program I write isn’t based on requirements, it’s based on my interpretation of the requirements.

See how stupid that sounds.

[-]

snowsayer@reddit

Tests are to prevent regressions.

Try refactoring code without tests. Have fun with all the new bugs you introduce.

[-]

zerothehero0@reddit

That's a good thing to realize. Fundamentaly there are two different categories of tests, Verification and Validation. Verification tests verify component correctness and internal requirements, and validation tests validate whether it meets end user needs. The common example for this being a house. Verification will tell you that your lumber is within requirements (unit testing), that your windows are correctly manufactured (component testing), that they have been installed in the wall correctly (integration testing), and that they are providing light to the interior (functional testing); all valuable and required pieces of information. But in the end you still need to validate that all the other walls and roof are installed and we have an actual house not just a wall (system testing) and that the end result is useful to the user with an accessible entrance and all rooms in the right orientation (acceptance testing).

Verification tests verify the quality of a valid system.

Validation tests validate the usability of a quality system.

[-]

booch@reddit

I disagree with the main points of this post.

A test isn’t proof that something is correct, it’s proof that one piece of code behaves the way another piece of code thinks it should behave.

Well, yes. That's kind of the point. You're not testing that that the requirement is correct, you're testing that the code doesn't fail to meet the requirement in specific ways. And the fact that your test doesn't use the same algorithms to determine the answer, that's perfectly useful. Presumably you have a way to validate the answer that isn't the same as the calculation of it; otherwise yes, you're just writing regression tests (the answer hasn't changed).

When it comes to the general idea here, I like the phrase "Automated tests don't prove that the code is correct; they prove it isn't wrong in specific ways (the ways that are tested)". The same is true of things like type systems; they don't prove the code is correct, just that it isn't wrong in as many ways as the type system is able to control it.

Double-entry bookkeeping doesn't prove your finances are "correct" either; but it's still very valuable.

[-]

thewormbird@reddit

Unit tests were never meant to be a litmus test of how correct your code is. It only freezes your assumptions in time so that you know sooner when they begin change.

Unit tests are no more a measure of correctness that a beaker and Bunsen burner are for determining if two chemicals will react like you expect. They just give our expectations and assumptions boundaries and that’s about it.

[-]

fumei_tokumei@reddit

I disagree. If your code fails a test you have two cases. One where your assumption is wrong, but another more important one where your code is wrong. Saying tests only freezes assumptions does not account for the second case. But I agree that for non-trivial complexity they do not provide a good measure of correctness.

[-]

thewormbird@reddit

It’s a cycle. Red, green, refactor.

[-]

dirtywaterbowl@reddit

As a former tester, testing is for breaking your code in ways you never imagined! 😈

[-]

Aggravating_Moment78@reddit

Well a test is what you write it to be, it proves or not what you wanted it to prove. It will prove a function returns this if given that, but if this should even be returned or something lije that. “Correct” is a vague term anyway

[-]

YouNeedThesaurus@reddit

This is like that moronic thing that people sometimes say: guns don't kill people, people kill people.

Well, actually, for gun deaths, it's the people with guns who kill people.

If you need to prove that your, hopefully single-purpose, method returns a value when you pass it a parameter, and a nil when you don't, then tests very much prove whether that is happening or not.

[-]

BarryMcCoghener@reddit

Imo unit tests are about 99% worthless. I will die on that hill.

[-]

LPitkin@reddit

Test so that test results gives you value. Only then your tests are worth something. I’m not sure 100% coverage gives you that. Then there’s regression reasons where 100% coverage can help.

[-]

Prestigious_Boat_386@reddit

Youre supposed to try more inputs than the one you used while writing the code the first time bro

[-]

oep4@reddit

tests are a benchmark

[-]

4444444vr@reddit

like, agreed, but why did you have to say this

[-]

omnichroma@reddit

Post riddled with em-dashes. Discarded as AI slop.

[-]

QuotheFan@reddit

It made me rethink what tests are even for. They’re not really about proving truth — more about locking down intent. A way to say, “If I ever change this behavior, I want to know.”

Tests are for guarantees. Without tests, our code has no guarantees.

[-]

bwainfweeze@reddit

There are no guarantees. Tests are about confidence intervals, and speed. How stupid is it to deploy the code right now? How much more work do we want to do before deploying to feel good about deploying?

[-]

bwainfweeze@reddit

Tests are brainstorming and then their primary value is in keeping your parts of it still working when someone else changes it. Agreeing with the code isn’t automatically a bad thing.

[-]

FloydATC@reddit

Generally speaking, good tests should use a different, well proven source of "truth" rather than duplicating the code being tested. For instance, if checking the output of a hashing algorithm, you would precompute the correct hashes using some reference implementation and verify your own implementation yields the same. Also, hard-coding "magic values" is allowed in tedts; for example if testing string input/output matches expectations.

The real objective, as others have pointed out, is to detect inadvertent changes that introduce bugs at some point in the future. Someone adds a new feature, then runs the tedts to verify they didn't break any of the existing ones. This is how good test coverage makes the code base easier to work with.

[-]

shinobushinobu@reddit

i use tests to make sure things dont break when i inevitably change them. surely no one thinks tests are a proof of correctness? I thought this was commonly understood?

[-]

davidwhitney@reddit

(Good) tests are:

encoded specifications
verification of outcomes (when done correctly), not implementations
a programmers entry point to the code

They don't verify anything you don't assert. Mutation testing can be used to verify your tests successfully do the above, but it's a rare discipline.

[-]

Narvak@reddit

For me they have 2 main purposes:

No regression: assert the new code don't break the old one - Help me execute some rare condition to check I got my expected result, that would be a pain to reproduce otherwise

[-]

commandersaki@reddit

Verification: did I build it correct? Validation: did I build it right?

[-]

Puzzleheaded-Lab-635@reddit

Types can’t tell you that a sorting function actually returns a sorted list , only that it returns a list.

That’s why testing alone can’t tell you if your system is correct. it only tells you if it’s coherent.

[-]

com2kid@reddit

just confirming that my assumptions in two places matched.

This is why historically tech companies had rules in place saying two different engineers had to write the application code VS the tests.

Then cost cutting came along. :(

[-]

obsidianih@reddit

Yes, we have multiple repos. At least one has the coverage rules that it fails the build of it's below 80% (a cto or something decided that was a good target).

Anyway there's one set of tests that check the controllers, all they do is use a log wrapper to set some meta data, and psss the function to the logger so it can log the success or failure. I looked into those tests and ask they do is expect the logger to be called one time. Doesn't check the meta data (many are duplicated cause of copy and paste), also doesn't check the correct method in the services layer is passed in to the logger.

[-]

centurijon@reddit

Prefer “integration” testing to unit testing.

Mock your external dependencies (DB, APIs you call, distributed cache, etc) and then set up your tests to hit the macro paths - input to a controller or endpoint, validate the output.

At the end of the day that’s all the really matters anyway. Nothing cares about the internal workings of your app or if widget A called wocket B which sent a message to the woozle factory. The only thing that matters is “for this set of imports, did the data I expect reach the database, and/or return from the controller, and/or trigger a message to an external system”

Integration testing can often mean reducing the number of tests while also making them less fragile - changing the inner design of your code isn’t harmful, as long as the outputs match your expected inputs

[-]

PurpleYoshiEgg@reddit

What about testing all approximately 4 billion floats (proof by exhaustion)?

[-]

_some_asshole@reddit

If you write modular code and test each piece for edge cases it should work well

[-]

vscoderCopilot@reddit

If you can code clean and can memorize the one million lines of code of course forget about tests

[-]

amestrianphilosopher@reddit

Chatgpt ass description

[-]

UnmaintainedDonkey@reddit

The venerable master Foo was walking with a novice. Hoping to prompt the master into a discussion, the novice said: “Master, I have heard that property-based tests are a very good thing - is this true?”

Master Foo replied: “Foolish pupil - tests are merely a poor man’s types!”

Chastised, the novice returned to his room, intent on studying types.

He carefully watched “Propositions as Types”, read the entirety of Oleg’s website and gathered an impressive collection of PDFs. He learned much, and looked forward to informing his master of his progress.

On his next walk with Master Foo, the novice attempted to impress his master by saying:

“Master, I have diligently studied the matter, and now understand that tests are truly a poor man’s types.”

Master Foo responded by hitting the novice with a stick, saying “When will you learn?

Types are a poor man’s tests!”

At that moment, the novice became enlightened.

[-]

TiredNomad-LDR@reddit

The lesson being, you do well with static types than dynamic ?

[-]

UnmaintainedDonkey@reddit

The lesson is you want both types and tests. They complement each other, and when done right both makes the other part have an smaller surface area.

[-]

fire_in_the_theater@reddit

we haven't built a system proving the semantics of programs because of the halting problem

[-]

0x0c0d0@reddit

This is the entire argument for TESTS FIRST, and why people who ponder this shit eternally, prefer the word SPEC over TEST.

I'll wait while this is ignored.

[-]

Ok_Editor_5090@reddit

Man, I really hate code coverage requirements. If people who review PR are not careful, your test can easily become dumb tests that are there just to increase code coverage and do not have any valuable testing or assertion.

Now, whether tests are useful or not. 1- tests should be written beforehand according to requirements -TDD. this way, you ensure that your tests are not just affirmations of your code and assumptions. 2- I believe the reason that cucumber and gherkin syntax came to be was that we wanted business people to write or at least see the test scripts. This way, the test would be mapped to business requirements and not technical assumptions.

[-]

Sammy81@reddit

Code coverage is extremely useful if used to ask yourself “Why isn’t part of my code running?”. For example if you use TDD and your code passes all tests with only 75% coverage, why not just delete the 25% that didn’t run? If it has a useful purpose that is not covered by your tests (like a failure condition) it will let you know you have incomplete requirements or incomplete tests.

[-]

pizzathief1@reddit

If my mock object doesn't act like the real thing... then the real thing is wrong.

[-]

srikanthksr@reddit

You're writing the tests wrongly. They shouldn't match what your code does, they should be verifying the requirements. When you write tests, you start from the requirement, define the system behaviours, and try to verify those (hello, TDD).

[-]

Gazz1016@reddit

A test passing doesn't prove that code is correct, but a test failing does prove that something is incorrect: either the code or the test. And having that signal is generally better than having no signal.

[-]

BenE@reddit

A test that just confirms implementation instead of testing that something is up to spec and desired by users is an over-fitting test that should be deleted as it will get in the way of code improvements. A good test will test the requirements and allow flexibility to improve the implementation and abstractions while still verifying that you're following the requirements. https://benoitessiambre.com/integration.html

[-]

seweso@reddit

TDD helps increase the quality and agility of your code.

But if you were chasing 100% coverage with after the fact tests which locks in the implementations…. You were doing something wrong imho.

The point of tests is to prevent regressions when altering code.

That often means that for a given input you validate (and then automatically verify) the output. Which can include unintended side effects or metrics.

Don’t chase mere coverage with unit tests which don’t add much value.

Instead try to think how you can cover risks as efficiently with as few tests as possible.

Because I’m pretty sure you are writing way too many unit tests, and not enough integration and e2e tests.

Tl;dr: make the units bigger which you test and use the approvals pattern

[-]

disappointedinitall@reddit

Personally, I’ve mostly tended to work alone, so any tests I write tend to just check that outputs match a range of expectations. So that if anything changes, it hopefully fails the tests.

Some of its more like QA work, such as testing forms, etc.

The other use cases would obviously just be exercising stuff in isolation that’s normally buried deep.

Testing just proves expectations. It doesn’t necessarily catch bugs.

Doing random data tests can sometimes catch bugs. But they can be a fucker to reproduce unless your random data is pregenerated, but then it’s not really random anymore :-)

[-]

egometry@reddit

Tests are still code

And code can be bad

This is why you need to raise a critical eye to a failing test after a change. You must ask "did my change make this test invalid, or should I listen to this failure?"

[-]

i_should_be_coding@reddit

Tests are your past self being a bro and letting you know you broke something.

Either that or random fails that break your build pipelines sometimes but you can never seem to reproduce.

[-]

websnarf@reddit

Very often this comes down to the quality of your tests. When I was at one of the FAANG companies while looking through some source one day I was shocked to see a test that tested a re-implemented fundamental math function that neglected to check negative numbers. Sure enough, when I tried negative numbers, the function failed right away.

Indeed bad tests prove very little. But a good test should make it as unlikely as possible that your function could possibly work it it passes. You have to be able to put your mind in two places at once. If your test is just modelled on your implementation, then yes, you are going to test code that you've already stepped through a debugger and already know is correct, which in practice reduces the value of your test to a regression test.

When I write tests, I put my mind in a state where I think "Oh yeah, and what are all the features and behaviors that I supposedly get for free after my superb implementation?" and I test for all of those. That's how you think of negative numbers for a fundamental math function, when your implementation was focused on positive numbers.

[-]

Breadinator@reddit

Oof. Don't fall for this thought process. Tests, written with the intent to examine behavior based on requirements, are essential if not necessary. Especially for more complex systems that have > 2 people working on it.

If you want to run fast and loose where any code push could break something, that's your call. Just don't go complaining if shit hits the fan in production.

[-]

OutsideDangerous6720@reddit

there is a lot of virtue signaling and superstition when people talk about tests

[-]

rashnull@reddit

Welcome to not following TDD

[-]

pat_trick@reddit

Tests prove that given specific input, the output matches what you tell the test it should be. That's it.

[-]

SocksOnHands@reddit

I've worked on projects with near 100% code coverage and they were riddled with bugs. This might sound controversial, but I think unit tests are nearly useless because of how many bugs come from integration and concurrency issues. A unit test will not tell you if a database constraint will be violated because the database is being mocked. Unit tests are typically not going to test what happens if more than one request modifies shared data - most programmers I've worked with pretend concurrency doesn't happen. So this creates a false sense of quality in testing by having 100% code coverage, but things still don't work.

[-]

MadKian@reddit

Imo the vast majority of unit tests are useless. The only kind of thing that is worth testing is very small, encapsulated functions, that do something very specific; like parsing a string and removing special chars or things like that.

[-]

anubus72@reddit

you think it's not worth testing your functionality end to end to make sure everything works together? Do you always manually test this after code changes? Or just yolo push to prod and find out?

[-]

dunkelziffer42@reddit

If you mock the database, it’s no longer a „unit“ test. It‘s a „subatomic particle“ test.

[-]

SKabanov@reddit

I've worked on projects with near 100% code coverage and they were riddled with bugs.

I guarantee you there were tests in there that were written just to get code coverage and weren't actually verifying much of anything at all - I've seen that in a couple of projects.

[-]

SocksOnHands@reddit

I know there were - I've written some of them. The CI/CD pipeline required a minimum amount of code coverage, but a lot of code is straightforward with very little to actually "test" - like testing if getters get and setters set..

[-]

portmapreduction@reddit

Having hard to test issues in a codebase like concurrency problems doesn't mean abandoning testing the simple thing that you can actually test.

[-]

kalmakka@reddit

No, such tests tend to drag the focus of the developers away from actually thinking about what they are doing to ensuring 100% code coverage.

Who cares if my SQL has syntax errors when my unit test mocks the database?

[-]

mastermrt@reddit

In that case, write proper integration tests too, and then have your SDETs write E2E automated tests.

[-]

optimal_random@reddit

Maybe a specific module/unit's tests are well written according to that functionality, thus giving you 100% coverage, while the real problems arise from integrations with other modules or services.

Maybe you are lacking better integration tests across modules and services.

[-]

eraserhd@reddit

I don’t like the whole unit/functional/integration split. We used to call everything unit because it was a scale invariant term. But since unit now means “object or method,” and somewhere we decided that every thing has to be tested at such a low level first.

The only measure of the goodness of any test is “confidence per second.” Don’t write tests with a low value, and delete tests with a low value.

Try to test the system through a smaller number of stable interfaces. Even if you need to write a bunch of harness code to do it. Tying tests to every interface in your system makes your code brittle and hard to change and your tests provide less confidence.

This advice is a bit different than integration testing, and better. For example, integration testing might say to test through the UI. Well the UI probably isn’t stable, so these tests are brittle.

[-]

HashBrownsOverEasy@reddit

Unit Test are for testing isolated, encapsulated logic. If the thing you are testing is not encapsulated, an integration test is probably a better fit.

Having said that, if Unit Tests are hard to write then it might be a sign that code needs to be better encapsulated.

[-]

Batman_AoD@reddit

Yes, this is absolutely my problem with the focus on unit tests as well. It's almost never the "units" that break; it's the cross-components interactions.

[-]

untypedfuture@reddit (OP)

I agree as that’s exactly what I’m experiencing rn 😂

[-]

AfraidMeringue6984@reddit

I don't know. Every time I write tests I go: "Boy those are some mighty fine tests, there's no way this could fail! Not like there are an infinite number of environmental configurations that aren't part of my very finite testing apparatuses that could cause unforeseen problems when launched on devices I will never see! Who needs external monitoring systems, QA testers, beta testers, penetration testers, redundant fallbacks? Fuck em. I got like 43% coverage. Ship it now" I mean that's, fairly standard right?

[-]

NegativeSemicolon@reddit

Well yeah that’s why designing your tests is so important. It should be the other way around, the code should agree with the tests.

[-]

renges@reddit

The moment you mention you're trying to get 100% coverage and the medium post talking about clean code means small methods and classes, I know this is a very beginners take. I'm surprised this post even get lots of upvotes considering how dumb it is. First of all, there are empirical data that shows that small functions doesn't help with maintaining a program. It's made it even harder to read because you have to hold a lot of contexts in your head. The correct methodology is to have deep modules and small surface. You don't unit test everything, only the interface that is exposed. And you write these tests by using TDD. Most who going for large coverage often write tests after, and that's why you think test is just agreeing with your code. TDD is also another approach that is empirically proven to be effective

[-]

GrinQuidam@reddit

Idk if this is your opinion maybe you just need to practice writing tests with less regard for the implementation.

[-]

knobbyknee@reddit

Tests show the presence of errors, not the absence of them.

Tests affect your code in other ways. Most important is that it forces your code to be testable in isolation.

That means not relying heavily on global state, not having too many dependencies and having a small number of code paths in each operation tested. Otherwise your code becomes much harder to test.

[-]

SirSooth@reddit

Imagine that every time someone fixes a bug and some tests stop passing in that codebase, it's because the tests were written exactly like that.

The plan was that tests were there to test what's actually worth testing such that when we do make changes, like fixing a bug, we know we haven't broken anything meaningful. But because of stupid ideas like "we need 99.9% code coverage", people test the shit out of every line of code. And then you end up with having to fix half your tests when you're only fixing a bug.

What confidence have they given you that your change is right if you had to "fix" so many of them? How do people even trust them?

[-]

max123246@reddit

To be fair, in an ideal case, code coverage should be an accurate measure for what functionality you're providing. Like, if a branch is never taken by a unit test, either this branch is not involved with any valid output/external representation that your code is creating and it can be removed, or you're missing a category of valid abstract data types in your tests

I do agree that in practice it encourages writing unit tests that test the implementation rather than the interface, but that's about sloppy work and the fact that we don't teach good software engineering practices.

[-]

post-death_wave_core@reddit

Tests are firstly just checking that the code does what I think it does. Which is not doing “nothing” imo. Secondly, they serve future me as example based documentation, requirements and refactor verification.

[-]

mrfredngo@reddit

This is solved by having different teams doing the implementation and writing tests, but working off the same spec.

This is how high-reliability software is written.

Of course if the spec has a problem, you’re screwed either way. That does occasionally happen.

[-]

Statharas@reddit

Stop thinking of tests as unit tests.

Tests exist to validate whether a system or part of a system can follows the expected behavioral pattern.

A unit test is code that is written to interface with your code to automate a test that makes an assumption under some given circumstances, when something happens to the system, then the system should do something. A unit test is an implementation of this that interfaces with a small endpoint of your code (e.g. a public method) to independently assert whether it does what it assumes it should be doing.

[-]

cdm014@reddit

If you do it right, the tests dictate the code being tested, because they test that your acceptance criteria were met.

It doesn't prove code is correct, but it does prove that the output of a piece of code is what we expect from a given set of inputs.

[-]

mmacvicarprett@reddit

In general, as with most math proofs this is true. However, you can brute force a proof in the same way you brute force and test all requirements.

[-]

5fd88f23a2695c2afb02@reddit

I guess you are talking about unit test, in which case you’re 100% correct. But don’t forget there are other abstractions of tests, and they are more important in terms of proof and acceptance.

[-]

PaintItPurple@reddit

Looking at "tests" as a monolith is not very useful. There are tests to ensure that some logic works the way you think it should under different circumstances, there are tests for conformance with project requirements, there are tests for conformance with internal requirements, there are tests to ensure that all code paths are exercised, there are tests to detect regressions, etc. Expecting a test designed for one of these things to accomplish all the others is a category error.

At any rate, you should probably take a look at the TDD methodology of writing tests that fail. TDD isn't practiced much anymore as it is quite onerous, but the general principle of ensuring that your tests fail is useful for most kinds of tests.

[-]

eikenberry@reddit

TDD is a great learning tool. Do it for a while and you'll see how you write code and tests changing. Looser coupling, black-box testing, tests and code written together, etc. Lots of good side effects even if you only do it for a short time.

[-]

SputnikCucumber@reddit

I try not to think too hard about my tests. They're there so that when I come back to my code later I can change things and if my tests throw a fit I'll know that it's "supposed" to do the thing.

[-]

Plank_With_A_Nail_In@reddit

Unit tests shouldn't be for one piece of code though so this is just you admitting you don't know what a unit test should be.

[-]

chucker23n@reddit

That’s not a unit test, then. That’s a system or integration test.

[-]

Tim-Sylvester@reddit

That's how I use them - to prove the code works how I expect it to, and to prove it still works how I expect it to after something else changes.

[-]

LessonStudio@reddit

The key to making a system, growing that system, and keeping it working, is to understand technical debt. If you allow tech debt to grow out of control, you will lose control of the project, and progress will slow to a crawl. Unit tests are one very important tool for managing tech debt. They increase the chances of older foundational code of doing what it should, including things which weren't manually tested as much as they will get beaten up by later functionality. This is where tech debt becomes a nightmare, when later code invariably requires going back to earlier code to fix it. Only for those fixes to then break something else.

Tests improve confidence that the code is correct. They often find dumbass bugs. Good tests are better at finding dumbass bugs.

They also find regression stupidities. Regression doesn't only come from desired changes to old code, but sometimes, someone is fixing a bug missed in the tests, and inadvertently breaks the old code in ways which they don't see. But other people are wondering why their code stopped working.

Done properly, they are great tutorials on how to use the codebase.

Integration tests along with unit tests, allow for rewrites where you know you are capturing the key functionality in the new code. Refactoring become way less scary. Replacing modules becomes possible.

As for regression, that can be subtle, and depends on the quality of the tests. For example, some tests could be about time. Making sure some part of the system runs at an acceptable speed.

Then, when some jackass comes along and adds a new performance killing index in the DB, the zillion insert test, or whatever will barf that it is taking way too long. Otherwise that performance issue might not be noticed for a long time.

Also, great tests explore the limits of your system. Right now you might have 3,000 users. But you can have tests beating it up at 1 million. This is also great to know how your system might respond to a DDOS.

There's a book on maintaining legacy systems which has a great opening statement something like this:

"It doesn't matter if you use the latest OOP, framework, architecture, language, PIMPL, SOLID, or anything; if you don't have unit tests it is bad code."

One other critical thing about unit tests is that they generally keep your code clean. Crap code is brutally hard to test. It is not that all hard to test code is crap, but it is a very bad sign.

Also, I find tests tend to give me a more mile high view of code. You start realizing that 8 different classes use inconsistent naming, or how functions parameters are structured. To me a clean API is one where you can guess a function name, and its parameter names, in their correct order, and get it right. When unit testing, you often see this sort of thing, and with great unit/integration tests, a refactoring where you sort this all out, doesn't blow up your system.

But, unit tests often make it clear what a system really does. When working with legacy systems, writing unit/integration tests is often step one to modernizing the system. One of the key reasons it became a scary legacy system was it's lack of unit/integration tests.

By cooking up the tests for a legacy system, you can start figuring out what all the different parts do. Even 10 year programmers often don't know their own system, just those parts which are the buggiest.

Now, with a high coverage set of tests, you can do things like write an actual description of the architecture, API, and requirements for the old system. The next step is to figure out where you want to go, and then start going there one module at a time, with the confidence that you aren't really trashing most of the system.

While tests won't prove perfection, they will prove that things are broken. And admitting you have a problem, is the first step to solving that problem.

To declare unit tests not all that valuable because they are basically not able to prove a negative is corner cutting looking for an excuse.

[-]

edave64@reddit

Another way tests can be used is for compatibility.

I once had to replicate an outdated, unintuitive, poorly documented, proprietary system. Without knowing what its code was actually doing, best I could do was collect known behavior into hundreds of test cases, then work step by step to pass all of them.

So, in a way, I used them to solidify someone else's assumptions, as free from my own as I could.

[-]

r2k-in-the-vortex@reddit

Yes, that is what the typical tests do.

But, you can go step further with formal verification and prove theorems like "this code can never enter infinite loop", "this code can never result in division by zero" and so on.

[-]

ImaginaryCoolName@reddit

Well yeah that's why we have testers outside of unit testing. Unit tests are just the appetizer

[-]

evangelism2@reddit

Tests in my experience are good for

1) confirming the code works as you expect

2) alerting people later if something they did broke an existing functionality

3) great for TDD or SDD

[-]

retechnic@reddit

I think it is also the question what is the best way to express the required behaviour, or some parts of the required behavior. Tests are often easy to understand and show well what is the expected program behavior. So it is easier to reason wheather the code is doing correct thing by doing the test, rather than untangling the algorithm. Type systems are good to express certian properties of the program, but it is hard or often impossible to articulate more dynamic properties with them. Really powerful proof system are complicated, so at some point it is easier to understand the imperative algorithm control flow compared to declarative annotations.

Each approach can cover certain parts of the specification well, and in addition redundancy is good to have.

[-]

Exnur0@reddit

There are two fantastic (and very short) articles from Google related to this topic:

TLDR: Test code is not like production code, it should be as dumb and plainly readable as possible, so that you can spot the assumptions being made as easily as possible.

[-]

Nerketur@reddit

There is no "correct" code, anyway. What does that even mean?

So yes, tests don't prove your code is correct. They just prove your code acts the way it was defined to act.

All code is an API to other code. This API is what is tested, not the code itself.

To put it another way, I want to prove that when a function is given this information (inputs), it produces some other information (specific outputs). Given input A, I get output S.

I'm not proving that the function is correct. I'm proving that it matches the "blueprint". The API. It could be wildly inaccurate on other inputs, or give outputs that are weird for incorrect inputs, even crash the program. But given the tested inputs, it gives outputs that match.

[-]

jewdai@reddit

Unit tests are to document what your assumptions about the systems are it's also a way to prove it does exactly what you expect it to under specific conditions.

If done well you can write all your tests before even testing the end to end manually. You'll likely have much less to debug and often can blaze through things.

[-]

Trygle@reddit

Well I mean, we write the tests first, so it better agree with it lol.

[-]

Novichok666@reddit

The statement is as true as it is pointless.

[-]

hoodieweather-@reddit

And equally AI generated! Why does this have so many upvotes?

[-]

bring_back_the_v10s@reddit

I realized they weren’t actually proving anything — just confirming that my assumptions in two places matched.

Maybe I'm just being pedantic but this is a self-contradictory statement. Tests prove that your assumptions about real world requirements are correct, so it's not like they "don't prove anything". Obviously, as long as your tests express your assumptions correctly.

This realization isn't supposed to be rocket science guys, come on.

They’re not really about proving truth

This sounds so odd to me. I've never personally encountered any programmer who thinks tests are supposed to prove that code correctly fulfills real world requirements. It seems to imply that software engineers in general get requirements right most of the time.

[-]

qruxxurq@reddit

Obviously. Plus, it's an infinitely recursive problem. How do you know the test "works"? What validates the test? Another test? Not to mention that it's an insane circular dependency. "Well, the code is supposed to do this, so the test tests that." Followed later by: "Well, the test tests for this, so this is what the code is supposed to do."

Not only is the test basically meaningless, but the code is also meaningless. The only that matters is what either (or both) programmers intended for that code to do, and we cannot know that from the code, b/c of the possibility of errors in the code.

Testing is important, but it's also a fucking cult. And people who do it badly and the people who are religious about it are just leaning into the cult-like bits.

Testing can be an early warning system for breakage, especially for really "fiddly" bits. But, most of the time, especially in your shops filled with religious nutbags, coverage is excessive, tests for almost nothing of value, doesn't tell you when something is going to go wrong, and is the programming equivalent of the IT Help Desk meme: "Hey, it doesn't work."

Like, if tests start failing, and it doesn't give you a big neon sign as to what's likely going wrong--but instead, requires tons of time to figure out--WTF was the point of the test? I could have just avoided the test and spent time debugging when someone says: "Hey, shit doesn't work."

As an example, if you're building a time-class that's conforming to ISO8601 or RFC3339, then tests are GREAT. But they're great if each test can be mapped to some part of the standard being tested. But we can see in that case that what matters is the SPECIFICATION. That tells us our intent, and the code aims for that intent, and so do the tests. When you test without a spec, even if it's not as formally or precisely or narrowly defined as 8601 or 3339, what are we even doing?

[-]

uniquelyavailable@reddit

Good tests require careful thought. But overall, agreeing that the code works correctly is a pretty important step.

[-]

vom-IT-coffin@reddit

Sounds like someone is writing tests incorrectly.

[-]

Supuhstar@reddit

A good testing suite passes if the code works as its documentation says it should work, and fails if someone changes the implementation in a way that breaks that contract.

If you want to say that this means that the tests aren’t technically proving that the code is correct and they’re just agreeing with it, go ahead. You might have a different definition of correctness than I do.

[-]

MakeoutPoint@reddit

I learned this from woodworking:

Measuring twice always yields the same result. It isn't until you "cut once" that you realize you forgot to account for the thickness of another piece when you measured.

[-]

Academic_East8298@reddit

I think integration tests are a pretty good measure of code correctness. There were quite a few situations, when integration tests allowed me to quickly handle a support ticket just by looking at the covered cases.

Unit tests on the other hand are a bit more flimsy, since they almost never survive a code refactoring and alnost never cover the connection points between individual units.

[-]

watabby@reddit

This post is AI btw, pretty obvious

[-]

Achereto@reddit

That's why you write the test first. Your test should define the feature you want to implement. If the test passes, you implemented the feature. If you can't think about another test defining the feature that would fail, then you are finished.

After that, the tests catch any changes that break the specified behaviour.

[-]

nnomae@reddit

That's why you write the tests first. Writing passing tests after the fact is pretty pointless.

[-]

siromega37@reddit

I guess in a domain-driven-only approach this would be true, but that is why we have TDD as an alternative. Not many shops practice TDD but it is there.

[-]

RICHUNCLEPENNYBAGS@reddit

Yes, Dijkstra memorably observed this decades ago.

[-]

rafuru@reddit

First, the author should specify that it's talking about unit tests.

Second: Unit tests are usually not used to validate business or user cases; they are there to validate that the method is working as expected in terms of its logic also, that it can properly handle edge cases or expected / unexpected errors.

If the method is meant to do a calculation given a set of values, you validate that it must do the specific calculation you intended to, not because it's tied to a business logic, because you need that specific calculation for something more complex which is part of the business logic.

Third: Unit tests are there to ensure that whenever someone makes a change, it shouldn't break existing code and if it does it must either be part of the whole update or re-think the solution.

For business or user cases, you have integration or functional tests which will help you to detect if the whole system is doing what is intended to.

[-]

codemuncher@reddit

Every bug in production passed all tests.

Think about that for a second. About what tests actually say, what they guarantee, and what that means for software quality.

Software quality is difficult and a lot more than writing unit or even integration tests.

[-]

qmunke@reddit

If you write your tests after you write your code (which judging by your "chasing 100% coverage" comment you are) then you're probably also writing tests that are tightly coupled to your implementation, which will exacerbate the scenario you're describing.

When your tests are asserting behaviour, it's much less likely you'll write code that's not correct.

[-]

robby_arctor@reddit

OP has precisely one AI-written post to promote their blog.

[-]

TyrannusX64@reddit

Pretty much. TDD, for example, only works if you're provided with solid requirements

[-]

rep_movsd@reddit

I hated unit tests until a few months ago

I still hate most of them, but as you said, they lock down intent if written correctly.

The sad thing is that in aiming for 90% coverage people write overly trivial tests that don't prove semantics of the code

[-]

scan@reddit

Green checkmarks, false confidence.

Shudder

[-]

Perfect-Campaign9551@reddit

100% coverage. Ugh . Unnecessary. Your tests should simply test the business requirements.

[-]

spergilkal@reddit

Well, yes, that is why you might write something like Assert(value, Is.Equal(toWhatIExpect)) in your unit test. You are verifying your assumptions or expectations. If I am about to change a complex function, I will often write tests to verify that my understanding is correct by writing unit tests. Once I have verified my understanding is correct I can more easily refactor the code, both with the confidence of having verified my mental model of it and by easily verifying that no regression has occurred.

[-]

CallMeKik@reddit

Tests are clamps. They hold things in place while you reshape other things. :)

[-]

skilless@reddit

Need that bell curve meme, first "we need tests", then "test are useless" in the middle, then "we need tests" again at the far end. Enjoy the journey.

[-]

the_0rly_factor@reddit

Unit tests are generally for regression yes. You write them to pass against the code you are sure works the way you want it to. Then if someone breaks the application code it will cause the tests to fail.

[-]

jacobb11@reddit

Tests ensure that the code is modular enough to be tested.

Tests ensure that the programmer spent some time thinking about edge cases.

Tests ensure that code changes did not break at least some existing functionality.

Tests reduce the chance of bugs, they don't eliminate it.

[-]

john16384@reddit

Tests should not test the code, as that's just the implementation, and there generally are an infinite number of ways to implement something.

Instead, tests should be written against the public API of the unit you are testing, and with that I mean the documentation of that API, as the documentation states the intent of the code and also what users of that code would expect. No docs? Writing tests is then pretty pointless.

For example, if the API states it accepts a number of children as a value, but doesn't say it can't be zero or negative or 1 million? I should then write tests for those edge cases to see that it doesn't blow up.

However, often when writing tests, I find myself fine-tuning the docs so it says something like "must be a positive value" and "throws xyz exception when zero or negative", and add a test that checks this accordingly.

[-]

ven_@reddit

One of the most valuable parts of tests for me is that it leads people to write testable code.

[-]

Leverkaas2516@reddit

This is why, in an ideal world, tests are written by someone other than the author of the code.

made me rethink what tests are even for.

Tests serve many purposes. They confirm behaviors and do validation, but writing them also forces me to understand the code better and also forces me to consider edge conditions. It's not at all uncommon for me to uncover coding errors when the tests are run initially - it's kind of like code review in terms of early bug detection.

[-]

ExiledHyruleKnight@reddit

I always say tests are my guarentee. If you put in X, you get Y. It shows people what to expect. It will never cover EVERY input, but a good test should show what will and won't work and usually a reason why.

Your documentation is your promise. Your tests are your proof of that promise.

But also tests are MUCH more for the future than your present. If you change X and it breaks your test, all your assumptions are no longer valid, which is why your tests are as important as your code.

There's also different level of testing. Cunit or Pytest should test singular functions. Integration testing is better, but also use of the device in a QA enviroment is always important because the customer is a moron who won't always type in a date as 06/08/2025... and might use any number of values there.

“If I ever change this behavior, I want to know.”

That's good, but it's more "If anyone else changes this behavior I want to know."

[-]

HCharlesB@reddit

it’s proof that one piece of code behaves the way another piece of code thinks it should behave.

IMO that's a pretty good start. And when fixing bugs, the first thing to do is to craft a test that reveals the bug (and of course it should remain part of the tests for future testing.)

My first tests just prove the code does what I expect. Then I strive to provide tests that validate expected trouble spots.

[-]

nameless_food@reddit

Tests are accurate only if their expectations of the system’s behavior matches the desired outcome. If the test is checking that a thing is blue, and the code gives out a blue thing, that is no good if the thing was supposed to be red. Perhaps at first the thing was supposed to be blue, but the spec changed, and the tests were not updated properly.

[-]

SCube18@reddit

Well mathematical proofs also work on some assumptions right? If I think the code should behave certain way and it works a certain way, then rigorous proof would also prove its correctness

[-]

Minute_Action@reddit

Tests should be named contracts, because that is what they are. It's a contract that one command should have an expected output. If that changes the contract is broken. I always say to whoever listens to me that tests are there to protect the code from devs, not to test anything.

[-]

TheHiveMindSpeaketh@reddit

It made me rethink what tests are even for. They’re not really about proving truth — more about locking down intent. A way to say, “If I ever change this behavior, I want to know.”

To me this is where the insight of TDD comes from. I don't usually find unit tests that I write after I've written code to be very useful because they end up being just a record of what I wrote. Because that flow is:

Have a spec (in English) --> Try to implement the spec (code) --> Record what I wrote in the tests

But a better flow is

Have a spec (in English) --> Translate the spec to code (by writing unit tests) --> Implement the spec (by writing code that causes the unit tests to pass)

[-]

Gleethos@reddit

Yeah, that is an important distinction to make because it always gets me into stupid arguments every time I say something along the lines of:

"We should also write a few unit tests for that new feature"

After, which I always get the same old pushback arguments which boil down to "ok but that does not proof that the feature works 100% in all possible cases".

After which, I always have to keep my composure because otherwise, I would burb out: "No shit sherlock! It is just a mechanism to prevent failure for the most critical usecase scenarios..."

I had this stupid discussion today, actually.

It's just a coalmine canary, that's it. Nothing special.

And, please don't let perfect be the enemy of good, dear people! Unit tests are super super valuable (*if they couple against a stable API, and NOT implementation details, DONT test implementation details directly)

[-]

-grok@reddit

ya just wait til co-pilot writes a huge pile of tests to really give management a giant pile of false confidence1

[-]

Gigio00@reddit

This is why, despite what many people think, classes on formal Software Engineering are actually useful.

[-]

pythosynthesis@reddit

What does "correct" even mean?

You write an alto, some code, which aims to achieve some goal. That's it. Now you know, or should know, how should your code respond under specific circumstances, edge cases. And you write tests to confirm that your code does indeed behave as expected under those circumstances.

Even strictly logically this doesn't prove that your he general case, when you run the code in prod, is properly implemented. It just doesn't. It simply tells you that what you built behaves as it should in those cases you can fully control. That's it.

This is very little indeed. It's also the most you can ask for, and so it's absolutely "perfect". Doing this gives you confidence in your code. But if something still goes wrong in prod, guess what, you just discovered a new edge case you didn't consider before! And so you increase your test coverage.

Incrementally you cover more and more odd situations, which increases your confidence that the code is indeed doing what it's supposed to.

Time tested libraries are not awesome because someone "proved them correct" but simply because no one was able to prove them wrong by catching a bug.

[-]

Careless-Childhood66@reddit

Usually, developers who write tests after implementation of domaon logic reach that conclusion.

In my experience, if you adapt a "pick a small piece of the solution, write some crude tests before, implement the component, refine the test, refactor, finish" Iterative approach to coding, you will end up with higher qualtiy assertions and code that aligns better with the specification.

[-]

StarInABottle@reddit

I've definitely seen codebases where no thought was put into "how could this piece of code go wrong?", only the happy path was tested, and your observations hold 100% true.

While one can never guarantee the code behaves exactly as intended, you'll get a lot of mileage out of writing test cases thinking like an attacker would - what happens when the input is unexpected? Can I break this?

[-]

Gwaptiva@reddit

They aldo document the intention, the purpose of the code under test

[-]

BlindTreeFrog@reddit

Tests were never suppose to prove that the code is "correct". Tests prove that the code executes in the described way. The problem is when people write the tests from the code instead of from the requirements and design.
If the requirements and design say that when giving inputs A, B, and C, the function does X, Y, and/or Z, then the test should verify that the the function does that and only that. Whether it is a unit, functional, integration, system, or regression test. that is the desired goal; make sure the function does what is described/documented and nothing else.

The main problem is when tests are written from the code without regard for the requirements/design.
The secondary problem is when code is written to the functions without regard for the requirements/design; with well written and complete tests this might not be an issue, but it can be an issue at times (I've worked with too many developers who just tweak the code to make the test past rather than figure out why it was failing)

[-]

bravopapa99@reddit

Why would you think any different?

[-]

jaggafoxy@reddit

Tests are documentation stating how code worked at the point in time the test was written.

[-]

SilverCats@reddit

Yes that's exactly what tests are. Checking that assumptions in one place match assumptions in another place. How you apply that is up to you.

You could write down a thousand page doc, and then manually sit down and go through all the steps by yourself and verify that you program matches what the spec says. Or you could have the computer check it. Or could have the computer check it but then manually change the output report to switch all fails to pass. Or you can just compile your code and ship it to a customer and see if you get any complaints.

[-]

Ahhmyface@reddit

After 20 years in this industry I'll go to war over this.

Something always bothered me about unit tests that I couldn't quite put my finger on.

If the unit test requires a complex input, then it's very easy to miss a case (and not all cases are enumerable). You can't test every input
Thanks to shitty coverage requirements you end up writing tests to validate variable assignment, string formatting and arithmetic. Largely a waste of time
Statistically the number of times you modify a unit of code without changing its behavior is massively less likely than intending to change its behaviour. And thus you must also change the test. So you've inflated your time on the average case to save time on your unlikely case
If the person writing the code misunderstands the requirements then they will also misunderstand the test. You might argue that a different dev could perform each role, but thanks to the nature of mocking, unit tests are tightly integrated to implementation
Drift: it's very easy for tests to pass even though the code (or test) is wrong. And nobody is inclined to examine a passing test.
False confidence: precisely because of the expectation of bug detection, despite the problems above, developers are less inclined to use human effort to validate logic
Duplicate mental effort: there are a large number of cases where the test is an order of magnitude more complex than the thing to be tested. Not only does this slow you down, but this mental effort could have easily been repurposed to making the actual code better.
The types of bugs unit tests catch are often trivial. Using a code analysis tool will automatically detect 90% of what your unit tests do

There are advantages of course. Namely forcing people to break down their code into modular units. Force people to think about edge cases. Hugely useful for refactors. But the payoffs come late and infrequently. A good integration test will deliver the same results without all of the useless busy work of keeping unit test coverage up.

Im hoping unit tests eventually become an artifact of history. AI code Analysis, official spec to code comparison, God knows. But something stinks with unit tests

[-]

tinbuddychrist@reddit

To some degree this does depend on how you write your tests. People often make tests that are a mirror of the code, and those are pretty pointless.

It is still possible to make more conceptual tests, and they tend to be less brittle.

E.g. if you're testing a sort method on some collection of numbers, you could validate that each step in the sorting process changes things in the exact way that you expect.

OR you could test that if you sort a collection of numbers: - the count of items doesn't change - if you loop through the sorted items that you don't find a pair out of order - all numbers in the input appear in the output the same number of times

This doesn't prove everything you could possibly want to be true about your sort method, but it does cover a lot, and it will keep working even if you change some of the implementation details.

[-]

andynormancx@reddit

That has always been one of the biggest benefits to me, it is mostly about knowing you’ve not changed behaviour in a way you didn’t expect.

Which then reduces the “rabbit in the headlights” feeling I get when diving into an existing code base with poor test coverage (or poor* tests). I remember all those times in the past when facing this and spending days half making changes before backing out and looking at another bug when I realised I had no idea what I was about to break.

Writing the test is also a good point to find out whether the design you had in mind actually makes sense. There is nothing quite like writing actual code to uncover a screwed up design…

* like that massive Salesforce project I got dropped into that had nearly 100% test coverage. But every single test just ran the code it was supposedly testing, without ever doing a single assertion to see if the code did the right thing (or indeed did anything).

[-]

highwind@reddit

Test is also check on interface. If you have to mock gazillion things and bring in ton of dependencies to just write a unit test, it tells me my design is wrong.

[-]

SeeTigerLearn@reddit

I had that realization the other day after an AI had near total code coverage. As I explored what it had done I soon discovered that some of the larger functions which had a bunch going on were being substituted with mock objects. Sure everything was passing and nearly the entire project was being accounted for…or was it.

After a couple nights of pondering what the implementation was really doing, I realized it was all sim. Sim calls by sim tests to sim functions resulting in sim results. I’m now collaborating with Claude on a different project and have been very explicit about how testing is to be implemented. It’s actually going pretty well—assuming I keep it from sneaking in code not approved prior. It did that late last night and when I discovered it it might as well been a dog taking a piss on the carpet. Ha.

[-]

BrianScottGregory@reddit

That's a common problem I've had in the past with peer programmers and unit testing. They test the way something works. But they don't test extraneous conditions.

But. This is when I got into a philosophical argument. When you're testing and trapping for extraneous conditions - you're increasing the bloat - both performance and lines of code of your software exponentially. A simple one line function can be transformed into 20 lines just to track for bad data.

So if looking for highly efficient, high performance code. You might just be shooting yourself in the foot by testing outside the boundaries of how something is supposed to work.

These discussions made me rethink unit testing entirely. Afterwards, I no longer focused on a singular approach to testing my (and other's) code out. I actually began focusing on entry points, uncontrolled areas where user/consumer interaction occurred, focusing on streamlining the functions called a million times, eliminating ALL checks there, and instead placing checks on the areas I couldn't strictly control the data.

Accordingly. I'm not a fan of strict procedurally based unit tests, and more a fan of integrating tests with code coverage tools and better debugging options. Yes, programmer code will break harder this way with things the programmer didn't expect, but it also makes you a better coder when you're not thorough about taking the time to better your integration.

That is. I personally find it's best to test that code works as expected given expected inputs and outputs, except consumer/user layers, which DOES have the checking there.

[-]

dmills_00@reddit

It is well known that you cannot test quality into a product.

Unit tests in my view largely exist as canaries for "Local behaviour has changed, this function doesn't do what it once did", and that has value in a large code base, but all such tests should fail when first written to prove they actually work.

Scary thought, "if sensor A and sensor B and not condition C then set output D" is easy to test, but sometimes you want "if and only if... ", and in a non trivial system that is a nightmare. When D is some sort of harm reduction system like an airbag or such, you really, really, want "If and only if", but a system having just 64 bits of state is way outside our ability to exhaustively test.

[-]

d4m45t4@reddit

Two things:

Tests are not the same thing as formal verification. They don't prove that your code is correct. But they don't need to do that to be valuable.
To get the most value out of tests, you need to write your tests before you write your code. That'll force you to think about your requirements and write the exact code you need to fulfil them. It'll also help you avoid false positives in your tests, which is an easy pitfall if you write tests afterwards.

[-]

LordAmras@reddit

It's the issue with 100% coverage.

Yes you want to know if you change something if something else somewhere else also changed it's a very important thing to do and what tests excel at, but if the code it's self contained and it's not used anywhere else then tests don't really do much other than making you work twice everytime you need to change behavior.

[-]

anengineerandacat@reddit

Tests verify your expectations (hence why a lot of testing libraries have .expects helpers).

That's all it does, nothing more.

Now as part of that we bolt on analyzers to run through the code, which then does quite a bit more useful things; verifies that you have code that's reachable, where you might have gaps in coverage, and checks for common foot-guns for your target language.

We can code in our expectations into our running code, but then you pay the overhead during execution.

At least that's my general take on it, and I have been comfortable with that to date.

[-]

tortridge@reddit

Not all tests are made equal.

QA policy vary between teams and company. I went as far as test a policy where tests and code where written by differente people with only the spec or documentation as input, it was great for documentation but very bad for productivity lol

Tests are also a big part of code review process (did the tests changed ? Why ? Is it legitimate ? Etc..) and should be tested with mutation testing for complete coverage

[-]

Dragon_yum@reddit

Tests check for a behavior and let you know if the behavior changed. Of course it doesn’t mean the code is correct, that’s is your job.

[-]

bwrca@reddit

Tests prove whatever you wanted the code to do, it's actually doing it. It can't test whether what you're actually thinking of doing is right .

[-]

ImOutWanderingAround@reddit

This basic algorithms course level knowledge. The trade off complexity to time. How many of our most difficult problems, such as signals processing using a DFT can be accomplished with a big O of On2, and then the same thing be accomplished with a FFT that is done in O n log n.

The test will pass, but what other impacts, such as performance be impacted.

[-]

hippydipster@reddit

Hopefully the assumptions made in the test are bare for all to see, and easy to change, and easy to change in ways that change none of the assumptions laid out in other tests.

Therefore, when you learn more about your system, and learn enough to correct an assumption in a test, you can do so and then fix your code, and all other tests and code remain with their assumptions unchanged and still "working", and in this way, you move your system safely forward in terms of correctness.

All tests might indeed be testing assumptions. This does not make all tests equal in usefulness.

[-]

Zomgnerfenigma@reddit

if the application is built in a modular fashion, with clean architecture and clean code, and is made mainly from small classes and methods glued together, then it’s fairly easy to prove the correctness of those individual classes and methods.

Many fine grained tests loose value and complex tests become more valuable over time. (call it unit vs integration we). The reason is side effects that fine grained tests can't catch.

The author rambles a lot about formal proof but doesn't even get into it. I don't know what he expects, but I expect that an average airhead can roughly test if his shit works.

[-]

Zazz2403@reddit

“A test isn’t proof that something is correct, it’s proof that one piece of code behaves the way another piece of code thinks it should behave.”

Yup.

That's exactly what I want it to do.

[-]

renegat0x0@reddit

Yeah, but try to ride without seatbelts. They might not be perfect, buy after coverage of 60% they can keep you sleeping in the night

[-]

BeansAndBelly@reddit

If the entire point is just locking down intent, then just let AI generate the tests. It assumes that what the code does is your intent.

But I think we can see how that’s wrong. At the least, you should tell the AI what you intend the result to be.

[-]

pyeri@reddit

That doesn't mean the tests are useless though. It's just like you can't test an Econ grad's knowledge and skills in every facet of life but only Economics - which is good enough given the context.

[-]

qualia-assurance@reddit

The way to think about testing is as a way to define expectations of how it should work in a way that immediately informs you when those expectations are broken.

That doesn't mean software that passes its tests is correct. Or even that software that fails tests is necessarily wrong. Its just a way of having feedback about when the expectations about your software have changed.

[-]

eewaaa@reddit

Good tests don't lock you in. It's guardrails that guide you in the right direction