Not once in 12 years have I found UI snapshot testing useful

[-]

fancy_panter@reddit

Do you mean visual image regression testing, where two .png files are compared and a diff generated? That is _incredibly_ useful. Humans are visual creatures, we can look at differences, even minor between two images, and figure out what is broken very quickly.

Now, snapshot files, where it's a diff of HTML, CSS, or data or whatever, those are almost always useless and feel merely good behavior. In my opinion, more harmful than useful because they provide a veneer of testing without actually testing specific scenarios. No human can look at a 300 or 3000 line diff and care what the difference is.

[-]

UltimateTrattles@reddit

You don’t read snapshot tests.

You go “why did this component change by several hundred lines when I thought I did something small”

They are alarm bells - that’s it.

[-]

padetn@reddit

Like any test it is as good as you write it. We make sure we cover complex states fully which is great regression testing for state to UI mapping and more palpable than just UI testing for the presence of a control.

[-]

andrei9669@reddit

I would beg to differ, I have caught many bugs purely based on taking a glance at html snapshots and seeing something either removed or added that was not supposed to be touched by the PR. but at the same time, if the change is intentional, then it's pure noise.

[-]

allknowinguser@reddit

Well the snapshot files being a large difference is the point. Some change you did broke the existing behavior/look. Hopefully not that many but it did its intended job.

[-]

Able_Resident_1291@reddit

- makes changes

- breaks snapshot tests

- updates snapshots with jest -u without looking at them

- commits

It's a great system.

[-]

Ecksters@reddit

Front-end snapshots tend to be very difficult to parse, since they end up in HTML while you're probably working in some nice component system that looks nothing like the final output.

However, I do like snapshots for API testing, because JSON output is really easy to read, and it basically sends up a red flag in PRs when some API response changes and you didn't expect it to.

[-]

14u2c@reddit

So a unit test?

[-]

Ecksters@reddit

I feel like the definition of "unit test" has been getting pretty seriously watered down in recent years. It was supposed to test a single unit of functionality.

If you're testing an entire API endpoint I do not consider that a unit test.

[-]

cantthinkofaname1029@reddit

The main issue is that a "unit of functionality" is pretty open to interpretation -- the waters feel pretty muddy when i try to define my larger unit tests

[-]

Wonderful-Habit-139@reddit

Uhhh... Use OpenAPI and stop this madness.

[-]

Ecksters@reddit

Unless I'm misunderstanding, OpenAPI can validate response shape, not the actual data.

[-]

cd_to_homedir@reddit

An integration test that checks the JSON output for an API endpoint works just fine for this.

[-]

Ecksters@reddit

That is precisely what this is, but you use snapshot tooling to facilitate updates to the output, while also making those changes easily identifiable in PRs.

[-]

max123246@reddit

Why not a json schema in that case?

[-]

KatAsh_In@reddit

Did you forget a step known as "PR review" , or are you working in a 3 person start-up?

[-]

mikkolukas@reddit

Did you know. By pair programming, you get better code quality and have already done the review while working 😌

[-]

moh_kohn@reddit

makes PR
colleague vaguely eyeballs PR while thinking about lunch
merges

[-]

wigglywiggs@reddit

If you work in a team like this then there's no point in caring about any engineering practices anyway.

I mean seriously, if the author of the change doesn't care, and the reviewer doesn't care, and this happens consistently, your codebase can't be saved.

[-]

chmod777@reddit

Lgtm. Coauthored by claude? Ship it!

[-]

SearchAtlantis@reddit

I low-key roast anyone that puts lgtm on a PR review.

[-]

Teh_Original@reddit

Try being in a group where you get lambasted for worrying about edge cases. =(

[-]

hooahest@reddit

It really depends on the use case, and the amount of fallback if a bug does occur

[-]

pepejovi@reddit

Hardly the fault of the reviewer at that point if there are corner cases missed by both the ticket writer and the author of the code..

[-]

peripateticman2026@reddit

To be fair, that's hardly the point of most code reviews. That's on the PR author, not the reviewer.

[-]

Ok-Entertainer-1414@reddit

If more than a few lines in a generated file like a snapshot test changed, I'm just marking them as reviewed without reading them. Do you also expect people to review the changes to package-lock.json when you update your npm dependencies?

[-]

Polite_Jello_377@reddit

Now you just need to automate the snapshot updating and it will be perfect

[-]

wigglywiggs@reddit

It's pretty good if your committers are not asleep at the wheel.

[-]

Psycho_Syntax@reddit

That’s still useful though, in case the snapshot breaks and it weren’t expecting it to. I mean if in that case you just run them and commit the new snapshots that’s just a process issue.

[-]

bothunter@reddit

Half the time, you don't even need to do the first step.

[-]

lunacraz@reddit

i've actually caught issues when engineers submitted a snapshot file change that went from hundreds of lines to one - and that one line being null

i think if you want to make sure a very simple display component is tested without some inane "make sure this class renders on page" test a snapshot test is nice to ensure any updates don't cause anything drastic

it's a nice review tool, and a very early sign if the snapshot is fucked up something went wrong

[-]

SixFigs_BigDigs@reddit (OP)

Updated Resume

Implemented and administered a dynamic automated testing system at Albertsons

[-]

VoxTM@reddit

We do visual regression testing and it works well.

[-]

siberian_huskies@reddit

What tool do you use?

[-]

VoxTM@reddit

Storybook + Playwright. Works very well because once you write your stories (which we find useful for development also) you get your visual tests for free.

Has the cost of having to write the initial setup and stories themselves.

[-]

Spiritual-Theory@reddit

Just did a huge refactor. They were helpful for a while.

[-]

soylentgraham@reddit

UI, no. libraries that generate graphics, or export files, yeah.

Who the heck does this for UIs? Ive never seen a UI in any project sit still for more than a week (and that includes weeny embedded displays)

[-]

IndependentOpinion44@reddit

It’s an easy way to get the coverage up to meet the arbitrary 80% coverage requirements a lot of orgs have.

[-]

PersianMG@reddit

I've been a fan of visual regression testing to be honest. On most changes, it auto passes but occasionally it flags something, take a look at screenshot and a button is missing or something is out of place. Personally its caught many bugs for me before I've shipped front end code.

[-]

Routine_Internal_771@reddit

Why? [genuine question, I'll be looking into giving it a shot]

[-]

SixFigs_BigDigs@reddit (OP)

The return on investment for your entire dev team to maintain and "pay attention to the snapshots" (they wont) is terrible. You can catch these errors in other less brittle ways. If you're suggesting it, you just need a directive for promo or you don't actually account for daily operations with a bunch of humans.

[-]

Matthew_Code@reddit

Are you aware that you can automatically check the diff for each snapshot and just have to take a look when the diff is > 5% for eg?

[-]

SixFigs_BigDigs@reddit (OP)

Yeah. I don't see that occurring outside of major fuckups in PRs or other issues on the main branch though personally

[-]

corny_horse@reddit

I don't see that occurring outside of major fuckups in PRs

Seems like catching "major fuckups" might be reasonably high value.

[-]

seventeenninetytoo@reddit

Well yeah, that's the whole point. To catch major fuckups.

[-]

KitchenDir3ctor@reddit

A quick visit after release/deploy on a test en is probably less impactful than dealing with visual testing as a whole.

[-]

SixFigs_BigDigs@reddit (OP)

One look at the pr itself shows that tho.. juice still ain’t worth the squeeze

[-]

KitchenDir3ctor@reddit

What risk does it cover?

[-]

Matthew_Code@reddit

Risk of fucking up the view?

[-]

SixFigs_BigDigs@reddit (OP)

Developer QA

Another dev QA during review

Acceptance QA

Post Deploy QA

NewRelic for Monitoring other code issues that probably caused it

[-]

killersquirel11@reddit

Damn you get all that? Best I can get is a lgtm stamp after two seconds of code review

[-]

Business-Row-478@reddit

Automation testing reduces the need for QA. 4 steps of QA including another dev doing QA during code review is pretty crazy

[-]

SixFigs_BigDigs@reddit (OP)

Reduces, not eliminates. Depend on automated testing if you want to.. 😬

[-]

HenryJonesJunior@reddit

With good tests, you do rely on the automated tests.

Someone made a change that isn't protected by an experiment? Screenshot test caught the new thing in the UI before it was submitted.

Someone messed up an authentication check for one specific type of customer? Integration tests catch it because the developer didn't think to manually test that scenario.

Proper automation is both faster and more reliable than human testing.

[-]

ham_plane@reddit

What is the "QA" you speak of? 🤔

[-]

ImportantSignal2098@reddit

Risk of fucking

Take it

[-]

Routine_Internal_771@reddit

When a change is made to a component or a global theme, we can see the full impact

It automates the generation of evidence of what a new feature/modification looks like

Regressions from dependency updates are caught

Teasing can occur for various screen sizes/themes/device configurations

[-]

jesseschalken@reddit

Do a massive refactor that isn't supposed to change behaviour and find your snapshot tests still pass. Then they're a godsend!

[-]

padetn@reddit

Bonkers take.

[-]

Sensitive-Ear-3896@reddit

What is UI snapshot testing?

[-]

kbielefe@reddit

TIL. Apparently you take a screenshot of the UI and compare it with a previous screenshot. If it differs too much, the test fails.

[-]

Sensitive-Ear-3896@reddit

Hmmm yeah I have to concur I cannot possible find this being useful, at least not anymore useful then running cURL and looking for the word "exception"||"error"

[-]

Deranged40@reddit

I cannot imagine finding this useful

I had a hard time imagining it before I got to use it first-hand, too, for what it's worth.

But when a test fails and it shows you two screenshots, and you see that there's a button on the entirely wrong side of the page, that's a useful and actionable failure.

[-]

Sensitive-Ear-3896@reddit

When I develop stuff I usually take the time to make sure it renders correctly, but ydy

[-]

techie2200@reddit

When modifying a design system component, it's handy to have realistic use-case tests or automated tests against the systems that use the design system to make sure nobody hacked something to work a certain way and you just blew it up.

[-]

Sensitive-Ear-3896@reddit

So now we’ve moved the goalposts to testing Gus, ok. Or you can read what I said. Up to you I guess

[-]

Deranged40@reddit

Well if you ever have to work on a team with others, maybe you'll understand ten.

[-]

Ecksters@reddit

Most likely OP is actually talking about taking snapshot of the underlying HTML/CSS of the page, rather than comparing screenshots, but it's hard to be certain.

[-]

techie2200@reddit

Some tools do both (snapshot the html/css, then render it out for you with visual diff highlighting). Very handy.

[-]

aboothe726@reddit

UI Snapshot Testing is an automated testing approach where you screenshot your frontend application's UI screens during build and compare the screenshots to "known good" screenshots. If the new build's screenshots are "too different" from the known good screenshots, then the test fails. There's some art to defining "too different" so that it catches the right changes and does not detect the noise

[-]

recycled_ideas@reddit

The use case for a UI snapshot test is to tell you something changed that you didn't expect to change.

That's it. You take snapshots of significant parts of your app, not of individual components and then if you get a snapshot error someplace you didn't expect you look into it.

In short if you make a change that was only supposed to change a bit of text on your about page and all of a sudden snapshot tests are failing on your homescreen you have a problem.

[-]

jl2352@reddit

If you mean a visual regression; I’ve had times it’s been useful. Once on a Wordpress site we had a page with all of the components used on it at least once. That was invaluable.

For HTML diffs; I’ve found them worse than useless. I discovered at one place that about a quarter of the snapshots had bugs in them.

[-]

kondorb@reddit

It can be useful as a form of regression testing covering the parts of the codebase that aren’t being actively worked on.

Just gotta be mindful of how you’re applying it.

But I agree that it is a niche use case thing, I’d rather use normal assertions over the generated HTML or some other more abstract mechanism your framework provides.

[-]

DreamingOfLight@reddit

If you mean the unit tests that save the html into a snapshot file, then definitely agreed.

Visual snapshot testing is great though. We used to have a setup where every component would be visually compared in Storybook on every PR. The developer then had to approve or reject (and fix) the visual differences. It was a really great system that caught unintentional UI changes.

[-]

Ysilla@reddit

Meh, idk, we've used ui snapshot testing for around 7 years now on mobile, and over that time, I've had ONE single positive result that actually pointed out to a pretty minor mistake (so minor even our designers barely cared about it).

But now if I had to sum up the time lost looking at false positives and regenerating all the screenshots when changes were expected... I'd probably be looking at multiple weeks. And it also makes every PR significantly slower on top of that. And also caused a few internal incidents related to the size of data they generate.

They have a place on very specific small components I think (like design library level), but for larger stuff they get into net negative territory very quickly imho.

[-]

KatAsh_In@reddit

I have found it useful. Multiple times. As with any tests, there is some effort required in maintaining and ensuring they are executed regularly.

Most of the time i have seen, is people implement snapshot testing, but dont value it. It is not hooked to CI and pretty soon people forget about it.

Can you elaborate more or give examples on why you feel they are useless?

[-]

GoodishCoder@reddit

In my experience, most people end up updating snapshots without looking at them because broken tests are expected when you make changes.

I think the concept is solid but the human behavior that often starts forming as a result devalues them.

[-]

Curious_Start_2546@reddit

If you have things set up correctly, the reviewer will see the before/after pictures in the MR. So anything wildly broken is quite easy to catch

[-]

GoodishCoder@reddit

Most places aren't set up correctly and honestly most reviewers aren't being thorough. Imo unit tests are far more effective because the baseline expectation is a passing test so when something fails it generally causes devs to pause and think. If the baseline expectation is for something to fail it ends up getting ignored.

[-]

redditnotmereddit@reddit

Team is convinced to use it as E2E on potentially dynamic content. You have ensured content is static, so it should be good, right?

But then you still face anti aliasing issue. Or device specific browser render. Then you spend time to put it in Docker.

Afterwards a PR with docs change compare 5% diff on random components. It's mostly anti aliasing issue still. You decrease sensitivity by increasing diff threshold.

Now your tests passed an incorrect styling due to sensitivity issue. And you have spent 1-2 weeks fine tuning the snapshot process.

[-]

KatAsh_In@reddit

There, that first line. Team is convinced to use it as E2E... No, snapshot tests are not to be used for/as e2e tests. They are a tool for component testing. If you have folks in your team that think taking a snapshot of a complex page and comparing it again and again is a good idea, you need to revisit whether they are actually bringing value or just implementing stuff because internet said so.

[-]

redditnotmereddit@reddit

Tell that to playwright. You can use it locally or web page. The idea is that it should be (mostly) static pages/components.

In my mind either an internal UI framework make sense to test, or sanity check on e.g. a localized login page (but then you don't know functionality)

[-]

Curious_Start_2546@reddit

Depends on the size of the organisation. If you are making a change in a shared/common components library, UI tests give you more confidence your changes haven’t broken another teams UI

[-]

techie2200@reddit

Having snapshots being checked on a prod-like env (in a pipeline) has come in handy quite a few times. Especially when it came to weird issues with local env vs prod (like a misconfig) which caused the entire page to fail to load.

Everything should be automated, with manual intervention if diffs are found. Our team's was configured too strictly (we often had to review 1px diffs because of the rendering engine in the tests, which meant nothing actually changed, but the tools we used highlighted the diffs and made it clear so it only took 10-15 seconds for a full review). I'd say at least 3-4 times a year something would come up that'd save us thousands of dollars in downtime, not to mention the rep hit to our customers.

[-]

zarkwonz@reddit

Snapshots with git lfs, PR updated snapshots along with your code.

We have found these invaluable for detecting regressions in contract rendering.

[-]

iloverollerblading@reddit

Useless af

[-]

royboypoly@reddit

Hard disagree

UI snapshot tests have detected SEVs before they happen in my experience

How about a critical revenue driving CTA button being pushed out of view?

This take is kinda crazy

[-]

AndyKJMehta@reddit

It’s simply a “are you sure?” check

[-]

kobbled@reddit

snapshot testing was the greatest back when enyzme was still a thing. unit tests were so much more useful than the garbage that RTL gives you

[-]

BenZed@reddit

Why'd you keep doing it for 12 years then?

[-]

dead-first@reddit

I mean ok, but who actually writes code or tests anymore just have AI do it... Who cares about how effective you think it is, it doesn't hurt or take time now

[-]

Foreign_Addition2844@reddit

Testing? Thats what the customers are for.

[-]

intertubeluber@reddit

I found it very useful for a project that generated a script in a proprietary scripting language. We’d use snapshot testing to catch regressions in the logic that would spit out the script.

Pretty novel use case I suppose but really valuable.

[-]

daraeje7@reddit

Usually i understand how something is not useful, but i genuinely feel like visual tests are always useful. However i use it with storybook and chromatic, so it’s easy to use

[-]

Idea-Aggressive@reddit

Do you mean visual regression tests?

If you call it “UI snapshot” of course you find it useless, you don’t understand it.

[-]

Sottti@reddit

Indo and have saved is multiple times. Either you don't look at them or the people reviewing the Para don't look at them or both.

[-]

throwaway_0x90@reddit

It depends,

Some design teams put in a lot of effort into designing the drop shadows or rounded corners with little hi-res rose petals and it's super important that those elements are always lining up.

[-]

2old2cube@reddit

I found it useful daily (iOS). Doing refactoring it is very useful tool.

[-]

gluhmm@reddit

Screenshot testing is very useful. HTML is absolutely useless. That's why I migrated my component tests to Storybook.

[-]

normalmighty@reddit

I find them useful for PR reviews to see how the snapshots have changed, and some QA testers have liked seeing them as a pointer as to what changes in the UI have occurred. Never found much use beyond that.

[-]

EkoChamberKryptonite@reddit

Ahh finally some deliberations on the craft of software development and not another AI glaze or AI doom post. Very refreshing this.

[-]

30thnight@reddit

Genuinely worthless when used as a replacement for E2E testing

[-]

pseudo_babbler@reddit

I agree they're useless, particularly because you also need other tests that actually test what you care about in the element structure, so the snapshot tests are just redundant time wasting complexity that everyone just ignores and regenerates the snapshots.

Personally for web I prefer it if I can do only black box playwright tests. You can set up to run them really fast now, as long as you can run everything locally or mocked. You can check loads of things in your page, plus it tests if they're actually visible and can be interacted with unlike a snapshot or jsdom test. It's a tough sell trying to convince a team not to unit test react dom output but I would only unit test state and business logic if I had free reign.

[-]

dnunn12@reddit

Found the guy that doesn’t vibe code.

[-]

codescapes@reddit

Broadly I agree - we do not use them on my app - but they have some utility. If you're trying to modernise a legacy UI that has overly coupled code / styles and you're scared of breaking things they act as a smoke detector.

But for an app that's under heavy development they're basically a pointless annoyance. My favourite combo right now is Vitest + Playwright. Since we're an internal app I can just run against a modern Chromium build (it's the 'official' browser everything is built for) and it's glorious.

We've got a fully mocked out environment to execute it against so it means we can properly test UI flows independently of network / API instability. I can also obviously just run that mock setup locally too for development purposes.

Honestly UI development is in such a better place than 10 years ago. People can complain about React all they like but man the ecosystem in general is vastly improved.

[-]

redditnotmereddit@reddit

Playwright is good in your case when you have everything mocked. Otherwise it's hard since it's E2E with playwright. I think many in the thread think of jest, which won't test browser render.

In many cases you have dynamic content within components, and you have to guarantee that localhost/stage can be deterministic

[-]

dbxp@reddit

I haven't found them useful in my work but I can see how they can be useful in a platform engineering role. If say you work for Shopify you want to role out an update which could effect thousands of customised sites, snapshots let you regression test with those customisations.

[-]

brainhack3r@reddit

The problem I've had is that:

- it's difficult to setup and usually requires changes to the build

- most other dev's don't buy into it. If they don't then I just feel like a jerk pushing everyone to use something they don't want to use.

- there IS some value in it but doesn't outweigh the above.

I can see you doing it if you were someone like Uber and you had an app with thousands of screens and workflows that you wanted stable.

But even then you could do canary deploys to see if people report errors.

[-]

mRWafflesFTW@reddit

I just implemented UI snapshot testing with playwright and I'm vibing through the roof.

[-]

StTheo@reddit

I used it a week ago. A guy set up our Tailwind project with font-size: 62.5%, then added hundreds of lines of CSS overrides to get the Tailwind-based component library’s components back to the right size. I wanted to revert all that.

We had a ton of component tests that used Playwright, so we used them to spin up temporary snapshot tests, then flag the ones that were flaky and ignore those. Those snapshot tests verified the overall migration, but I deleted them before merging.

I wouldn’t rely on them long-term, but it was nice to have that brute force verification tool there during a large migration.

[-]

MoreRespectForQA@reddit

It's technically very tricky to get right. You have to isolate all sources of nondeterminism from the test. This means everything from:

Pinning the exact version of every piece of software in the testing stack - especially the browser.
Isolating parts of the snapshot which vary.
Being disciplined about fixing "bugs" which users might not actually care about but which make the app behave nondeterministically.
and tons more.

Despite being a lot of dev work (never seen a QA team which could handle it), it pays off handsomely when you nail it.

[-]

redditnotmereddit@reddit

Good points. And when you consider that it's mainly an E2E test (like playwrights) you run into a lot of problems. It should be component tests

[-]

NatoBoram@reddit

Oh hey I've only used that once. I thought it was cool, but I couldn't make good use of it. It sounds very interesting, though.

Instead, one thing I liked was Storybook.

[-]

bigAssFkingRoooobots@reddit

Hard disagree: we have a huge product with 20 years of different features built on top of each other with different (mostly deprecated) frameworks. Because of this it happens often that changing a small thing breaks something in a different part of the product.

We have monthly releases and during the release process each engineering team has to review the UI snapshot changes with the previous release. The process is automatic where we only see the snapshots that changed and we usually have 100/200 images to go through and it saved us so many times.

Note that we are profitable and doing OKish but no plan on rewriting any existing logic, even AI is clueless when working on it

[-]

driftking428@reddit

I don't love them but they've helped me.

I created a competent in a code base. Came back months later and tweaked it. I had not realized how widely shipped my component had become. My update broke a lot of snapshots.

[-]

iamgrzegorz@reddit

The fact that something is not useful for you doesn't make it a cargo cult behavior.

[-]

abandonplanetearth@reddit

Then you aren't working in software where it matters

[-]

SixFigs_BigDigs@reddit (OP)

That could certainly be true

[-]

Nezrann@reddit

SDET here, visual regression is incredibly important, especially if you have anything feeding data to populate basically any UI component.

[-]

allknowinguser@reddit

In a unit test? It’s pretty nice in large code bases that depend on multiple internal libraries. Any change you add can be compared to existing and make sure it was meant to be. All the other items you mentioned in another comment is POST release to test or prod environment.

[-]

zninjamonkey@reddit

I am finding it pretty useful right now. It’s an internal business tool and they hate if something changes.

[-]

redditnotmereddit@reddit

We have snapshot on components with dynamic content as E2E. It's horrible. Flaky and unreliable. Diff on 5% of pixels could be compression artifacts (OK) or incorrect styling (not OK).

I would love to explore with AI though. Either to compare snapshots (to remove diff on anti aliasing/compression artifacts) or to run towards real prod and have it analyze how content has changed. If a product information page look roughly same but has new product information, then it could be powerful.

[-]

WiseHalmon@reddit

I don't know why we don't have full video recordings of playwright tests in GitHub copilot yet

[-]

LonelyProgrammerGuy@reddit

Github Recall lol

[-]

SixFigs_BigDigs@reddit (OP)

that's next 💀

[-]

LeoPelozo@reddit

It's useful when you share ui elements between different teams. If someone somewhere changes something on that ui element they don't have to check 75 screens to see if they fucked up something, they just run the tests. This is mobile btw, no idea about web.

[-]

pacman326@reddit

It's incredible when say you are swapping out design systems with a lot of themeing and want to easily validate you havent made significant regressions (or understand whether the expected changes worked)

[-]

PricedOut4Ever@reddit

The one time I found it successful was a very specific use case.

We had a product forked off of an open source platform. We were constantly pulling in changes from the upstream open source project as well as managing our own changes. Sometimes, the upstream changes would randomly include changes that were hard to predict and would break or conflict with our changes. So, it wasn’t really to validate our changes, but just the changes from upstream.

Was still a massive PITA and one of the least enjoyable jobs I’ve ever had.

[-]

Budget-Length2666@reddit

approve. browsers are flaky beasts.