Is it just me or are nvme drives less durable?
Posted by loozerr@reddit | hardware | View on Reddit | 41 comments
I've had a pretty miserable luck with my nvmes compared to sata ssds and hdds - like half of my drives have gave up in relatively light desktop use, where they just grind to a halt with extremely long response times and low throughput. Some have also been temperature sensitive as in, they won't register as bootable when cold or they start acting up when warm.
This has happened with and without heatsinks and in various devices, and all have been reputable brands like Intel and Samsung.
Does anyone share this sentiment?
GenZia@reddit
Depends on the controller more than anything, I imagine.
There are Gen 4 drives with 28nm controllers (likely fabbed at third-class foundries) that get so hot after a few hours of basic internet browsing, you can’t even touch the controller.
And I’ve also seen Gen 4 drives on 14–16nm FinFET that barely feel warm after a long gaming session.
wintrmt3@reddit
That's a weird comparison, web browsers read and write the disk pretty much continuously, while games just do short bursts when loading a new chunk of the map or writing a save.
ExeusV@reddit
Why?
wintrmt3@reddit
Every piece of content that doesn't prohibit it gets written to cache, all cookies and local storage gets durably written, things like this. So yes, if you leave it alone and only sites with well behaved js are loaded it doesn't, but in a realistic browsing session there is a lot of small i/o.
nickidk4@reddit
I have always used Samsung SSDs and never experienced any problems.
What SSD brands did you experience failed on you?
loozerr@reddit (OP)
Samsung, Intel, Hynix
Z3r0sama2017@reddit
Imo it depends on the manufacturer. I've only ever had two storage drives die out of about 30 over the years. One was a piece of shit ocz vertex something that shat itself after two years, the other was the crippled adata xpg sx8200pro that was downgraded after fantastic reviews.
total_cynic@reddit
Are you relatively short of RAM? lots of paging will be tough on an SSD, but you won't feel the slowdown the way you would if the pagefile were on slower storage.
loozerr@reddit (OP)
I used to be, which might have something to do with it indeed.
RedTuesdayMusic@reddit
Across a WD SN850 4TB, SN750 1TB, HP NVME of uncertain make in my Omnibook Ultra, Samsung 990 Pro 4TB, 990 Pro 2TB and an SN9100 8TB (only had this one for a few weeks) I've so far had no problems anecdotally.
The SN750 is also installed in banana mode due to missing standoff on a PCIe addin card and has by far the most power cycles and power on hours
jenny_905@reddit
I had my first failure of one this year, just became flaky and unreliable - it was a Hynix 1TB PC611 (OEM) from 2020. Similar issues as you report, it's not entirely dead but started to give I/O errors, occasionally not appearing as bootable device etc. I did wonder if that recent Windows-killing-SSDs thing was related but who knows.
Aside from that no issue but the fastest drive I own is a PCIe 4.0 so I'm definitely not on the bleeding edge.
loozerr@reddit (OP)
My issues have been os agnostic though, maybe I've just underestimated how fragile QLC really is. But Linux likely hammers the drive a lot less, though baloo has sometimes been pretty intense.
Nicholas-Steel@reddit
QLC is really shit, like really shit compared to TLC, MLC (which can stand for either 2 layers or 3) and especially SLC.
If you have an Nvidia card with Geforce Experience (or Nvidia App) installed, turn off the Instant Replay setting in the Overlay to stop it constantly writing to the storage drive. AMD's Instant Replay setting instead writes to RAM and only writes to storage when you want to save.
kommz13@reddit
do SLC/MLC nvmes exist?
BatteryPoweredFriend@reddit
New ones explicitly slc or mlc don't anymore. The closest you can get to those now are enterprise drives with programmable slc caches.
Xurbax@reddit
You can buy used MLC Enterprise drives. Many of them are SAS drives - U.2 ones are a bit less common. I have a 3.2TB Micron one which works great, except for some reason some motherboards won't boot off of it.
SLC (Enterprise) ones exist, but are exceedingly rare, and mostly weird (like PCIe that need special drivers). Easier to get an Optane drive instead.
kommz13@reddit
kinda wished i got an optane...or that the tech didnt die out....
glitchvid@reddit
Basically no, not anymore.
The last good MLC series was the Samsung 970 Pro line, after that it's all TLC.
Nicholas-Steel@reddit
I'm not sure, I have a feeling they don't outside of enterprise tier drives.
loozerr@reddit (OP)
I've been using obs and gpu-screen-recorder for that, both utilising ram.
Nicholas-Steel@reddit
Two of the big downsides to QLC SSD's are the performance drops off a cliff whenever the cache is exhausted and write durability is abysmal.
jenny_905@reddit
Hmm yeah this was a TLC drive, I've got a few QLC/SLC's sitting around but I've never used them really, just drives I've pulled from laptops and upgraded.
theevilsharpie@reddit
I've had two (I think) enterprise SATA SSDs die in servers that I maintained, but they were pretty heavily loaded with major write workloads. I can't speak for enterprise NVMe drives, as my professional life is pretty much all cloud-based these days, and details of the underlying hardware are abstracted.
I've never had a personal SSD die. If anything, they seem to be more durable than the mechanical disks of old. I once slipped on some stairs and landed square on my backup that had my laptop in it. The laptop chassis was severely damaged, but the SSD was fine.
If you're finding that NVMe disks in particular are dying more frequently than SATA disks, then two things that I'd check:
Make sure it's seated firmly in the M.2 connector, and that it's screwed down snugly (but not over-tightened). An over-tightened NVMe SSD will bend, and over time, work its way out of the slot.
If you've got a high performance NVMe SSD that needs a heat sink, make sure that there's actually airflow moving the heated air away from the heatsink. Modern tower air coolers or water coolers can starve various parts of the motherboard of cooling without adequate case cooling, especially since today's NVMe drivers are often obstructed by a massive GPU.
Make sure your SSDs have the latest available firmware from the drive vendor. Many firmware updates are intended to address reliability issues (or other issues that can cause the drive to appear dead).
Also, modern consumer drivers with TLC (and particularly QLC) NAND is going to have poor write durability. This shouldn't be a big deal for consumers, because consumer use cases (playing games, watching movies, browsing the web) is mainly going to involve reading data, rather than writing it.
However, swap files (or page files, as they're called in Windows) are very frequently written to in a way that results in significant write amplification on the NAND cells, and hibernation files can result in many gigabytes of data being written to disk every time your computer goes to sleep. If you're concerned about SSD durability, disabling the swap file and hibernation support will go a long way toward reducing the number of writes being made to your drive.
kneeonball@reddit
I’ve never had an issue but I’ve only had Samsung nvme drives.
hollow_bridge@reddit
Definitely less durable, you can check their health and see it goes faster. Older sata drives frequently used slc or mlc, and newer drives use tlc or qlc which is less durable. If you want to extend durability make sure to turn off virtual memore (pagefile) as that writes a lot and you probably don't need it as long as you have 16gb+ ram.
waitmarks@reddit
Are you buying QLC or TLC drives? QLC drives are inherently not going to be as performant or last as long as TLC drives. This is because in QLC there are 4 bits stored per NAND cell. if you need to change 1 bit, all 4 bits have to be erased and then rewritten. This is slow and creates extra wear on the flash. By comparison, TLC stores 3 bits per NAND cell. If you are just buying the cheapest SSD available, even from "reputable" brands, you are most likely getting QLC drives.
In my opinion, QLC is really only good for a secondary drive that is just extra storage, and your main drive should be TLC.
JuanElMinero@reddit
That's not the mechanism making QLC less durable and performant than TLC. Both of them store data by setting a single voltage for each cell of roughly the same range. TLC divides that range into 8 different levels, while QLC has 16.
It's less performant, because tuning and hitting 1 of 16 finely grained voltage levels and verify it's correct setting takes longer.
It's less durable, because equal wear will make a QLC cell not able to differentiate between 16 voltage levels earlier. They don't experience a lot more degradation, they just do something that's more difficult for a capacity tradeoff.
In theory, it also means that there's a higher risk of bit rot for QLC when the drives are powered off for a long time, assuming equal amounts of current leakage.
f3n2x@reddit
Yeah, it's actually much worse than people realize. 4bit per cell is 4x the capacity but not 4x the writes, it's much much more. The controller has to absolutely HAMMER that QLC cell to tune in the precise voltage. The endurance of indivudual cells goes down by at least an order of magnitude with each additional bit while TLC to QLC is already only +33% compared to +100% from SLC to MLC.
loozerr@reddit (OP)
Both. Most failures were QLC though.
michaelsoft__binbows@reddit
I have an SN850 I took the heatsink off of, and it was on there real tight and it was my first time so that drive got BENT during that process. it did not come online for a few days but has been solid since then.
However I have had a 4tb inland performance plus drive straight up just die and take my savegames with it. Microcenter actually refunded my original purchase price with no questions asked and i picked up the same product as a replacement for half the price. In retrospect it would have been smarter to spend a little more on a samsung or hynix manufactured product but knock on wood it's still going strong and has a fresh 5 year warranty.
Sisaroth@reddit
I feel like it's impossible to say as an individual consumer. In my whole life and family, I have had one drive to fail. It was an HDD that used to be in a laptop and it was being used as an external harddrive without any protective shell.
With this kind of sample sizes you just can't make conclusions.
loozerr@reddit (OP)
It is yeah, was wondering if others had similar feelings that ssds don't really last anymore, but looks like it's just rotten luck.
Cubanitto@reddit
I have 2 laptops 3 desktops and I have at least 2 nvme SSD's in each. I've had no problems, But I tend to buy only samsung drives.
rancor1223@reddit
In my whole life, we'll, in the 20 years I've been around and woned a PC I've had 1 HDD fail. Out of, we'll, like 10? I currently run cheapest shittiest SSDs in my NAS and my desktop is SSD only as well (there are some midrange TLC Kingston drives). No issues so far, fingers crossed.
I think you are just very unlucky.
SJGucky@reddit
No Failure so far. I use SSDs for several years.
My most used Samsung 960 512GB PCIe 3.0 NVME was my bootdrive, which lost about 1-1,2% per year of the "life expectancy" according to SMART. But it was a PRO model with a higher TBW.
Windows writes the most...
All other NVME or SSDs (860 EVO/870 QVO) are purchased later and each still only lost \~2% with 30-50TB written to them.
zaxanrazor@reddit
A lot of manufacturers cheap out on the controllers, even if the high end, and this is the result.
SignalButterscotch73@reddit
The only solid state storage I've ever had fail were USB thumb drives and SD cards.
3x crucial SATA drives and 2x Samsung NVMe (pcie3) drives still going strong with the newest being one of the SATA drives from 2021.
I've had HDDs fail but only a couple of older 6Tb drives from the mid to late 2010s and an external 40Gb drive from the mid 2000s that I dropped one time too many.
Drives of any flavour failing significantly before the manufacturers estimated max write while rare, can happen and if you've had more than one failure from the same brand then it could be a dodgy batch of some component that slipped through QC.
forgottenendeavours@reddit
I think you've just been unlucky.
The best info resource I'm aware of that's publicly accessible and which seems reasonably consistent and revealing is Amazon product reviews, where you can generally take a 1-star score to be representative of some sort of failure. At that, 1-star scores tend to be far higher for mainstream consumer mechanical drives than consumer NVMEs - NVMEs tend to get around 2%, while SATA and USB mech drives can be anywhere from 6 to 12%. Things do improve substantially for enterprise-class mechanical mech drives, which tend to be equal or better to premium consumer NVMEs, with 1-star rates of 1 to 3%.
Obviously, this is all to be taken with a pinch of salt. There are no hard numbers here, just reasonably consistent trends.
loozerr@reddit (OP)
Yeah I figured that too, but I'm getting pretty unlucky so figured to ask for anecdotes. I don't think amazon is that relevant here, though, since mechanical drives tend to fail when dropped and I'm mainly talking about failures outside warranty period. Like most drives I've melted have been 3-5 years old, except for one which had the cold boot issues pretty much from the beginning.
ILikeFlyingMachines@reddit
Never had any NVME fail, and only ever 1 SSD which is an ancient 64GB SSD
loozerr@reddit (OP)
Other than the five or so nvme drives which have "soft failed" I've just had some external and laptop hdds give up due to... external factors. I've even got a 20GB SATA SSD which works perfectly fine.