Are hdd sequencial access?
Posted by leavetake@reddit | hardware | View on Reddit | 11 comments
Are hdd sequentaial access or random? What about virtuale Memory/paging? With sequentaial access It would be a messa..thank you for explaining a noob this
oldprecision@reddit
HDD are random access. An example of sequential access is tape.
malastare-@reddit
Its not quite that simple.
HDDs are spinning media, there's no guarantee that the sector that needs to be read is below the read head at the moment its necessary. Additionally, from a mechanical perspective, the drive actually ends up reading a track/sector for a while before it reaches the target data.
We sometimes ignore that, but this is where the "seek time" and "rotational latency" statistics for HDDs came from. It was the time it took the read head to move from its current track to the target track and then wait for the correct sector to rotate into position.
This is definitely not the same sort of serial access as tape, where data location can be described as a distance along the tape and all data between the current location and the target must be traveled. However, it is ultimately just swapping a one-dimensional location for a two dimensional one. The disk is still spinning in the same way that the tape is getting pulled. The disk spins faster and has a very short cycle, but it's also a far way away from actual random access seen in RAM and SSDs where a data location is translated into an absolute section of a circuit.
HDDs feel like random access, but that is simply because the drive controller abstracts the non-random nature.
laffer1@reddit
And it’s possible the data happens to be stored sequentially that you need in a hdd. Defragmenting a drive helped to put file pieces together.
malastare-@reddit
Yes, but...
Since the data on an HDD is stored in various sectors of a track, there's no guarantee that the next block of a defragmented file is going to be read sequentially from the perspective of a drive. Very few defragmenters (none that I've ever used) actually defragmented files with the intention to lay the files into sectors that would optimize seek time and rotational latency.
Put simply, the drive could easily read all the data that's within a single sector and modern drives would have caches to store extra sectors along the way, but by time the application on the other side got back and asked the drive to read the next sectors, they might have rotated out of position. Or, the next block of the file could be on the next track, forcing the read head to seek to a new track.
The results were still about as good as you could hope, even with the occasional misses like that. But it's important to realize that defragmenters tended to treat hard drives like records with one continuous spiral of data. The reality is more complex, and that's before we get into drives reporting virtual geometries or SMART drives reordering/replacing sectors.
laffer1@reddit
I never said it was guaranteed. There's also the problem of the file system in use. FAT variants stored things a lot differently than NTFS or UFS/FFS on old *NIX systems. More modern file systems are also different. Metadata for the files could be in quite a different part of the disk.
malastare-@reddit
Completely agree. I wasn't saying you were wrong or didn't understand, just sharing information.
As you said: Defragmenting helped because keeping files together increased the chance that a drive would be ready to read the data when an application wanted to read it.
I was just tossing some shade at the old idea that defragmenting was fully optimizing a drive. Lots of companies that made them claimed a bunch of weird things, and only sometimes did the things they fixed actually cause improvements.
anival024@reddit
It's random access. You can access any sector on the disk with direct commands. Not all requests will be served with the same latency, but that's a different matter.
If you don't want to count that as random access, then SSDs aren't either, and neither is DRAM with multiple ranks.
malastare-@reddit
It's random access. You can access any sector on the disk with direct commands without having to read (or spool) through everything else.
Yes, somewhere in my rambling my intention was to agree with this.
But the point of the explanation was to point out --if nothing else-- that the idea of "pure" random access and "pure" sequential access are pretty rare in the real world. Tape systems are viewed as "pure" sequential, but most modern systems do have the ability to "skip" portions of the data. The tape still needs to spool through, but it runs much faster and the system waits until the tape slows down to start reading and pick up where it skipped to.
That same feel happens on HDDs. The read arm moves to a different track and then begins reading to figure out where it is. It may then read multiple sectors before it finds the one its looking for.
In short, it's not the same as SSD or DRAM because it can't seek directly to the sector. It can seek to the right area, then it needs to sequentially scan to find the right one.
This is still closer to "true" random access than sequential, but its not the same ability to select the sector to read directly.
Netblock@reddit
Even in the same rank, DRAM is not perfectly random-access. same vs different bank group; same bank vs different bank.
regenobids@reddit
Imagine a tape, but instead of being wound over two spools with a single read/write head, it's instead laid out, all the way, with many read/write heads and a controller keeping track of their position.
With enough heads, it'd be like an HDD, and usable for the same purpose. Data cold be erased and written ad hoc, where there is free space.
The actual bytes would become more fragmented with time as it would on an HDD. It'd need a defrag, or infinite read heads and a lot of space. Both innately sequential mediums.
HDD is simply more practical for this purpose, it only needs a fair bit of read heads and fast spinning platters all of which is easy to fit in a 3.5" or 5.25" format, without physical wear and tear of a tape.
DVD and Blu-ray, both layered and sequential but using a single read/write heads.
Even an LP record player, the most sequential thinkable storage medium, can read tracks randomly if you give it some sensors and plenty time to move that needle in position.
malastare-@reddit
A primary identifying characteristic of sequential storage is the idea that the storage itself is used to define the location. Things like tape or punch cards include the storage locations so that the device reading the media knows where it is. Even really good card readers (I'm told) still wanted card sequence numbers on the cards to be able to validate the location in the sequence.
Tape systems (again, the one's I'm familiar with, which are dated) encoded the location in the data on the tape, because there were worries that the tape would stretch with age, so trying to set a storage location as something like 3.682m from the start of the tape wouldn't be reliable, because six month later, that tape might stretch and the new location might be 3.683 or 3.678m. So, instead, the reader had to spool the tape and read the tape until it found the location it needed.
HDDs do a tiny subset of this, in that they are set up as rings split into sectors. The drive can mechanically define a track as the distance from the center, since drive plates are metal/glass/ceramic and they don't (notably) stretch. But within a track, the drive may read a decent amount of a track looking for the right sector. It feels like the drive is reading a track the same way that a tape reader is reading tape, but the HDD is seeking through storage boundaries written to the storage media and skipping (logically, at least) the data itself.
Sequential access does indeed suck. Its notoriously slow. From an algorithmic perspective, sequential processing of an array in order to get a single piece of information is regarded as fairly wasteful. There are, however, situations where sequential access has no waste and can be very efficient: If my operation naturally operates on sequential data --for instance, decrypting or decompressing-- then the initial seek is the only cost and every other read takes place with no seek time. Modern SSDs have sequential access with similarly low sequential cost, so tape has lost all of its performance benefits except for two: Cost and Density. The old CS joke holds true: Never underestimate the bandwidth of a diesel truck filled with magnetic tape. The latency though...
Now, tossing a last handful of nails into the coffin of sequential access is the nature of modern computing. Sequential access is at its worst in a multiprocessing environment. The storage device holds state (the current read location) which is almost guaranteed to be optimal for at most one process. For any other running processes, the read location is wrong and the cost of jumping between them becomes so high that no modern computing system uses sequential access storage for primary storage. Its only ever used for archival storage now.