It's a personal project for fun or learning that'll never be used in production. Do not be tempted to use it in production in the distant future, even if it seems to be reaching feature parity, because early decisions made while the project wasn't intended for production will hide important bugs that never got fully thought through.
The code's so bad that you're going to rewrite it no matter what, and the only question is which language.
Most of the team who wrote and maintained the non-Rust original will also work on the rewrite, because they'll be aware of many of the subtle edge cases and logic bugs that forced most of the complexity that the original evolved over time.
You're prepared to spend a full decade or two discovering missed edge cases that the original handled, some of which will have security implications that overshadow hypothetical memory safety wins (assuming the original's been well-maintained so far, at least, with no known vulnerabilities at the time of the rewrite).
You've benchmarked it, and the performance improvement is worth everything else. Ideally, you've also run your replacement in parallel on live data for long enough that you're confident any incompatibilities are extremely rare. Not really an option for system tools shipped to run on others' machines, though.
Well, you have to balance that against the reality of possibly ending up 20 years from now with a large code base that's written in a language that very few people want to work on anymore and not enough know to go around.
Some to many code bases are also amenable to incremental conversion as well, so it doesn't have to be a step back and punt type scenario. Particularly in those cases where the product is composed of multiple processes that only interact on the wire or some such.
Keep in mind that none of the following bad things happened:
No buffer overflows.
No use-after-free.
No double-free.
No data races on shared mutable state.
No null-pointer dereferences.
No uninitialized memory reads.
That means, even if the tools were (and probably still are) buggy, they never had a bug that could be exploited to read arbitrary memory.
GNU coreutils has shipped CVEs in every single one of those categories.
there are entire classes of bugs that can be written in c(++) that can't be written in (safe) rust (unless there's a compiler bug and any compiler bugs in the big 26 probably require you to write some pretty out there code). And it's not just Rust either. Garbage collected languages also prevent these classes of bugs. To say that languages can't prevent bugs is a really bizzare take tbh.
This seems a bit like if you saw a headline that said "Poisons this poison-testing kit won't catch" and answered "Poisons aren't part of a poison-testing kit." Like sure, that is true, but helping you avoid them is an explicit goal of the project, so knowing its limits is useful.
What? Rust actively disallows (or requires a very positive override) a whole range of accidental human foibles. So, by your definition, if bugs are the result of human fallibility , languages that don't allow those to happen is catching bugs on behalf of the fallible human.
It just can't catch bugs for which you are unable to express sufficient semantics for it to validate. The type system though extends that ability considerably and into the problem domain issues, preventing human fallibility yet more.
What? Foibles aren't bugs. Bugs are bugs. All languages disallow things that they don't allow, I mean, that's by definition. Saying those are bugs is stupid.
Because bugs aren't a part of the language, they're a part of the falibility of the creator of programs. Any creator.
From your own first post. By preventing MORE human fallibility into the process, some languages prevent more bugs than others. Rust leans heavily into that and prevents whole families of bugs that other language allow.
I'd rather have far fewer bugs be possible, and hence need to do far less extra work to catch them after the fact. Yeh, you'll still need unit tests and such to catch logical bugs that you cannot semantically encode with the language and it's type system. But logical bugs are the really only kind you can reliably find with testing anyway. You can't generally prove the memory and thread safety of a complex code base via testing.
if you had a function that takes a string, would you write unit tests to validate that it has defined behavior when you pass in a corrupted series of bits that can’t be parsed into the type system?
probably not, because basically every language prevents such a thing from ever happening.
if you were writing assembly, you probably should write that test. it’s not necessary in high level languages.
rust makes more errors unnecessary to test for. sure, i could just remember to always write a good test, but i am human. i make mistakes. and my coworkers make even more. the rust compiler is infallible.
The tests would 100% verify that the function is functioning as required if you are making use of that function in your code. That's literally what tests are meant to do.
yes, you sound really dumb. tests that you write by hand are more likely to have bugs than tests automatically generated from your code’s data contracts by a piece of software battle tested by thousands of production systems. like, obviously dude.
Cool stuff, thanks for sharing. Let me start by saying I fully agree with the author's preface of
I’m not writing this to criticize the uutils team. Quite the contrary; I actually want to thank them for sharing the audit results in such detail so that we can all learn from them.
However, the using paths instead of FDs, and doing string equality on paths still seem kind of... naive, no? Absolutely major, MAJOR respect to open source maintainers and I'm not expecting this kind of scrutiny on normal code, but honestly even just from a skill perspective I wouldn't expect these specific 2 kinds of mistakes from stdlib maintainers. Again, no disrespect, but it did surprise me somewhat, since this is the kind of stuff I tend to take into account even when writing non critical userland code.
Rust stdlib is portable
There is some unix/win32/etc specific extensions, but there is really no crossplatform way to do some things with the similar code
File:: operate on open files, not paths. std::fs exposes the OS APIs, that seems like a must-have. Can you point at a lamguage's stdlib which isn't, according to you, poorly designed ?
Not designed this way, but conveniently falls into this shape: the C standard library.
It's annoying to mess with strings for paths (even considering PATH_MAX, which is not available everywhere, and which can't access every file in a system), so it's simply easier to navigate around using *at syscalls (if you're doing recursive code).
That's a bunch of caveats to my attempt at a hot take, to be fair :p
AFAICT, File and DirEntry provide the same functionality as *at in libc. It's what the walkdir crate will give you if you use it to recurse. Not as obvious as passing a path to open(), but I would say the same of libc. std::fs also has niceties like remove_dir_all(), and experimental ones like *_nofollow() and Dir.
That comparison is bypassed by anything that resolves to / but isn’t spelled /. So /../, /./, /usr/.., or a symlink that points to /.
I'm sorry but how the dell do you go along developing a unix command-line tool and not think to canonicalize a path before checking it? Especially when it's for this specific root check especially when this check is specifically designed to prevent users from accidentally acting on root via a messed up indirect path?
Filesystems are one big garden for TOCTOU attacks. And UNIX doesn't help with this problem at all. To be fair, it was designed long before this kind of thing was really a big concern. Even if you didn't have this symlink problem at the target item (file), an attacker could change the upstream portion of the path. The only fix is to resolve an item to a descriptor once and then use that repeatedly. And UNIX isn't quite really set up for this. It does have fds, but those are for open files. It doesn't have them for directories, partial paths, fully-specified paths, etc. The closest thing to an fd for a directory is an inode number and it's not really designed to solve this kind of problem. It also isn't specific enough since across filesystems inode numbers are reused.
This filesystem TOCTOU stuff gets laid on Rust for any other reason than people think laying it on the UNIX OS is admitting it'll never be fixed. UNIX just wasn't designed with a proper way to canonicalize all the things I listed above.
Little of this has anything to do with rust other than the article writer knows that security-mindedness overlaps with people who are willing to pay to help improve their code security. It's not really Rust's fault.
kill -1 going from "send signal 1 + ask for pid" to "send default signal to every process you can see" is the kind of thing that only bites the first time someone types it from muscle memory on prod
bug-for-bug compatibility framed as a security feature is a solid take. most rewrites treat divergence as cleanup, but every divergence is a shell script somewhere making a wrong call
Most of what surprises people about Rust is what it does catch — the list of eliminated bugs is genuinely impressive. The tricky part is that it builds a kind of learned helplessness around the category it can't touch.
Logic errors and wrong assumptions about the domain are entirely on you. The compiler's confidence is contagious enough that you can forget that.
One of the most fundamental advantages that Rust provides is that so much of the time you might have otherwise spent just trying to watch your own back over mechanical errors can go into working on logical correctness. And that it tends to force you to work in ways that also enhance logical correctness at the code level, and provides tools to help you do that.
But correct problem domain logic is always going to be our problem. Of course it also provides very nice ways to mechanically help enforce that as well. But that can only go so far, and attempts to go full on 'no invalid state' mode in a complex system can introduce sufficient complexity that it overwhelms its own benefits.
Down the first fix described will allow the hacker to trick you into creating a file in the wrong place by replacing a parent directory with a symlink? And the confused program might write a file where it has permissions but the attacker does not.
Uristqwerty@reddit
Don't rewrite it in Rust unless one of:
It's a personal project for fun or learning that'll never be used in production. Do not be tempted to use it in production in the distant future, even if it seems to be reaching feature parity, because early decisions made while the project wasn't intended for production will hide important bugs that never got fully thought through.
The code's so bad that you're going to rewrite it no matter what, and the only question is which language.
Most of the team who wrote and maintained the non-Rust original will also work on the rewrite, because they'll be aware of many of the subtle edge cases and logic bugs that forced most of the complexity that the original evolved over time.
You're prepared to spend a full decade or two discovering missed edge cases that the original handled, some of which will have security implications that overshadow hypothetical memory safety wins (assuming the original's been well-maintained so far, at least, with no known vulnerabilities at the time of the rewrite).
You've benchmarked it, and the performance improvement is worth everything else. Ideally, you've also run your replacement in parallel on live data for long enough that you're confident any incompatibilities are extremely rare. Not really an option for system tools shipped to run on others' machines, though.
Full-Spectral@reddit
Well, you have to balance that against the reality of possibly ending up 20 years from now with a large code base that's written in a language that very few people want to work on anymore and not enough know to go around.
Some to many code bases are also amenable to incremental conversion as well, so it doesn't have to be a step back and punt type scenario. Particularly in those cases where the product is composed of multiple processes that only interact on the wire or some such.
jet_heller@reddit
Huh. This article should only be one line: Almost all of them.
Because bugs aren't a part of the language, they're a part of the falibility of the creator of programs. Any creator.
Tornado547@reddit
from the article:
there are entire classes of bugs that can be written in c(++) that can't be written in (safe) rust (unless there's a compiler bug and any compiler bugs in the big 26 probably require you to write some pretty out there code). And it's not just Rust either. Garbage collected languages also prevent these classes of bugs. To say that languages can't prevent bugs is a really bizzare take tbh.
yasamoka@reddit
Please read the article properly.
jet_heller@reddit
Then the headline shouldn't be stupid.
yasamoka@reddit
Why are you commenting on something you haven’t read…
jet_heller@reddit
I have read the headline.
I'm commenting on the headline.
What are you talking about?
Plank_With_A_Nail_In@reddit
Can you please check you have your shoes on the correct feet?
PaintItPurple@reddit
This seems a bit like if you saw a headline that said "Poisons this poison-testing kit won't catch" and answered "Poisons aren't part of a poison-testing kit." Like sure, that is true, but helping you avoid them is an explicit goal of the project, so knowing its limits is useful.
jet_heller@reddit
Uh. Not at all. Rust is not a bug catching tool. A poison testing kit IS a poison catching tool.
Full-Spectral@reddit
What? Rust actively disallows (or requires a very positive override) a whole range of accidental human foibles. So, by your definition, if bugs are the result of human fallibility , languages that don't allow those to happen is catching bugs on behalf of the fallible human.
It just can't catch bugs for which you are unable to express sufficient semantics for it to validate. The type system though extends that ability considerably and into the problem domain issues, preventing human fallibility yet more.
jet_heller@reddit
What? Foibles aren't bugs. Bugs are bugs. All languages disallow things that they don't allow, I mean, that's by definition. Saying those are bugs is stupid.
Full-Spectral@reddit
From your own first post. By preventing MORE human fallibility into the process, some languages prevent more bugs than others. Rust leans heavily into that and prevents whole families of bugs that other language allow.
jet_heller@reddit
You're welcome to consider it that.
I would rather consider bugs bugs and have mechanisms for catching them.
Full-Spectral@reddit
I'd rather have far fewer bugs be possible, and hence need to do far less extra work to catch them after the fact. Yeh, you'll still need unit tests and such to catch logical bugs that you cannot semantically encode with the language and it's type system. But logical bugs are the really only kind you can reliably find with testing anyway. You can't generally prove the memory and thread safety of a complex code base via testing.
jet_heller@reddit
Yea. Ok. This is done by education and testing. Not a language.
But if that's what you want to think, you're welcome to that.
fexonig@reddit
if you had a function that takes a string, would you write unit tests to validate that it has defined behavior when you pass in a corrupted series of bits that can’t be parsed into the type system?
probably not, because basically every language prevents such a thing from ever happening.
if you were writing assembly, you probably should write that test. it’s not necessary in high level languages.
rust makes more errors unnecessary to test for. sure, i could just remember to always write a good test, but i am human. i make mistakes. and my coworkers make even more. the rust compiler is infallible.
jet_heller@reddit
The tests would 100% verify that the function is functioning as required if you are making use of that function in your code. That's literally what tests are meant to do.
fexonig@reddit
you are assuming, somehow, that tests themselves can’t have bugs.
i get you’re prob gonna say “git gud and you won’t write buggy tests” but like we don’t say that about the code itself, for good reason.
the rust compiler will test certain things for you, and we can be confident there are no bugs in the rustc tests because it is a mature system.
jet_heller@reddit
Soooo. . .the tests have bugs but the rust ones are too good for that?
Ok. Thanks for saying "ignore me".
fexonig@reddit
yes, you sound really dumb. tests that you write by hand are more likely to have bugs than tests automatically generated from your code’s data contracts by a piece of software battle tested by thousands of production systems. like, obviously dude.
BruhMomentConfirmed@reddit
Cool stuff, thanks for sharing. Let me start by saying I fully agree with the author's preface of
However, the using paths instead of FDs, and doing string equality on paths still seem kind of... naive, no? Absolutely major, MAJOR respect to open source maintainers and I'm not expecting this kind of scrutiny on normal code, but honestly even just from a skill perspective I wouldn't expect these specific 2 kinds of mistakes from stdlib maintainers. Again, no disrespect, but it did surprise me somewhat, since this is the kind of stuff I tend to take into account even when writing non critical userland code.
link23@reddit
Agreed. I'm surprised that none of these bugs were caught by integration tests or regression tests (which surely exist... right?), too.
cosmic-parsley@reddit
Testing will pretty much never catch TOCTOU. The article touches on that.
link23@reddit
TOCTOU wasn't the only class of bug, though.
0lach@reddit
Rust stdlib is portable There is some unix/win32/etc specific extensions, but there is really no crossplatform way to do some things with the similar code
BruhMomentConfirmed@reddit
I think file descriptors/handles are quite universal, right? At least the concepts translate pretty well to all platforms afaik.
ShinyHappyREM@reddit
Some embedded devices don't have any files, they just have different memory types (ROM, SRAM, flash) and special ways to access them.
Not sure if Rust supports a system like that...
BruhMomentConfirmed@reddit
Right, in which case std::fs wouldn't even be relevant.
oiimn@reddit
I’m not ashamed to admit I would have fallen for the file descriptor trap. I personally think that’s an easy mistake to make
cosmic-parsley@reddit
Ditto there. I wonder if anything that got flagged is from back when it was a scratch project, and might not have gotten properly re-reviewed.
norude1@reddit
Well, this just makes me think that Rust's std::fs family of functions is poorly designed. It shouldn't compel you to use file paths everywhere
moltonel@reddit
File::operate on open files, not paths.std::fsexposes the OS APIs, that seems like a must-have. Can you point at a lamguage's stdlib which isn't, according to you, poorly designed ?ericonr@reddit
Not designed this way, but conveniently falls into this shape: the C standard library.
It's annoying to mess with strings for paths (even considering
PATH_MAX, which is not available everywhere, and which can't access every file in a system), so it's simply easier to navigate around using*atsyscalls (if you're doing recursive code).That's a bunch of caveats to my attempt at a hot take, to be fair :p
gmes78@reddit
Those aren't from the C standard library. They're from POSIX.
So, if you want to include C with POSIX, you should compare it to Rust with POSIX, through the rustix crate, for example.
moltonel@reddit
AFAICT,
FileandDirEntryprovide the same functionality as*atin libc. It's what the walkdir crate will give you if you use it to recurse. Not as obvious as passing a path toopen(), but I would say the same of libc.std::fsalso has niceties likeremove_dir_all(), and experimental ones like*_nofollow()andDir.ericonr@reddit
Glad to hear walkdir does the right thing.
I think
remove_dir_allwas also the subject of a ~recent improvement to remove stack overflow.DivideSensitive@reddit
I think it's the good choice for 99% of users, i.e. people just wanting to use files and not building fundamental infrastructure.
cake-day-on-feb-29@reddit
I'm sorry but how the dell do you go along developing a unix command-line tool and not think to canonicalize a path before checking it? Especially when it's for this specific root check especially when this check is specifically designed to prevent users from accidentally acting on root via a messed up indirect path?
happyscrappy@reddit
Filesystems are one big garden for TOCTOU attacks. And UNIX doesn't help with this problem at all. To be fair, it was designed long before this kind of thing was really a big concern. Even if you didn't have this symlink problem at the target item (file), an attacker could change the upstream portion of the path. The only fix is to resolve an item to a descriptor once and then use that repeatedly. And UNIX isn't quite really set up for this. It does have fds, but those are for open files. It doesn't have them for directories, partial paths, fully-specified paths, etc. The closest thing to an fd for a directory is an inode number and it's not really designed to solve this kind of problem. It also isn't specific enough since across filesystems inode numbers are reused.
This filesystem TOCTOU stuff gets laid on Rust for any other reason than people think laying it on the UNIX OS is admitting it'll never be fixed. UNIX just wasn't designed with a proper way to canonicalize all the things I listed above.
Little of this has anything to do with rust other than the article writer knows that security-mindedness overlaps with people who are willing to pay to help improve their code security. It's not really Rust's fault.
Much-Cellist-44@reddit
kill -1 going from "send signal 1 + ask for pid" to "send default signal to every process you can see" is the kind of thing that only bites the first time someone types it from muscle memory on prod bug-for-bug compatibility framed as a security feature is a solid take. most rewrites treat divergence as cleanup, but every divergence is a shell script somewhere making a wrong call
programming-ModTeam@reddit
No content written mostly by an LLM. If you don't want to write it, we don't want to read it.
Klutzy_Pin9611@reddit
Most of what surprises people about Rust is what it does catch — the list of eliminated bugs is genuinely impressive. The tricky part is that it builds a kind of learned helplessness around the category it can't touch.
Logic errors and wrong assumptions about the domain are entirely on you. The compiler's confidence is contagious enough that you can forget that.
programming-ModTeam@reddit
No content written mostly by an LLM. If you don't want to write it, we don't want to read it.
Full-Spectral@reddit
One of the most fundamental advantages that Rust provides is that so much of the time you might have otherwise spent just trying to watch your own back over mechanical errors can go into working on logical correctness. And that it tends to force you to work in ways that also enhance logical correctness at the code level, and provides tools to help you do that.
But correct problem domain logic is always going to be our problem. Of course it also provides very nice ways to mechanically help enforce that as well. But that can only go so far, and attempts to go full on 'no invalid state' mode in a complex system can introduce sufficient complexity that it overwhelms its own benefits.
Smallpaul@reddit
Down the first fix described will allow the hacker to trick you into creating a file in the wrong place by replacing a parent directory with a symlink? And the confused program might write a file where it has permissions but the attacker does not.