Linux Built-In Tools Are So Powerful, You Can Build a Database With Them. Here's How
Posted by FoxInTheRedBox@reddit | linux | View on Reddit | 108 comments
Richard_Masterson@reddit
"Linux built-in tools"
So, GNU tools. Not made by the Linux Foundation and completely unrelated to them.
nononoitsfine@reddit
You can build a database out of text file lol
0x1f606@reddit
Yeah, because text files are just that powerful.
nononoitsfine@reddit
yup
randomatic@reddit
The file system is a database. A damn good one. It even maintains a cache in memory.
redonculous@reddit
CSV gang represent!
michelbarnich@reddit
Obviously you should use JSON or XML for that /s
AlterTableUsernames@reddit
Well, what is a JSON if not a small database? Genuine question.
BrianHuster@reddit
It is a data format. Not a database
AlterTableUsernames@reddit
And what is a database if not formatted data? Again a genuine question.
HiPhish@reddit
A real database does many more things than a plain data file:
If all you want to do is just store some small amount of data for later use and there is never more than one process reading and writing the file you can use anything you want. But as you scale up in size or access you will start hitting limits with this naive approach. Look up ACID for a minimum every serious database software must fulfill.
AirTuna@reddit
None of those are a core requirement to meet the definition of a database, though.
Irverter@reddit
The same way a book is not a library?
It may hold the data, but it's lacking the infrastructure used to manage it.
BrianHuster@reddit
I didn't say database is not formatted data
AlterTableUsernames@reddit
You made a distinction between database and a 'data format', which is just data formatted in a by the format specified way. May that be as it is. What is the difference between a database and a data format then, if JSON is not a small database?
BrianHuster@reddit
You are the reason why logic is important. A contains B as a feature, doesn't mean B is A.
Your question can already be answered with just a Google search, but I will take the strongest point : A database is a software. Is JSON a software?
FistBus2786@reddit
"database: A collection of data arranged for ease and speed of search and retrieval. An organized body of related information."
I'd say JSON fits the definition of a database.
See: dictionary.
BrianHuster@reddit
Ok, I'm wrong about that database is a software, but still, JSON doesn't fit that definition. JSON is not "arranged for ease of search and retrieval". Technically it is just text. To actually get information from a JSON file, you need a parser to convert that text to object from where data can be taken. So it's not a database.
DueToRetire@reddit
A database is not a software lol
Virtual_Ordinary_119@reddit
Ok, now tell me about referential integrity in JSON
Nicksaurus@reddit
Calling it a database implies that it has some functionality to index and query the data, so it's not entirely on the client to do that
discordhighlanders@reddit
Just an easy way to interact with similar data with-out needing a class. If you want to actually use it as a database, you have MongoDB.
_illogical_@reddit
Especially if you want it to be web scale
Sanderhh@reddit
It has sharding!
arwinda@reddit
*and
SuperGr33n@reddit
So echo and grep?
Phosquitos@reddit
What's wrong using SQLite?
hazyPixels@reddit
This is some serious early days of UNIX shit. Like 1970s. Back in the day when being an "accomplished awkster" was a status thing.
OK I guess I'm old. I'll go back to cleaning my VT100 keyboard. Carry on.
BranchLatter4294@reddit
Why do people do this?! There are plenty of good database management systems. Don't reinvent the wheel.
Chance_of_Rain_@reddit
Well, we’re talking about Linux here
BranchLatter4294@reddit
There are lots of good databases for Linux.
s1gnt@reddit
you mean https://www.gnu.org/software/recutils/manual/recutils.html?
kronik85@reddit
That's a neat tool, didn't know about it until now
Chance_of_Rain_@reddit
I know, I was talking about people liking to reinvent the wheel on Linux
nirvana1289@reddit
Because the point is not to build a database but to introduce readers to common data manipulation command line tools for Linux
BranchLatter4294@reddit
And yet, the headline is about building a database... Go figure.
nirvana1289@reddit
The headline is “Linux Built-In Tools Are So Powerful…”. The rest is the example that is used to demonstrate the claim. The fact the example is a database is only a fancy pick for an example.
zquzra@reddit
Well, I'm a amateur carpenter. I could simply buy a new chair, but I enjoy the craft.
lurco_purgo@reddit
Here's how you can practise drawing a dog...
"Why do people do this?! There is plenty of great art already! Also CharGPT exists, don't reinvent the wheel"
The fact that your comment is getting upvotes and in a Linux subreddit of all places is kind of depressing to me...
BranchLatter4294@reddit
Off to go snow skiing on roller skates.
Leliana403@reddit
It'd probably be good for you get outside actually. You seem way too upset over someone spending their own time making a tool that has nothing to do with you and doesn't affect you in any way.
A_for_Anonymous@reddit
It's great to be proficient in these commands for quick hacks. When you're dealing with stuff interactively, you don't want CREATE TABLE, query optimisation, etc. You want quick and dirty as long as you can afford to execute it. Also for pipes and streaming, which is as easy as powerful.
Sure, bash and coreutils hacks are hackish. But it's so bad it's good. Quick, compact, easy to remember, gets the job done. Until when it doesn't, which is when you want to start using Python and whatever.
s1gnt@reddit
No python please, why downgrading from shell?
natermer@reddit
If you know what you are doing then awk/grep/sed/cut/etc will blow any database out of the water in terms of performance. These things were optimized to run fast as hell on systems from the 1980s.
If your goal is to simply process information then it is a mistake to turn your nose up on them.
BranchLatter4294@reddit
On my way to drive some nails with a blender.
natermer@reddit
By the time it takes for most tools to even start you could be through 100GB of data.
Leprecon@reddit
I don't think this is meant as a serious databasa proposal...
BranchLatter4294@reddit
Off to make a smoothie with my printer.
Leprecon@reddit
Thats the spirit. You will be an excellent software developer one day!
zargex@reddit
because it is fun
BranchLatter4294@reddit
Well, I'm off to set up my company's payroll system on Photoshop.
Fast-Top-5071@reddit
Don't you mean GIMP?
BranchLatter4294@reddit
It doesn't really matter... If you're willing to use a text file as a database, anything goes.
zargex@reddit
I would choose another project, but go ahead lol
emmfranklin@reddit
That's ok brother. Let people try something from scratch for their own enjoyment..
BranchLatter4294@reddit
Off to take a Caribbean cruise in my Ford 150.
emmfranklin@reddit
😅
gitcheckedout@reddit
People who have free time.
Michaeli_Starky@reddit
Free time? What's that?
PearMyPie@reddit
You're on reddit, don't pretend you don't know what free time is lol
Michaeli_Starky@reddit
Ahhhh so that's what free time is
duva_@reddit
It's ilustrative
jr735@reddit
I assume people would do it as a learning exercise, not to have something to use daily. CS courses routinely have people write programs that have already been written. Being tasked to write a bubble sort to pass CS is not reinventing the wheel.
AryanPandey@reddit
Just for fun, not for production, I promise
ReallyEvilRob@reddit
I'm sure this will scale beautifully.
A_for_Anonymous@reddit
It doesn't have to. This is for hacking, personal lists, streaming with pipes, etc.
If you start building a broom shack with whatever plank leftovers your dad stored for decades, will somebody come and say "hey anon, surely this can'y be 100 floors high or resist nuclear disasters hahaha, you should invest 10 man years of engineering", because you just want a broom shack, and you have 2 brooms. Maybe 3. Scales to 5.
HiPhish@reddit
That's awfully specific.
no_brains101@reddit
Or, and here's a thought, if you want a database as a file, use SQLite?
SirArthurPT@reddit
...or any other, all of them are just files.
But the article is more about a set of Linux commands where using them as db is just the use case example.
roadit@reddit
They're not, they are servers that store data in files. SQLite has no server.
SirArthurPT@reddit
When you create a mysql db, for instance, what you're doing is creating a folder with the db name at /var/lib/mysql (if no other path at my.ini) and each table is a file (or more for indexes) in that folder.
roadit@reddit
Yes, I know, and you need a mysqld or mariadb to turn it into a database. The database is not just the files.
SirArthurPT@reddit
You need the software to interpret the files, just like you need a filesystem to interpret what files are in your computer, a word processor to open a word document, an image processor to display a JPG, a sound processor to play an MP3 and so on. But the database itself are just those files, there's no "magical place" to store data.
Likewise a SQLite db is a file but you need its software to interpret the contents of that file.
Coffee_Ops@reddit
Talking about what datatypes "are" always leads to the absolute best kinds of pedantry.
Everything is binary, databases are a myth, and this is where I make my stand.
SirArthurPT@reddit
No argue there, everything is just 0/1, how is it interpreted depends solely on the convention that those bits were ordenated.
roadit@reddit
True, but besides the point.
no_brains101@reddit
no, not all of them are just files in the way that you can just copy the file somewhere else and use it as a database there. All of them are like, technically files, sure.
But yeah fair.
Lawnmover_Man@reddit
...not quite sure about fair. Technically correct? Yes. Fair? I mean... no. Not really.
no_brains101@reddit
The second part of their comment. They mentioned that my offhand comment about not rolling your own db missed the original authors point. And I said yeah, fair, it did. But I probably should have been more clear about what part of my comment went with what part of their comment.
mattias_jcb@reddit
The point of this article flew right over your head there. :)
ethicalhumanbeing@reddit
H2
emmfranklin@reddit
That was sweet and polite..
Coffee_Ops@reddit
Get out.
SaltedPaint@reddit
Skip filesystem overhead and use a raw disk
GlumWoodpecker@reddit
:\^)
s1gnt@reddit
yeah but dd looks ugly!
GlumWoodpecker@reddit
-lousyd@reddit
I feel like calling those commands "built-in" fails to give enough credit to the awesome programmers and team that develop and maintain the coreutils package, which is not built-in to Linux. Those tools come from somewhere! Somebody had to choose to include them in your Linux distro!
"Standard" or "basic" might have been a better choice of words.
s1gnt@reddit
btw as everything in linux someone did it before you https://www.gnu.org/software/recutils/manual/recutils.html
s1gnt@reddit
Lol that was funny to read, this article really stretches what database mean
I have alternative solution:
and there you have it!
SELECT FIELD WHERE ID=PRIMARYKEY FROM TABLE
is as simple ascat /TABLE/PRIMARYKEY/FIELD
Jahf@reddit
I did this for a CGI (not graphics, think pre-PHP server side web applications) back in the mid 90s to drive a local realty database.
I was a hack. My code was spaghetti. It was entirely in C shell (Perl was just becoming popular at the time, JavaScript hadn't quite happened yet, and for whatever reason I had a hate for sh).
Have fun with this as a learning exercise but, don't use it for anything significant.
Awesimo-5001@reddit
But did it work? That's all that matters, really.
urmyheartBeatStopR@reddit
Filesystem is like a database tbh.
moderately-extremist@reddit
Somebody needs to add support for this to Sqlalchemy.
Glowworm04@reddit
its just a fun exercise, no one is saying amazon should start using this
ourlastchancefortea@reddit
Maybe they should
_-Kr4t0s-_@reddit
Please don’t do this.
If you’re storing large enough amounts of data there are real SQL and NOSQL databases to work with, and if you’re not, then just dump a dict/hash to JSON or YAML and load it entirely into memory when you need it.
Working with text files like this is the dumbest idea ever. It’s tons of added work for something that’s less performant and less useful than the alternatives.
A_for_Anonymous@reddit
BraneGuy@reddit
lol love how the second command they wrote:
echo “Take out the trash:$(date -I):3:open” > tasks
Will overwrite your entire database. Should be >>
BraneGuy@reddit
lol love how the second command they wrote:
echo “Take out the trash:$(date -I):3:open” > tasks
Will overwrite your entire database
FryBoyter@reddit
The saying "Why make it simple when you can make it complicated" is probably quite true here.
I definitely stick with tools like DBeaver.
turtle_mekb@reddit
or you could just use a proper database format/software? apart from being a fun excercise hobby thing, why reinvent the wheel?
matj1@reddit
File system is a database, and most operating systems have a file system. So they already contain and manage databases with no extra effort.
CaptainObvious110@reddit
Wow
Healthy-Intention-15@reddit
or you could just use sqlite.
PeriodicallyYours@reddit
Wrap it into SSI, and here we go, a DB driven site without any DB or even a scripting language.
dr_entropy@reddit
It would be very fun to take a sql parser and see how far you can get converting queries to executing with only core utils.
elatllat@reddit
At least use an index with postmap, search, etc.