Single file Python CLIs when do you split, when do you keep it monolithic?
Posted by Beneficial_String411@reddit | Python | View on Reddit | 44 comments
Working on a tool that's grown to \~4000 LOC in one .py file. argparse + 18 subcommands, stdlib + pyyaml only. Tests are in a separate dir.
Single-file has been great for:
- Debugging (one file to grep)
- Distribution (one wheel, no package layout decisions)
- Onboarding contributors
But I'm starting to wonder if it's worth keeping monolithic at this size. What's your threshold for splitting? Is it LOC, or coupling, or "I can't navigate it anymore"?
teleprint-me@reddit
Dont think about it too much. Its just opinion and personal preference.
For example, bottle is just one file.
A lot developers pride themselves on that alone. IDK why either, it doesnt mean anything to me personally.
Regardless, what matters is your style is consistent and you follow good programming principles when it makes sense to.
Ill split among many files too early and then abstractions creep in and then I have to refactor because I abstracted to early on.
Once the code takes shape and if it makes sense to, you can easily migrate the code if the code itself is already logically composed and easily maintainable.
pvkooten@reddit
I would recommend using my library that I released recently. Just tell an LLM to "replace argparse with cliche" - I think it will one shot it and remove A LOT of code :) see https://github.com/kootenpv/cliche
StrayFeral@reddit
I am in this boat right now too.
Normally I always split my sources - 1 class per file. But since currently I am also building a CLI/TUI tool and for the sake of portability and simplicity for the end-user I keep it all in a single file, currently 2300 lines. I find it harder to navigate, but I will keep it like this to the end.
Again - I dislike it a lot and normally would split it into files. This project is an exception.
ZeD_est_DeuS@reddit
Ok I'll byte First of all... for obvious reason at the end you do you. These are my feelings about it and you just asked for them, but at the end you'll do what you want
I'm sceptical about the "greatness" you have listed...
Grepping one or three files is trivial, and moreover I would not calling it debugging. About that I suggest you to look at an IDE (I'm AN old fart and suggest eclipse+pydev, but anything with a visual debugger will do)... And you'll see that opening even 10 files at the same time will be trivial.
Even the distribution woes are really not an issue.. just follow a "standard" layout and the tools will do the thing for you. It really bothers you to choose between a "src" folder and "all together"? You already have test in another folder, so I assume you have already chosen between 'test in wheel' or not.
For the onboarding... I feel that this project is so small that it really doesn't matter if you have 1 or 3 files...
FWIW I prefer to put the optparse (yeah, I really prefer it over argparse) setup in a separate module in my projects because I prefer to have the "core" of my tools in a dedicated place. As a bonus it's easier to recycle my module as a library in gui applications
In reality I think that you have already decided to split, but you have not admitted to yourself :) Why bother asking strangers for opinions otherwise?
mtik00@reddit
Dang, optparse! Old school, nice.
Why do you prefer it over argparse? I'd love to hear your thoughts. I haven't thought about optparse in a decade or so.
_Alexandros_h_@reddit
There is a method to make a multi file project a single file. There is this method called ZipApp where you create a directory with all your code, create a
__main__.pyfile as your entrypoint, bundle your pip packages and then it creates basically a zip file (.pyz) that you can run without even minding about dependenciesamroamroamro@reddit
nice solution, i haven't seen this before :)
didntplaymysummercar@reddit
Exactly. This is what I did in the past (and recommend doing) for a large self-contained script with many sub-commands.
SteelRevanchist@reddit
Most of my scripts are below 200 lines.
Substantial-Cost-429@reddit
The coupling threshold is usually a better signal than LOC. 4000 lines in one file is fine as long as you can answer "what does this function need to know about?" for any part of it.
Two practical heuristics I've found useful:
**Split when imports diverge** — if different parts of the file would need different optional dependencies (or different test setups), that's a natural module boundary.
**Split when ownership splits** — if two people would ever reasonably be working on different parts simultaneously, single-file becomes a merge conflict factory.
18 subcommands in one file is on the edge — I'd probably go with a `commands/` directory at that point but keep a thin `cli.py` entry that stitches them together. Still one wheel, easy to grep, but each command is an independently navigable unit.
Side note: we build CLI-driven AI agent tools and recently open-sourced a config repo with 888 stars at https://github.com/caliber-ai-org/ai-setup — our CLI tooling started single-file too and we split around the 3k LOC mark. Exactly what you're describing.
tunisia3507@reddit
How are your users interacting with the file? If it is installed or requires an environment, then it should have been in modules 3500 lines ago. If you are emailing them a script to run, I can see the argument for keeping it single-file.
I can see you use pyyaml. Is that specified using inline script metadata, fine. If not, you should be packaging it with a pyproject.toml anyway to specify the dependency, in which case you may as well use multiple modules.
aminoy77@reddit
4000 LOC is past the threshold for me, but LOC isn't
the real signal — it's when you can't hold the whole
thing in your head during a debugging session.
My rule: split when a subcommand needs its own mental
model to understand. If reading subcommand A requires
knowing how subcommand B works internally, that's
coupling that a monolith hides and a package exposes.
Went through this with HelloChusquis — CLI agent that
started monolithic. Split when the provider fallback
logic started tangling with the tool execution logic.
Each now lives in core/ and the main file is just
wiring. Still greppable, easier to test in isolation.
18 subcommands with stdlib + pyyaml only sounds like
you're close to the split point but not there yet.
The question is: can a new contributor understand one
subcommand without reading the whole file?
ComprehensiveJury509@reddit
I personally hate navigating files that are longer than \~500 lines. If handling multiple files is a problem to you, then that sounds like your IDE sucks and you should probably change that. How it's easier to onboard someone with poorly organized code I'd be curious to understand.
Beneficial_String411@reddit (OP)
the 500 line ceiling resonates that's roughly the point where i lose the mental map of "where is X handler" and have to ctrl+f. fair.on onboarding: it's not "easier" it's "different." single file lets a new contributor read top to bottom in one pass and understand the whole surface. multi file forces them to follow imports and build the mental model bottom up. neither is universally better, but for tools with one logical entry point and tight surface, monolith reads fast.curious what your line ceiling is in practice, and whether you split by domain or by visibility (private helpers vs public api).
ComprehensiveJury509@reddit
You do you, but I honestly don't believe that any serious programmer should think that following imports is a hurdle.
liitle-mouse-lion@reddit
Maybe OP is using Notepad
sylfy@reddit
CLI commands and subcommands already provide a natural boundary along which to split and organise your functions. There is no scenario under which a single file is better. It is most definitely not better for a new contributor if they have to parse a 4000 line file.
RazorBest@reddit
I have to say that splitting your project into submodules doesn't mean you immediately have to create a deep hierarchical structure. You could also flatly split your file into 3.
If you do the splits by responsability, someone that's going to add a new feature, will probably be able to work with one file at a time, and only start writing in the next file when they're done writing in the current one.
If someone has to frequently jump between files throughout development, I consider that to be bad code.
And for looking up function or class definitions, IDEs can very easily solve that.
No_Soy_Colosio@reddit
You split as it grows and you get a good feeling of what goes where
NerdEnPose@reddit
I mean, you do you. HTMX is one file. If there’s no reason to split into files don’t do it just to do it.
Beneficial_String411@reddit (OP)
totally fair. bottle.py stayed single file forever and worked. i think the trigger isn't LOC it's when you start "importing from yourself" across logical boundaries inside the same file via class instances or module-level state. once that gets gnarly, splitting actually helps.
curious if anyone has a heuristic beyond gut feel.
Ran4@reddit
Because you don't have a gut? 😂
NerdEnPose@reddit
Sorry, I never answered your questions. It’s never LOC that’s arbitrary. I split by design patterns or more basically by responsibility. Throw file IO into a file and later maybe a package. Same for API clients etc
Beneficial_String411@reddit (OP)
that's the cleanest framing i've heard for it "by responsibility, not by LOC." the part i'm wrestling with: at what point does a responsibility deserve its own file vs just its own clearly named section of one file? for a CLI with \~18 subcommands, each command could be its own file, but that's also 18 imports for what's essentially one logical entry point.leaning toward: file per responsibility once the responsibility has its own state, dependencies, or test surface. comment section in monolith if it's just "here's a function group."
NerdEnPose@reddit
Utmost respect here. It just doesn’t really matter. It’s going to come down to preference and this thread is just a lot of people stating preference. There are a few good reasons but the ones I can think of are all organizational. For example is one team owns the file IO, you put that in a package and update CODEOWNERS and put an AGENTS.md in that directory. The other is you run more risk of merge conflicts on bigger files.
But the rest is just preference. At work I have 5k+ LOC files and repos that have tons of imports and really broken out structure. I navigate both fine, and if you can’t it’s a skill issue where you need to get better with your IDE.
The hill I will die on is consistency. Pick a strategy and stick to it in the same repo.
OrthelToralen@reddit
I usually do a file per command in my CLI apps. 18 imports is worth the cost to know exactly where every command’s code lives at a glance.
AliMas055@reddit
Start writing in the same file. You can always move to new file and import.
Effective-Total-2312@reddit
I don't do files longer than a few hundreds. Ideally keeping them below the 100. Only exception are vibecoded tools I don't really care about because are literally very small personal tasks, so if the AI made it in 1000 LOCs inside one file I don't mind
misterfitzie@reddit
My rule for going to multiple files when I notice that it's become messy/scattered. Cannot navigate it anymore is one heuristic , but it's more of matter of taste. These days when I know a program is going to be of a certain size I chop it up from the start, so I don't really run into this issue so much, but usually a big reason for me deciding to split up some code is because I want to "strengthen a concept". By pulling related code into a separate file, it's usually done because I want it to have a stronger individual identity, even if it is only has one "user".
The_Seeker_25920@reddit
Thats huge lol, use click and keep them to <300 lines IMO. It’s not hard to manage.
Meleneth@reddit
for me the minimum viable product is a python module, so I'm already DQ'd by the time we're talking about 'one file scripts'.
I like dependencies. I like being installed as an executable. I like being able to work on one tiny piece of the functionality, without risk to the entire codebase.
WiseDog7958@reddit
I’ve tried both. single file feels great early on -> easy to grep, easy to ship. but once it grows, it’s not really the line count that hurts, it’s when changes start touching random parts of the file. that’s when it gets annoying. I usually split when I catch myself scrolling too much or jumping around to understand one change.
Challseus@reddit
I would do a module per subcommand, then just import them into the core CLI file. You can grep over multiple files just fine.
Think about how easy it would be to onboard someone by saying, "go checkout X module. That's where the code for subcommand X lives. It needs some fixes." I'd rather open a very small file and make changes and feel comfortable I'm not going to break everything, versus a large one.
Finally, if you're going to have contributors and presumably using GitHub, splitting up the files is going to save you massive merge conflicts down the road.
Chasian@reddit
I don't have a strong opinion on your real question
but I just think you're an AI. Like three of your comments have the "it's not x. It's y" pattern and it's just, i'm sorry i can't get past it lol
if you are a real person, I would encourage you to not use this pattern
Legendary-69420@reddit
I believe all the CLI related code (basically function calls) should be in the same file but the logic itself should be in seperate files.
Also Typer is probably the bets thing for Python CLIs
billFoldDog@reddit
I write alternative entrypoints.
Say the package is converter.
One entrypoint will be
python -m converter.convertcli< and another will bepython -m converter.clipcliand another will bepython -m converter.moviecliThen I write bash wrappers that have something like
python -m converter clipcli @$and Ill name that scriptclipand drop it in the bin folder on my path.Atlamillias@reddit
Not really a matter of line size for me but complexity. A handful of small components that represent a complete system go together. If a component becomes too complex, has too many proprietary parts, or I have a lot of internal utilities for it specifically, that single component get split into its subcomponents and goes in a seperate file. If I do this several times, that system becomes a package (I typically stop at 1 level because nesting is kinda gross imo).
If your file was 4k lines and you have a handful of structures that, while large, are clearly individual self-contained units that form a system I'd probably keep it as is.
434f4445@reddit
It’s not so much a line count. I follow the principle that each feature class should be its own file. Similar to how compiled languages lay out their framework. You should clump like functions together in a class and that class should be a file. You can have multiple like classes or parent child classes in the same file but I tend to try to keep them separate. It works better for modularity and when editing things on a multi person code repo it’s easier to avoid merge conflicts than the monolithic script. But idk I’m also open to other ideas. There really isn’t one correct way to handle it.
OrthelToralen@reddit
I look at it like this. Is it easier or harder to reason about the code split into more than one file? Usually, a clear separation of concerns, with functionality split into separate files, makes it easier.
The file system serves as a proxy for how the app works. In my codebases, core business logic lives in repositories, a service layer provisions the logic to the rest of the callers in the app. An API service provisions API requests, a database service provisions database requests, a database service provisions database calls.
So, if I change how a key logical function works, or swap out parts of the infrastructure, I just need to make the change in one place rather than refactoring every caller. For example, if I change database providers, I just change one bit of code and the rest of the app goes on functioning normally.
You could do the same thing in one file I suppose, but dividing things up explicitly in a separate file makes the code easier to understand.
GREP doesn’t really care how many files it is searching through it will work just the same. So this is a non-issue.
For me, a 4,000 line single file would drive me nuts. But, it’s really a matter of personal preference of you and the other contributors to your app.
diegotbn@reddit
I would not refactor since you're already doing what you're doing and I would hate to introduce a regression. 4000 lines is a decent amount of risk. Last time I refactored a file that big (models.py in a Django app, about 2000+ lines) I caused a huge headache and ended up reverting to a prior commit and loading database backups. Luckily I never hit prod with my changes and it only affected subscribers to the opt-in experimental update channel.
But last time I designed a CLI for work I deliberately built it from the beginning to be modular with respect to Separation of Concerns and to be easily expandable. I've got no complaints from other devs building on top of my work. I think it was the right way to do it and I'm glad I did.
kBajina@reddit
Experiment and see how you like it. Personally I would dread digging through a 4000 LOC file, or like having to have the same file open in multiple tabs just to look at different sections of code.
sausix@reddit
When you don't split into multiple modules you may regret it later.
Many packages now have to maintain compatibility imports after transitioning into multie modules. Usually an init.py importing from the sub modules. Not too bad but I would avoid this early.
pacific_plywood@reddit
I generally dont like to get above 500-1000 lines per file, but also, libraries like click help a lot with brevity
Beneficial_String411@reddit (OP)
agreed on the click point it removes a ton of argparse boilerplate. the only reason i stayed on argparse was the "stdlib only" constraint i set early; in retrospect click would've cut maybe 400 LOC and made the file feel smaller.on grep / distribution: yeah, 4k is nothing i think the real win of the single file constraint isn't ergonomic, it's psychological. when you can't add files, you start asking "does this really need to be its own concept" earlier. multi file makes it too cheap to add abstractions.robably overrotated, but it's been a useful forcing function for this project.