PyNeat: Deep structural code refactoring in Python using LibCST

Posted by AssociateEmotional11@reddit | Python | View on Reddit | 11 comments

Hi r/Python,

I spend a lot of time reviewing code and noticed that while formatters fix styling, they don't fix structural logic flaws. So, I built PyNeat, a CLI tool based on `LibCST` to perform deep structural refactoring.

### What My Project Does

PyNeat scans your Python AST in a single pass and automatically refactors structural anti-patterns while preserving 100% of your original comments and whitespace. It currently fixes:

**The Arrow Anti-pattern:** Flattens deeply nested `if/else` using guard clauses.
**Dangerous Evals:** Safely converts `eval()` to AST literals.
**Empty Excepts:** Detects silent `except: pass` failures.
**Mutable Defaults:** Fixes the infamous `def func(items=[])` memory leak.
**Identity Checks:** Upgrades `== None` to `is None`, and `is 200` to `== 200`.

### Target Audience

This tool is intended for production use by developers, reviewers, and teams dealing with legacy codebases (or quickly generated boilerplate) that need structural clean-up without breaking logic. The current v1.0 MVP is pure Python, but a Rust rewrite (`pyneat-rs`) for massive multi-threading is on the roadmap.

### Comparison

* **vs. Black / Ruff:** Black and Ruff are formatters and linters. They fix whitespace, line length, and warn you about bad code. PyNeat actually *rewrites the logic* (e.g., inverts an `if` condition and injects an early `return` to flatten the code).

* **vs. Built-in `ast` module:** The standard `ast` module drops your comments and whitespace when unparsing. PyNeat uses Instagram's `LibCST` (Concrete Syntax Tree) so the output preserves every single comment and blank line exactly as you wrote it.

**Links:**

* **GitHub:** https://github.com/khanhnam-nathan/Pyneat

* **PyPI:** `pip install pyneat-cli`

I would love to hear your brutally honest feedback on the architecture, the AST traversal approach, or any edge cases I might have missed!

[-]

Python-ModTeam@reddit

Hello from the r/Python moderation team,

We appreciate your contribution but have noticed a high volume of similar projects (e.g. AI/ML wrappers, YouTube scrapers, etc.) or submissions that do not meet our quality criteria. To maintain the diversity and quality of content on our subreddit, your post has been removed.

All showcase, code review, project, and AI generated projects should go into the pinned monthly Showcase Thread.

You can also try reposting in one of daily threads instead.

Thank you for understanding, and we encourage you to continue engaging with our community!

Best, The r/Python moderation team

[-]

AssociateEmotional11@reddit (OP)

As I can see that the downvotes highly outweigh the upvotes for 55 percentages . Therefore , please give me feedbacks so I can know what exactly is wrong with my project! Thank you for reading!

[-]

Alex--91@reddit

I’ve not voted either way but I suspect part of the problem might be that you said ruff is only a formatter and linter but ruff does indeed operate on the AST and it can indeed do —fixes like replacing == None to is None etc. The other part seems to be that people who have tried it find it quite destructive? ruff, for example, shows the issues but doesn’t fix them unless you explicitly pass the —fix flag.

[-]

AssociateEmotional11@reddit (OP)

[-]

yaxriifgyn@reddit

I'm going to try it out later today. I'm hoping that it has a dry run or diff output mode to let me review changes before applying them. The need for a backup copy of every changed file is essential as well.

I have many places where code flattening could be applied. When I wrote the code, I was thinking of a single exit from a function so that explicit code tracing / debug logging was only needed in one place. As the code has matured, the need for such logging has diminished.

Could you comment on your use of AI if any. Thx.

[-]

AssociateEmotional11@reddit (OP)

[-]

yaxriifgyn@reddit

I do use git, but I have a lot more tools available. I use this to compare the files:

diff -bB -W150 -y messy.py clean.py

I don't expect to have to repair the code before I can even test it.

[-]

AssociateEmotional11@reddit (OP)

[-]

wingtales@reddit

The solution to your first paragraph is to track your code with git. Either stage or commit your current changes before running PyNeat. Then you can see the diff with 'git diff' and undo the changes with 'git restore .'

[-]

yaxriifgyn@reddit

I finally got around to testing this code. I'm glad I was very careful because the *.clean.py file was completely destroyed version of the input file.. The messy file was the definition for a large dataclass class with __post_Install__() and __repr__() methods.

imports were moved above the initial file comments and blank lines.
Then first or only line of import statements were moved out of try blocks, and placed with the other imports at the top of the file.
The class token in class definitions were capitalized.
The EOL, (newline keyword in open()) as changed from LF to CRLF on Windows 11 (pro).
I think the file encoding will be forced to UTF-8, regardless of input file encoding.
The capitalization of properties was changed to lower case, so un-neatened files that import this file will have errors. The Public_Key property was renamed to public__key. Also property Pfx_ABC became pfx__a_b_c, but property ABC was not renamed.
The literal parts of f-strings and comment text were changed to reflect the above property name changes.
I did not see a clear way to turn off individual rules, the options to disable rule groups did not explicitly show which rules were in which groups.
I did not see a in-code method to turn on/off neatening.

My overall impression is that this code needs more testing with a large selection of real world hand-coded Python code. It's a start,but it need much more work.

[-]

AssociateEmotional11@reddit (OP)

Hi u/yaxriifgyn,

First of all, thank you so much for taking the time to test PyNeat on a real-world, complex dataclass file. This incredibly detailed feedback is exactly the kind of "stress test" an MVP needs, and I deeply appreciate your brutal honesty. You are completely right—the tool was too aggressive and mangled the file.

Here is a breakdown of what went wrong and my plan to fix them based on your points:

1. The Import Hoisting Bug: The current import transformer is too naive. It blindly grabs all Import nodes. I will update the logic to check the parent context so it respects try/except blocks (for optional dependencies) and module-level docstrings.

2. Aggressive Renaming & String Modifying: Forcing snake_case was a mistake, especially since it modifies strings/comments and breaks public APIs. A structural formatter should not touch string literals. I will disable property renaming by default and scope the visitor strictly to Name nodes.

3. I/O (EOL & Encoding): I will update the file reader/writer to preserve the original newline configuration and encoding rather than forcing UTF-8/CRLF.

4. Granular Control: You made a great point. Adding an inline # pyneat: ignore or # pyneat: off mechanism, along with specific CLI flags to disable individual rules, is now the top priority for the next release.

I am opening GitHub issues for all of these edge cases right now. As you said, it’s just a start, and real-world testing like yours is what makes open-source tools better.

Thank you again for pointing me in the right direction!