Codebase has hundreds of isinstance() and getattr(). How to convince colleague to fix?

Posted by melesigenes@reddit | ExperiencedDevs | View on Reddit | 119 comments

codebase is littered with isinstance() and getattr(). I hate both and to me these are hallmarks of LLM generated code + not reviewing the code. getattr() to me should only be used for dynamically generated attributes/attributes whose names you don’t know in advance which is an extremely rare case. you could just do x = inst.attribute for 99% of cases. isinstance is generally atrocious code. like you should be typing at the boundaries and then trusting the types instead of passing Any in and then checking each attribute or each object if it’s the right type so you fail at input instead of some undetermined amount of steps later (and plus spreading these checks everywhere). if something can legitimately be multiple types (that’s already smelly to me) then ok maybe isinstance branches make sense but I can’t think of legitimate production level use cases of isinstance

I thought this was because my colleague was adding LLM code and then not even checking it but I found out today it’s on purpose as defensive checks. it makes no sense to me.

there’s code like this

def extract_text(input: Any) -> str:

output = getattr(input, “text”, None)

if isinstance(output, str):

return output

return “”

like everywhere in the code. like whole dozen line blocks parsing a dict deep into the code by checking isinstance everywhere. I was going to submit a bunch of pull requests where I fixed these. i wrote about these in my review and this person wants to keep them because you “don’t know if the input will change”.

above code could be solved by typing input and having text set to str. instead there’s hundreds of isinstance checks because none of the inputs are either validated or they are validated earlier but not trusted.

I want to convince my colleague that this is shit code but apparently datadog encourages isinstance usage because it’s more flexible https://docs.datadoghq.com/security/code_security/static_analysis/static_analysis_rules/python-best-practices/type-check-isinstance/

am I being irrational for thinking isinstance in general is smelly code? Like it makes me irrationally angry. I don’t know what to do here without stepping on toes

[-]

martiantheory@reddit

It really hurts my heart to see the industry go this way where we just hand over the reigns to LLMs. It almost feels like the early days where people got paid by KLoCs. As if code output is the primary measure of value. I honestly feel like we're cooked.

I know we're not literally handing *all the power* over to LLMs, but the blind push to integrate AI (in spite of quality concerns) is something I've seen on reddit and in my conversations with the friends I've gained in 15 years of coding in corporate America. About 80% of my conversations around engineering jobs have something to do with AI quality issues. It's kind of depressing to see.

[-]

red_hare@reddit

I'd fix this with linting:

Enable ANN001 with flake8 or ruff to get type hints on everything https://docs.astral.sh/ruff/rules/missing-type-function-argument/
Add mypy. Mypy should now flag them as unnecessary.
Add ruff and mypy to your CI/CD
Add a pre-commit hook or .claude rule (for *.py) file to always run ruff and mypy before committing / after changes so their AI agent sees the bad use and fixes it

[-]

2fplus1@reddit

A check with isinstance() isn't by itself a problem. I think the bigger issue is that as you say, it's everywhere in the code. To me that sounds like there isn't really an architecture in place or if there is/was, it's being ignored. Defensive checks like this should be done at the boundaries of the system where user or untrusted input is coming in but the core should be properly typed and can expect to get known types as input and not have to check and re-check all over the place. I like the framing in https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/

[-]

aaaaargZombies@reddit

This is the go to blog post on the subject https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/

You need to have a way of enforcing invariance or they are probably correct to include them.

[-]

ViperG@reddit

LLM's do this, alot

[-]

Lucky-Acadia-4828@reddit

Even worse, LLMs will aggravate thsi when they see more of this. So it's going down spiral

[-]

New_Enthusiasm9053@reddit

They also won't use pydantic when doing JSON loading and validating(or yaml or dicts etc). They'll make their own spaghetti bug prone implementation of pydantic.

[-]

Lucky-Acadia-4828@reddit

Tbh it depends on your existing codebase + agents.md

My codebase is heavy pydantic and i strictly add that instructions. Using at least sonnet 4.6 or opus never fails to follow this rule

[-]

New_Enthusiasm9053@reddit

Sure, but if you don't it'll happily just send untyped dictionaries everywhere. AI really isn't useful in the hands of the inexperienced for this reason. You have to know you want to use libraries(and which) when possible at minimum.

Laziness used to be the impetus for people to learn which libraries to use but with AI that doesn't exist anymore unfortunately.

[-]

Lucky-Acadia-4828@reddit

100%

[-]

xaraca@reddit

I imagine they were trained on a lot of code that predates type annotations. This used to be standard practice.

[-]

midniteslayr@reddit

Yup. I’m getting flashbacks to what it was like working on code in the early 2000s. This shi- was everywhere.

[-]

VEMODMASKINEN@reddit

Just tell them not to.

[-]

nullbyte420@reddit

Yeah you're being irrationally angry. Python does not care what type it's passing. It's a fine strategy.

[-]

pgetreuer@reddit

Right, this just looks like defensive input handling. "Don't know if the input will change" is a reason to do that. If an API is meant to work with JSON or nested dicts of flexible structure, there will be some runtime type parsing like this.

[-]

New_Enthusiasm9053@reddit

Yes but that's what pydantic is for. Little reason to do that yourself. It's also got a rust core so not only do you get free good documentation and battle tested validation code it's also faster.

[-]

pgetreuer@reddit

If an API is meant to handle input flexibly, as in, structural differences like "this object might be a str or a dict," then yes we can use pydantic (or mypy, etc.) to annotate a sum/union type ("str | dict"). Agreed that that's a good thing to do to enable better static analysis than with Any.

The issue is that the runtime handling an input like "str | dict" does often depend on which type it is (again, supposing the API is actually meant to handle both). Some conditional branches or other means of dispatch are needed to handle inputs of one type vs. another.

It's fair to question whether an API ought to have such flexibility to begin with, considering these complications.

[-]

New_Enthusiasm9053@reddit

Sure for that specific instance it's fine and the stdlib does it too. I'm just not generally convinced it's a great idea in most cases. If you do have an API surface that does this you'd do it at the boundary and then cast it to one consistent type as soon as possible.

Otherwise you end up revalidating the types everywhere constantly and it's just clutter that obscures the logic of your code.

You're right though it is sometimes necessary due to Python being dynamically typed but I'd consider it a code smell to see it anywhere except the external API surfaces of your code.

[-]

1One2Twenty2Two@reddit

That's how you end up with a shitty code base and people who hate Python.

Any decent Python codebase should be typed and use static analysis tools to avoid that kind of mess.

[-]

TheBinkz@reddit

I agree for the most part. Yet sometimes the checker cant correctly identify the type. So you may need to include that isinstance.

[-]

DrShocker@reddit

In what circumstances do we genuinely have both no idea what the incoming type is and need to care about the specific types with isinstance? Sorry if it's obvious, but I don't use Python for anything more than small scripts basically.

[-]

Buttleston@reddit

Any time you take input from an external source. So, you check type at those edges using something like pydantic, and after that you know what the types are and use type hints

[-]

DrShocker@reddit

Your external input would be something like a string or json or something right? which you parse into a known struct. I have never worked on something where the shape of the data coming into the program was entirely unknown outside of template stuff like maybe `std::vector` or whatever, but even then, you're able to lock down the properties you need for it to work, and it will reject things it's invalid for. Rust's tratis solve that a bit better than C++ concepts, but still it seems rare that I need for the type to be of a certain instance, just for the object to have certain properites or methods.

[-]

Buttleston@reddit

You can not trust data from external sources even if you "know" what the structure should be

That's what for example serde is used for in rust. Your structs are annotated with serde and when you deserialize into your struct it will fail if the json doc does not match. If it doesn't fail then you have guarantees on the structure of the data. Pydantic plays a similar role, once you have deserialized json with pydantic you know the data is in the correct shape

[-]

DrShocker@reddit

Parsing data from one representation (a string) to another (a specifically typed object) is an entirely separate problem from whether your internal functions are type-annotated. If I know my internal function requires a Car, there's never a reason to expect someone to feed in a raw JSON string—that's misusing the API.

The boundary function responsible for constructing a Car should take in the string, identify if it's valid JSON, and determine if it has the correct shape. Every step of that, you know the type you're able to work with (string, then maybe "any" while it's being validated for being a Car,then Car). At no point do we have an API that we can just throw a URL or a box of car parts in to get a Car, unless we specifically add the code to handle taht case.

In Python, this boundary parsing is often where runtime type-checking (like isinstance or tools like Pydantic) actually happens, because json.loads() returns Any. But once it passes that boundary and becomes a valid Car object, the rest of the codebase shouldn't be riddled with isinstance() checks. That’s completely different than allowing mystery data to get past the parsing steps and throwing isinstance() checks everywhere because you don't know what type it is anymore.

I can't imagine that ever being useful. Data doesn't just materialize from nowhere. It always enters the system as a known primitive, even if all you know at first is that it's a raw string or a binary blob. Which you need to run some code against to see what comes out. For isinstance() to make sense imo, you'd need to be switching on the type, at which point why not just embed the function you need into the types that are accepted, or use a Union to fully document exactly which types work?

That said, maybe I'm just too used to statically typed languages. I'm genuinely open to the possibility that there's a really elegant way to handle things without constraining the types as much. I just am struggling to think of when it'd be myself.

[-]

Buttleston@reddit

Right, that is exactly what I said in my first response to you. Validate at the edges, once you know what you have, you no longer need to validate

[-]

texruska@reddit

Yeah but you could assert a specific type, not branch based on it

[-]

TitanTowel@reddit

This is the shit the outsourcers at my work produce.

[-]

ChineseGravelBike@reddit

Yeah honestly doing Type hints automatically was the first thing I used AI for it was great at that

[-]

FatefulDonkey@reddit

It's not a matter if python cares or not. It's about readability and bugs going silent.

Using getattr everywhere is just a way to hide bugs.

[-]

New_Enthusiasm9053@reddit

Nah they're rationally angry. Use type hints and none of this code will ever be hit. Code like this is symptomatic of shit spaghetti code. The stdlib does it because they have to anticipate trash inputs from users but code you control doesn't need it as long as inputs are correctly validated.

Python not caring doesn't mean you should be passing any inputs because inevitably you'll get an input you don't correctly handle because it's not possibly to write functions that handle every type ever correctly.

The only acceptable use is asserts to throw an exception if people use the wrong type to force them to use the function as expected.

[-]

oar335@reddit

It's perfectly fine python TBH, and whether it is "code smell" or not depends greatly on the context, specifically the existing conventions in the codebase and the development culture. There are definitely situations where such extreme defensive programming is warranted.

[-]

rlbond86@reddit

It's bad python, they could easily have made a Protocol.

[-]

oar335@reddit

Thank you for the correction. I didn't realize how way out of date my Python knowledge is!

[-]

VivisMarrie@reddit

What do you mean by Protocol?

[-]

rlbond86@reddit

Protocols are the way to use duck typing in Python's type annotation system.

[-]

melesigenes@reddit (OP)

Yeah I don’t think our use case needs such defensive programming. That’s kind of the issue

[-]

nasanu@reddit

You basically are demanding a lot of work be done because you want a more or less identical thing with the letters you want in place instead of theirs.

[-]

spline_reticulator@reddit

Isinstance is not a code smell. Mypy uses it for type narrowing. getattr is a code smell though. The way to fix this is to see if you're team is willing to introduce a type checker like mypy. It can be configured for varying levels of strictness. If you configure it in permissive mode then you can force your teammates to gradually make the code more type safe. If you configure it in strict mode then you will probably have to make these fixes yourself as part of rolling it out. If your team is not familiar with using mypy, expect some grumbling.

[-]

melesigenes@reddit (OP)

This is good advice. I’m going to advocate for mypy usage maybe in pre commit hooks or something

[-]

AintNoNeedForYa@reddit

So, isn’t the problem the codebase and not the developer who is trying to live with a poorly structured code base?

[-]

spline_reticulator@reddit

Most people usually do it on PR. Doing it on commit can be cumbersome.

[-]

awkward@reddit

Ripping out defensive practices based on smell alone isn't going to make you popular or solve any problems.

[-]

unflores@reddit

I tend to bring a bad practice up generally in a chapter and then i will bring it back up with an example when it eventually screws us.

[-]

awkward@reddit

This is tough because the reason this particular practice is bad is that it very slightly taxes development time, and the nature of a ripout has a high chance of breaking edge cases or causing production issues if the test suite is anything less than immaculate. That means there's very little chance to measure the drag or to deliver an "I told you so." later.

I've seen codebases where overly defensive programming represented a very significant performance issue, but I doubt that there's anything like an order of magnitude speedup available by removing these checks.

[-]

melesigenes@reddit (OP)

Yeah this is my core conundrum. It’s not like removing these defensive checks speeds anything up. It just adds to my development time because it’s not easy to follow the logic.

[-]

daHaus@reddit

Do you use an IDE for python? PyCharm is helpful for tracking who calls what

Beyond that IIRC type checks are expensive in python and by the looks of your example you're failing silently

[-]

melesigenes@reddit (OP)

Ok so like bite sized changes. Thanks for the advice

[-]

pgetreuer@reddit

From what you've described, it sounds like broader issues than code style ("there's hundreds of isinstance checks because none of the inputs are either validated or they are validated earlier but not trusted" and "you don't know if the input will change").

Shitty design and LLM coding could well be part of it. Beyond the code itself, it's worth considering what else might fuel the situation. A data quality issue, lack of communication among developers, lack of automated tests, type churn when updating dependencies, ... is there a bigger problem that is being missed? Look for where else your team might be needing help.

[-]

aa-b@reddit

For straight refactoring like this, just go ahead and make a pull request to demonstrate the before/after. Oddly, it often helps to have an LLM do most of the work because then the other dev will assume you haven't spent much time on it, so they won't feel as pressured by your idea. Also if you can't even get an LLM to explain why it's a good idea it might not be, so that's helpful too.

[-]

awkward@reddit

I strongly advise against making broad changes to the codebase that have no functional goal and remove defensive statements. That is an excellent way to catch a large number of bugs with your name on the blame list.

This pattern can't be removed without being sure the changes are bulletproof. That means significant churn. Churn without business value is waste.

[-]

aa-b@reddit

No, I think you've missed my point. I'm saying that OP should make this a low-risk waste of relatively little time, because there is no business risk in creating an experimental POC pull request that demonstrates an idea. Using an LLM to do the mechanical work reduces time cost.

Empty words about coding style are unconvincing, and if OP cares enough to persuade anyone they're going to have to demonstrate the change somehow

[-]

mxldevs@reddit

because none of the inputs are either validated or they are validated earlier but not trusted.

Well, the second part is the main issue isn't it?

If you can't trust the format of the inputs to be consistent, then what is the point of the validation?

If it's arbitrary data coming from 3rd party sources where you have no control over what you're getting, this might work, but "validated earlier but not trusted" means there's something wrong with the overall design.

Do you not trust systems within the organization?

[-]

otro-wey-mas@reddit

There are Python libraries out there that can do type checking to validate the input.

Pydantic is a great tool to do this, it does the validation and you can add the custom validations you might need. I would really recommend to keep the data validation, just do it more elegant.

[-]

1One2Twenty2Two@reddit

Datadog encourages the use of isinstance() instead of type(). It does not encourages the use of isinstance() as a remedy to missing static analysis and shit code.

[-]

Objectdotuser@reddit

if you're doing type checking and switching, absolutely use isinstance for type comparisons. that is what its there for

[-]

spline_reticulator@reddit

Using isinstance isn't is not mutually exclusive with static analysis. Mypy actually uses it to do safe type casting.

[-]

penguinmandude@reddit

It should be done on inbound data to validate its types at runtime. Internally you should be able to trust static types after that

[-]

spline_reticulator@reddit

That is true. I explained how OP should rewrite the function in another comment.

[-]

dipper_pines_here@reddit

He had the right idea but the wrong pattern. The simplest improvement is to replace that pattern with try-catch. Catch the attribute error for the specific case he’s validating. Try..catch will allow you to catch more exceptions than just attribute errors.

[-]

AngusAlThor@reddit

Man, so you've only been coding Python for like 10 minutes, ey?

Both of you are wrong; Yes, your colleague should be using type-hints that are more restrictive than "Any", but "isinstance" is still a very important and useful function for when inout can be genuinely flexible.

[-]

xXxdethl0rdxXx@reddit

I get where you’re coming from, but this is an opportunity to challenge your priors. Is it actually bad? If so, articulate exactly why, and enforce it in an automated way. Convincing other people to refactor things on vibes alone is never going to work well.

[-]

3ABO3@reddit

i feel like the obvious answer is Pydantic and mypy? they can be a pain in the ass to adopt, so go slowly

but they work

[-]

Wooden_Step_5691@reddit

It's a sign of a javascript dev.

[-]

clintecker@reddit

i hate to break it to you but this sort of stuff has existed well before LLMs, does the code work? removing stuff cos “you don’t like it” when the system is working well is not a good look no matter where the code came from

[-]

zorgabluff@reddit

I mean, is this causing any actual issues?

If not, is this the hill you want to die on?

[-]

moduli-retain-banana@reddit

I would die on this hill and I would quit this team. It's clear they don't know how to develop a production grade Python application. At minimum you should be validating data at the boundary with something like Pydantic and using mypy/pyright for static type checking. All post-boundary code should be using type hints and trusting that the input is what it says; the static type checks guarantee it.

This would be like a TypeScript codebase using any everywhere and instead checking the data type at runtime, which defeats the purpose of TS.

I bet this team is also using requirements.txt to declare their dependencies.

[-]

nog_ar_nog@reddit

People adding AI slop this bad into a repo that OP also works on will impact their productivity so they have a valid reason to bring this up as a concern.

[-]

jonnno_@reddit

Have you never debugged something where it just fails silently with zero output or clues?

[-]

1One2Twenty2Two@reddit

It's just a terrible way to write Python code. We're in 2026. Any decent Python codebase should use mypy and have types checked.

[-]

melesigenes@reddit (OP)

I mean it causes me mental anguish and strife primarily but secondarily it makes the code so hard to read and add on to

[-]

jl2352@reddit

You’re being irrational to taking it this seriously.

Write your own code without getattr. When you’re working on code with getattr, change it. Just plow doing it a better way.

As for them; have you tried talking to them? I don’t think I saw any reference to talking to them in your post.

[-]

melesigenes@reddit (OP)

This is a reasonable take and the path I’ll take. I have tried talking to them and I get push back with reasoning that in my opinion is not valid and makes the code harder to read, but it’s technically or functionally wrong. Hence the post and request for advice.

[-]

ComprehensiveHead913@reddit

You seem to be focusing on the "consumer" side in your example but you can't safely remove getattr and isinstance until you've found every single call to extract_text and verified what's being passed to this function. What if extract_text could conceivably receive something that doesn't have an input attribute or if input isn't always a str? Has that happened in the past since someone deemed it necessary to check?

I'd just start there, at the calling functions, and define (data)classes, named tuples or whatever is appropriate - one argument at a time.

[-]

Fabulous-Possible758@reddit

I’d say it’s even worth examining whether there is semantic intent behind the naming of the function even if it’s not being used in those situations currently. Without much context it’s hard to say, but a function name like “extract_text” definitely indicates a slightly higher expectation of what that function does than just returning the .text attribute, even if that’s all that it’s being used for currently.

[-]

melesigenes@reddit (OP)

So yeah this was a bad example but I did check and the way it’s used input.text is always a str or it can be None

And this is part of why this makes me angry is that I have to check the data types and call sites for every one of these instances to check if the check makes sense or not

[-]

DrShocker@reddit

Get buy in on lint rules. I don't like working in dynamic or weakly typed languages as a rule of thumb, so I don't work in code that has the choice of doing this unless things get really crazy.

[-]

Full-Hyena4414@reddit

How to install a new lint rule to START prohibiting a certain thing without refactoring all the codebase where it is already done?

[-]

DrShocker@reddit

You'd usually need to look into if the linter has an file you can add to indicate which violations already exist so that it can only enforce it on new code.

[-]

melesigenes@reddit (OP)

We have Union

[-]

NoUniverseExists@reddit

A "wise way" would be using something that is actually typed, as C# or Rust.

[-]

aruisdante@reddit

Your specific example isn’t a super great one. While we can debate if the API is the right one or not, I’m not really sure how else you’d write this if you need it to not fail and need it to return exactly str, and you can’t just rely on static type checking at lint type because “bad stuff” would happen.

output = input.text

Will raise AttributeError if there is no .text field. So you’d have to write:

try:
   output = input.text
except AttributeError:
   output = None
return output if isinstance(output, str) else “”

And I’m not really sure that’s “better.”

It does sound like maybe this codebase is missing an understanding of how to properly duck-type systems via use of protocols. But without more context, there may be perfectly valid reasons why all this defensive code is written this way.

[-]

melesigenes@reddit (OP)

So you’re right this is a bad example and I didn’t want to write out the references and expose more of the codebase but the way the function is being used there’s always a text attribute and it’s always a str or None. It’s not used as a generic extract the text attribute if it exists and is a string otherwise empty string or None if the attribute doesn’t exist. Which is kind of why it makes me angry

If there was a good reason for defensive code I would be ok with it but most of the checks seem useless to me

[-]

aruisdante@reddit

It might be a learning opportunity to get more customer-centric and work on thinking from other’s perspectives. Have you tried talking to them in more detail about why they are worried about the input types changing?

I’ve worked with developers before who essentially litter the code with things designed to catch their own mistakes. For example I once had to convince a developer on my team not to special case a graph search algorithm starting or ending on a non-traversable node as throwing an exception when “cannot reach target from goal” was already a well supported, nominal condition for the router. His justification was “well, but if this has happened I missed an optimization chance and wastefully called the search algorithm, and I want to catch that bug.” Armed with actual understanding of what his motivations were, I was able to convince him of alternative methodologies that satisfied his concerns without putting needless sharp edges in the system.

If someone is writing defensively like this on the regular, it probably suggests someone who got burned before by people changing stuff without testing, then code he wrote failed, and he got blamed rather than the people that changed things. Now he’s over corrected by writing extremely defensive code so that he can’t be blamed. In other words, it smacks of a trust issue, which likely points to a more systemic organizational issue around testing and static analysis. That could be an approach to finding common ground on how to start cleaning this stuff up.

Alternatively, there actually might be some really good situations you aren’t considering where a failure in a particular location could cause a much larger outage, and so this developer believes that writing more defensive code in order to ensure that the system keeps working even if all the other quality gates fail is worth the cost.

You’ll never know unless you sit down and really talk it through with them, and do so from a state of mind where you assume there must be some actual reason they’re doing this even if it seems odd to you, and that reason should be respected even if ultimately it is determined to no longer be a good reason.

[-]

melesigenes@reddit (OP)

Do you have developers that you manage that do AI coding without checking?

All of your concerns are legit and worth considering and I appreciate you writing out such a thoughtful response. I really like your approach to this.

I did actually have brief conversations with the colleague because I was actually a little curious about all defensive checks and whether it was purposeful, and the colleague gave his reasoning and it’s not just a matter of disagreement of opinions. Like for example an input data is already typed earlier in the code (ie the output of the previous function can only lead to a single type) and later they’re checking the type in multiple places again as a defensive check even though it’s not possible for the type to be different. To me that’s not just a disagreement of opinion that’s just adding code that doesn’t add anything. I mentioned the same in a review. I left comments pointing this out but he just changed the defensive check instead of removing it and the reasoning is not very logical or sound. I’m not a manager, I’m not a tech lead, we’re just colleagues.

What’s a good way to have a discussion about this with the colleague? Do I bring it up again and insist or repoint out the logic? Do I defer outwardly but just change the code I touch? Advocate precommit hooks for typing? Do I just suck it up and live with it? I’m not great at conflict resolution as I tend to be people please-y so I was looking for advice from senior engineers. Our manager is not technical and it’s not something they would understand

[-]

aruisdante@reddit

and later they’re checking the type in multiple places again as a defensive check even though it’s not possible for the type to be different

Multiple places within a single scope (function)? Or in separate scopes? If it’s in separate scopes, then that goes back to a trust issue, because it’s absolutely possible for the type to change if someone later modifies the calling code. Such defects are the primary reason people that come from a background in strongly typed languages hate Python so much. But of course in modern Python this is solved with type hinting and using MyPy/PyLance. Do you use type hinting and those analyzers in your codebase? Maybe doing so would help alleviate his concerns.

Do you have developers that you manage that do AI coding without checking?

Thankfully no. I work in an industry where AI use still isn’t commonly allowed (safety critical software development). On the other hand, this means I can’t give terribly good advice there if that is what is happening.

[-]

Axonos@reddit

What does it mean to “type at the boundaries”?

typehint a function and leave it at that?

[-]

melesigenes@reddit (OP)

No I mean like instead of checking the data deep into the code and at various call sites it should be checked and typed when the data first enters the code

[-]

Wenir@reddit

What do you suggest using instead of isinstance for sum types?

[-]

melesigenes@reddit (OP)

Isinstance for sum types is appropriate sometimes. l just don’t like it when it’s overused and there’s a gigantic tree of logic branches. If that’s the case then I think the data should be normalized early and there’s room for polymorphism to better express what’s going on

[-]

OriginalBiscotti769@reddit

careful with those endless course loops

[-]

Horror_Editor6292@reddit

whats going on with that one word

[-]

twowordsfournumbers@reddit

Isinstance by itself is not smelly.

You're seeing smoke, but you're attributing it to the chimney and not the fire.

You have a culture problem and this is what happens when you don't adopt good hygiene. This is literal tech debt.

There's a lot of ways to fix this, but none of them are going to make you happy.

[-]

lolCLEMPSON@reddit

You stop using shitty languages.

[-]

SignificanceShotc@reddit

oh sure, let me just tell my workplace to stop using the "shitty language". I'm sure everyone will be on board with that and I'll get a promotion on top of it!

[-]

lolCLEMPSON@reddit

glad I can help

[-]

melesigenes@reddit (OP)

Great suggestion let me go change the entire codebase

[-]

Entuaka@reddit

Just ask your agent to generate a binary /s

[-]

notger@reddit

Just ask an LLM to do it.

Boom ... you are leading the board on lines changed.

[-]

just_looking_aroun@reddit

I’m sure it will be easier than convincing someone of a refactor

[-]

TheRedRavenTR@reddit

I have a case where we have X different tabs in a page where each of them fill a different structured request. These requests all go through the same process with different extra conditions for each so we have a baseRequest and the other requests inherit it and the process checks for isInstance to apply the extra conditions. I think would have to write the process 4 times with minimal difference to achieve same thing otherwise.

[-]

crustyeng@reddit

Call me crazy, but I think Python (or any interpreted language) simply has no place in any production environment.

[-]

jaxsaxsf@reddit

You're right of course, but that battle was lost long ago. All we can do now is try to convince people to use the tools that exist in these languages. Python has static types ... sort of. And they work, for the most part.

In the case of LLMs, the chief problem is that the training data set includes mostly Python without types so that's what it will generate by default. But you can combine directives to the LLM to use types in generated code, avoid Any, and back that up with a linter.

Even this trash codebase could be cleaned up with the help of an LLM to add type hints.

[-]

crustyeng@reddit

Tacked-on, optional typing does virtually nothing to address the issue, I will agree.

[-]

OP_will_deliver@reddit

Ya you're crazy.

[-]

crustyeng@reddit

Finding bugs in production is a lot more fun, I suppose.

[-]

valence_engineer@reddit

Everything has bugs. Everything. Type systems don't prevent them. Python has plenty of terrible things however anyone who thinks in absolutes about languages is not worth listening to. One company we had such optimized python that we told everyone it was rust so they'd spend 10x the effort on trying to copy it. Fun times.

[-]

crustyeng@reddit

Interpreted languages do not singularly make shipping bugs possible, they just make it a lot easier.

[-]

Think-Memory6430@reddit

The advice in here so far seems so bad!

Ideologically you are correct OP. I think there’s even an argument that takes the point of “the input could change” and makes that work WITH your proposal and not against it: If the input could change, this could all silently break at any moment!

Instead parsing a generic dict into something typed and validated that you can then use without all these checks is by far the best solution.

Pragmatically it may take some effort to get there, but it would definitely be the cleaner solve.

As far as whether an LLM would write this: yes, it would continue poor practices where those poor practices already existed. It could also easily follow good, typed practices though.

[-]

Smok3dSalmon@reddit

Type checks force inheritance, but I agree.. the amount of duck typing in Python is smelly. Or move it into a helper function that describes the intent like is_stringy(obj). Then optimize it later if possible

[-]

NoOrdinaryBees@reddit

IME isinstance isn’t necessarily a code smell. It can be pretty sanity-rescuing when dealing with modules that abuse union types, for example.

[-]

rovermicrover@reddit

Model performance coding on the repo is going to get worse because it’s going to loose context also as to what each variable actually is supposed to be also.

I would argue if they want to keep vibing they are going to need to clean house so the models will actually be able to work.

[-]

_PanicattheCostco@reddit

This is the type of overly defensive python that Claude outputs when it doesn’t have decent steering. It’s annoying to review and maintain IMO and which is good enough reason for me. Can’t this type of thing be handled better with dataclasses, pydantic, or mypy anyway?

[-]

MoreRespectForQA@reddit

datadog isnt encouraging it theyre just saying "if you are gonna check types if an if statement use is instance"

[-]

melesigenes@reddit (OP)

I see. I read too quickly through the article. Thanks for pointing this out

[-]

IIALE34II@reddit

I can think of few cases I have had to use isinstance. We run django as a backbone to one of our services, and its probably more related to django, but sometimes, the types just don't match, and its better safe than sorry, having some isinstance util method somewhere. But yes, typehints are absolutely preferred. I wouldn't pass "any" without a good reason in my code review either. Type unions or type aliases is what I'd use.

[-]

Glad-Researcher2738@reddit

It's not because of the code written by LLM but rather because the codebase was bad in the first place and LLM followed that thinking it was the recommended practice.

[-]

Financial-Grass6753@reddit

> Codebase has hundreds of isinstance() and getattr(). How to convince colleague to fix?

Ask him to write automatic tests. Not some silly unit tests, but at least integration or even E2E against real components. And coverage say >70%.

Unless he outsources his mental activity completely to LLMs, he will start hating pythonic reflexions on day 2 or 3.

[-]

Curtilia@reddit

I hate python