What are your tips for keeping track of variables in complex data flows?

Posted by rainyengineer@reddit | ExperiencedDevs | View on Reddit | 11 comments

I’m on a new-ish team where we have a lambda that should really broken into smaller pieces and a part of a larger step function. Unfortunately, we probably won’t be allocated time to do so. It started out smalls but over time we added more and more services, api calls, orchestrators, and now we have some monstrous code.

It’s become quite difficult to follow what the values of the variables are behind the scenes as I’m working on it (they’re all usually some hefty json). What also doesn’t help is that I’ve always been a visual learner. I envy the engineers that can just ask where something comes from and remember it forever. I need to see it and diagram the more complex repositories to understand them fully which can be time consuming and seen as wasteful.

How do you prefer to stay on top of knowing what exactly the expected values are from one method call to the next in spiderwebs of code? I log the json to CloudWatch which helps, but that can get costly and make the logs noisy. Do you maintain really detailed doc strings of the expected json response bodies/variable values? Do you put sample values in the if name = main for local run and debug?

I’ll take any tips or advice you have for me. This is an area I want to improve upon because it’s holding me back as an engineer.

[-]

engineered_academic@reddit

Is using a debugger a lost art these days?

Shit you should have schemas, at the least. OP logging object JSON to CloudWatch is a terrible idea. This is how you get data breaches and privacy violations. Your logging team also hates you.

rainyengineer@reddit (OP)

There’s literally nothing sensitive in the JSON I’m logging but point taken on the observability team

Nothing sensitive now until someone follows your example and logs a sensitive token.

pduck820@reddit

Dumb thing I do, all params inbound to a function get "in" prefixed to their name...

public void DoStuff(int inUserAccountID) { }

Makes it easier down the page to see and know what's passed in vs local... or just type "in" and have intellisense help me out (offer not valid in non-visual-studio, of course). And of course, there's going to be the "no Hungarian, you're evil, how dare you, it doesn't mean anything"... Except it does. It means it was a var passed into the function. :P

Another is I never use i in for loops... always loopIter... for(int loopIter = 0; loopIter < {something}; loopIter++). Nested loops get "inner" and "outer" prefixes. And if I have three nested loops, I know I done messed up.

Like others said.. be consistent, and you can rely on the naming conventions.

exomyth@reddit

Honestly, I never remember what my APIs contain or need because I am managing too many. So to answer your question.

I don't keep track of anything, but I do follow conventions, and then it is just tracing calling paths, which should be kept simple and easy to follow.

But I have worked in code where I had to note down part of the calls as I am trying to wrap my head around way too much information to follow through. The code was bad and overcomplicated though. That should not happen with good code

foil_k@reddit

Just me, or does this seem more like the kind of thing a junior or mid-level dev would ask, especially since it refers to a single lambda?

I mean, there are countless ways to deal with complexity, but the initial steps are nearly always the same basic principle: renaming for clarity and documentation. This is something I learned pretty early on in my career.

Iz4e@reddit

Honestly, just ask Claude. Seems to be a good sense for it. Besides that sounds like you need better abstractions/APIs

DeterminedQuokka@reddit

Truth. I commonly will ask cursor “how did calling x end up in this function”

PaulPhxAz@reddit

I don't even like AI and this seems like the perfect use for it. Explain, track, refactor it into a few big pieces so it's more split apart.

It's the same thing I would give a junior dev to work on.

Use dataclasses to pass the variables around. Or require them to be named (dict in js, forced kwargs in Python, etc)

krmhd@reddit

Naming conventions. Nobody memorizes those paths. Projects just ensure same logical entity always uses the same terminology everywhere in the codebase, then you can recognize it