Stop Counting Lines of Code: Metrics That Actually Matter

[-]

yes_u_suckk@reddit

Some people will always use moronic metrics to decide how good a developer is or how good/complex is a software.

I once had a manager that decided during salary reviews pull the number of commits of each developer to determine how good or bad they were. Oh, so you had only 120 commits in the last 6 months? This is not good enough, so no raise for you.

After this happened for the first time the developers started to create multiple commits for any tiny change in the source code, like:

add if condition
add else condition
increment the variable
add comma in the function comment

Developers that used to have 10-20 commits a month, now had 200-300 commits a month.

Of course, this made the messages in the VCS terrible, but the manager loved the changes in the next salary review. He even bragged during the company party at the end of the year that his initiative of checking the number of commits of each developer during the reviews created a incredible "productivity boost" in the next salary review. He genuinely believed that and was proud of his accomplishment.

Thank Satan I left that place shortly after.

[-]

applestem@reddit

We had a manager that used lines of code rate. So we started seeing if we could eliminate lines of code and were able to force his metrics low or negative in some modules. We actually ended up with better code.

[-]

Milligan@reddit

I remember reading about a manager who counted lines of code and developed a tool to count the semicolons. (I think it was in Joel on Software). In C++ a blank line with a semicolon is a valid line of code, so the developers started ending all of their code with two semicolons, thus doubling productivity. An old management maxim: "You get what you reward".

[-]

Sovol_user@reddit

Crash if ; missing 😢

[-]

putin_my_ass@reddit

My wife worked at a place many years ago that had a lead developer (CTO title) who bragged about it on his business card. Something asinine like "Over 200,000 lines of code shipped!".

I laughed when she told me and said "You should leave that company, they don't know shit about shit."

[-]

Pttrnr@reddit

timeframe? in 10 years as CTO? per year? what type of code? was it ever used? did it run without problems? i can write a program that ships a million lines of perfect assembly every day.

[-]

putin_my_ass@reddit

Right? On its own it's a completely irrelevant metric, it was clear he was using it to impress potential clients who weren't technologically savvy. When you see a lack of substance like that you have to wonder about how serious the company is.

[-]

IntelligentSpite6364@reddit

200,000 lines is also like … a medium size legacy project

[-]

StarkAndRobotic@reddit

200,000 lines is a side project I do at home.

[-]

aeroverra@reddit

My side projects I made after dropping out of college have more lines than that.

[-]

antennawire@reddit

Meanwhile my (ex) wife:
"You think that's working hard, what you call programming? You don't know what working is."

[-]

aeroverra@reddit

That would be an instant deal breaker if anyone I was dating said that.

[-]

liquidivy@reddit

Too late to show her this, maybe. https://www.stilldrinking.org/programming-sucks

[-]

aeroverra@reddit

That's friggin awesome.

[-]

LoustiCraft@reddit

Litteral definition of task failed successfully

[-]

pheonixblade9@reddit

I've had negative LoC at most places I have worked. Yes, while adding features and fixing bugs.

[-]

r0ck0@reddit

Similar to this story... https://www.folklore.org/Negative_2000_Lines_Of_Code.html

[-]

applestem@reddit

Funny, that’s about the same time as we did our malicious compliance. Opposite coast, different industry, but same idea.

[-]

ZorbaTHut@reddit

Many years ago I led a subproject that involving vendoring and forking a major library we relied on, then deleting code and features that we didn't actually need. Thanks to that project I'm pretty sure my lifetime lines of code is negative.

[-]

avdpos@reddit

I measure our commits to see on an average for how our productivity go up.

But I do not compare different person's on their number of commits. I look at how their own statistic is. If we commit code the same way for 2 years and every person make +25%-+50% per half year it shows something about us as individuals. But we adress that person X make 5 commits per errand (me) while person Y make 2 commits (my closest colleague). So we value errands, and use the commits as one of many ways to show how we improve for our boss that mostly heard "they should do more" from the product owner.

[-]

Pluckerpluck@reddit

This is still stupid. I've had periods of problem solving where I'd have pushed incredibly few commits, and times when I was doing refactoring where I could pump out commits for each file as I clean them up.

Hiring a new developer could literally lower existing team members outputs as they spend more time helping them get up to speed or training them. You going to say "we got a new hire, but our productivity fell"?

[-]

avdpos@reddit

It have been a good measurement in our situation to show that we all do more work individually.

Generally? Probably not. We can easily track not "commits" but "committed to different tasks". So it have helped us showing our boss in a very concrete way that we are better at producing now than 2 years ago. And we only take data over ~ 6 month periods (our release cycle) so it means something. It works in our situation - but of course not as the only thing.

[-]

wildjokers@reddit

This is dumb.

[-]

GravyMcBiscuits@reddit

Use squash on the MR/PR.

[-]

throwawayforwork_86@reddit

As soon as you use a variable as a metric it cease to be useful as both or something similar (looked it up Goodhart's law: Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.)

[-]

billie_parker@reddit

Clearly that's not a universal rule. If you're measuring your running team based on the lowest times, it won't collapse as a measure

[-]

LoyalSol@reddit

The spirit of the rule is that when it comes to humans. Once we know what we're being measured by, we start the fudge the shit out of it.

[-]

lupercalpainting@reddit

This is only true when the metric is a proxy for what you truly want to measure.

If what you want to measure is “how many kilos of cocaine did I buy” and you use mass, you’ll shorted by someone cutting your shit with baby laxative. You used mass as a proxy for “amount of cocaine”.

If what you want to measure is “how is the running team over a set distance” then yes, you can directly measure that.

[-]

wildjokers@reddit

https://en.wikipedia.org/wiki/Goodhart%27s_law

[-]

saiastrange@reddit

Man, this reminds me of my last job where they had some tool integrated into Github that gave each developer a score based on the quantity and quality of their commits.

We were expected to make somewhere in the ballpark of 4-6 commits a day. The software had some kind of algorithm that determined how impactful your code was, and I believe it also scored you negatively if any of the code you had written was changed or replaced by a certain point.

Having to think and write in those atomic commits made it a nightmare to make any large cohesive changes.

On top of that, the level of micromanagement was bonkers. They wanted multiple draft PRs up throughout the day and at the end of the day so they could see exactly what you were working on at any point. 😵‍💫

[-]

N546RV@reddit

Vaguely related story: we used to have a system that did a +/- summary of commit activity by individual; this dated back to before we migrated from self-hosted git repos to Github Enterprise.

Anyway, at some point prior to me starting here, one guy was making the point of how this summary was really easy to game, and to make this point he made a trivial whitespace change to every file in the repo. The stories I've heard vary on whether the subsequent push of this commit was an accident or not, but...well, it happened.

And that's why, if you're using git blame to find the history of a change in this big major repo, at some point you will run face-first into a commit from 2011 with the simple message "Refactor."

[-]

rysto32@reddit

I would flat out murder that guy. He just fucked everybody’s merges up to prove his point.

[-]

suvepl@reddit

git blame actually has a useful trick for such cases: the --ignore-rev option.

[-]

aeroverra@reddit

I would immediately make a script that committed every save.

[-]

Gizmophreak@reddit

A.K.A. Perverse Incentive or The Cobra Effect

[-]

eek04@reddit

I once had a manager that decided during salary reviews pull the number of commits of each developer to determine how good or bad they were.

As a manager, I've done that, but that's been one of many many metrics I looked at to get a feeling for how well things were going for each report. And I checked the contents of a fair number of sample commits for every report, including taking a look at just about every large commit. I also looked at number of tickets filed and closed, number of tickets changes, number of and quality of documents and comments on documents produced, clearing rates for OKRs, plus a ton of more fuzzy evaluation. E.g, what kind of leadership and mentoring did they do? How advanced was all the stuff they worked on? Did they notice and push for things we needed? Did they work on the things that had impact, or just pick things they found fun? Did they fully complete their projects, or do e.g. the fun parts of coding and then leave off documentation and integration work? I also provided all the input data to each of my reports, and had them write a brief document about the good things they'd done.

Number of commits is a useful data point, but it is only one of very many data points to look at to get a good understanding of a report.

[-]

wildjokers@reddit

Number of commits is a useful data point,

git rebase makes it a totally worthless data point.

[-]

eek04@reddit

Where I've worked and did this, we didn't use git. (IMO, what we used was clearly better for the enterprise usecase than Git, but there can be different opinions.)

[-]

aparente_mente@reddit

sounds reasonable. Would you say you would have detected there if some high volume contributor generates bad code that could later make another dev take more time because of hard to modify code?

[-]

eek04@reddit

Almost certainly; in addition to sampling commits, I did a lot of pre-commit code review, a lot of refactoring to make the code easier to work with, and mentored on how to write better code.

The manager role had large bursts of non-planned work, so these were a good fit. They all had the advantage of either not being on the critical path or being fairly quick to do (and possible to hand off in a pinch.)

[-]

wildjokers@reddit

This reminds me of a story I read on https://thedailywtf.com some years ago. (https://thedailywtf.com/articles/Productivity-20)

You are either a plagiarist or managers keep coming up with the same dumb ideas.

[-]

AimToMisbehave@reddit

My old delivery manager used to collect similar metrics for the number of lines of code added, I pretty printed a single json file and pushed it back to git, I had a 400% increase in lines of code added that week.

[-]

EveryQuantityEver@reddit

Did you get the raise after doing that, though? That's the important thing.

[-]

Sovol_user@reddit

I used to count clock cycles for each instruction, many many years ago. Some instructions could do the same actions but take less clock cycles, so be much qucker

[-]

yanitrix@reddit

I loved how stupid eplanations for these metrics actually are. I remember when I didn't understand how to measure story points so I just asked my manager "So is it just time we need to deliver the feature?", he said "No, it's not about time. It's the measure how good we know the codebase, think about like like velocity". So I replied "But velocity is distance over time, so it has to include time". "No it's not about time", he replied. And I was like what the fuck is this shit about then

[-]

Magallan@reddit

Points are a measure of complexity not time.

If I ask you to make a cup of tea, you know how to do it and you know exactly how long it'll take you. Very low points.

If I then specify that I want you to grow the tea leaves yourself. You don't know what a tea tree looks like, or where you'd get one, or what conditions it needs, or how to treat the leaves to make tea. Very high points.

This helps the problem of estimating software projects where you have a lot of unknowns.

[-]

yanitrix@reddit

Points are a measure of complexity not time.

yes, but

you know how to do it and you know exactly how long it'll take you

??

[-]

Magallan@reddit

Fair, if I just rephrase it as "you know exactly how to do it".

The point is, over longer timescales, estimates don't matter.

If I say something will take 5 minutes, it'll take 5 minutes.

If I say something will take 6 months, it'll take anywhere between 5 and 8 months, however, I've given a time estimate so there is an expectation it will take exactly 6 months.

This is the problem that story points solve. Feel free to stay unhappy about it, but that's why we have them.

[-]

UK-sHaDoW@reddit

They're quite easy to explain. They're relative measure. 2 is going to roughly equal to another 2 story. 4 will be twice the size of a 2 roughly.

[-]

Pluckerpluck@reddit

The reason you don't want to include time is because humans well know to be shit at estimating time.

It's not even that, it's just about unlinking tasks from direct work time. You start simply measuring team averages per sprint

So the fact you hired a new developer and have to spend time training them? That doesn't affect the points on your stories, but it lowers the number you can complete each week.

It's effectively an abstraction that allows for scaling and other facts to be included into estimates, without then having individual tasks tracked through micro mangement.

[-]

elmuerte@reddit

And then you get

"another 2 story" took 1 week to deliver, so this new "2 story" will take 1 week.

[-]

UK-sHaDoW@reddit

What's wrong with that?

[-]

neutronbob@reddit

Exactly. And if you miss, then you try to determine if you mis-estimated with the original 2 or whether the estimate was good except for some factor you didn't take into account.

After multiple cycles, you can get fairly good at this form of rough assessment. Never perfect, of course. But good enough that it's useful.

[-]

AlanOix@reddit

But good enough that it's useful.

At this point, I have done hundreds of estimate, and I can say with certainty that I never considered any of them useful. It is always to do some planning that serves no purpose except giving a vague idea to the manager about when X feature will be ready. Mind you, that information is mostly useless because the feature will be ready when the devs will finish the development, not sooner, not later. Often they use it to make deadlines that are nothing but arbitrary, but that is not worth reuniting multiple dev during one hour (or more) for that.

[-]

Jaggedmallard26@reddit

Agile is supposed to be tailored to each team with things like points not being directly comparable between teams. If a team finds 2 points for a weeks work is optimal for them then thats it working as intended. I would argue that if 2 points is worth so much then there is something going wrong with task splitting but perhaps its a complex system where work simply isn't parallelisable.

[-]

RiverRoll@reddit

Story points are terribly explained most of the time. They're a relative time measure, you start with a few reference tasks, you give them an arbitrary value (one that would leave enough room for the different kinds of tasks) and you start estimating based on how they compare in terms of expected time and uncertainty.

[-]

wildjokers@reddit

Story points aren't about time at all. They are relative complexity compared to other stories.

[-]

RiverRoll@reddit

Why do you want to quantify complexity then?

[-]

Pluckerpluck@reddit

You're right in that it is based on relative time, but it's separated out from individuals. So it works in reverse. Your team can, on average, fit twice as many 1 point stories into a sprint than 2 point stories. But imagine your team starts to work faster (somehow). Do tasks now get allocated less points? No. Stories maintain their scoring. But the number you can fit into that sprint? That varies. As your team changes you use the same point system, but your velocity changes. Get a new hire? Well the number of points you can complete might go down because you must spend time training them.

So they're related to time, but they are not time because they include everything else your team is doing within a sprint. They include meetings, they include time spend helping colleagues, they include time spend squeezing in quick troubleshooting, they include being sick.

Story points should only be used to measure how much work can be completed in a sprint by a team. You should never map them directly to time as a result.

[-]

LookIPickedAUsername@reddit

Ok, but... I still don't understand the point. We already have real-world absolute time measures, and we all know that internally people convert these "relative" time measures into absolute even if they don't say it out loud. You know perfectly well that your PM is thinking "Ok, the team can complete about story points a week, so we've got about four weeks until we hit the next milestone...".

And if everyone is already internally doing the math to convert story points to days, why jump through all of those hoops just to pretend you're not estimating in days? You could just estimate in days to begin with, and you'd end up in the same place with less fuss.

[-]

ryusage@reddit

I get what you're saying, that's happened on almost all my Agile projects as well. It feels like one of those things that's good in theory but flawed in practice.

In theory, it's the difference between "effort" and "delivery time". Maybe a typical feature ticket takes 4 hours of focused work for Alice, 8 for Bob, and 12 for Tom, and they all have varying amounts of meetings, PR reviews are inconsistent, their brains work better some weeks than others, etc. The work that has to be done is basically the same in all scenarios, but you need some conversion factor to get from the amount of work to the actual Delivery Time. Our intuition is usually to think of the work in terms of the "focused hours" and how that fits into our daily schedule, but when we do that we're really ignoring how many other factors are involved.

So if your ticket estimates need complex conversion factors anyway before you can use them for timeline predictions, then using a totally different unit of measure for "effort" kinda makes sense.

The biggest problem I've seen is that a unit of measure is meaningless without some reference point, and people do a shit job of defining that for their Story Points. Like, the government defined the Kilogram by picking a particular hunk of metal and saying, "it's whatever that thing weighs," and everything else is just compared to that. A Story Point really ought to be the same. You pick a user story and say, "it's however much thinking/typing/complexity/uncertainty that story involved", and then you compare everything else to that. But that's pretty abstract so people rarely do it. And in the absence of a clear definition, everyone ends up falling back on what they know, internally using an amount of time as their reference for this new unit.

Even if you define your points well, the other problem is that in practice there's always pressure on devs to maintain a certain velocity. And if the rate is fixed, then suddenly you have a constant relationship between effort and time and you're effectively just estimating delivery time again, now with more abstract units.

tl;dr: story points can make sense if people actually understand how to define them, and accept that velocity changes all the time, but in practice people usually don't so points end up being exactly the same as hours

[-]

fishling@reddit

we all know that internally people convert these "relative" time measures into absolute even if they don't say it out loud.

Um, no we don't all do this. When our team estimates stories, we don't care about how many points it is. We just compare it to other stories, adjust for risk, and assign points based on what it is similar to.

You know perfectly well that your PM is thinking "Ok, the team can complete about story points a week, so we've got about four weeks until we hit the next milestone...".

That's different. Points per week is a rate of change per unit time. So, there's absolutely nothing wrong with looking at a team's historical velocity and using that as a way to estimate the burndown rate of story points to estimate a completion date. That does NOT somehow mean that points themselves become time.

Distance doesn't become time just because I know how long my commute takes and tell you I have a 30 minute commute on most days beause I know my average velocity is generally consistent. So why do you think points are different?

[-]

Venthe@reddit

Okay, story time.

Originally, "points" were indeed a time measure, albeit a non-straightforward one. 1 (story) point - One Ideal Day, so no interruptions, everything according to plan etc; at least according to the coiner - Ron Jefferies. Kent Beck at the other hand, came up with user stories. As both of them were founders of XP, this practice became part of the XP - User Stories, and consequently (now) Story points

Agile movement took that and modified it a bit. Here again, we need to make a strong distinction between the intent and the misuse. You know for sure about how the idea is misused, so I'll go with the intent.

The intent is to divorce the measure from time, have a unified metric for a single team that allows both to judge complexity, but also shows the gaps of knowledge etc.

That's also why Fibonacci sequencing for SP came around, incidentally. To force discussion around the question: "Is it roughly the same, or significantly easier/harder". Increments promote lengthy discussions. Moreover, ever-inreasing distance means that the higher numbers by default show higher uncertainty.

What is important - concrete values are meaningless. 3 in team A has completely different meaning that 3 in the team B.

Ultimately, SP nowadays should be used as a yard stick internal for the team.

One person thinks 11, other 21. What one person knows that the other one does not?
We did x in 5, and y is similar to x, so it will be 5
We assumed 5, but we did it in twice the time [more on that in a second], so did we miss something? Can we improve the process?
We did x in 5, so we can forecast that we will also do 5 in the next iteration.

The SP->Time is again, useful for the team in a few different ways. Simple mapping to time is helpful as a measure to ask retrospective questions; addressing uncertainties etc. Coincidentally, this mapping can be used to forecast delivery. "Usually, stories similar to the ones we have in the backlog took us around 1 month, so these will also take us around the same time".

The problem arises, when the forecast becomes a metric; and is used to set the stone-bound expectations for the team; or worse - to judge performance. Similarly, when used directly across the teams they loose all their intended purpose; becoming thinly-veiled days.

[-]

SaigonOSU@reddit

Thank you! So many people fail to understand story points

[-]

Plank_With_A_Nail_In@reddit

Because people fuck the conversion up if you mention time. Estimating is hard for people they always think they do things faster than they actually can...many times faster.

[-]

EveryQuantityEver@reddit

Because estimating in "days" doesn't take into account what happens if the ticket requires more effort. A story point estimate is also a measure of the amount of effort, and the amount of uncertainty a ticket has.

[-]

RiverRoll@reddit

I think it makes sense when you have a team with different experience levels, in general or in the project, and not everybody works at the same speed. With points you can kinda agree in some common reference frame.

[-]

hippydipster@reddit

It's a test, and you failed.

You behaved as though words have meaning, which was an error and got you pigeonholed as an IC to be dumped once the company can limp on without you.

[-]

MendesOEscriturario@reddit

What does IC stand for here?

[-]

hippydipster@reddit

Individual Contributor.

[-]

MendesOEscriturario@reddit

Thank you!

[-]

KaleidoscopeLegal583@reddit

That seems rather hyperbolic.

[-]

Kronikarz@reddit

I fucking wish

[-]

KaleidoscopeLegal583@reddit

Alright. Let me rephrase.

I think companies always need ICs who act as though words have meaning.

[-]

Setepenre@reddit

Clown face meme

Points are not about time
compute velocity metric from past sprints from points
use velocity to estimate when the project will finish

[-]

fishling@reddit

I'm not sure what part of that you think doesn't make sense.

Points aren't about time. However, that has no bearing on your ability to use a rate of change of points over time to estimate when a project will complete.

Distance isn't about time. But I can use a change in distance over time to estimate how long a trip is.

Temperature isn't about time. But I can use a rate of change of heating up water in a kettle to estimate how long it will take to boil.

[-]

Setepenre@reddit

Speed is Distance per Time.

If points are not Time, Velocity is not speed.

[-]

N546RV@reddit

Fun conversation I had with a former PM:

PM: "So I'd like us to start reviewing completed tickets in our sprint retros and assigning 'actual story points' so we can refine our pointing."
Me: "How are we going to assess 'actual points?'"
PM: "Well, we can look at the time it took to complete the ticket."
Me: "But I thought story points weren't time-based? Isn't this contradictory of that?"
PM: "Hmmmmm..."

[-]

fishling@reddit

They were only partially wrong. It can be a useful exercise to see if past stories of the same size have some correlation to how much time they took to complete, and to analyze outliers to look for ways to improve estimation and risk analysis. Not all outliers are interesting or problematic though. Some might be a sign that a higher estimate due to risk correctly predicted the risk.

[-]

EveryQuantityEver@reddit

They wanted you to go back and reflect upon your estimation, now that you know how much effort it took to complete. Thus helping your team get better at it.

[-]

N546RV@reddit

Before the sprint: "So there's a good chance this change could be done in maybe half a day if everything goes right. But it does touch some really hairy and fragile legacy code; the chances of unexpected side effects is nontrivial. It'd be a one-point ticket if not for the legacy concerns, but with that uncertainty we're going to call it a three instead."

After the sprint: "Fortunately, we didn't have any problems with the legacy code, and this ended up being a straightforward change."

Should the takeaway here be that we should have stuck with the one-point assignment? Were we wrong to pad the points due to the uncertainty of the legacy concerns?

This is my problem with the idea. If uncertainty is a key input in story points, then returning to them after the fact, when all that uncertainty is gone, is literally Monday morning quarterbacking. Especially if the guidance given is "let's look at how much time we actually spent." Yeah, it ended up being easy, but we weren't wrong to consider the uncertainty, and that same uncertainty will exist with future tasks that touch that nasty legacy code.

That's my entire gripe. You start with "story points are not estimates of time," and then go to "let's evaluate our story point assignments based on how much time it took to complete the task."

Now if we want to generally discuss in a retro what we learned about $legacySystem and how it might affect future work that touches it, that's great - but IMO that's general knowledge-sharing, and not "we didn't point this correctly."

[-]

EveryQuantityEver@reddit

Should the takeaway here be that we should have stuck with the one-point assignment? Were we wrong to pad the points due to the uncertainty of the legacy concerns?

The takeaway is that your initial estimate, of how it would be without interruption, was accurate. You keep taking in that info, ticket by ticket, and your estimates get more accurate.

Your gripe is that you don't want to do any reflection on what you've done.

[-]

kooknboo@reddit

It’s about bullshit metrics and dashboards to report them. Full stop.

[-]

chesterriley@reddit

This. The concept of "story points" are fundamentally flawed (along with most things in Scrum) because they don't measure anything that is real or objective. They only exist so that a manager can have a bullshit "velocity" on a bullshit spread sheet. Since story points don't measure anything real, there is always story point inflation so that on the mangers meaningless spreadsheet it appears the team is "increasing velocity" and the clueless manager can brag that the team is increasing productivity.

[-]

kooknboo@reddit

Yep. Our "leader" guy daily inflates his metrics 3x, by taking a simple card and splitting it into 3.

"Conduct Project Kickoff"

becomes:

"Schedule Project Kickoff"

"Conduct Project Kickoff"

"Closeout Project Kickoff"

Then his charts and dashboards (he's all about the sparklines lately) trend nearly vertical whilst all his peers trend slowly, but steady upward. Everyone gives him a reach around like he's the goat. Nobody has ever once dug into the data behind any of this. I don't think even the worker bees on the team know or have even bothered to care and look at what he's doing. Great place to work. But, holy hell, do you need a highly refined bullshit filter.

[-]

fishling@reddit

It's "points per sprint", so the time dimension is "sprints".

It sounds like you didn't understand something immediately and decided that must mean it was stupid instead of putting in some minimal effort to learn.

You asked your manager if story points were the time we need to deliver the feature. That's incorrect, so he said "no". Tying it back to physics, the points are the "distance", not the "time".

"But velocity is distance over time, so it has to include time". "No it's not about time"

Your manager was still trying to answer your original question. The points aren't about time. He was saying this because another common mistake is to directly equate story points to hours or days, which is also wrong. Story points are purposefully an abstraction to get away from time-based estimation because people are terrible at that, especially in software, where there is a lot more uncertainty, risk, and variation in skill/knowledge to account for.

If you understood that velocity is distance over time, then I'm not sure how you didn't get that distance matched with points and time matched with a sprint.

[-]

bwmat@reddit

Given fixed-length sprints, the distinction is meaningless though?

[-]

mindless900@reddit

Story point estimates are complexity estimates. Usually that has a rough equality to some amount of time on a personal level, but time will actually vary based on the actual engineer that works on the ticket. The brand new entry-level engineer will take a week, the Lead who has been here since the dawn of time might take a day. It also is supposed to capture the "unknown unknowns" of a task as well. Changing the color of a text field, low complexity and low chance of discovering additional work not foreseen. On the other hand, integrating a brand new SDK that no one has used yet, high complexity and lots of potential to add additional complexity.

Velocity (in agile) is a measurement of the complexity (story points) a team can do in a given time period (sprint). The same team can have wildly different velocities sprint over sprint depending on a few factors (PTO, holidays, where they are in a project, how good they are at estimating) these should get captured as focus factors, so you can account for that time not spent working on sprint tasks and reduce the expected velocity (and therefore the teams capacity) for the up coming sprint.

Teams should have some example tickets from a past project to demonstrate what a 1, 2, 3, 5, and 8 point ticket are so when estimating you can be more consistent.

[-]

AlanOix@reddit

Velocity is not an agile concept, it is not even in the scrum guide. It is just that velocity often comes in a scrum methodology because scrum requires to plan the next sprint so they are often bundled in the same "agile formations"

[-]

Wonderful-Wind-5736@reddit

Story points measure exactly how much time I think I can get away with.

[-]

AlanOix@reddit

It is exactly like that for me. I don't even use the extra time to do something non-work related with the spare time, I use it to make sure I don't leave the code in a shitty state, to test it properly and document if needed with as little time pressure as possible.

[-]

booch@reddit

I have this argument every time someone tries to convince people it's not a measure of "how long it will take", that it's "not time". Because, as you noted, it is. Period. Full Stop. Sure, it's time with more calculations added in to hide the fact that it's time, but it's still time.

I have yet to see a single explanation that doesn't boil down to something that's still just "time, but with more words around it"

[-]

Markavian@reddit

I never force my team to estimate; at least not explicitly. I ask them to break work down into bullet points, that I then use as ticket titles.

The estimate is how many "tickets" they created; because believe it or not, there is an "average ticket", and I can just take the teams velocity for the past three months, multiply that by the number of tickets, and boom... estimate.

Team velocity of 2.2 tickets per week? 9 tickets? That's 4.09 weeks of work right there.

[-]

mattgen88@reddit

Congrats, you invented kanban

[-]

Mrqueue@reddit

kanban isn't about estimation

[-]

mattgen88@reddit

It isn't. It's about a set amount of same sized things flowing through a pile line.

The author above basically did that by ensuring everything is the same size.

Kanban let's you estimate the time to delivery by ensuring things are all the same size, letting you say that when x enters the pipeline it'll be done on y so long as wip limits are held

[-]

Mrqueue@reddit

You can do kanban without everything being the same size

[-]

Markavian@reddit

That's a small part of kanban; make your work visible, measure queues, identify bottlenecks, assign appropriate skills and resources to alleviate bottlenecks, discover new bottlenecks, continue to optimise the flow of delivery... estimating is a whole subtopic in itself.

The problem with story points is that they're so abstract from work at hand that unfolding them back into useful estimates is confusing for most people, and weak science at best.

/opinions

[-]

Wonderful-Wind-5736@reddit

Seems reasonable when ticket size varies little or the number of tickets is sufficiently large.

[-]

Markavian@reddit

A weird thing happens with teams and tickets; people like to stick with a ticket for a day or two, unless you get really lucky and the whole thing gets reviewed tested and merged on the same day; so usually if something is worth a ticket it's worth taking a few days to do... so tes naturally bundle in a minimum amount of work into a ticket.

On the flip side; tickets which seem to drag on for weeks and weeks tend to either be under resourced, overly complicated, or poorly defined - either way it's the job of a delivery manager / tech lead to figure out when a ticket has stalled out and needs breaking up or rebooting - which pushes tickets back into the region of "a few days".

Obviously that's near to ideal (practical) from my perspective; but I've also seen tickets that take literal months to finish because of poorly defined dependencies (e.g. waiting on an external third party response), or other "reasons" (see: developer passion project).

[-]

eek04@reddit

An interesting point here: Software estimation seems to be fairly decent and standard distributes if we assume people are estimating on an exponential curve. So you'd expect that there's more time lost in too large tickets compared to saved in too small tickets. But if you have data, you can estimate how much and what risk there is for it.

[-]

Wonderful-Wind-5736@reddit

So ticket size approximately follows a log-normal distribution? Seems reasonable considering there's a natural lower bound (0). Although a log-normal distribution usually arises from the product of many positive-valued random processes, not sure why time taken for tasks should combine via product as opposed to e.g. sums. Maybe that's just how people think...

[-]

Markavian@reddit

Absolutely; the biggest risk to a project is untracked work - i.e. work performed for a project but not tracked on a board - simply because that work becomes impossible to track - and is often the reason why people plan in contingency time - because they don't have visibility of dependent tasks.

That's why it's important for team members to create their own tickets - they need the freedom to say "I've identified more requirements/new dependencies - that's what I'm working on" - and so long as that contributes to the goals of the project, it can be very quick to generate a new estimate and work out if (perceived) deadlines are at risk.

[-]

Wonderful-Wind-5736@reddit

Very interesting points! Having "natural" forces to keep ticket size variability low is such a cool method to simplify estimation.

[-]

civildisobedient@reddit

Team velocity of 2.2 tickets per week? 9 tickets? That's 4.09 weeks of work right there.

Technically you're measuring cycle time which is arguably a vastly more important measurement. i.e., I don't care how much time you think something takes - what matters is how much time it actually takes.

[-]

Wtygrrr@reddit

The entire point of story points is for developers to provide feedback on how complex a story is so that it can be broken down into less complex stories. Anything else anyone tells you is a corrupted version that isn’t Agile.

[-]

wildjokers@reddit

Anything else anyone tells you is a corrupted version that isn’t Agile.

No true Scotsman Fallacy.

One of the guiding principles of Agile is a team does what works for them. So there is no such thing as a "corrupted version that isn't agile"

[-]

wldmr@reddit

You (and others) are basically describing the No Bullshit Scale:

1 SP 2.TFB - Too Fucking Big
NFC - No Fucking Clue

Everything beyond that is just pointless pencil pushing.

[-]

wildjokers@reddit

Story points have nothing to do with time. They are a measure of relative complexity compared to other stories.

[-]

Nanday_@reddit

Same, manager kept saying it's not about time, no worries, chill. Then team leader told me in a 1-1: this is how they translate story points to time and what they expect you to deliver every couple of weeks.

[-]

QuantumQuack0@reddit

No no, you don't get it. If you tell a manager a time, they will take it literally. You have to trick them by telling them a "score" that they have to convert to time using math. Then somehow they become more lenient with estimates.

[-]

Mrqueue@reddit

the thing with time is it's impossible to accurately estimate to an amount of hours, on top of that days is a very poor estimate because the amount of hours you have in a day varies.

Imo the best way to plan is split tickets into equal sizes and go from there

[-]

KingBig9811@reddit

Velocity

[-]

-Knul-@reddit

Vibes

[-]

DoneItDuncan@reddit

I view it as "how likely this story is going to get sidetracked". A low value would mean it's a quick one or two line change plus some testing with a low chance of distraction. A high value would mean you're all over the codebase with plenty of tangents to go off on and it's likely to scope creep.

[-]

ataboo@reddit

Yeah risk / complexity is a good one. It'd be nice to just estimate time but people just can't help judging unfairly.

[-]

hardware2win@reddit

People treat story points as hint for complexity, not velocity.

[-]

mr_birkenblatt@reddit

It's about Fibonacci

[-]

loup-vaillant@reddit

It’s self contradictory on purpose: it is critically important that you both understand what’s expected of you, and do not say it out loud: what you don’t spell out, you can’t complain about.

[-]

nitram122@reddit

Goodhart's law is an adage often stated as, "When a measure becomes a target, it ceases to be a good measure".

[-]

nanotree@reddit

I'd like to think that no one measures lines of code anymore. "Stop counting lines of code" was advice from before 90s... The fact that anyone would still try is just sad.

[-]

StruanT@reddit

It's not a completely useless metric. It just isn't useful for anything management wants to use it for.

[-]

KrochetyKornatoski@reddit

Yes ... please tell that to the moronic management ....typically a bad developer will wrote more (inefficient) lines of code that isn't optimized than a seasoned developer ... what matters for me is how well the deliverable matches the expected result ... just a general statement ... new developers have a tendency to be light in the error checking / data validation arena which results in errors in the UAT/SIT phase of the project ... how many times have I heard "well you didn't tell me I should check that" ... which I would say "correct but you should've known to check that" ... off on a tangent ... in the 80s I think? I was reading a trade magazine that stated a developer only writes about 10 lines of code a day ... which I immediately thought "yeah so what's your point?" ... Keith

[-]

my_beer@reddit

Lines of code deleted is a more reasonable metric, especially with a mature system. It's not a good metric as it's just as gameable as lines of code created but it encourages better behaviors.

[-]

stfuandkissmyturtle@reddit

Literally yesterday I had to trim down 400 lines of code to 200. Had no idea line of codes was a metric until now

[-]

my_beer@reddit

Lines of code produced is an old, and particularly shit metric, lines deleted is partially a joke but is actually quite a useful metric especially in older systems.

[-]

loup-vaillant@reddit

Dijkstra said it best I believe: we shouldn't think in terms of lines of code produced, but in terms of lines of code spent.

For my personal projects, this proved to be an excellent metric: as long as I don't game it with code golf tricks, number of lines of code are strongly correlated with actual complexity, which obviously I want to minimise.

[-]

eek04@reddit

Interesting research tidbit from long ago when we were writing in procedural languages and assembly:

No matter the language, there is approximately the same number of bugs per line of code.

This also tends to be true for a particular team/company culture.

So if you cut lines of code, you likely also cut number of bugs.

[-]

nachohk@reddit

No matter the language, there is approximately the same number of bugs per line of code.

I seriously doubt how broadly applicable this is, especially given the very significant differences in compile-time error checking tools available for different languages.

[-]

Drugbird@reddit

So if you cut lines of code, you likely also cut number of bugs.

Meh. I often see an inverse correlation between lines of code and readability / maintainability / debuggability.

I.e. if there's a clever trick to do something on one line, but there's a more verbose way to do it in 3, then the 3 lines tend to be easier to debug because it's less surprising and you don't need to be clever to "get" the trick.

Furthermore, the 3 lines often e.g. introduce additional variables, which act as additional "comments".

[-]

eek04@reddit

There are tricky ways to do things which are bad. Large code typically doesn't come from avoiding the tricks, though, it comes from bad higher level architecture or using a verbose language.

I also tend to do things like introduce extra variables to make things clear, and avoid tricks. I am fond of the ternary operator, though, which some people consider tricky.

[-]

ggtsu_00@reddit

"If I had more time, I would have written a shorter function."

[-]

manystripes@reddit

At my previous job the only metric they tracked was confirmed bugs per thousand lines of code changed. There's opportunity for gaming it by writing more verbose code, but there are also arguments to be made that more verbose code is more maintainable than clever code so win/win if that happens.

[-]

StarkAndRobotic@reddit

I knew some QA devs who would intentionally keep quiet during code reviews, so they could file bugs later - because that’s what management wanted - metrics to justify their performance. In contrast, my team, which worked really well, and fixed code during reviews, would get spoken to, because no one, not even any other team during a competition, could find any bugs in our code. I knew about what was really going on, so in the next competition I filed all the bugs in the other teams code, before their own team could pretend to “find” bugs in their own code. For this kind of idiocy, management gave me a gift card at best buy or some rubbish and got off my back about there not being bugs in my teams code.

[-]

bwmat@reddit

Wait, nobody being able to find bugs in your code was a PROBLEM?

[-]

StarkAndRobotic@reddit

Also - we had a test spec, to make sure all product code matched specifications, so we had test cases for everything, everything had automation to test it. So we did everything right, and efficiently. Everything was reviewed both within the team and by outside team members.

[-]

StarkAndRobotic@reddit

Also, ironically, because of this incentive, QA devs wanted to work with bad devs who wrote bad code, so there would be more bugs, so there manager would be impressed (which they were). Instead my team, which did excellent work, was scrutinised for not having bugs. Reading through this thread, this is also an example of the “Cobra effect” or perverse incentive or whatever…

[-]

StarkAndRobotic@reddit

Yes. They did not believe it. Instead of everyone doing their job, like they’re supposed to, fixing things before they were a problem, they expected a certain number of “bugs” to be filed as proof that people were doing work, and the MORE bugs, the MORE work was being done (in their opinion). It was idiotic.

[-]

JustMeRandy@reddit

Cycle time and pull requests merged are two sides of the same coin. Better to measure both to ensure your team isn’t incentivised to prioritise one over the other.

Also, while measuring error rate over code coverage may be a better measure of how buggy your software is, I’d argue that any approach to quality management that does not have automated testing as its backbone is going to result in code that takes longer to deploy to customers and that will be more prone to regressions.

[-]

Wonderful-Wind-5736@reddit

Even for single-developer projects, automated testing is worth the investment many times over. It doesn't have to be sophisticated, but simply checking if basic properties hold removes a lot of mental load. With GenAI, writing most tests is almost free anyway.

[-]

bacan_@reddit

Could you point me in the direction of an example you recommend of how to use gen AI for writing tests?

[-]

Wonderful-Wind-5736@reddit

It's quite trivial. In our internal ChatGPT frontend we have prompt templates. I use one where I tell the it to write doc strings, add type annotations and write tests for a function I paste in. The tests are often quite basic but usually cover any stupid programming mistakes. They usually get the fixtures right, too. I sometimes add more specific tests based on how risky I perceive the function to be.

[-]

old-toad9684@reddit

Cycle time and pull requests merged

It's bandwidth and latency. I prefer using those terms directly as it clues anyone with networking knowledge into the fact they get conflated in a very similar way, and that every project can have different priorities on which dimension is more important.

[-]

Loves_Poetry@reddit

I would definitely prefer cycle time over pull requests merged. Counting pull requests merged doesn't take into account how long a pull request was open. As a developer, I don't like when I need to wait more than a day for a review, because I want to move on to something else. Waiting long times forces context switches, which is bad, so the metric should take that into account

Focusing on cycle time also encourages developers to make pull requests that are easy to review

[-]

mysticreddit@reddit

/Oblg. -2000 Lines of Code

[-]

-grok@reddit

Boomers: Let's get these developers typing classes so we can increase output!

Also Boomers: Why isn't this crap working making me richer!!?!?!??!

[-]

neutronbob@reddit

How is this is a generational thing? In addition, most boomers (born 1945-64) are retired.

[-]

-grok@reddit

People go with what they know, and boomers definitely know typing!

[-]

booch@reddit

It seems like the author is comparing metrics used to measure performance of software developers (I mean, I agree, they're bad metrics)... with metrics used to measure product development. The first is the work of the software developers. The second is the work of the software developers, the project managers, the product managers, the clients, the managers all the way up the tree until you get to a point where they have no input on what features are worked on.

Those are two totally different things. And, in fact, you can do both at once. Though I agree you shouldn't be measuring on lines of code, or tickets, or whatever.

[-]

Yangoose@reddit

Whatever metric you come up with won't work for one of two reasons:

The employee has no actual control of the metric so it is meaningless to encouraging their productivity.
The employee can control the metric and is highly incentivized to game the system to max out the metric while ignoring any actual productivity.

You can manage workers on a factory floor by the number widgets produced but for creative outlets like programming it just doesn't work.

The truth nobody wants to admit is that you can't measure a job like coding with a spreadsheet.

The only way to do it is with an actual engaged manager, which seems to be a completely lost art.

[-]

bwainfweeze@reddit

You can manage workers on a factory floor by the number widgets produced

And even in manufacturing this notion went out in the 1990's. Eliyahu Goldratt sends his regards. Short version: Chewing up raw materials to make intermediate parts that cannot be sold as finished project nor sold as unused raw materials is a liability, not an asset, and should be strictly limited to being produced no faster than the constrained resource + buffer to make sure the constrained resource always operates near 100% of capacity.

[-]

scottix@reddit

Actually LoC trend is a good metric if it's not abused, the problem is when you tie it to rewards. I'm sure you have stopped coding and told someone god I'm so tired just wrote a 1000 lines of code. You know he worked a lot.

[-]

wildjokers@reddit

I would be more impressed if they removed 1000 lines of code and all tests still passed.

[-]

Ceedeekee@reddit

Metrics we use already are dumb as fuck

here use these metrics and just assume the causual link is engineering effort

this article propagates the problem it questions in the first place

[-]

Owengjones@reddit

This seems like a challenge for engineers who aren’t in a position to necessarily dictate what they work on. I don’t really see the author address this issue either. Computing engineer success through customer sentiment is great but I’m not sure how it scales in a large company. If PM has decided that engineering should work on feature X but the end product (assuming successful development) doesn’t actually solve the problem customers are facing, should engineers pay the price in terms of performance reviews?

Granted counting LoC or Sprints or commits or any sort of quantity based metrics is also not a valid measure of quality, but at least it’s under your control.

[-]

Terribleturtleharm@reddit

Our org measures number of PR's.

[-]

myhf@reddit

oh my goodness, Feature Adoption Rate as a metric is the source of so many dark patterns

[-]

ThirstyWolfSpider@reddit

The only time I used lines of code as a metric was when I was eliminating code — and still didn't take it seriously.

[-]

davecrist@reddit

Back in the day I worked with a dev that reveled in diligently rewriting existing code so that he could sometimes report negative LOC metrics. Good times.

[-]

Kache@reddit

This can be either/both good or bad, it really depends

good: simplifying abstractions, likely discovering and fixing bugs as abstractions improve, and lowering the cost of code maintenance for all other devs
bad: code golfing code to be more inscrutable, not making any fundamental improvements, increasing code churn and introducing unintentional bugs

[-]

davecrist@reddit

Removing lines by being overly clever for lols is definitely not good!

[-]

Middlewarian@reddit

I'm like that. I'm proud of the fact that the front tier of my on-line code generator is portable and less than 30 lines. But every time I think about that, I'm annoyed that one of those lines is only for Windows -- WSAStartup. The middle tier of my code generator is just under 300 lines. It's a Linux-only program and not as annoying because of that.

[-]

Supuhstar@reddit

Sorry, people still count lines of code????

The only time that has ever mattered to me is when I’ve tried to open a file that’s more than 10,000 lines long

[-]

maybeonmars@reddit

Wtf!
You cannot measure developers on whether a feature has increased revenue or traffic.
You have business drivers directing what features must be built. The devs build what is spec'd for them by business. If the feature doesn't make money, it's because business misread the market.
Really, this author is so far off the mark.

[-]

EveryQuantityEver@reddit

Not to mention, for a lot of developers, what you get assigned isn't entirely your choice. So this would just make it much easier for bad managers to manage someone out.

[-]

HomeTahnHero@reddit

This. None of these metrics are specific to engineering/development, which is the whole point. Yes you want to measure these things, but they won’t tell you much about how the engineering side is performing

[-]

Ravarix@reddit

The industry really hasn't figured out how to train engineering managers. The good ones are old engineers who understand the need for clear requirements. The bad ones are 'managers' first and look at code as another medium for politics.

[-]

ewoksith@reddit

Thank you! I was looking for confirmation. My initial impression was almost the same. I kind of felt like I detected a category error--if I'm applying that term correctly. Your comment makes me feel like I was on track with that, but then I was also thinking, initially, that the metrics like counting lines of code might still be potentially valuable, which I'm thinking now was off the mark.

[-]

eek04@reddit

You have business drivers directing what features must be built. The devs build what is spec'd for them by business. If the feature doesn't make money, it's because business misread the market.

This depends on the company. Some companies give more power to developers to pick what to write, and it can be reasonable to use that metric. That's especially true for senior developers.

[-]

BlindTreeFrog@reddit

it can be reasonable to use that metric.

Tech Debt is hard enough to get approval to deal with. Tying metrics to increased revenue means tech debt will never be cleaned up

[-]

manystripes@reddit

This is how you get companies only caring about security after a major breach occurs. There are plenty of foundational things that are essential to a good product that the end user won't necessarily see or care about until suddenly they care a lot when things go sideways

[-]

crash41301@reddit

Agreed. Article reads like it's for a cpo, product manager, gm, cto, etc even though it seems to be aimed at an em. For the business as a whole its absolutely spot on and if anyone in those non em positions doesn't already realize that then.... my word be concerned.

For an individual engineer and eng manager, on a team where product, gm, marketing. Etc pick what they work on? Stereotypical eng metrics are how to measure the eng. I'd throw system reliability, time to delivery in with many of the others mentioned too.

[-]

shevy-java@reddit

Number of lines of code actually is a metric that matters. One can say it is not the most important metric, but it definitely does matter.

[-]

bbuerk@reddit

I’m just hoping that traditional metrics don’t get replaced with some sort of LLM model grading my contributions

[-]

Lothrazar@reddit

As if story points are a metric that matter

[-]

WalkingTaco42@reddit

Metrics need to be considered per project. As many are pointing out, something like customer retention rate might not make much sense... and thus a project wouldn't use it.

The overall point is "lines of code" is silly.

[-]

Veggies-are-okay@reddit

One time I put an 80 character limit on lines. I must have gotten like 4x more productive overnight…

[-]

flerchin@reddit

Fuck's sake people can't know if they provide value for most development. Great you fixed a typo on the sign up form. It needed doing, but how much value did it provide? Some for sure: reputational, and lower friction to sign up, but the signal will hardly be measurable.

The devs worked a ticket. They'll work the next. Whether it was valuable or not is really not in their purview.

[-]

lotgd-archivist@reddit

Customer Retention Rate [Has the recently released feature improved the product?]

This is not applicable to how many shops run. The product owner is ultimately responsible for deciding what goes into the sprint and thus the application. And those writing the user stories are responsible for the design of the individual features. Yes devs have (or ought to have) input in that entire process, but measuring developer performance with something they are not responsible for is asking for trouble.

Time-to-Market

Closer, but if the backlog is badly prioritized by the PO, this also may lead to measurements that are ultimately not tracking developer activity, depending on how Time To Market is understood.

System Reliability

That is an important thing to keep track of regardless, even if it's not for evaluating the productivity of your devs. But there are some pitfalls here that I think the post should have mentioned. Such as not flagging errors on the application that made them visible. Otherwise you may end up berating your frontend team while all the errors are actually coming from some backend system 3 layers deeper. Ask me how I know.

Cost Per Active User

I think that has to be done very carefully, depending on the org. If the devs are not DevOps, they probably don't have much direct input into the costs of running the system they maintain. And even then you may have business requirements that increase the total costs dramatically (for instance by requiring the use of external services with an expensive contract).

[-]

michaemoser@reddit

i suspect that all metrics of this kind will be gamed eventually. Once upon a time we had code coverage of unit tests - very important metric, apparently. Now then I came across a project with unit tests that did not have any assertions in them - so here you have a scenario where this metric becomes meaningless.

[-]

Conradfr@reddit

Some of those metrics are usually not in the developer's control...

[-]

SwillStroganoff@reddit

One metric we (unofficially) use is lines of code deleted.