Does AI change what actually matters about Jupyter notebooks?

Posted by pplonski@reddit | Python | View on Reddit | 17 comments

I'd love to get some honest feedback from people who actually use notebooks in practice.

I've been experimenting with different workflow on top of Jupyter: instead of writing code first, you describe what you want in plain English, and Python runs behind the scenes. So the flow is:
prompt --> LLM generated code --> auto-execution --> results

One important implementation detail: the whole conversation is still staved as .ipynb file.

One thought I had. There has been a lot of criticism of notebooks for hidden state, mixxing code and outputs, hard to git review. But does AI change which of these problems actually matter. If code is generated and execution is automated then some of old pain points feel less important? At the same time, I'm pretty sure that we are introducing new problems, like trusting LLM generated code.

Would really appreciate critical feedback - do you think that AI makes classic notebook problems less important?

[-]

ManBearHybrid@reddit

The whole purpose of notebooks instead of just running .py scripts is human readability. They mix code and output, precisely because you can show intermediate outputs and "tell a story" with your code. If you're not planning on a human ever reading the code, then there's no point to having it in a notebook. You may as well just go straight to .py files. In the future, I think LLMs will dispense with human-readable code entirely and will go straight to writing machine code.

[-]

AKiss20@reddit

I mean there’s reasons beyond human readability why we don’t use machine code or assembly anymore. It’s incredibly non-portable.

It’s possible we will one day get to some form of incredibly dense compiled or interpreted language that essentially only LLMs can reasonably read quickly (a human can read machine code, it’s just incredibly slow and difficult) but honestly I doubt the software engineering and other industries that utilize software as a core part of their business (aka basically everything) will be willing to trust the magic black box that much for a while. Corporations love to have a head to chop when shit goes wrong. If you allow the magic box to do things that have, by design, essentially no way to hold a human accountable, execs won’t have someone to blame when shit catches fire. They hate that.

[-]

ManBearHybrid@reddit

Corporations love to have a head to chop when shit goes wrong.

While this is true, corporations also like money. Let's ignore the risk of making linear predictions for a minute, and assume that AI will continue to get better and better. It might not, but let's assume it does for the sake of argument. If it continues to improve, there will be a tipping point in the economics of it, where corporations' appetite to pay a team of engineers to maintain a system will be outweighed by the risk of letting AI do it. Right now that risk is high so it's worth it for them to keep paying dozens/hundreds/thousands of premium salaries. But I don't know that it always will be.

It is reasonable to think that there may soon come a day where AI is good enough for them to dispense with needing people to understand the code. So then human-readable code will be obsolete.

The recent news about Anthropic Mythos further confirms this to me. AI, which has only been around for the blink of an eye in historical terms, is finding zero-day vulnerabilities in code that humans have missed for decades. To me, the writing is on the wall.

Feeling_Ad_2729@reddit

yes, but the change isn't the one most people expect.

notebooks used to matter because they gave you REPL + narrative + visual output in one place. that was the exploration moat.

AI agents don't need any of that for exploration. they can iterate faster in a pure Python REPL than a human can in Jupyter, because the narrative-for-human-reader step is overhead for them.

what notebooks become, post-AI: output artifacts. you let the agent do the exploration in files/REPL, then the agent produces the notebook as a REPORT for other humans. the cells become a linearized explanation of what was discovered, not a workspace where discovery happens.

that's a genuinely different use case. the tooling (ipywidgets, cell ordering rules, kernel state management) that matters for the OLD use case is almost irrelevant for the NEW one. static .ipynb rendering and reproducibility matter way more.

CompetitiveAerie5904@reddit

yes! but not in the way most people first assume. AI doesn’t make Jupyter notebooks irrelevant; it shifts what matters when you use them.

Let’s unpack that.

h-mo@reddit

it solves one problem and makes another worse. hidden state and messy diffs matter less when you're not hand-writing the code, sure. but "trusting LLM generated code" in a notebook is actually harder than trusting hand-written code because there's no clear author intent to reason against - you just have output that looks plausible until it doesn't. the reproducibility problem also gets worse, not better.

BluebirdMiddle5121@reddit

Cursors ipynb notebooks effectively solve this while also letting you observe code and debug.

Individual-Flow9158@reddit

If you're doing the kind of experiment where the ends justifies the means this is fine. But for Data Science, if you don't understand what the code's doing, you can be unwittingly dishonest about the resulting metrics. AI massively increases the risk of Garbage In

the whole conversation is still staved as .ipynb file.

Why?