Does AI change what actually matters about Jupyter notebooks?
Posted by pplonski@reddit | Python | View on Reddit | 17 comments
I'd love to get some honest feedback from people who actually use notebooks in practice.
I've been experimenting with different workflow on top of Jupyter: instead of writing code first, you describe what you want in plain English, and Python runs behind the scenes. So the flow is:
prompt --> LLM generated code --> auto-execution --> results
One important implementation detail: the whole conversation is still staved as .ipynb file.
One thought I had. There has been a lot of criticism of notebooks for hidden state, mixxing code and outputs, hard to git review. But does AI change which of these problems actually matter. If code is generated and execution is automated then some of old pain points feel less important? At the same time, I'm pretty sure that we are introducing new problems, like trusting LLM generated code.
Would really appreciate critical feedback - do you think that AI makes classic notebook problems less important?
ManBearHybrid@reddit
The whole purpose of notebooks instead of just running .py scripts is human readability. They mix code and output, precisely because you can show intermediate outputs and "tell a story" with your code. If you're not planning on a human ever reading the code, then there's no point to having it in a notebook. You may as well just go straight to .py files. In the future, I think LLMs will dispense with human-readable code entirely and will go straight to writing machine code.
AKiss20@reddit
I mean there’s reasons beyond human readability why we don’t use machine code or assembly anymore. It’s incredibly non-portable.
It’s possible we will one day get to some form of incredibly dense compiled or interpreted language that essentially only LLMs can reasonably read quickly (a human can read machine code, it’s just incredibly slow and difficult) but honestly I doubt the software engineering and other industries that utilize software as a core part of their business (aka basically everything) will be willing to trust the magic black box that much for a while. Corporations love to have a head to chop when shit goes wrong. If you allow the magic box to do things that have, by design, essentially no way to hold a human accountable, execs won’t have someone to blame when shit catches fire. They hate that.
ManBearHybrid@reddit
While this is true, corporations also like money. Let's ignore the risk of making linear predictions for a minute, and assume that AI will continue to get better and better. It might not, but let's assume it does for the sake of argument. If it continues to improve, there will be a tipping point in the economics of it, where corporations' appetite to pay a team of engineers to maintain a system will be outweighed by the risk of letting AI do it. Right now that risk is high so it's worth it for them to keep paying dozens/hundreds/thousands of premium salaries. But I don't know that it always will be.
It is reasonable to think that there may soon come a day where AI is good enough for them to dispense with needing people to understand the code. So then human-readable code will be obsolete.
The recent news about Anthropic Mythos further confirms this to me. AI, which has only been around for the blink of an eye in historical terms, is finding zero-day vulnerabilities in code that humans have missed for decades. To me, the writing is on the wall.
Feeling_Ad_2729@reddit
yes, but the change isn't the one most people expect.
notebooks used to matter because they gave you REPL + narrative + visual output in one place. that was the exploration moat.
AI agents don't need any of that for exploration. they can iterate faster in a pure Python REPL than a human can in Jupyter, because the narrative-for-human-reader step is overhead for them.
what notebooks become, post-AI: output artifacts. you let the agent do the exploration in files/REPL, then the agent produces the notebook as a REPORT for other humans. the cells become a linearized explanation of what was discovered, not a workspace where discovery happens.
that's a genuinely different use case. the tooling (ipywidgets, cell ordering rules, kernel state management) that matters for the OLD use case is almost irrelevant for the NEW one. static .ipynb rendering and reproducibility matter way more.
CompetitiveAerie5904@reddit
yes! but not in the way most people first assume. AI doesn’t make Jupyter notebooks irrelevant; it shifts what matters when you use them.
Let’s unpack that.
h-mo@reddit
it solves one problem and makes another worse. hidden state and messy diffs matter less when you're not hand-writing the code, sure. but "trusting LLM generated code" in a notebook is actually harder than trusting hand-written code because there's no clear author intent to reason against - you just have output that looks plausible until it doesn't. the reproducibility problem also gets worse, not better.
BluebirdMiddle5121@reddit
Cursors ipynb notebooks effectively solve this while also letting you observe code and debug.
Individual-Flow9158@reddit
If you're doing the kind of experiment where the ends justifies the means this is fine. But for Data Science, if you don't understand what the code's doing, you can be unwittingly dishonest about the resulting metrics. AI massively increases the risk of Garbage In
Effective-Two-3926@reddit
You're a fucking idiot.
aloobhujiyaay@reddit
the git problem doesn’t go away. if anything, it gets worse since now diffs include generated code + outputs + prompts. harder to review meaningfully
DeerFew3903@reddit
the hidden state problem gets even worse when you can't see what code actually ran - at least before you could debug by reading through cells but now you're trusting some black box to write correct logic
ManBearHybrid@reddit
Yep, this is just vibe coding. It works fine until it doesn't.
wRAR_@reddit
This sounds like a broader question of "if I use an LLM to generate code that solves a task, run it once and thus solve the task, does it matter if the code is crap". Of course it doesn't if the task is solved and you don't need to reuse the code.
Why?
dj_estrela@reddit
To be able to continue the high-level natural language later
wRAR_@reddit
You don't need to save the code, especially in a notebook format, for that.
_redmist@reddit
A lot of these problems can be solved with Marimo imho.
Capable-Wrap-3349@reddit
It seems that Marimo has much better ai integration