I built a world record exact solver for the minimum line cover of prime points after watching a Numberphile video. It turned the previous 282-hour record into 22 minutes, then kept going to prove 20 new awkward primes never certified before.
Posted by jespergran@reddit | programming | View on Reddit | 128 comments
After watching a Numberphile video on "awkward primes" I fell down a rabbit hole that turned into a month of obsessive C++ optimisation.
The problem: Plot the first N primes as points on a graph — the 1st prime (2) at position 1, the 2nd prime (3) at position 2, and so on. What is the minimum number of straight lines needed to pass through every point? Proving you've truly found the minimum is the hard part — it's an NP-complete set cover problem, and it gets exponentially harder as N grows.
The previous record stood at N=861, certified by Max Alekseyev (GWU) using an industrial MIP solver in 282 hours.
The solver replaces the MIP approach with an arithmetic-aware incremental architecture:
- 12,162 "heavy lines" (through 3+ primes) stored as 1024-bit bitmasks, keeping the full working set L2-resident
- 60% of steps certified instantly via witness propagation with no search at all
- Lagrangian relaxation with projected subgradient descent for tight lower bounds
- Parallel branch-and-bound with an Exclusive Dependency Rule that provably forces required lines without branching
The results: N=861 reached in 22 minutes. Full sweep to N=1024 completed in under 40 hours, certifying f(1024)=143 and finding 20 new awkward primes.
Full paper, MIT-licensed C++ source, and a live browser demo that runs the actual algorithm in real time are all at the link above. For the OEIS people: https://oeis.org/A373813
walen@reddit
Sorry if this comes out the wrong way, I just want to confirm that my understanding is wrong (I hope).
You say:
Does this mean that the reason you got a 750x speed up vs the previous record... is just that you cached the previous record's results?
Again, I probably am reading it wrong. What am I missing?
jespergran@reddit (OP)
Good question, and no, the 50ms enumeration isn't using anyone's cached results. That step is the new solver computing from scratch in 50 milliseconds every time you do a run, using the actual prime values, which lines pass through 3 or more of the 1024 prime points. It's fast because it's a well-structured double loop with cheap GCD-normalized slope comparisons. The previous solver would have done something equivalent, just less efficiently — and critically, it would have repeated it independently for every value of N rather than doing it once.
The speedup comes from the architecture being fundamentally different in several compounding ways:
The previous approach (MIP solver): Treated each N as a completely independent set-cover problem, handed it to a general-purpose mixed-integer programming solver, waited for an answer, then started over from scratch at N+1 with no memory of what just happened.
The new approach: Sweeps N=1 through N=1024 incrementally, carrying three pieces of state forward at each step: a witness cover (an explicit optimal solution from N-1), a warm primal seed, and a warm Lagrangian dual seed. This changes the character of most steps entirely:
- 60% of steps (witness mode): the new prime point already lies on a line in the cover inherited from N-1. Optimality is certified instantly, no search required.
- 23% of steps (root mode): the witness fails, but two lower bounds, a cover-inequality bound and a Lagrangian relaxation bound, already match the best known solution at the root, before any branching. Done in milliseconds.
- 17% of steps (full branch-and-bound): these are the hard cases, and they account for virtually all the wall-clock time. But even here, the Lagrangian bounds prune the search tree aggressively, the Exclusive Dependency Rule forces lines into the partial solution for free (provably without loss of optimality), and the bitset representation means every coverage check is a popcount on a 128-byte bitmask — the entire working set fits in L1/L2 cache.
So the 750× speedup is the product of many things: a purpose-built incremental structure instead of N independent restarts, Lagrangian bounds instead of generic LP relaxations, arithmetic-aware pruning rules, and a bitset engine tuned to the hardware. The 50ms preprocessing is almost a rounding error in that picture.
walen@reddit
Thanks for the explanation. That is indeed an impressive optimization.
jespergran@reddit (OP)
Here is an example: The previous world record spent 221942.3s (61 hours) certifying that N = 855 had 122 lines. One prime, 2 and a half days.
Here is the statistics from N = 855 from my run:
N=855 prime=6619 lines=122 time=301.980s mode=D
So the previous WR solver spent 61 hours certifying that N = 855 had 122 lines, my solver does the same in 302 seconds. But that isn't the crazy part... If you look at the statistics carefully, N = 855 was solved in "D" mode, which is the 17% of primes that need to do full branch-and-bound in my solver, which account for 99.998% of the time in the total sweep from N = 1 to N = 1024.
N=854, and N = 856 were both solved in less than 10 milliseconds.
In fact, the entire sweep of primes between N = 1016 to N = 1024, which you'd expect would be the hardest primes to solve, were all solved in less than 0.05 seconds (50 milliseconds) IN TOTAL, because they all used either W or R mode:
N=1016 prime=8087 lines=143 time=0.005169s mode=W
N=1017 prime=8089 lines=143 time=0.010178s mode=R
N=1018 prime=8093 lines=143 time=0.004765s mode=W
N=1019 prime=8101 lines=143 time=0.004757s mode=W
N=1020 prime=8111 lines=143 time=0.004732s mode=W
N=1021 prime=8117 lines=143 time=0.004741s mode=W
N=1022 prime=8123 lines=143 time=0.004769s mode=W
N=1023 prime=8147 lines=143 time=0.004957s mode=W
N=1024 prime=8161 lines=143 time=0.004953s mode=W
If the next prime falls on an existing line from the previous best solution, you can already guarantee that the solve is optimal, and instantly verify the answer as correct.
programming-ModTeam@reddit
No content written mostly by an LLM. If you don't want to write it, we don't want to read it.
programming-ModTeam@reddit
No content written mostly by an LLM. If you don't want to write it, we don't want to read it.
TankorSmash@reddit
This sounds like an LLM wrote this part of the comment
TrainsareFascinating@reddit
What's your point? What is the point of making this comment? Are you adding something substantive to this conversation about the program and its result?
programming-ModTeam@reddit
Your post or comment was overly uncivil.
GaboureySidibe@reddit
You're right, I guess lying is fantastic after all.
TrainsareFascinating@reddit
Stop pretending to be a person.
TankorSmash@reddit
You don't find it dishonest to have someone your comment for you and pass it off as your own?
TrainsareFascinating@reddit
You sound like someone is writing your comments for you.
TankorSmash@reddit
Why's that?
TrainsareFascinating@reddit
I'm tired of your AI slop.
TankorSmash@reddit
Huh?
TrainsareFascinating@reddit
Stop with the AI slop, bot.
LBGW_experiment@reddit
Same with
Not necessarily the em dash, but the "and critically, ..."
I use Claude code a lot at work and it always phrases things like this
LostCharmer@reddit
It's literally got an em-dash in the sentence.
This is the smoking gun!
LBGW_experiment@reddit
I said "not necessarily the em dash" as in, "here's another even more obvious tell than the obvious em dash"
floodyberry@reddit
a lot of their comments sound like an llm, really obnoxious reading what sounds like replies from the glazebot 3000
AvianPoliceForce@reddit
not to mention the emdash
integrate_2xdx_10_13@reddit
I mean… the previous record was 185 lines of Python using the Sage library.
OP threw over 300 hours of agentic AI in 100 gmail accounts at producing a 3259 line monolithic C++ solution.
The self-aggrandising might need to be toned down a little methinks.
Cobayo@reddit
That "Python" solution is using a MIP solver, Gurobi or CPLEX. That's extremely optimized C++ software.
ManySugar5156@reddit
kinda explains the speed gap lol, python was just the wrapper
integrate_2xdx_10_13@reddit
It’s also off the shelf.
“did you know if you spend $150 on ingredients and 4 hours cooking, you can make a burger thats better than a Big Mac”
But
1) we don’t know for certain if Gurobi or CPLEX was used for the benchmark (Sage defaults to GLPK)
2) we don’t know the hardware it was ran on compared to OP’s
3) it’s not even optimised anyway. It’s clearly some quick script, look at the definitions of
on_same_lineandfunc_grows_by_jumpand how they’re used in the hot path.davl3232@reddit
I would pay to avoid eating a Big Mac
Cobayo@reddit
Of course! I have not looked at any details related to the +200 hours run, I'm just guessing that if somebody ran that thing for like 10 days, I'm pretty sure they know better. I'd quickly lose a lot of respect if they didn't, to the point I would dismiss the results.
Yeah, I asked OP about some details but got no answer yet. OP claims to use a c4d-highcpu-8 which is not that fast, basically my desktop PC.
That's fine, you can write MIP scripts however you want, +99.9% of the runtime is gonna be all on the solver. The author is likely a mathematician rather than a programmer, doesn't really change how long it's gonna take to run.
RedRedditor84@reddit
You're really minimising how many "and there's the smoking gun" responses OP had to deal with without going on a murderous rampage.
ch1nacancer@reddit
Hahaha that’s what I’m reading here too. All he did was skip the initial 861 N of compute and ran the remaining 1024-861=163 lines in 40 hours. So his solution took 40 hours on 163 lines + calculation of new primes, and the original record of 861 lines took 282 hours all in with prime calc and line calc. Overall time complexity trend is still a linear time comparison, considering the current hardware it was run on vs the record. Don’t worry, our prime conjectures and cryptography methods are still safe.
Vibe coding and Ai have a tendency to convince the curious layman that they are somehow extraordinary in areas where they are not. I’ve fell victim to this too, and it’s not easy to spot when the Ai is simply obliging your ignorance and displaying sycophancy.
This is nothing groundbreaking. It’s just the same math we always did with primes now wrapped in a bow thanks to LLM coding accelerating the presentation. It’s impressive mathematics and a nice way to make current computer programs sweat, but it’s nothing new. We’ve always known about the prime number conjectures and their properties for some time now. I encourage more people to watch numberphile and sink into deeper rabbit holes. There is a huge amount of number theory we can’t prove yet.
But when and if we do, we can finally put some open questions about the universe to rest.
gliese946@reddit
Wow, you are completely wrong. This is a state-of-the-art computation that blew the previous attempt out of the water. And mathematically, it is completely false to say this is what we've known about primes for a long time. Neil Sloane is an accomplished mathematician and in the numberphile video he says this is a brand new observation about prime distribution.
ch1nacancer@reddit
You guys are missing the bigger point. Fitting dots to lines is trivial, counting to n factorial is trivial, using a previously computed set of primes and lines to tune some heuristics is trivial.
Solving the Riemann hypothesis and avoiding the sieve is not trivial. The moment you have a single analytical form of computing a prime, you are bringing what is currently limited to nonlinear time complexity down to order 1 or order N. That is the nontrivial piece. Once you figure that part out, all of these other parts unravel and become lower order time complexity.
I’ll let someone else show all the math that proves that, but the fact that there are hundreds of proofs that are contingent on that conjecture, and that solving it connects all the dots (literally), we could turn this complex problem into a trivial one with that one single proof.
And yet, still, no one has been able to do it yet, or if they have, they aren’t claiming the millennium prize. (Both are likely, since possessing that knowledge would be worth way more than the $1m prize)
gliese946@reddit
I'm afraid you have no idea what you're talking about regarding the mathematics. No mathematician in the world has any idea why those plateaus in the growth of the function appear, and the problem was only first observed very recently. This has nothing to do with the Riemann hypothesis, or sieves, or computing primes -- this problem would not fall if the Riemann hypothesis was suddenly proved.
As for fitting lines to points: yes there are solvers to do this in the general situation. OP's algorithm uses elements specific to this context for the fitting-lines-to-points problem to achieve a massive speedup in this particular context; this may or may not be exciting depending on your keenness about algorthmic efficiency, but the fact that the increase in efficiency opened up the problem and allowed OP to uncover new realms in the mathematical problem is very exciting.
Watch the original numberphile video to understand the problem before you post again. It has nothing to do with computing primes, and everything to do with describing the precise evolution of the pseudo-linearity of the dependency of the nth prime on n.
ch1nacancer@reddit
I’m simply stating that the frequency of primes will have some fundamental principle that makes solving this problem trivial. We disagree, but I don’t think I’m wrong. You seem to think this problem has nothing to do with computing primes. I believe otherwise. I’m coming at it from a computer science and electrical engineering background. We rely on new number theories to optimize algorithmic complexity. I’m referring to kolmogorov complexity for this specific problem. If we can solve the Riemann problem, we can collapse the Kolmogorov complexity of this particularly calculation, which results in a faster compute time. Nothing I’ve said here is wrong, I think you just aren’t getting what I’m saying. I might be wrong though and the count might have absolutely nothing to do with the frequency of primes, but I highly doubt that seeing as how this has everything to do with how primes are mapped to integers.
gliese946@reddit
I am understanding what you're saying. What I'm saying is that your assumption is unfounded that any approach that solved the Riemann problem -- i.e., that proved that all zeroes of the zeta function lie on the critical line -- would also necessarily reveal where the plateaus in this line-cover problem will occur. OP's project is about computationally extending the numerical evidence regarding where the plateaus in the line-cover problem occur.
And even if a proof of RH would be of help in this particular problem, the potential future existence of that proof wouldn't mean that it's worthless to try to achieve some understanding of OP's problem without assuming RH.
ch1nacancer@reddit
Yes I know, which is why I said that I might be wrong. But I don’t want to do all the math to see if I’m right. I’m just intuiting that because of the nature of the problem, this is just a rehash of the Riemann zeta. I just watched the numberphile video and it confirmed my suspicions. This is just feels string theory for primes. Where string theory is just a rehashing of quantum mechanics without offering any new testable predictions, these awkward primes remind me of the same feeling. While I agree those plateau frequencies and plateau lengths might offer something to do with solving the Riemann hypothesis, I still feel like we are nowhere near finding a solution, and at least for me, I don’t see how this plateau analysis brings me any closer.
gliese946@reddit
Ok. I don't want to be insulting but when you say something like the contents of the numberphile video give you the vibe of string theory for primes, this just reveals that there's nothing mathematical we can talk about across that gulf. I'll just try one more time: there is nothing about the plateau lengths that purports to have anything to do with the RH, I don't know why you keep bringing that up, but I'm going to give up now. Have a good night.
ch1nacancer@reddit
Ok, but there is nothing purporting that it doesn’t. And that’s my point. You seem to think it doesn’t and I disagree. I’m not smart enough to figure out why, and I don’t feel bad admitting that, because no one else has been able to do it for the last hundred and some years.
jespergran@reddit (OP)
No. Getting from N = 1 to N = 800 takes less than 60 seconds. Getting from N = 800 to N = 861 takes another 21 minutes. At 22 minutes, I'm already at N = 861, beating the previous WR. I did that run 100 times a day for a month straight. I didn't run the solver once, starting at 861 and ran it until 1024. I literally used N = 780 to N = 825 as the "benchmark" during testing at the end, because it got the easy primes at 780-800, and the start of the exponential curve starting at N = 800. The 780-825 sweep takes around 2 minutes. It took the previous solver 282 hours to run from N = 1 to 861.
Cobayo@reddit
It means it's preprocessing all lines that pass through 3 primes within [1,1024]
That aside, it's so hard to discern this sort of stuff. You can generate so much slop noise, faster than smart people can verify.
programming-ModTeam@reddit
No content written mostly by an LLM. If you don't want to write it, we don't want to read it.
GloppyGloP@reddit
AI slop code isn’t cool.
S0phon@reddit
It's AI but how is it slop?
GloppyGloP@reddit
Because OP admit he doesn’t understand what the AI tool did “the math has grown so far beyond my comprehension” and regurgitating the previous result and claiming a 750 times improvement is ridiculously dumb.
The very definition of slop.
EducationalBridge307@reddit
This isn't the definition of slop that I'm familiar with, which is much more about "high quantity, low quality". This result was produced with heavy reliance on AI, but if it is a good result, what does it matter?
By your definition, wouldn't something like AlphaGo be considered slop? The creators do not understand how it works, similarly to OP.
hpxvzhjfgb@reddit
no, calling alphago slop would be analogous to calling the LLM itself slop because the company that trained it doesn't understand what all of the weights mean. the actual analogy would be someone who isn't a go player using alphago to generate thousands of moves in random go positions where they don't understand the reasoning behind the moves.
Farados55@reddit
At what point does that matter when the results have been verified to be novel? OP was able to find new primes that fit this pattern that went beyond the previous solution.
hpxvzhjfgb@reddit
I didn't say it matters. the result is good
gliese946@reddit
You're quite wrong. I understand this problem well, and while it's an unorthodox and computationally expensive way for OP to have optimized his algorithm, the 750x improvement is absolutely valid. Read the paper, which is carefully argued. And OP did not regurgitate a solution, he recreated it in a tiny fraction of the time the original creator of the problem took, and extended it to a domain that was completely unachievable by previous methods.
It is a new problem and did not have decades of prior work on fine-tuning an algorithm, which would definitely mean diminishing returns on algorithmic efficiency but still. The person who generated the previous record is known to me and is no slouch. Neil Sloane would not have been excited about presenting the previous research on this problem, in the numberphile video, if there wasn't something interesting going on. I hate actual slop as much as anyone else but here OP has used AI to gain a vast improvement in an exciting new mathematical problem.
jespergran@reddit (OP)
I appreciate you seeing the achievement for what it is. Regardless of the tools used, the results speak for themselves: the solver processes the first 861 primes in 22 minutes, whereas the previous version required 282 hours to reach the same identical results.
Everything is mathematically proven in the paper. I’ve been meticulous in ensuring the logic never diverges from the optimal solution path, and the current run is already pushing toward the 2048th prime on a slightly modified version of the solver which is currently around the 1,680 mark, nearly doubling a record on a problem that hits a massive exponential wall after N=800.
Developing 3,000 lines of specialized C++ with this level of 'out-of-the-box' logic would typically take years of manual iteration. AI allowed me to condense that development into a month, but the rigor remains; I have spent hours verifying every module to ensure it is mathematically sound. I wouldn't refer to it as 'slop', it’s an architectural leap in a brand-new problem space.
pixartist@reddit
Which is the point of using ai
hpxvzhjfgb@reddit
the result isn't slop but the website and github repo are. just look at how stupidly long the readme is for what only needs to be a single .cpp file and the paper in a .pdf file. creating a website full of professional-looking designs and animations just for an improved bound on an obscure math problem is also rather ridiculous.
Farados55@reddit
You’re right, why make anything pretty if we care about it?
S0phon@reddit
This sentence is more ridiculous than anything OP has done. Do you realize how you sound?
hpxvzhjfgb@reddit
that sentence sounds perfectly sensible to me.
S0phon@reddit
God forbid someone cares about their work enough to present it well. So ridiculous.
Alex_Hovhannisyan@reddit
> read op's post
> remain skeptical, notice signs of AI/LLM-isms
> look through op's post history
> see similar patterns, including everyone's favorite: "You're absolutely right"
> OP responds to claim of AI use with an obvious AI-generated reply
> question reality
I'm tired, boss.
CandidAtmosphere@reddit
It’s interesting to see a finite, combinatorial branch-and-bound mechanism applied to this problem, given how heavily standard analytic number theory relies on continuous functional analysis to model prime distribution. I am curious about the specific constraints used to guide the generative AI. By requesting a strictly programmatic solution to a set-cover puzzle, the prompts likely bypassed the usual statistical bias of large language models. When a query includes standard mathematical framing, the AI defaults to continuous methods because that literature saturates its training data. Forcing the model into algorithmic finitism was clearly a computational necessity here, though without sustained theoretical steering, the AI-generated architecture operates without a broader conceptual compass.
You can see some of that tension where the framework straddles discrete execution and continuous geometry. The engine is optimized for exact bitset operations, yet the problem itself remains plotted on a standard Cartesian coordinate system. Plotting the prime sequence along a linear spatial axis relies on additive increments, which boxes their inherently multiplicative nature into a conventional grid. This raises a deeper theoretical issue regarding the purpose of the calculation, beyond the physical achievement of computing the N=1024 horizon. A strictly discrete approach makes intuitive sense since the integers do not form a continuous metric space. If the apparent randomness in prime distribution is the result of the interactions of rigid, finite rules, it invites questions about whether foundational metrics require continuous evolution.
jespergran@reddit (OP)
You’ve assumed a conversational prompting model: “tell the AI to solve the problem.” That’s not what happened.
This was a project‑scale agentic workflow over about a month, 300+ hours, building a 3000+ line C++23 solver. The AI (Codex 5.4 and 5.5 for the most part) ran autonomously for hours at a stretch using a fixed optimisation protocol:
Codex compiles the solver, runs it over a benchmark window (at the end N=780–825, \~2 minutes per run), do this twice in case of any small wall time differences between runs, pipe full per‑step statistics to logs.
Read the logs — every dense row like `N=950 prime=7499 lines=137 time=9687.872s mode=D active=10878 nodes=510868854 frontier=65536 ub0=138 lb_floor=137 lb_cov=89 lb=137 gap=1 ub_raw=144 lb_raw=122 gap_raw=22 root_ub_frac=0.861 warm_size=125 greedy_heavy=119 prod=10878 pinc=36531 forced_root=0 forced=1896647 lag_iters=12591626629 polish_sweeps=69988928 lag_prune=238420400 strong_branch=7 depth=108` — and treat the experiment log embedded in the header as a hard constraint (what has already failed).
Form a hypothesis grounded in measurements, not intuition. Copy the code into a sandbox, add tools to the sandbox version of the solver to confirm the hypothesis, perform an update, re‑run twice, compare to baseline.
- Faster with same line counts → apply to source.
- Slower or equal → retry at least 3 modifications, then append to experiment log (“what not to try again”), revert.
(This explanation is simplified: in reality, \~200 carefully crafted prompts over \~3500 lines, each aimed at a specific observed bottleneck.)
The log was the only persistent memory across sessions. Without it, each fresh session would rediscover the same dead ends. Over a thousand compiler runs, the log accumulated entries until the piecewise multiplier schedule finally settled. Every so often I reset the log in case an earlier attempt now was feasible.
So no, I did not “prompt an AI to design a line cover solver” with a single query about set cover. I built an environment where a model with no memory could still reason about evidence, discard failures, and converge on a working algorithm through iterated, measured trial. While Codex was running, I also had Claude Code and DeepSeek verify and confirm what Codex was doing was correct. The math: Exclusive Dependency Rule, Lagrangian bounds, child caps; was human‑proved (exchange arguments, not LLM generation) and then implemented incrementally with AI assistance.
Many algorithm ideas came from me; I asked the AI about feasibility of a good idea I had that would reduce branch-and-bound depth or get more primes to land in witness or root closure modes (83% of them lands in either of these two modes in the final code, and when a prime lands in these modes, it's resolved within microseconds to milliseconds at times), only D mode (full DFS) requires the full branch-and-bound but even that one has several benefits such as the exclusive dependency rule. I never added anything until I could prove the line count would stay unchanged for every single N up until the world record and could prove that it couldn't diverge from the optimal answer as N keeps increasing. I spent entire days figuring out the root of some small divergence I had early on in the solver (on the 786th prime and 813th prime specifically I had the most issues where it would diverge by one line), until I could prove every single line of code were optimal.
The outcome is a 749.9× speedup over the prior record, extending OEIS A373813 to N=1024 with f(1024)=143. In the early stages of development, I experimented with several different algorithms before finally settling on branch‑and‑bound with Lagrangian relaxation, the approach that ultimately worked the best. And throughout, the mathematics and the proofs guarantee the solver’s correctness (Section 6 in the paper).
MushinZero@reddit
I'd be curious to see your project setup and prompts. I didn't download through paper because I assume it's not in there and the math would go over my head. :)
Farados55@reddit
I second this. I’d like some insight into how OP prompted his agents to dump their progress for the next session. That sounds really interesting.
CandidAtmosphere@reddit
I'm not being reductive of your achievement, I'm just pointing out that your program is highly antithetical to classical methods. If you had primed the AI with a lot of information about classical analysis, which almost any analytic number theorist would do, it would not have made the program that you are running now. Unless you actively direct the program's design, it will default back into those assumptions, exactly as it does with regard to definite Cartesian coordinates which are essentially counter to the overall requirements of your program. In other words I don't think your program is broken, it is just that there is more you could learn about the unorthodox theory it depends on which you could then use to better instruct the ongoing design of the program.
Here is the basic point I am trying to make is that continuity is generated by the additive operator, which tells us nothing about the nature of the primes (adding 1 to any prime completely breaks its internal structure). Therefore you are in fact losing information by this default basis, and I merely suggest it's the cause of a limit to the upper end of computation your program can currently achieve.
jespergran@reddit (OP)
The coordinate system isn't a computational choice I made, it is the problem. OEIS A373813 is defined as the minimum line cover of the points (i, p_i). Changing the basis (e.g. to log-scale) wouldn't make the solver more natural to primes; it would define a different sequence asking about collinearity under a different metric. There's nothing to re-optimize here.
More importantly, the solver is already entirely discrete. The coordinate system is used exactly once, during the 50ms one-time enumeration of heavy lines as soon as the script runs, and even there, collinearity is detected via GCD-normalized rational slopes using exact integer arithmetic. After enumeration, everything runs on 1024-bit bitmasks with popcount operations. There is no floating-point, no continuous approximation, and no calculus anywhere in the hot path.
The actual computational ceiling is also not what you describe. The paper is explicit: the limit is the bitmask width (kBitCapacity = 1024). Extending the record beyond N=1024 means widening the bitmask to e.g. 2048 bits — a straightforward engineering change. The coordinate system has no bearing on this. (I'm already running both a 2048 version and a 4096 version on Google Cloud)
On the theoretical point: collinearity of (i, p_i), (j, p_j), (k, p_k) reduces to the integer identity (p_j − p_i)(k − i) = (p_k − p_i)(j − i). That's a multiplicative/additive mix, yes, but it's exact and coordinate-system agnostic. The 'additive basis losing multiplicative information' framing doesn't connect to anything in the implementation, the solver finds every such coincidence exactly, by construction.
ch1nacancer@reddit
Well it’s just a list of heuristics and cached information compressed from pre-computed data. Not as clever as a pure analytical solution, which is the holy grail.
The significance of a pure analytical solution to the Riemann hypothesis would unlock a whole new branch of physics and mathematics. If the Ai was actually super intelligent it would have just solved this with a solution to the Riemann hypothesis instead of relying on historical data and pre-computed solutions.
That’s why the field currently focuses on this and not the other thing. The other thing is just computational reduction, but in order for mathematics to move further, prime number solvers would need to apply a new analytical method so finding primes without brute forcing all the previous ones.
That’s why this problem is so hard and has yet to be solved by anyone. If someone has solved it, and the public knew about it, we’d have no more privacy, security or secrecy in information anywhere on the internet. And that’s obviously not the case yet, and if someone had this idea and wasn’t sharing it with the public, they’d be able to access any digital vault they wanted and be able to make arbitrary changes to any information they wanted without being traceable.
So even if someone had the solution, we wouldn’t even know about it, unless it was published to the public.
CandidAtmosphere@reddit
If you're saying this program is a broken failed heuristic, we should just falsify it by attacking the Lagrangian relaxation bounds, finding a flaw in the Exclusive Dependency Rule, or proving where the dominance map drops an optimal branch. I don't think it's going to happen. I think the OP created something excellent, which got their results posted to OEIS. Some of the replies higher in the thread are cope about what OP has achieved, regardless of what extent the AI participated in writing the code.
I'm just pointing out that if OP doesn't engage in much more serious theoretical contemplation about why they are making this program, they will inevitably hit a dead end. The program works largely because of the discrete methods I'm assuming OP is unaware are totally bizarre constraints that their program relies on, but it will run out of memory and space before it computes anything important because of its grounding in classical geometry.
ch1nacancer@reddit
I’m not saying it’s broken nor failed. It’s obviously not failed at doing what was intended and it’s already proven to have worked so it’s not broken. All I’m saying is that Gauss already solved this problem hundreds of years ago with his x / ln(x) correlation and we’ve been unable to improve on it ever since. So we are all just piggy backing off of Gauss to compute bigger and bigger numbers and that is all this is. Now if we could compute such big numbers that we find something beyond Gauss, that would be a step in the right direction.
CandidAtmosphere@reddit
Gauss describes a smooth asymptotic density, which provides no leverage for navigating the discrete branching of a finite set-cover puzzle. That’s why the jump to discrete methods is being underestimated here: the solver succeeds specifically because it abandons classical, continuous models in favor of a strictly finite engine. Furthermore, modern encryption is secured by the discrete algebraic reality of prime factorization. As far as I can tell you seem to be insinuating that the OP's program must be a failed approximation, in order to justify the point you're trying to make. My own explanation is that their discrete methods are extraordinary, they fly in the face of what any analytic number theorist would do or even what any AI directed by a respectable analytic number theorist would do, but it is still ultimately insufficient. It is a result of the AI working to meet OP's instructions for pure functional programming constraints, without a theoretical understanding of why that is a much deeper issue than an optimization problem.
jespergran@reddit (OP)
The 'dead end' you're predicting is already documented in the paper and is a mundane engineering limit, not a theoretical one. The solver uses 1024-bit bitmasks; extending to N=2048 means widening the bitmask. It doesn't 'run out of space because of classical geometry' — it hits a fixed compile-time parameter that you increase. That's not a conceptual barrier. I already have a version running to 2048 and 4096, they both run slightly slower, but they're both beyond 1024 and will keep running for the next weeks to months. Seeing how quickly the problem becomes exponentially harder after N = 800 makes me think that the "dead end" is not really the bitmasks, but the compute time, especially beyond N =1650 or so.
jespergran@reddit (OP)
This problem has nothing to do with the Riemann Hypothesis or cryptography. This is a geometric combinatorics problem: given fixed points (i, p_i) in the plane, what is the minimum number of straight lines that pass through all of them? It doesn't involve finding primes (other than the first few thousand which you can do in microseconds), factoring numbers, or anything that touches RSA. A proof of the Riemann Hypothesis also wouldn't break internet security, that would require solving integer factorization or P vs NP, which are separate open problems. These three things are routinely conflated but are genuinely unrelated.
The solver is also not a heuristic. It closes the gap between its lower bound and its best feasible solution to exactly zero before terminating.
Computerist1969@reddit
The greatest part of what you did was getting Google to fund 300 hours of their own AI without them getting any money for it.
Farados55@reddit
Fr bro is doing more to bring down the powers that be than any of us
UnintentionallyEmpty@reddit
I don't feel like reading the paper, the math is beyond me anyway.
I did skim it to see if there was any mention of some sort of peer review, but I didn't find it.
Has this been published (or been sent for publication) to anywhere where we can expect the publishers to apply the required scientific rigor to the proofs and claims being made?
Farados55@reddit
Neil Sloane is a mathematician who maintains the database where OP has submitted his sequence to, and it seems that Neil has accepted it. OP linked to it.
https://oeis.org/A373813
I don’t think this is research or peer review worthy, tbh. OP just (with AI) engineered a faster framework to find those primes. It still took an hour or something to go like 20 or so terms beyond the previous solution, so this is still some NP problem.
Farados55@reddit
300 hours of dev time with an agent seems like a lot. What did this cost in tokens?
jespergran@reddit (OP)
Haha, the honest answer is approximately zero dollars. I have 100+ Gmail accounts, and it takes about 10 seconds to log out and swap to a new one in Codex, so I just cycled through free sessions. By the time I looped back to account 1 a week had passed and they had all refreshed. Extremely janky, but it worked.
The workflow was a combination of Codex, Claude Code and DeepSeek. Codex would run the solver directly in the terminal to establish a baseline, analyze the output, identify the biggest bottleneck, apply a fix in a sandbox, and run the solver twice. If it improved, it handed over the upgraded code. If it did not, it left a detailed experiment log for the next session to pick up from. Each Codex session ran for about one to two hours, and in the meantime DeepSeek was hunting for issues and drafting prompts for Claude Code to act on. (I had a list of over 200 prompts so this is a simplification).
After 300 hours and over 1000 iterations of the file, I had the 750x speedup. Toward the end I ran 20 to 30 Codex sessions in a row without a single improvement, and honestly the math has grown so far beyond my own comprehension that at this point I understand the core ideas of the solver but could not have written the deeper machinery myself.
Farados55@reddit
That's kinda funny. Well, as an AI slop skeptic I am pretty impressed. I don't understand the math either, but if this has been verified by Neil Sloane on his database then that's pretty cool. This is probably the most useful thing I have seen an agent do nearly autonomously from what you describe. From an architecture standpoint, the code is pretty simple so I guess agents are very good at concentrated problems with limited dependencies which is good to know. As a software engineer, I think this is also a fairly good proof at what agents can accomplish at the minimum.
saint_marco@reddit
Having a simple objective function to improve takes a lot of the magic out of this.
Farados55@reddit
Meh, optimization is an end goal for software. The fact that OP can boast a 750x speed up is absolutely amazing. As I said, keeping an agent constrained to hyperoptimize a small script is probably one of its best use cases tbh
gramathy@reddit
It really seems like the agents were doing optimization rather than pure code generation too, which is a much more constrained problem to solve.
catch_dot_dot_dot@reddit
This is such a great comment and it's good to have an open mind. I mostly code for work and I was a big AI skeptic but now use agents heavily. Some of the work they've produced is far beyond what I could've practically done, they're incredibly useful tools.
Most of us using AI are not producing slop, it's still the same code I'd be happy to put up for review myself, I just didn't have to write it. Plus the tests and documentation are better than what I would've written. And no this isn't some astroturfed bot comment like people accuse anyone making positive AI comments.
tj-horner@reddit
I am in the same boat. The AI tooling landscape is rapidly changing and improving, even just over the past several months. There are still plenty of valid criticisms and reasons to be skeptical, but "LLMs write bad code" just... isn't really one of them anymore. You can get them to write bad code, but it's generally pretty solid.
BananaPeely@reddit
Yeah I think the problem is that AI makes it easier to launder low effort stuff as high effort. High effort vibe coded projects are basically just normal projects, maybe done quite faster.
catch_dot_dot_dot@reddit
And to be fair, my perspective is as a professional in an engineering group that holds people to high standards
Cobayo@reddit
Can you expand a little bit on the second paragraph? I guess I have to read the paper. How have you verified that you answer is actually correct?
I've worked on a problem that normally gets solved with min-max flow, and I've changed the solution to a MIP solver so I can add some features that make the problem NP. Amazingly the commercial solver runs at more or less the same rate as a heavily optimized flow algorithm, which I couldn't really believe.
Why could you optimize your problem by that many orders of magnitude? At the end of the day if it's NP we have to eventually go through the options, you know.
jespergran@reddit (OP)
Correctness first: the solver is a certified branch-and-bound, not a heuristic. At every single N it closes the gap between its lower bound and its best found solution to exactly zero before reporting an answer. The lower bound comes from two sources: a Lagrangian relaxation of the set-cover LP (tightened iteratively via subgradient updates), and a cover-inequality bound derived from the line structure itself. If those two bounds meet the best feasible solution, the answer is proven optimal with no sampling and no approximation. The paper formalizes this and the results are posted to OEIS. I also had Claude Code and DeepSeek independently verify the core invariants throughout development.
On the workflow: Codex can run commands in the terminal and basically takes full control of your PC. The prompt made it do the following: 1. download and compile the solver. 2. Run the solver twice (I usually had it do a 780-825 sweep which took approximately 2 minutes, this test range kept changing as the solver got faster). It ran the solver twice, just in case it fluctuated by a few seconds over the 2 minute run (literally scrolling in visual studio could cause a 1 sec change because it took some CPU). Once both baseline runs were complete, it saved the terminal outputs and analysed the entire solver (3000+ lines), then analysed both sweeps (detailed log of how every algorithm acted). It then made a hypothesis of what was the biggest bottleneck that slowed down the solver. Once it had made a theory, it copied the solver into the working directory and made a sandbox duplicate, and added tools to the solver, to verify the hypothesis. Once it was 100% sure it had found a potential improvement (around 30 min into the thinking process usually), it landed on the biggest issue, that was not present in the experiment log (previously failed attempts). It copied the file again, and edited it with the new solution, and ran it twice through the same 780-825 range, and compared the statistics towards the baseline. If it improved, deliver the updated file, if it stayed the same or regressed, add a detailed experiment log so future AI conversations knows to avoid it, and have 30+ minutes of sandbox and thinking so it doesn’t need to do that again. It was guided to ALWAYS stay away from ANY solution that could change the optimal line count. Any improvement in speed is pointless if the line count would be calculated incorrectly.
On the speedup: your instinct is right that NP means you eventually have to branch, and raw hardware only gets you a linear factor. The 750x didn't come from a faster machine — it came from the vast majority of N values never needing to branch at all. I can run the sweep to 800 in less than a minute on a 10 year old school laptop. About 60% of steps are resolved in microseconds because the new prime point already lies on a line inherited from the previous optimal cover (witness mode). Another 23% are resolved at the root before any branching, because the Lagrangian bound already matches the primal. Only the remaining \~17% require full search, and even those benefit from aggressive pruning via the Exclusive Dependency Rule (a structural property of this specific problem that forces certain lines into any optimal solution, provably). The previous MIP solver treated every N as a cold-start independent problem and got none of those gains.
Your flow-to-MIP result is interesting — commercial solvers are remarkably good at exploiting structure that a hand-rolled flow algorithm already encodes implicitly, so I can believe the parity. The difference here is that the problem has enough geometric structure (collinearity, incremental prime additions, line inheritance) that a purpose-built solver can exploit things no general-purpose solver would ever find.
Cobayo@reddit
I'm not fully convinced yet but if I had a big takeaway, it would be your workflow. It seems like a generic approach into trying stuff out until something better sticks, at any problem. It would be nice if you could apply the same methodology at more popular NP problems, like TSP, Knapsack, SAT, etc, so it would be much easier to review for way more people.
jespergran@reddit (OP)
Yeah, it's a genuinely niche problem. Part of the reason I chose it was that the ground truth is unambiguous, the optimal line count either matches OEIS or it doesn't, which made the feedback loop clean, I also found the problem really satisfying. That's actually a prerequisite for the workflow: you need a signal the AI can measure objectively, otherwise 'did this help?' becomes a judgment call and the whole thing falls apart.
TSP, SAT or Knapsack would have that same property, and they'd be much easier to benchmark against since there are decades of established solvers and known hard instances to test on. Next time I go this deep I'll pick something like that — both to get more eyes on the methodology and to see whether the same log-driven, hypothesis-testing loop can move the needle on a problem where people would immediately recognize whether the result is meaningful.
Fornicatinzebra@reddit
Id be careful about stating that - pretty sure it would be breaking the T&S
MrChocodemon@reddit
Good start. Lying within 2 words.
Economy-Rip5676@reddit
I love how so many insane programming projects start with someone casually watching a YouTube video and accidentally inventing something way beyond normal human productivity
gliese946@reddit
I can only imagine how exciting this was to put together, congratulations! I watched the numberphile video when it came out and saw the new "double plateau" that was found, inserted into the video after neil Sloane recorded it. Did you uncover a series of additional, ever longer double plateaus in the function? And what's the status of any conjecture about the existence of always more and longer plateaus as the function gets extended?
Great work! You should cross post to /r/math.
jespergran@reddit (OP)
There is indeed a new plateau, though it's not in the released N=1024 sequence, it's on a new solver run currently targeting N=2048, about a week in.
The longest plateau in the Numberphile video (and for all primes up to 1024) runs 112 primes, from N=464 to N=575 at f=69. The new run appears to have shattered that: a plateau of 237 consecutive primes sitting at f=145, somewhere in the N≈1100–1300 range! I will publish that whole sweep to 2048 once it's complete. The issue is that single primes are starting to get so computationally heavy that they take almost entire days.
gliese946@reddit
That is very exciting! Is it also a double plateau like the two earlier ones (13/14 and 68/69)? I find this a very strange phenomenon and can't begin to imagine what would cause it to occur, when the rest of the function has quite a consistent growth pattern.
I also can't imagine why my question was downvoted.
jespergran@reddit (OP)
I don't have the numbers in front of me right now, but there is a 237 plateau at 145 lines, a 48 line plateau at 143 lines and a 25 line plateau at 142 lines, (first, fourth and seventh highest plateaus across the sweep until around N = 1680 that I have running to 2048).
If you run the demo until the line count is 69 here: https://prime-line-cover.vercel.app, takes around 20 sec to get there (set the slider to instant speed), you sort of get a clue on why the plateaus happen, seems like since it gets exponentially harder, when the solutions "curve" down and then back up again, you find optimal solutions in the first half that the 2nd half land on. Imagine the line count looking like a curve, and you draw a thick line of optimal solutions through that curve, once the first half of the curve is figured out, the 2nd half of that curve somehow all lands on optimal lines.
Imagine drawing horizontal lines through a U-shape. Because of the symmetry, once the solver 'covers' the first half of the curve, the second half effectively mirrors those requirements. The solutions for the latter half 'land' on the optimal lines already established by the first half. In reality, it’s a complex, super-linear curve, but this symmetry is likely why the line count stalls even as N increases. It’s as if the complexity for the next set of numbers has already been 'paid for' by the previous optimal configuration. I have no clue why hundreds of them happen to land on the existing lines though, without a single prime diverging for 237 lines straight. And it seems like the plateaus increase in length as N grows which I also find very interesting, I really doubted I'd see anything similar to the 112 step plateau around N = 500.
I went to bed around N = 1100, woke up the next day and expected to see around N = 1150, but because of the plateaus it boosted me to around N = 1600 because of all the witness plateaus where primes land on existing lines.
onwardforward@reddit
Is there a way to see the actual primes you are talking about? (the 20 new ones and the ones you mention in this comment)
I see the positions of the primes noted, but not the primes themselves. Do you have them noted somewhere?
buildingstuff_daily@reddit
mods removing interesting math/cs content while leaving the 47th "which framework" post up is peak reddit moderation
Farados55@reddit
I don’t blame them tbh because at first glance this seems all LLM generated. Tbh, I’m still not sure if OP is just using LLMs to respond or not. Either he is or he just happens to respond extremely similarly. OP is also not American so that might explain it.
People are still skeptical of agents doing novel work. So am I, tbh.
Constant-Zebra-9752@reddit
100% OP is using LLM's to write their comments.
I hate the "but English isn't their langauge" argument as well, as if translators haven't existed and worked perfectly well for years.
_kelvindecosta@reddit
It has been a while since I studied math in Uni, so I cannot truly appreciate your work, but I must say that you have a good eye for design. Absolutely love your website, the way you presented your work and even the interactive visuals. Cheers!
jespergran@reddit (OP)
Thank you so much! The website was honestly a project in itself, about 8000 lines of code with the entire solver ported to JavaScript so it could run live in the browser. The demo reaches the 250th prime in about 5 seconds, while the C++ solver covers the same range in 103ms. Seeing it animate in real time in a browser still puts a smile on my face though. Really glad the presentation landed well!
eracodes@reddit
why port to JS instead of WASM?
Constant-Zebra-9752@reddit
AI told him to.
jespergran@reddit (OP)
WASM would probably be much better. But to be honest, the first thing I built was the visualisation, before I even started on the C++ code. That was originally my only goal. I got fascinated by the Numberphile video and wondered how hard it was to get to a high prime. The original JavaScript version got to the 39th prime and got stuck for several minutes. I intended to stop there, but went to C++ to try a more serious attempt, with the goal of reaching N = 1024. After a month of work with C++ I achieved the goal and wanted to create a port of the C++ to visualise it, and instead of starting from scratch, I ported it into the old JS visualisation. The old version from a month ago still runs at https://primelines.vercel.app, it has the original JS code before I even started on the c++ version. It doesn’t even compare to the ported C++ version at https://prime-line-cover.vercel.app that includes all the algorithms, but you can see the similarities.
purg3be@reddit
The design has AI all over it.
Suppafly@reddit
How so?
purg3be@reddit
The buttons up top, the main graph as a card, the shadow on the card, color scheme, font, deployed on vercel, etc.
Its all very polished, yet super genetic and looks like every other website.
gravteck@reddit
As a backend guy, I would bask in this compliment. It's the one consistent side of the job I use it for.
Nine99@reddit
The site keeps ignoring many of my up/down arrow presses. It also features tiny, hard to read light grey on beige text. Once you entered the large diagram, it's unclear how you leave it to the screen before. The diagram lines start moving on their own, and you have to search for the pause button to stop that.
So there's plenty of room for improvement. At least it's not neon/purple with useless links, I guess (most vibecoded websites).
drink_with_me_to_day@reddit
It's using the Claude theme
deja-roo@reddit
Reddit thinks everything is AI
purg3be@reddit
I am reddit
Farados55@reddit
Well to be fair, this is AI. The author used agents for all of this.
_kelvindecosta@reddit
Still, I think they've done a good job, even if it is all vibe coded
meaui_cat@reddit
I’m nowhere near smart enough to even comprehend how significant this is BUT this sentence made my day:
“Lagrangian relaxation with projected subgradient descent for tight lower bounds”
Kudos to OP for the 100 Gmail accounts tho! Smart.
AlSweigart@reddit
I'll say what I say to all AI slop claims:
Wow, big if true.
DigThatData@reddit
nice, congrats!
MrBajt@reddit
The problem is you are responding with an LLM. It is genuinely hard to borderline impossible to engage with that, because I can basically get it to output whatever I want and sound completely right. How I know? I use it as a research assistant regularly in novel fields and its currently a hit and miss that lead to wrong conclusions nunerous times if not careful. So if you want to engage with people write your results in your own words and not a LLM wall of text.
Interesting_Level943@reddit
Recruitment: We need someone to address the risk control issues related to the batch production and login of the APP.
icecoldgold773@reddit
Ai sloppy and garbage C++ code
programming-ModTeam@reddit
r/programming is not a place to post your project, get feedback, ask for help, or promote your startup.
Technical write-ups on what makes a project technically challenging, interesting, or educational are allowed and encouraged, but just a link to a GitHub page or a list of features is not allowed.
The technical write-up must be the focus of the post, not just a tickbox-checking exercise to get us to allow it. This is a technical subreddit.
We don't care what you built, we care how you build it.
programming-ModTeam@reddit
No content written mostly by an LLM. If you don't want to write it, we don't want to read it.
OkkeHendriks@reddit
Why removed this was interesting!?
jespergran@reddit (OP)
I'm so confused too, no reason to have this removed...
Physical-Compote4594@reddit
Super cool! Congrats on coming up with such an excellent solution!
SepiaHawk@reddit
Please don’t break any cryptography while you’re in the rabbit hole.
_l33ter_@reddit
Freaking awesome! I love when the rabbit hole kicks in :)
jespergran@reddit (OP)
Yeah, my original intention was just to make a simple visualization because I found the problem so satisfying after watching a Numberphile video. The first solver was coded in JavaScript, and it hit an exponential wall somewhere around the 39th prime, grinding for several minutes before getting to the next prime. So I decided to tackle it properly in C++, and after about 300 hours of work, it now blasts past the 39th prime in microseconds and just keeps going all the way to 1024!
_l33ter_@reddit
yeah - I know the Numberphile video :)