A new breed of analyzers: the state of AI when we get to enjoy some positive aspects of this technology.

[-]

Ameisen@reddit

Someone I know ran some assembly generated by my virtual machine's JIT that I had annotated for why each step was being done.

It decided that there were like 9 critical bugs - and offered solutions to some.

7 weren't bugs at all.

1 wasn't a bug and their solution would break things. It said to replace my "sub + add" (effectively) with just a test [reg] for a null check. The problem was that I needed to check for sized read/write if the operation would touch null, so I relied on the overflow flag. The LLM "fix" would have only worked for literal address zero - not 0xFF.. for 2+ bytes, etc.

1 was a legitimate problem caused by a refactor. It resulted in the wrong address being reported by an exception. Not huge but still present.

[-]

currentscurrents@reddit

For context, this is the same Curl maintainer that was previously complaining about AI slop bug reports.

It seems AI can also find legitimate bugs when used by people who know what they're doing.

[-]

psaux_grep@reddit

Merely calling Daniel a «curl maintainer» is like calling Linus Torvalds a «Linux maintainer».

[-]

xelrach@reddit

Sounds like a pretty positive development. Neural networks have been used for pattern matching for a very long time and they're better at it than they are at generation. Having another class of analysis to use when reviewing code seems like a win.

Who knows how much it will cost though.

[-]

currentscurrents@reddit

The neat thing about generation is that you can do other tasks, like classification or pattern matching, within the generative output.

And that's what they're doing here. These tools work by prompting an LLM with some function or file (sometimes preprocessed into a syntax tree) and instructions to look for vulnerabilities.

Who knows how much it will cost though.

The one they mention in the blogpost costs $200/month. Which is steep for an individual but affordable for a company.

[-]

JohnDoe_John@reddit (OP)

Going forward

We do not yet have any AI powered code analyzer in our CI setup, but I am looking forward to adding such. Maybe several.

We can ask GitHub copilot for pull-request reviews but from the little I’ve tried copilot for reviews it is far from comparable to the reports I have received from Joshua and Stanislav, and quite frankly it has been mostly underwhelming. We do not use it. Of course, that can change and it might turn into a powerful tool one day.

We now have an established constructive communication setup with both these reporters, which should enable a solid foundation for us to improve curl even more going forward.

I personally still do not use any AI at all during development – apart from occasional small experiments. Partly because they all seem to force me into using VS code and I totally lose all my productivity with that. Partly because I’ve not found it very productive in my experiments.

Interestingly, this productive AI development happens pretty much concurrently with the AI slop avalanche we also see, proving that one AI is not necessarily like the other AI.

[-]

ReginaldBundy@reddit

The Telnet example is really interesting. Just for the fun of it, I ran the current version of suboption() from lib/telnet.c through ChatGPT asking it to look for issues, result:

I then followed up and asked specifically about the calls to bad_option(). In the response ChatGPT actually refers libcurl (obviously an older version without the fix) and states:

So, in essence, bad_option() ensures:

The string isn’t NULL.

It doesn’t contain control or non-printable characters that would break the Telnet suboption negotiation or confuse terminal emulators.

It prevents injection of Telnet command bytes like IAC (255), which could prematurely terminate a suboption.