AI was used to recreate deadly plane crash audio, prompting regulators to step in
Posted by Shoddy_Act7059@reddit | aviation | View on Reddit | 77 comments
Been a while since I posted here (trying to limit social media usage and live more offline); but, after hearing about this scandal involving the UPS 2976 plane crash in Kentucky, I decided to do some investigating. I believe the OP of the post I'll link below, Yosh145, said that people might have been reconstructing the audio from the NTSB released Spectrogram. Well, we have official confirmation of that now...
And that reconstruction used AI.
I don't think I need to explain further why this is bad, so I won't. But, yeah, a very scummy thing to do there. Certainly fooled a lot of people, though.
Link to the aforementioned post by Yosh145: https://www.reddit.com/r/aviation/comments/1tjv73c/ntsb_removes_ups_flight_2976_spectrogram/
sourcefourmini@reddit
I feel like I’m living in an alternate reality right now. Because 1) this isn’t something that was “rumored to be happening” or that we “had to wait for official confirmation” of; the recreated audio is all over by now, including being posted to this subreddit before being removed. And 2) this isn’t “fake”, nor is it some new gen AI-powered technique. It’s spectrographic analysis, the exact same principle underlying a CD. Sound makes waves, we can convert those waves to an image, and we can subsequently reproduce the sound from that image. You could do this right now with a copy of MatLab.
cheetuzz@reddit
How is spectrographic analysis related to a CD? As far as I understand, CDs are simple, uncompressed waveforms (sampled digitally).
sourcefourmini@reddit
That's fair, and it's a clunky comparison, I'll admit. My point was that both waveforms and spectrograms are just ways to represent a signal visually, and it's practically definitional that you can get the audio signal back from its representation. A waveform is a direct representation of the signal, and a spectrogram is a representation of its fast Fourier transform - which is fully reversible.
KehreAzerith@reddit
There's nothing AI about converting a spectrograph into audio, this has existed for a long time now. People need to stop shoving AI buzz words into everything.
Anonymous017447@reddit
AI didn’t allow people to recreate it but it probably made it easier. Converting a sound spectrum image to audio has been around for decades.
It’s just that with better technology, it’s much less time consuming.
KeynoteBS@reddit
Yeah I’m not understanding what the issue is here. NTSB released the audio spectrogram. That’s like releasing the FFT visualization of a song. Remember Winamp? It would create pretty visualizations of music. And you can certainly reverse those graphs back to music.
So, if you don’t want people doing that then maybe don’t release the actual spectrogram? This is like releasing the x-rays or CT scan and then expecting that other other people with the know, how won’t be able to pick up on the information in those scans.
reebokhightops@reddit
Was Winamp the one that really whipped the llama’s ass?
Shoddy_Act7059@reddit (OP)
Yeah, I've seen a boatload of people say the NTSB kinda brought this on themselves for releasing the spectrogram in the first place.
Still doesn't mean people using AI to reconstruct the audio of the UPS 2976's CVR is any less sh*tty, though.
IllegalStateExcept@reddit
Independent security research is extremely important. From everything I can tell, that was the intent of the people doing this audio reconstruction.
That being said, there are ethical and practical guidelines to how such research is done with software. Two of the most important rules are: 1. Don't disclose personal identifiable or damaging information not related to the vulnerability and 2. Disclose the vulnerability to the affected organization before disclosure to the public (usually months in advance so the problem can be fixed).
This is clearly an edge case of software security. However, I think the researchers could have done a better job with responsible disclosure.
https://en.wikipedia.org/wiki/Coordinated_vulnerability_disclosure
Shoddy_Act7059@reddit (OP)
So, just so we're clear, the actual problem is that the people who did the reconstruction breached those ethical and practical guidelines. Is that right?
If so, I am sorry for what I said, and will update my post again to reflect this.
IllegalStateExcept@reddit
That is my read on the situation as someone who has been involved with security research. The worst case scenario is that someone politically motivated figures this out instead of someone who is just trying to get social media likes. E.g. imagine Russia figuring out how to do this and releasing out of context damaging clips about the air India accident. You can do an incredible amount of damage by selectively releasing evidence. I'd rather have a big embarrassing "oops it's all out" moment.
Shoddy_Act7059@reddit (OP)
Ah, okay. Thank you; I've already updated the post to reflect this.
qalpi@reddit
AI just makes it even easier
GIJoeVibin@reddit
What was the actual AI usage here? The article referenced “AI” but every detail about what actually happened seems to indicate that the process used is something you can brute force using non-AI means. The process can be accelerated thanks to AI, of course, but saying “AI was used to recreate the audio” implies some sort of synthetic text to speech type stuff at play, as opposed to what appears to be “AI tools might have helped someone convert the spectrogram into audio”. The former, in my opinion, is a bit more objectionable than the latter.
I just don’t really see the relevance of the AI bit here from the actual information presented.
Shoddy_Act7059@reddit (OP)
Perhaps I chose the wrong article to link (happens to me from time to time), but this was how AI was used in this whole thing, according to Flying Magazine:
"Yet on Thursday the NTSB announced it had become aware of someone using artificial intelligence (AI) to reconstruct approximations of CVR audio from sound spectrum imagery released as part of the agency’s investigations, including the ongoing probe of the 2025 crash of UPS Flight 2976 in Louisville, Kentucky."
Link to that article: https://www.flyingmag.com/ntsb-ups-cockpit-voice-recordings-fabricated-with-ai/
I think it's more so that AI was used at all for the reconstruction just added to the overall sliminess of it all, as MANY people, especially those in the art fields, do not like AI; or, more specifically, generative AI. Now, I could be wrong that this is NOT an example of generative AI being used here. So, if I am please feel free to let me know.
KeynoteBS@reddit
Agreed.
jrw01@reddit
Why is everyone uncritically blaming this on AI?
> “Nobody was aware that you can recreate audio from a picture”
Anyone who understands what a spectrogram actually represents should be able to intuitively understand that the process is reversible (with limitations); this has been done since the 1940s: https://en.wikipedia.org/wiki/Pattern_playback
ranrotx@reddit
Exactly this. Creating a spectrogram from audio is deterministic, and with most things that are deterministic you can reverse engineer it back into its original form.
IaNterlI@reddit
Yeah exactly: what used to be hard to understand and implement, AI has lowered those barriers making things easier for everyone including unscrupulous idiots.
Shoddy_Act7059@reddit (OP)
I feel like it was such a big deal here because, at least of the many 'reconstructions' of CVRs I have seen on YouTube, either using text-to-speech bots or using Mayday: Air Disasters audio, MANY of them covered older crashes, ones that didn't happen relatively recently.
But, this was a crash that happened last year, and a pretty notable one, too, given what happened. Also, this MAY be the first time generative AI did this sort of thing, too.
At least, that's my opinion; I also might be wrong about some of this stuff and please feel free to correct me.
Shoddy_Act7059@reddit (OP)
Exactly.
Blue_Etalon@reddit
There was no “AI Emerging technology” involved here. The jpg of that spectrogram being converted back to audio is basic FFT processing. Sure, digitizing the jpg may have been faster using some sort of AI enabled image processing, but this stuff is decades old. The blame for this rests solely on the NTSB for putting out something they absolutely should have known would be reverse processed. And the gaslighting of the NTSB statement blaming people for decrypting it was infantile. What’s really frightening is no one in the data release review process would know this.
Main_Violinist_3372@reddit
What was the purpose of the NTSB releasing the spectrogram image in the first place?
FelisCantabrigiensis@reddit
To aid understanding and prevent future accidents. When you do serious research, you present your data as well as your conclusions so others can understand (and verify) that your conclusions are correct from your data. Only with a good reason, such as widely agreed confidentiality, do you not do this.
The mistake was not to realise that someone can now easily (rather than previously finding it much more difficult) reconstruct audio from a spectrogram, which breaches the widely-agreed principle that CVR recordings are not released to anyone other than accident investigators. Since no-one has done this before, it's not unreasonable to have missed that it was now possible.
There are also now more publicity-seeking ghouls than ever before, which also increases the chance of this happening.
mduell@reddit
I don’t know what this claim is, but let’s generously call it misleading. The algorithm used (Griffin-Lim) was published in 1984 and has readily available implementations for a while (you can find decade old ones on GitHub).
FelisCantabrigiensis@reddit
Yes, I know. You can assume I have some familiarity with this field.
The novelty here is taking an image of a spectrogram in an accident report and turning that back into a sound recording. No-one has done that before (to my knowledge), and even reconstructing any sound recording from an image of a spectrogram is rare.
That's why the NTSB didn't have this on their threat/confidentiality radar and thought it was OK to publish this to aid transparency.
mduell@reddit
Again, absolutely not novel, here's a nearly 20 year old project doing just that: https://arss.sourceforge.net/
"We had no idea someone might apply software to images" is extremely uncompelling, to say the least.
FelisCantabrigiensis@reddit
That is a valid point.
TinyCopy5841@reddit
Public release of any report does not do anything to prevent future accidents, because the general public without a need to know does not have the means to do anything with the information. Dissemination of final reports could be limited to engineers, pilots, safety personnel or anyone working in the industry in a relevant field with actual need to know and safety would not be compromised in any way.
FelisCantabrigiensis@reddit
There's no way to identify everyone who works in aviation (including on the ground) so restricting the information to "just those with a need to know" is impossible.
It's also unwise: NTSB reports do a great deal to support public confidence in aviation, and therefore perform a useful civic, economic, and aviation industry purpose.
"Airline CEO says trust me bro" is not quite as strong as "independent investigation agency is investigating the cause of any serious incident or accident", is it?
TinyCopy5841@reddit
There is already plenty of information that is restricted in this manner. Anything to do with security, threat management, proprietary business information and so on. It would be trivially easy to only make final reports available through an internal system with limited and monitored access.
Sure, but that has little to do with safety and more to do with keeping up appearances and making the industry appear safe and thrustworthy in the public eye.
FelisCantabrigiensis@reddit
You could limit the access, but only by destroying the main point of accident investigation reports - to inform everyone involved in aviation as widely as possible - and also enabling secrecy by officials to avoid embarrassment but damaging safety.
That would also be in breach of Annex 13 to the Convention of Civil Aviation, section 6.5:
Your attempts to redefine the entire basis of international accident investigation and reporting to suit your opinions on one inadvertent information leak is misguided, to say the least.
TinyCopy5841@reddit
It wouldn't damage safety. It's just a theater, people would fly regardless because they have no other choice and the officials would have to be the ones overseeing the actual implementation of the various NTSB or other recommendations.
The general public today does not have any actual effect on this process.
mduell@reddit
I think it was mostly due to the unidentified 6350 Hz tone.
TabsAZ@reddit
There’s a high pitched sound that becomes audible in all the microphones right as the engine separation happens. Initially was thought to be the fire warning bell but appears to not be. The spectrogram was made as part of the attempt to analyze what the sound actually is and there was a whole separate sub-report about it in the docket.
Funkytadualexhaust@reddit
Not clear, like AI made it up (common), or AI was used to accurately reconstruct?
elprophet@reddit
I expect it's two-fold, but this is not an example of a "Deep Fake" - no hallucinations, inventions, confabulations, or anything else.
I think on the one hand, AI coding agents have gotten better where it's much lower cost to run with an idea. AIUI Scott Manley actually tipped this off on Twitter, saying something like "Hey these spectrograms look quite high quality, are they high enough quality to reverse?" And then with just a couple prompts, "Write a program to extract the spectrogram from this image" and "write a program to convert a spectrogram to the original audio" should take 15 to 20 minutes. (Honestly. I should go and find some other spectral data to try it.)
The second part is that I think the quality that the NTSB is releasing is a lot better. Remember these are images in a PDF from the program's source of truth. When they were first released (In the... 90s? I can't find a good reference to when the NTSB first included these), I expect the visual fidelity was quite low. Enough to show the item of concern, which were things like "And here's the explosion" or "here's the clunking of the jesus nut", but not enough fidelity to also get the "well we're fucked" audio. This is the first one where the quality of the spectral data is high enough that someone could reasonable extract that. Going forward, I bet the NTSB will be able to continue to release these, but will need to intentionally limit both the temporal and spatial resolution for the pictures. Enough to show the mechanical effects, but not enough to show the voices.
Shoddy_Act7059@reddit (OP)
The latter. From what I read in this article from Flying Magazine:
"Yet on Thursday the NTSB announced it had become aware of someone using artificial intelligence (AI) to reconstruct approximations of CVR audio from sound spectrum imagery released as part of the agency’s investigations, including the ongoing probe of the 2025 crash of UPS Flight 2976 in Louisville, Kentucky."
Link: https://www.flyingmag.com/ntsb-ups-cockpit-voice-recordings-fabricated-with-ai/
I will say, I should have been more clear in my post, and I'll go back to change it now.
AmbitiousEconomics@reddit
AI just made it easier to do though, you could also do it manually. For whatever reason reading your post it makes it seem like AI is the problem rather than the reconstruction, which is the actual issue.
FoxFyer@reddit
While it's true that the ability to reverse-engineer spectrographs into audio has existed for decades, I don't think it's trivial that nobody ever did that to CVR spectrograms until one person with an AI tool and no social filter decided it would be a fun project.
AmbitiousEconomics@reddit
On the contrary they stopped releasing actual data and started releasing pictures because the actual data made stuff like recovering voices trivial. In the past they’ve also usually been careful to not release spectrograms with pilot voices, usually by limiting the time series they’re sampling.
AI tools does speed up the process and also make it available to everyone, but the data shouldn’t have been there in the first place.
Shoddy_Act7059@reddit (OP)
Yeah, after some thought, you're completely right; I guess my very anti-AI bias got in the way, and I made way more about that than the actual reconstruction. I decided to update the post again to reflect this. Thanks.
AmbitiousEconomics@reddit
You’re welcome. I think it is fine to be anti-AI, but just being blanket anti-AI reduces the impact when AI is actually the problem, if that makes sense.
EJoule@reddit
What was the purpose of sharing the sound spectrum imagery?
mduell@reddit
Releasing the spectrogram is releasing the audio waveform. Converting it between for mats using decades old algorithms hardly seems like the problem here.
DingleBurg2021@reddit
This is just the beginning of AI misinformation. People around me can’t seem to differentiate between real or AI anymore it’s getting so good. Can you imagine what the governments are gonna use it for? 1984 and Brave New World here we come.
SpitefulSeagull@reddit
It's ok the voters have proven they can be trusted to put smart, competent, sympathetic people in charge.
aviation-ModTeam@reddit
This content was removed for breaking the r/aviation rules.
This subreddit is dedicated to aviation and the discussion of aviation, not politics and religion. For discussion of these subjects, please choose a more appropriate subreddit.
If you believe this was a mistake, please message the moderators through modmail. Thank you for participating in the r/aviation community.
Shoddy_Act7059@reddit (OP)
Well, Martin (and Ruth) Ginsburg did say this once:
"The true symbol of the United States is not the bald eagle—it is the pendulum. When it goes very far in one direction, you can count on its swinging back."
So, there's some solace for you.
swirler@reddit
People not being able to distinguish real from fake is not new with AI. War of the Worlds comes to mind. https://en.wikipedia.org/wiki/The_War_of_the_Worlds_(1938_radio_drama)
And don't forget the photoshop fakes era.
Shoddy_Act7059@reddit (OP)
While on the subject, I find it funny how the original War of the Worlds was about how one should not manipulate any kind of media...
Only for the 2025 remake to do the exact opposite of that (If I had a nickel for every time stock footage was either manipulated or just straight up ripped off the internet for the movie, I'd way more than two nickels).
Shoddy_Act7059@reddit (OP)
AI misinformation has been a thing long before this, and I think it's actually kinda prevalent now.
I don't generally pay much attention to the news now, anyways, due to how it often affects my mental and emotional health -- as well as knowing anything insanely big will eventually find its way to me through either people talking about it or through any sort of emergency alert.
Also, partly due to being in my early 20s, I still have an optimistic view on humanity and the rest of the world, although that it's certainly been tested time and time again lately (but, it still hasn't been broken yet, though).
NoSwimmers45@reddit
> optimistic view on humanity…that is certainly been tested time and time again
Welcome to the entire life of a millennial.
Shoddy_Act7059@reddit (OP)
I'm actually a late Gen Z (born in 2003), but I feel this connection still works, lmao.
FlyNSubaruWRX@reddit
An someone explain the difference of posting a transcript vs the audio? Obviously I know they censor foul language but what else are they redacting in the CVR transcript?
biggsteve81@reddit
The biggest difference is it is against federal law to release CVR recordings.
MadMonksJunk@reddit
which the NTSB was stupid enough to do not understanding what was easy for those with the knowledge to do is becoming easy for AI to do as well.
NTSB is at fault due to ignorance. The knowledge and tech to convert formats is far from new, it's just more available.
Shoddy_Act7059@reddit (OP)
A transcript, in this case, is literally just what the CVR picked up the pilots (or anyone else generally in the cockpit) said during the incident/accident flight, and investigators then wrote out.
The audio, in this case, is the actual voices of the pilots (or, again, anyone that's generally near the cockpit) saying what investigators wrote for the transcript.
FlyNSubaruWRX@reddit
Thanks for the info.
Shoddy_Act7059@reddit (OP)
No problem.
qalpi@reddit
Why do you think the audio isn't "real'"
Shoddy_Act7059@reddit (OP)
I guess I just thought, based on what I researched before making this post, the pilot's voices were AI-generated or were affected in some way by AI.
However, guess I didn't research far enough, and I made a few careless errors. I am extremely sorry about that.
malcifer11@reddit
I don’t understand what you mean about fooling people. Who was fooled, and how?
Shoddy_Act7059@reddit (OP)
Many people across various social media platforms thought it WAS the real audio, or people posted the audio claiming it to be "Real".
Again, probably should have made that clear in my post. I'll go back and change it now. Thanks for letting me know.
ihavebeesinmyknees@reddit
If it's an unmodified spectrogram, and it's accurately reinterpreted back into audio, then it is the real audio
Shoddy_Act7059@reddit (OP)
But, was it the REAL pilot's voices being used, though?
I'm kinda thinking THAT's the issue here regarding this whole thing.
ihavebeesinmyknees@reddit
If "it's accurately reinterpreted back into audio", then it is their real voices
Shoddy_Act7059@reddit (OP)
Hm, alright then.
Well, guess I'll update the post again to reflect this.
Admirable_Site_8337@reddit
Simply, it is against federal law to release CVR tapes for good reasons.
While there has been a way to turn a spectrograph into sound before, it is much easier to do now.
Admittedly, I am not familiar with why a spectrograph would be released, so I could be missing good context here……but maybe just stop releasing a spectrograph?
jrw01@reddit
Why is everyone uncritically blaming this on AI? Did the writers of the articles covering this situation do any research at all?
> “Nobody was aware that you can recreate audio from a picture”
Anyone who understands what a spectrogram actually represents should be able to intuitively understand that the process is reversible (with limitations); this has been done since the 1940s: https://en.wikipedia.org/wiki/Pattern_playback
TabsAZ@reddit
Yep, Fourier transform is not new. There are videos on YouTube demonstrating the technique for creating audio from spectrograms that predate AI by a decade plus.
flying_wrenches@reddit
Please do not post links to the CVR footage.
we (the mod team) would not like that footage on the subreddit.
Good men died
If you have any questions, please send a mod
Mail.
Kidvette2004@reddit
I was looking for the actual CVR of Flight 2976 (if it had been released) on YouTube and kept finding AI generated versions.. it was just so weird and felt disrespectful to the pilots to me.
Shoddy_Act7059@reddit (OP)
Yeah, that's what I thought, too.
Kidvette2004@reddit
The spectrograph reconstructions are a bit less weird but that’s because it’s real
Shoddy_Act7059@reddit (OP)
I also don't know if this somehow breaks Rule 8 or not; but, I feel like, if it somehow does, this is too big of a news piece to have deleted, because it highlights how dangerous AI usage can be in the aviation industry.
post-explainer@reddit
Please provide a source by replying to the message that was sent to you. Failure to respond to that message will result in the automatic removal of this post. Please feel free to reach out to the mod team through modmail if you have any questions or concerns.
r/Aviation is trialing new measures to prevent karma farming. Please feel free to provide feedback through modmail. Thank you for participating in the community!