Heads up: Kernel 6.19 regression causing silent crash loops in MongoDB (and eating Btrfs storage)

Posted by newaaa41@reddit | linux | View on Reddit | 15 comments

I just spent my entire weekend chasing a "phantom" storage leak on my Fedora homelab (Threadripper 2920X) and figured I’d share the findings in case anyone else is seeing weird disk usage after updating to Kernel 6.19.7.

TL;DR: There's a regression in the 6.19 memory management subsystem that causes MongoDB’s tcmalloc to SIGSEGV every \~30 seconds. If you're on Btrfs with snapshots, each crash triggers a WiredTiger journal recovery that generates hundreds of megabytes of new CoW extents. I lost 800GB of space in 4 days.

The symptoms:

du reported 60GB, but btrfs fi usage showed the drive was nearly full (1TB).

docker inspect was totally useless—it showed exit code 0 for the crashing container because it was restarting so fast.

docker events was the only thing that showed the constant "die (exit code 139)" loop.

I eventually had to do a binary-search isolation test on my 70+ containers to find the culprit. It wasn't my config or a bug in the app—rolling back to Kernel 6.18 fixed it instantly.

If you're running MongoDB on Btrfs, you might want to hold off on the 6.19 update or keep a close eye on your snapshot growth. I wrote a post-mortem with the full debugging steps and logs here if you're interested in the deep dive:

https://ali.rabeei.com/blog/database-that-ate-my-disk