Built a High-Performance Key-Value Datastore in Pure Java

[-]

noswag15@reddit

I was looking for something similar to this but with support for streams instead of byte arrays ... I checked rocksdb but it seems to expect the key and value to both be byte[] ... from the readme on this project, this library also seems similar ... does stream support exist or is planned for the future ?

A library like this could be very useful as a temporary storage/cache for large files and blobs (potentially downloaded from external sources) but if they first have to be eagerly read into memory as byte[] before being stored in the cache, it may not work well.

[-]

Familiar-Level-261@reddit

A library like this could be very useful as a temporary storage/cache for large files and blobs (potentially downloaded from external sources) but if they first have to be eagerly read into memory as byte[] before being stored in the cache, it may not work well.

have you heard of file systems ? those can stream AND are KV stores!

[-]

noswag15@reddit

I'm not sure what you're implying here. Of course I know of filesystems. What I'm looking for is to be able to store files downloaded from external systems and have them indexed by some id. Think of user profile images for example. The advantage of putting it in a key-value store like this is that values can be memory mapped so if memory is available, it will be as good as reading from memory and the system takes care of swapping content. Furthermore, if the key-value store supports size/TTL based eviction, I don't have to worry about cleaning up files. Essentially, what I'm looking for is an LRU cache which can serve content as fast as possible if memory is available and if not, fallback to disk and handle swapping/eviction of both keys and value chunks.

[-]

Familiar-Level-261@reddit

In context of specifically what I answered to (I'm not denying other useful cases for KV), namely temp download files

What I'm looking for is to be able to store files downloaded from external systems and have them indexed by some id.

that's a file name

The advantage of putting it in a key-value store like this is that values can be memory mapped so if memory is available, it will be as good as reading from memory and the system takes care of swapping content.

You can just mmap a file

Furthermore, if the key-value store supports size/TTL based eviction, I don't have to worry about cleaning up files.

If you remove a file while still holding FD, it will exist up to the point of closing FD or app exiting i.e. auto cleanup.

If you want persistence TTL is also very easy to do on disk files.

The point is using DB for simple and temporary stuff is some massive overkill as you're essentially making worse file system

[-]

noswag15@reddit

Well that's the point. I don't want to wait for JVM exit for file cleanup. I don't have infinite disk storage so I need to evict disk files after they reach a certain maxTotalSize.

If you want persistence TTL is also very easy to do on disk files.

Well yeah if I have to implement everything by hand, I can if there's no other option (which is exactly what I ended up doing). But if a library can do it for me, and if it's lightweight enough, why wouldn't I use it ?

[-]

theuntamed000@reddit (OP)

hmm, you really have nice usecase there.

idk if i am going to support streams in near future, but what if the streams get broken in the middle of data transfer ? we might want to discard it which is a transactional property, its like 1 or 0, nothing in middle.

So i might think to add transactional property first, as its widely required feature.

But currently still the focus on increasing performance and concurrency throughput.

[-]

noswag15@reddit

makes sense. I'll keep an eye on the project. thanks.

[-]

psychelic_patch@reddit

Hei man ; i'm also writing databases ; i'm not using java but feel free to reach out i'm using paper and benching lot of behaviors before-hand ; will ppb not be testing your app but who knows maybe we can still help each other ; would have mainly some questions concerning your choices here tbh ; truly inspiring work tbh keep it up !

[-]

theuntamed000@reddit (OP)

Thanks man,
would like to hear your questions

[-]

HosseinKakavand@reddit

Impressive work. To build trust, add a JMH suite with pinned CPU and warmed JVM, and publish p50, p95, p99 plus tail latency. Compare to RocksDB, Chronicle Map, and MapDB with identical durability. Document crash consistency, WAL or checkpoints, compaction, file layout, and GC vs off heap. A YCSB profile and failure testing would make the story compelling.

We’re experimenting with a backend infra builder, think Loveable but for your infra. In the prototype, you can: describe your app → get a recommended stack + Terraform, and managed infra. Would appreciate feedback (even the harsh stuff) https://reliable.luthersystemsapp.com

[-]

theuntamed000@reddit (OP)

hmm yeah i'll think about that.