moka-py: A high performance caching library for Python written in Rust with TTL/TTI support
Posted by del1ro@reddit | Python | View on Reddit | 13 comments
Hello!
I'm exited to share my first Rust lib for Python — moka-py!
What My Project Does
moka-py is a Python binding for the highly efficient Moka caching library written in Rust. This library allows you to leverage the power of Moka's high-performance, feature-rich cache in your Python projects.
Key Features:
- Synchronous Cache: Supports thread-safe, in-memory caching for Python applications.
- TTL Support: Automatically evicts entries after a configurable time-to-live (TTL).
- TTI Support: Automatically evicts entries after a configurable time-to-idle (TTI).
- Size-based Eviction: Automatically removes items when the cache exceeds its size limit using the TinyLFU policy.
- Concurrency: Optimized for high-performance, concurrent access in multi-threaded environments.
- Fully typed: mypy/pyright friendly. Even decorators
Example (@lru_cache
drop-in replacement but with TTL and TTI support):
from time import sleep
from moka_py import cached
@cached(maxsize=1024, ttl=10.0, tti=1.0)
def f(x, y):
print("hard computations")
return x + y
f(1, 2) # calls computations
f(1, 2) # gets from the cache
sleep(1.1)
f(1, 2) # calls computations (since TTI has passed)
One more example:
from time import sleep
from moka_py import Moka
# Create a cache with a capacity of 100 entries, with a TTL of 30 seconds
# and a TTI of 5.2 seconds. Entries are always removed after 30 seconds
# and are removed after 5.2 seconds if there are no `get`s happened for this time.
#
# Both TTL and TTI settings are optional. In the absence of an entry,
# the corresponding policy will not expire it.
cache: Moka[str, list[int]] = Moka(capacity=100, ttl=30, tti=5.2)
# Insert a value.
cache.set("key", [3, 2, 1])
# Retrieve the value.
assert cache.get("key") == [3, 2, 1]
# Wait for 5.2+ seconds, and the entry will be automatically evicted.
sleep(5.3)
assert cache.get("key") is None
Target Audience
moka-py might be useful for short-term in-memory caching for frequently-asked data
Comparison
- cachetools — Pure Python caching library. 10-50% slower and has no typing
TODO:
- Per-entry expiration
- Choosing between eviction policies (LRU/TinyLFU)
- Size-aware eviction
- Support async functions
Links
- https://github.com/deliro/moka-py
- https://pypi.org/project/moka-py/
twopointthreesigma@reddit
Very nice, any chance to add a user defined callback that's triggered on cache deletion? That would be quite useful.
Much_Raccoon5442@reddit
Is it possible to have a cached version returned while kicking off the long running function call in the background so it is ready for the next call?
del1ro@reddit (OP)
Can you clarify what you mean? Maybe an example
nicwolff@reddit
This is what's called "serve-stale" functionality in Web caching.
del1ro@reddit (OP)
Oh, TIL. I think it should be done outside the cache itself since spawning Threads or even asyncio.Tasks (depending on sync/async nature of a function) in background is a bit tricky and not obvious. But this is a good idea to consider.
SatoshiReport@reddit
Is this better than Redis?
del1ro@reddit (OP)
It's not better or worse than Redis, it's just different. Redis is a database with client-server interaction, while moka-py is more like a Python dict than Redis.
With Redis you can expect >=1ms time on any request (\~1ms in the best case, when Redis is hosted on the same server and requests go through loopback).
With moka-py the timings are MUCH more pleasant. pytest-benchmark shows 160ns average time for `Moka.get` which is 6250 times faster than the fastest GET request to Redis.
But moka-py lives in your Python process memory, so each process has its own cache which isn't persistent or shareable across network or even processes, but only between threads (since threads share the same memory)
daivushe1@reddit
How does it compare to cachehox?
del1ro@reddit (OP)
Never heard about it. Google doesn't give anything meaningful
daivushe1@reddit
https://github.com/awolverp/cachebox
del1ro@reddit (OP)
cachebox looks like a lot more mature tool. A slight glance shows a few different things:
* `cachebox.cached` decorator doesn't use ParamSpec, thus all decorated functions become just Any
* cachebox doesn't have Time-to-idle functionality (this was a killer feature when I was choosing cache lib for Rust)
* cachebox have a lot more eviction policies. moka-py has just one for now (TinyLFU)
Performance literally the same. %timeit of `moka_py.cached(128)` vs `cachebox.cached(cachebox.LFUCache(128))` shows 576 ns ± 3.92 ns per loop for cachebox and 576 ns ± 1.57 ns per loop for moka-py
Electrical-Top-5510@reddit
does it work with multiple instances of a service? is it possible to use it distributed(how the data is kept in sync)? Where is the data stored? Is it client-server like redis?
del1ro@reddit (OP)
Every process has its own cache. Sharing is available between threads though. You can think of it as a Python dict with some additional logic.
Client-server solutions like Redis uses network hence have at least 1ms delay (using loopback). moka-py has 500-800ns delay in average