CacheLite is the engine behind Redis, Memcached and Guava — rebuilt from first principles in the Python standard library. Hand-rolled data structures, four eviction policies, TTL expiry, thread safety, snapshots, an HTTP API and a CLI. Nothing imported beyond stdlib.
Every request starts with a get. CacheLite checks the hash map, confirms the entry hasn't passed its TTL, bumps its recency/frequency metadata, and hands back the value — all under one lock.
A miss and a stored None are different things. The decorator and API both use a sentinel so caching a function that legitimately returns None still works.
hit_rate, miss_rate and every counter update on the way through.
from cachelite import Cache cache = Cache(max_items=1000, policy="lru") cache.set("user:1", {"name": "Ada"}) cache.get("user:1") # -> {"name": "Ada"} cache.get("ghost") # -> None (counted as a miss) cache.stats().hit_rate # -> 0.5
A set writes the value, records a deep byte-size estimate, and registers the key with the active eviction policy. Give it a TTL inline, or lean on a cache-wide default.
The cache is bounded by item count, by memory, or both. Sizes recurse into nested containers — a list of strings reports its real footprint, not just the pointer.
TTLs use a monotonic clock, so NTP corrections never expire a key early.
cache = Cache(
max_items=10_000,
max_bytes=8_000_000, # 8 MB budget too
default_ttl=60,
)
cache.set("session:42", payload) # 60s default
cache.set("otp:42", code, ttl=30) # per-key
cache.ttl("otp:42") # -> ~30.0
When the cache fills, a policy picks the victim in O(1). No scanning, no sorting — the data structures keep the answer ready. Swap the policy with one string.
LRU rides a hand-built doubly-linked list. LFU tracks frequency buckets with a live minimum. FIFO tombstones lazily. Random does swap-remove on an array.
Eviction fires an on_evict event, so you can ship the victim somewhere else.
from cachelite.events import ON_EVICT cache = Cache(max_items=2, policy="lfu") cache.events.on(ON_EVICT, lambda k, v: print("dropped", k)) cache.set("a", 1); cache.set("b", 2) cache.get("a"); cache.get("a") # a is hotter cache.set("c", 3) # dropped b
Expiry is two-pronged, just like Redis. Lazy: every read checks the deadline and drops the key on the spot — zero overhead for keys nobody touches.
Active: an optional background thread samples random keys, evicts the dead ones, and sweeps again whenever more than a quarter of a sample was expired. Memory never bloats with stale keys.
invalidate("user:*") clears a whole namespace by glob in one call.
from cachelite.core.config import CacheConfig cache = Cache(config=CacheConfig( max_items=10_000, active_expiry=True, # background sweeper sweep_interval=1.0, sweep_sample=20, )) cache.invalidate("user:*") # bulk drop -> count
Snapshot the live cache to disk as JSON or pickle, then restore it into a fresh process. TTLs are stored as remaining seconds, so deadlines rebuild correctly against the new clock — no expired-on-load surprises.
Reach it over HTTP or from the shell. The same engine powers a RESTful server and a file-backed CLI.
Corrupt or foreign files raise a clear SnapshotError, never a silent half-load.
# in-process cache.snapshot("cache.json") fresh = Cache(max_items=1000) fresh.restore("cache.json") # TTLs intact # or over HTTP $ python -m cachelite serve --port 8080 $ curl -X PUT localhost:8080/cache/k -d '{"value":1}'
Sentinel head/tail nodes mean no null checks on the hot path. Access moves a node to the front; eviction pops the tail.
Three structures cooperate so the least-used key is always one lookup away. Ties break to the oldest entry at that frequency.
Insertion order in a deque; deletes leave tombstones that eviction skips. Access never reorders anything.
Swapping the victim with the last element gives O(1) removal of any key — no shifting, surprisingly competitive in practice.
Decorate any function. TTL, custom keys, None-safe, with cache_clear() and cache_info().
Many concurrent readers or one exclusive writer, plus striped locking to cut contention.
Hits, misses, evictions, expirations, hit rate and memory usage, snapshotted atomically.
on_hit, on_miss, on_evict, on_expire and more. Handlers are isolated — one crash never sinks the cache.
A threaded server on stdlib http.server: GET/PUT/DELETE, /stats, /keys, /invalidate, /flush.
set, get, keys, stats, flush, serve, demo — file-backed for persistence between calls.
Clone it, run the tests, follow a request through every stage. It's ~1,900 lines of commented Python and a clean reference for how production caches actually work.