Why is Redis so fast?
A first-principles look at the design decisions that make an in-memory store answer in microseconds
TL;DR
Redis is a librarian who keeps every book on a desk, never walks to the shelves, answers one question at a time so no one fights over a book, and writes in a single neat ledger instead of locking a filing cabinet.
The idea
Redis is fast because every value lives in RAM (no disk seek), its event loop handles all client commands on a single thread (no lock contention), its data structures are purpose-built to avoid wasteful allocation, and its network layer is I/O-multiplexed so thousands of connections add almost no overhead. Those four decisions compound: each one removes a different class of latency.
Where it shows up
System-design interviews — examiners expect you to explain why you'd reach for Redis over Memcached or a relational cache. Saying "it's in-memory" is table stakes; explaining the single-threaded event loop and O(1) data structures scores points.
On-call — Redis latency spikes most often trace to one of three causes: a blocking command (KEYS *, SMEMBERS on a huge set, or a Lua script) hogging the single thread; memory pressure forcing the eviction policy to run on every write; or a single large value serialised over a slow NIC. Knowing the architecture tells you where to look first.
Real systems — Twitter used Redis sorted sets for timelines (ranked by tweet ID). GitHub uses it for rate-limiting counters. Sidekiq stores its job queue in Redis lists. Stack Overflow's tag engine is backed by sorted sets. In every case the choice was driven by a specific data structure, not just "caching".
Read the detailed breakdown›
1. RAM eliminates the seek tax
A spinning-disk random read takes ~5–10 ms. An NVMe SSD gets that down to ~100 µs. A DRAM access is ~100 ns — three to five orders of magnitude faster. Redis keeps its entire dataset in RAM by design. There is no buffer pool, no page cache negotiation, no read-ahead. When you issue GET foo, the value is already in memory; the only variable latency is how fast the OS can copy bytes from kernel to userspace and write them to a socket.
RDB snapshots and AOF persistence exist but are asynchronous by default (appendfsync everysec). The fast path never blocks on disk I/O.
2. Single-threaded event loop eliminates lock contention
Redis processes commands on one thread using an I/O-multiplexed event loop (built on epoll on Linux, kqueue on BSD/macOS). Every command runs to completion before the next one starts — there are no mutexes, no read-write locks, no condition variables around the data structures.
This sounds like a bottleneck, but contention is usually more expensive than serialisation. A multi-threaded cache with fine-grained locks pays cache-coherence traffic between CPU cores, lock-acquire overhead, and the occasional convoy effect. Redis avoids all of that. The throughput ceiling for a single thread on modern hardware is roughly 100 000–1 000 000 ops/sec depending on operation size — higher than most applications need from a single instance.
Redis 6 added I/O threading: multiple threads read from and write to sockets, but command execution itself is still single-threaded. This separation means network serialisation no longer caps throughput on high-bandwidth hardware without reintroducing data-structure locking.
3. Purpose-built data structures avoid wasted work
Redis doesn't store generic blobs with a key. It ships with strings, lists, hashes, sets, sorted sets, bitmaps, HyperLogLogs, streams, and more — each tuned for a specific access pattern.
Sorted sets are backed by a skip list (for range queries) and a hash table (for O(1) point lookups). That dual structure means ZRANGEBYSCORE and ZSCORE are both fast without compromise.
Lists are doubly-linked lists for O(1) head/tail operations — perfect for queues and stacks.
Small hashes and small sets use a compact listpack encoding (formerly ziplist) when they're below a configurable size threshold. A listpack is a contiguous byte array: no pointer chasing, excellent CPU cache locality. Once the collection grows past the threshold, Redis promotes it to a proper hash table or skip list. You get compact storage for small collections automatically.
Strings are Simple Dynamic Strings (SDS) — a length-prefixed byte array rather than null-terminated C strings. STRLEN is O(1) because the length is stored, not computed. Appending is amortised O(1) because SDS tracks available capacity.
4. I/O multiplexing handles thousands of connections cheaply
epoll (Linux) lets the kernel notify Redis of which file descriptors are readable/writable in a single syscall, regardless of how many connections are open. The cost of idle connections is near zero. Contrast this with the old select() model, which scanned every fd on every call, or a thread-per-connection model, where 10 000 connections means 10 000 stacks sitting in memory.
Pipelining compounds this: a client can send many commands in one TCP segment and read all responses together, collapsing round-trip time to a single RTT and amortising syscall overhead across the batch.
5. Protocol simplicity reduces parsing overhead
RESP (Redis Serialisation Protocol) is line-oriented and prefix-length-encoded. There's no XML/JSON parsing, no schema negotiation, no compression negotiation on the hot path. A server reading *2\r\n$3\r\nGET\r\n$3\r\nfoo\r\n can determine command and argument lengths with a handful of pointer increments. This is boring engineering — which is exactly why it works.
6. Memory allocator tuning
Redis ships with jemalloc by default. jemalloc uses per-thread arenas and size-class buckets to reduce fragmentation and contention with the OS allocator. Memory fragmentation in Redis is tracked via INFO memory (mem_fragmentation_ratio). A ratio significantly above 1.0 means Redis is holding more RSS than its own bookkeeping says it needs — a sign that the allocator has fragmented, which can waste tens or hundreds of MB and slow allocation paths.
Putting it together
The speed is not one trick. It is the compounding of: no disk seek (RAM), no lock wait (single-threaded command execution), no pointer chasing for small collections (listpack), no idle-connection overhead (epoll), and no heavy serialisation (RESP). Remove any one of these and Redis is still fast. All together, they produce sub-millisecond p99 latency at hundreds of thousands of ops/sec on commodity hardware.