Back to all posts

Redis in Production: Caching Strategies That Actually Work (And Some That Don't)

A production-tested guide to Redis caching strategies — cache-aside, write-through, invalidation patterns, TTL design, and the mistakes that cost us real money.

8 min read

I’ve been running Redis in production on e-commerce platforms for several years now, and I can tell you that most caching tutorials leave out the parts that actually matter. They show you SET key value EX 300 and call it a day. Real production caching is about trade-offs, invalidation strategies, failure modes, and knowing when NOT to cache something.

This is everything I wish someone had told me before I learned it the hard way.

The Three Core Strategies (And When Each One Fits)

Cache-Aside (Lazy Loading)

This is the most common pattern, and for good reason. The application checks the cache first. On a miss, it reads from the database, writes to the cache, and returns the data.

async function getProduct(id: string): Promise<Product> {
  const cacheKey = `product:${id}`;

  // 1. Check cache
  const cached = await redis.get(cacheKey);
  if (cached) return JSON.parse(cached);

  // 2. Cache miss — read from DB
  const product = await productRepo.findById(id);
  if (!product) throw new NotFoundException();

  // 3. Populate cache
  await redis.set(cacheKey, JSON.stringify(product), 'EX', 600);

  return product;
}

When it works: Read-heavy workloads where stale data is acceptable for short periods. Product catalogs, user profiles, configuration data.

When it doesn’t: Data that changes frequently and must be immediately consistent. If you cache a user’s cart with cache-aside and they add an item, they’ll see the old cart until the TTL expires (unless you actively invalidate — more on that later).

Trade-off: Simple to implement, but every cache miss is a slow request. Cold cache after a deploy or Redis restart means a thundering herd to the database.

Write-Through

Every write goes to both the cache and the database simultaneously. Reads always hit the cache.

async function updateProduct(id: string, data: UpdateProductDto): Promise<Product> {
  // 1. Update database
  const product = await productRepo.update(id, data);

  // 2. Update cache (synchronously — part of the write path)
  const cacheKey = `product:${id}`;
  await redis.set(cacheKey, JSON.stringify(product), 'EX', 600);

  return product;
}

When it works: Data that’s read frequently and updated occasionally, AND where you need strong read consistency. Session data, feature flags, pricing rules.

When it doesn’t: Write-heavy workloads. You’re paying the latency cost of a Redis write on every single database write, even for data nobody might read.

Trade-off: Cache is always fresh, but writes are slower. And you still need a TTL as a safety net — if the write-through fails silently, the cache diverges forever without one.

Write-Behind (Write-Back)

Writes go to the cache first, and the cache asynchronously flushes to the database. This is the fastest for writes but the most dangerous.

When it works: High-frequency counters, analytics events, real-time metrics — where losing a few data points is acceptable.

When it doesn’t: Anything transactional. If Redis crashes before flushing to the database, that data is gone. I would never use this for orders, payments, or anything financial.

Trade-off: Maximum write performance, but you’re accepting potential data loss. In my experience, this pattern is rarely the right choice for core business data. We use it for view counts and analytics pipelines — that’s about it.

TTL Strategies Beyond “Set It to 5 Minutes”

The number one TTL anti-pattern I see is slapping the same TTL on everything. Your product catalog and your real-time inventory count have wildly different freshness requirements. Treat them differently.

Here’s how we think about TTL design:

We also use adaptive TTLs based on access patterns:

async function getWithAdaptiveTTL(key: string, fetcher: () => Promise<any>): Promise<any> {
  const cached = await redis.get(key);
  if (cached) {
    // Extend TTL on access (most-recently-used stays warm)
    const currentTTL = await redis.ttl(key);
    if (currentTTL < 120) {
      await redis.expire(key, 600); // Reset to 10 min on access
    }
    return JSON.parse(cached);
  }

  const data = await fetcher();
  await redis.set(key, JSON.stringify(data), 'EX', 600);
  return data;
}

Frequently accessed data stays warm. Rarely accessed data naturally evicts. Your cache memory stays focused on what matters.

Cache Invalidation: The Real Hard Problem

Phil Karlton said there are two hard things in computer science: cache invalidation and naming things. He wasn’t wrong about the first one.

Here are the invalidation patterns we use, from simplest to most robust:

TTL-Based Expiration (Passive)

Let it expire naturally. This is fine when stale data is acceptable. It’s simple, and simple is underrated.

Explicit Invalidation (Active)

When data changes, delete the cache key:

async function updateProduct(id: string, data: UpdateProductDto): Promise<Product> {
  const product = await productRepo.update(id, data);

  // Delete, don't update — let the next read repopulate
  await redis.del(`product:${id}`);

  // Don't forget related caches!
  await redis.del(`catalog:${product.categoryId}`);
  await redis.del(`search:products:*`); // Pattern deletion for search caches

  return product;
}

Important: Delete, don’t update. If you update the cache and the database write fails or rolls back, your cache has phantom data. Delete is idempotent and safe.

But notice the search:products:* — that’s where invalidation gets ugly. A product change might affect category listings, search results, recommendation feeds, and homepage features. Missing even one creates a stale data bug that’s incredibly hard to track down.

Event-Driven Invalidation

This is what we settled on for anything non-trivial. Database changes publish events, and a dedicated cache invalidation consumer handles the cleanup:

// Event handler — separate service/consumer
@OnEvent('product.updated')
async handleProductUpdate(event: ProductUpdatedEvent) {
  const { productId, categoryId, previousCategoryId } = event;

  const keysToInvalidate = [
    `product:${productId}`,
    `catalog:${categoryId}`,
    `homepage:featured`,
  ];

  // If category changed, invalidate old category too
  if (previousCategoryId && previousCategoryId !== categoryId) {
    keysToInvalidate.push(`catalog:${previousCategoryId}`);
  }

  await Promise.all(keysToInvalidate.map(key => redis.del(key)));
}

This centralizes invalidation logic. When a new cache is added, you add its invalidation rule to the event handler — one place to maintain instead of scattered redis.del() calls across the codebase.

Redis Cluster on AWS ElastiCache: What We Learned

Running Redis on ElastiCache in cluster mode has some sharp edges:

When NOT to Cache

This is the section most articles skip. Here’s when caching makes things worse:

The rule of thumb: If your cache hit rate is below 80%, investigate whether that cache is earning its keep. Below 50%, it’s probably hurting more than helping.

Monitoring Cache Health

A cache you don’t monitor is a cache waiting to betray you. Here’s what we track:

We use OpenTelemetry to instrument all Redis calls and ship traces to our observability stack. Every cache operation gets a span with the key pattern (not the full key — you don’t want high-cardinality labels), the operation type, and whether it was a hit or miss.

// Simplified OTEL instrumentation wrapper
async function cachedGet<T>(key: string, fetcher: () => Promise<T>, ttl: number): Promise<T> {
  const span = tracer.startSpan('cache.get', {
    attributes: { 'cache.key_pattern': key.split(':')[0], 'cache.ttl': ttl },
  });

  try {
    const cached = await redis.get(key);
    if (cached) {
      span.setAttribute('cache.hit', true);
      cacheHitCounter.inc({ pattern: key.split(':')[0] });
      return JSON.parse(cached);
    }

    span.setAttribute('cache.hit', false);
    cacheMissCounter.inc({ pattern: key.split(':')[0] });
    const data = await fetcher();
    await redis.set(key, JSON.stringify(data), 'EX', ttl);
    return data;
  } finally {
    span.end();
  }
}

The Bottom Line

Redis is one of the most powerful tools in a backend engineer’s toolkit, but it’s not magic. It’s a trade-off machine: memory for speed, consistency for performance, simplicity for resilience. Every caching decision should start with “what’s the cost of stale data?” and “what’s the cost of a cache miss?” — not with “let’s cache everything and hope for the best.”

Start with cache-aside for most things. Add write-through only where consistency matters. Use event-driven invalidation for anything complex. Monitor your hit rates religiously. And never, ever forget: the fastest cache miss is the one you prevent by not caching data that shouldn’t be cached in the first place.

Found this useful?

Share it on LinkedIn, check out more posts, or connect with me to exchange ideas.

Keep reading