Understanding Caching in Scalable Systems

By Oleksandr Andrushchenko, Published on

Caching speeds up page loading and reduces the load on servers and databases. In this model, for example, the dispatcher first checks if a request was made previously and returns the stored result to avoid redundant processing. Databases often perform better with uniform read/write distribution, and caching can help handle uneven loads and traffic spikes.
Cache use cases
Cache use cases

Client caching

Caches can exist on the client (OS or browser), on the server, or in a dedicated cache layer.

CDN caching

Content delivery networks (CDNs) act as distributed caches to improve performance and availability.

Web server caching

Reverse proxies and caching servers like Varnish can serve static and dynamic content directly. Web servers can cache responses to avoid repeated calls to application servers.

Database caching

Databases often include default caching mechanisms optimized for general use. Adjusting these settings for specific usage patterns can improve performance further.

Application caching

In-memory caches like Memcached or Redis act as fast key-value stores between your application and storage. Since data is held in RAM, access is much faster than from disk-based databases. Cache invalidation strategies like least recently used (LRU) help maintain frequently accessed data in memory. Redis also provides persistence and advanced data structures such as sorted sets and lists.
Cacheable items generally fall into two categories: database queries and objects:

  • Row-level
  • Query-level
  • Complete serializable objects
  • Fully-rendered HTML

File-based caching is usually discouraged in auto-scaling environments.

Caching at the database query level

Queries can be hashed and stored in cache. Challenges include:

  • Deleting cached results for complex queries
  • Invalidating cached queries when underlying data changes

Caching at the object level

Consider your data as objects, similar to application code. Assemble datasets into objects and cache them:

  • Remove objects from cache if underlying data changes
  • Enable asynchronous processing with workers using the latest cached objects

Examples of cacheable objects:

  • User sessions
  • Rendered web pages
  • Activity streams
  • User graph data

Cache update strategies

Limited cache capacity requires choosing an appropriate update strategy.

Cache-aside

The application handles storage access. Steps:

  • Check cache for entry
  • If missing, load from database
  • Add entry to cache
  • Return the result

Cache-aside
Cache-aside

Also called lazy loading. Only requested data is cached, reducing unnecessary memory use.

Disadvantages of cache-aside

  • Each cache miss causes multiple trips, adding latency
  • Cached data can become stale; mitigated by TTL or write-through strategies
  • Node replacement starts empty, temporarily increasing latency

Write-through

The cache acts as the primary store; updates propagate to the database:

  • Application updates cache
  • Cache synchronously writes to database
  • Return the result
Write-through cache
Write-through cache

Read performance is fast; write operations are slower but data remains consistent.

Disadvantages of write-through
  • New nodes start empty and must wait for database updates
  • Unaccessed data may be written unnecessarily; TTL can limit this

Write-behind (write-back)

The application writes to cache, which asynchronously updates the database:

  • Improved write performance
Write-behind cache (Write-back)
Write-behind cache (Write-back)

How it works:

  • When the application writes data, it updates the cache only — not the database immediately.
  • The cache queues or batches the writes and asynchronously flushes them to the database later (after some delay or when the batch is full).
  • This allows for very fast write operations from the app’s perspective.
Disadvantages of write-behind
  • Risk of data loss if cache fails before writing to database
  • More complex to implement than cache-aside or write-through

Read-through

How it works:

  • The application reads data through the cache.
  • If the data is not in the cache (a cache miss), the cache itself fetches it from the underlying data source (e.g., a database), stores it, and returns it to the app.
  • Future reads for the same key hit the cache directly.

Goal: Simplify logic for the client — it never needs to fetch from the database directly.
When used: When reads are frequent and you can tolerate the first-miss latency.

Read-through cache
Read-through cache

Refresh-ahead

The cache automatically refreshes recently accessed entries before expiration, reducing latency for future reads if predictions are accurate.

Disadvantages of refresh-ahead
  • Incorrect predictions can degrade performance

General cache considerations

  • Maintaining consistency between cache and database requires careful invalidation
  • Determining when to update the cache adds complexity
  • May require application changes, like integrating Redis or Memcached