Understanding Caching in Scalable Systems

By Oleksandr Andrushchenko, Published on Nov 04, 2025

Caching speeds up page loading and reduces the load on servers and databases. In this model, for example, the dispatcher first checks if a request was made previously and returns the stored result to avoid redundant processing. Databases often perform better with uniform read/write distribution, and caching can help handle uneven loads and traffic spikes.

Client caching

Caches can exist on the client (OS or browser), on the server, or in a dedicated cache layer.

CDN caching

Content delivery networks (CDNs) act as distributed caches to improve performance and availability.

Web server caching

Reverse proxies and caching servers like Varnish can serve static and dynamic content directly. Web servers can cache responses to avoid repeated calls to application servers.

Database caching

Databases often include default caching mechanisms optimized for general use. Adjusting these settings for specific usage patterns can improve performance further.

Application caching

In-memory caches like Memcached or Redis act as fast key-value stores between your application and storage. Since data is held in RAM, access is much faster than from disk-based databases. Cache invalidation strategies like least recently used (LRU) help maintain frequently accessed data in memory. Redis also provides persistence and advanced data structures such as sorted sets and lists.
Cacheable items generally fall into two categories: database queries and objects:

Row-level
Query-level
Complete serializable objects
Fully-rendered HTML

File-based caching is usually discouraged in auto-scaling environments.

Caching at the database query level

Queries can be hashed and stored in cache. Challenges include:

Deleting cached results for complex queries
Invalidating cached queries when underlying data changes

Caching at the object level

Consider your data as objects, similar to application code. Assemble datasets into objects and cache them:

Remove objects from cache if underlying data changes
Enable asynchronous processing with workers using the latest cached objects

Examples of cacheable objects:

User sessions
Rendered web pages
Activity streams
User graph data

Cache update strategies

Limited cache capacity requires choosing an appropriate update strategy.

Cache-aside

The application handles storage access. Steps:

Check cache for entry
If missing, load from database
Add entry to cache
Return the result

Also called lazy loading. Only requested data is cached, reducing unnecessary memory use.

Disadvantages of cache-aside

Each cache miss causes multiple trips, adding latency
Cached data can become stale; mitigated by TTL or write-through strategies
Node replacement starts empty, temporarily increasing latency

Write-through

The cache acts as the primary store; updates propagate to the database:

Application updates cache
Cache synchronously writes to database
Return the result

Read performance is fast; write operations are slower but data remains consistent.

Disadvantages of write-through

New nodes start empty and must wait for database updates
Unaccessed data may be written unnecessarily; TTL can limit this

Write-behind (write-back)

The application writes to cache, which asynchronously updates the database:

Improved write performance

How it works:

When the application writes data, it updates the cache only — not the database immediately.
The cache queues or batches the writes and asynchronously flushes them to the database later (after some delay or when the batch is full).
This allows for very fast write operations from the app’s perspective.

Disadvantages of write-behind

Risk of data loss if cache fails before writing to database
More complex to implement than cache-aside or write-through

Read-through