A CDN’s ability to deliver fast, always-updated content depends on one crucial decision: when to refresh cached files that have become stale. This decision isn’t random—it is based on a layered system of rules, HTTP standards, algorithms, traffic behavior, and intelligent background checks.
Here’s a deep, reader-friendly explanation of how CDNs actually decide when to refresh stale content.
1. Cache-Control Headers: The Primary Decision Maker
The origin server tells the CDN how long content should stay fresh by using Cache-Control headers.
The most important directives include:
-
max-age — duration content can be served as fresh.
-
s-maxage — same as max-age but specifically for CDNs.
-
must-revalidate — once stale, the CDN must confirm freshness with the origin.
-
no-cache — CDN can store but must revalidate before using.
-
no-store — CDN cannot store at all.
Example:
Cache-Control: max-age=3600
This means the CDN treats the content as fresh for 1 hour. After that hour, it marks the content as stale and begins the refresh decision process.
2. Expires Headers for Additional Time Rules
Some servers still use the older Expires header, which gives an exact timestamp.
Example:
Expires: Tue, 25 Nov 2025 12:00:00 GMT
Once the current time surpasses this timestamp, the CDN knows the content is stale.
This is a secondary source of truth behind Cache-Control.
3. CDN Custom Rules That Override Origin Headers
Website owners often override origin rules directly inside the CDN dashboard.
For example:
-
Set images to cache for 30 days
-
Set HTML pages to 10 minutes
-
Force caching even if the origin forgot
-
Bypass caching for specific paths like
/admin
If these rules exist, the CDN follows them first when deciding whether to refresh stale content.
4. Stale-While-Revalidate: Serving Old Content While Fetching New
One of the smartest mechanisms is stale-while-revalidate.
Example:
Cache-Control: max-age=3600, stale-while-revalidate=300
This means:
-
Content is fresh for 1 hour
-
After 1 hour, the CDN may still serve it for up to 5 minutes
-
While serving it, the CDN fetches a new copy in the background
This allows users to get instant responses even during a refresh cycle.
The site stays fast, and the origin stays protected.
5. Stale-If-Error: Serving Stale Content During Origin Problems
Another mechanism is stale-if-error.
Example:
stale-if-error=600
Meaning:
-
If the origin is down, slow, or returning errors
-
The CDN may continue serving stale content for 10 minutes
This prevents downtime and ensures stability during outages.
6. Conditional Requests: Revalidation Instead of Full Refresh
Sometimes content is stale but still valid. Instead of downloading the entire file again, the CDN first checks with the origin:
-
ETag (If-None-Match)
-
If-Modified-Since
If the origin responds with 304 Not Modified, the CDN resets the freshness timer without downloading anything.
This saves:
-
Bandwidth
-
Requests
-
Server processing
-
Time
If a 200 OK is returned, then the content has changed, and a fresh copy is downloaded.
7. Heuristic Caching: When No Cache Rules Exist
Not all websites use cache headers. In such cases, CDNs apply heuristic caching.
They look at:
-
Last-Modified date
-
Content type
-
Historical patterns
Example heuristic:
Take 10% of the time since the file was last modified.
If a file was last updated 10 days ago → CDN might assign a 1-day TTL.
Heuristics ensure even poorly configured sites still benefit from caching.
8. Popularity-Based Refresh Decisions
Popular content is refreshed more actively and kept longer in cache.
CDNs track:
-
Request frequency
-
Traffic bursts
-
Regional demand
-
Seasonality
Examples:
-
Viral videos get refreshed often
-
Unpopular files are allowed to stay stale longer
-
Rarely accessed files may be evicted entirely to save space
This keeps the cache efficient and relevant.
9. Machine Learning and Predictive Caching
Modern CDNs integrate ML systems that analyze:
-
User behavior
-
Content update frequency
-
Time-of-day patterns
-
Device types
-
Historical refresh cycles
The CDN predicts:
-
When the origin is likely to change the file
-
How long a file stays useful
-
When certain regions will need fresher versions
This results in adaptive TTLs that respond to real-world usage.
10. Request Collapsing: Preventing Multiple Refresh Attempts
When a stale item is requested by hundreds of users at once, you don’t want the CDN contacting the origin 100 times.
So CDNs use request collapsing:
-
The first user triggers the refresh
-
All other users temporarily get the stale version
-
Once the refresh completes, everyone gets the new version
This avoids the “thundering herd problem” that can crash origins.
11. Grace Periods for Unexpected Demand
CDNs often extend TTL temporarily during:
-
Flash crowds
-
Viral spikes
-
Seasonal surges
-
High-load periods
This keeps the content stable even when traffic is overwhelming.
The refresh decision is delayed until the CDN can communicate safely with the origin.
12. Soft Expiration vs Hard Expiration
CDNs classify stale content in two ways:
Soft Expiration
-
Content is stale but can still be served
-
CDN begins revalidation
-
User never sees delays
Hard Expiration
-
CDN must fetch new content now
-
Stale content cannot be served unless stale-if-error applies
The choice depends on headers and CDN configuration.
13. Path-Based Refresh Logic
Different parts of a website behave differently.
Examples:
-
/images/→ rarely change → long TTL -
/news/→ changes frequently → short TTL -
/api/→ must be fresh → no caching -
/static/js/→ versioned → safe for very long TTL
CDNs detect such patterns and refresh content accordingly.
14. Origin Shielding and Mid-Tier Caches
In some CDNs, not every edge server talks directly to the origin. Instead, they talk to a shield server.
This improves refresh efficiency because:
-
Shield fetches once
-
All edge servers get updates from the shield
-
Origin load drops dramatically
The shield server’s own logic also controls when updates occur.
15. Admin Purges and Manual Refresh Policies
Finally, administrators can:
-
Purge cache
-
Invalidate URLs
-
Trigger full refresh
-
Use cache tags (e.g., purge only product-related assets)
-
Auto-purge on deployment
Admin actions override all automated logic.
Final Summary
A CDN decides when to refresh stale content using a powerful mix of:
-
Origin rules (Cache-Control, Expires, ETags)
-
CDN custom overrides
-
Stale-while-revalidate and stale-if-error
-
Revalidation checks
-
Heuristic caching
-
Popularity patterns
-
Machine learning
-
Request collapsing
-
Grace periods
-
Manual purges
-
Origin shielding
All these layers ensure content stays fast, fresh, reliable, and cost-efficient, even during global traffic spikes.

0 comments:
Post a Comment
We value your voice! Drop a comment to share your thoughts, ask a question, or start a meaningful discussion. Be kind, be respectful, and let’s chat!