Refetching Dispatcher Flush Agents in AEM: A Smarter Caching Strategy
Caching plays a critical role in ensuring high performance and scalability in Adobe Experience Manager (AEM) architectures. The Dispatcher sits between the Publish tier and end users, caching rendered content to reduce load on the Publish instances. However, how cache invalidation is handled can significantly impact system stabilityโespecially during traffic spikes.
This article explains the concept of refetching Dispatcher flush agents, why they are needed, how they work, and when they should (and should not) be used.
The Problem with Plain Dispatcher Flush Agents
A standard (plain) Dispatcher flush agent invalidates cached content when a page or asset is activated. Once flushed, the content is not immediately re-cached. Instead, it is fetched again from the Publish instance only when the next user request arrives.
At first glance, this behavior seems reasonableโbut it introduces an important risk.
Traffic Spike Risk
Consider this common scenario:
- A large number of pages or assets are flushed during offโpeak hours (for example, at night).
- In the morning, users start visiting the site as usual.
- Multiple users request the same recently flushed pages at the same time.
In this case, the Dispatcher may forward multiple concurrent requests for the same content to the Publish tier before the cache is reโpopulated. Instead of one render per page, the Publish instance may end up rendering the same page many times in parallel.
HighโRisk Content Types
This issue becomes more severe when dealing with large or expensiveโtoโrender resources, such as:
- Highโresolution images
- Large video files
- Pages with complex backend logic or heavy integrations
Under peak load, this can lead to:
- Increased Publish CPU and memory usage
- Longer response times
- Potential availability issues
What Is a Refetching Dispatcher Flush Agent?
A refetch flush agent enhances the standard flush behavior by instructing the Dispatcher to immediately reโfetch the content from the Publish instance right after the cache invalidation happens.
In other words:
- Cache is flushed
- Dispatcher proactively requests the updated content
- The cache is repopulated before real users hit the page
This approach effectively eliminates the thunderingโherd problem caused by concurrent user requests.
Benefits of Using Refetch Flush Agents
1. Controlled Cache WarmโUp
With refetching enabled, cache warmโup happens as part of the publication workflow rather than being left to chance through user traffic.
This means:
- The first user does not pay the cost of rendering
- Subsequent users are served directly from the Dispatcher cache
2. Reduced Load on Publish Instances
Since the Dispatcher fetches the content only once per flushed item, Publish instances are protected from multiple redundant rendering requests.
This is particularly beneficial for:
- Resourceโintensive pages
- Mediaโheavy content
- Sites experiencing sharp traffic peaks
3. Predictable and Planned Rendering
Refetch agents give you full control over when rendering occurs. Because refetching is tied to replication events, you know exactly when the Publish tier will be involved.
A common and effective practice is:
- Perform activations and refetch flushes during lowโtraffic windows (e.g., overnight)
- Ensure the cache is fully populated before users arrive
As a result, daytime traffic is served almost entirely from Dispatcher cache.
Live Example: Custom Refetch Dispatcher Flush for Media Content
A practical implementation of refetching can be seen in a custom Dispatcher flush content builder that sends a list of URIs to be reโfetched immediately upon flushingโvideo assets being a common example.
@Component(service = ContentBuilder.class, property = {"name=re_fetch_dispatcher_flush"})
public class DispatcherFlushContentBuilder implements ContentBuilder {
@Reference
private ResourceResolverFactory resourceResolverFactory;
public static final String NAME = "re_fetch_dispatcher_flush";
public static final String TITLE = "Re-fetch Dispatcher Flush";
private static final String NT_DAM_ASSET = "dam:Asset";
private static final String RENDITION_PATH = "/_jcr_content/renditions/";
@Override
public ReplicationContent create(Session session, ReplicationAction action,
ReplicationContentFactory factory, Map<String, Object> parameters)
throws ReplicationException {
HashMap<String, Object> map = new HashMap<String, Object>();
map.put(MyConstants.AUTHENTICATION_INFO_SESSION, session);
try (ResourceResolver resourceResolver = resourceResolverFactory.getResourceResolver(map)) {
String path = action.getPath();
Resource res = resourceResolver.getResource(path);
if(res != null) {
Node node = (Node) res.adaptTo(Node.class);
// Check if the node is a dam asset, if yes then create content with all the renditions of the asset
// other than node type, information like extension, size etc can also be used to filter the renditions if required
if (NT_DAM_ASSET.equals(node.getPrimaryNodeType().getName())) {
int renditionSize = this.configuration.renditions().length;
String[] uris = new String[renditionSize];
int counter = 0;
for (String rendition : this.configuration.renditions()) {
uris[counter] = path + RENDITION_PATH + rendition;
counter++;
}
return create(factory, uris);
}
}
} catch (LoginException | RepositoryException e) {
//log error and return void content
}
return ReplicationContent.VOID;
}
private ReplicationContent create(ReplicationContentFactory factory, String[] uris)
throws ReplicationException {
File tmpFile;
try {
tmpFile = File.createTempFile("cq5", ".post");
} catch (IOException e) {
throw new ReplicationException("Unable to create temp file", e);
}
try (BufferedWriter out = new BufferedWriter(new FileWriter(tmpFile))) {
for (int i = 0; i < uris.length; i++) {
out.write(uris[i]);
out.newLine();
}
out.close();
IOUtils.close(out);
return factory.create("text/plain", tmpFile, true);
} catch (IOException e) {
tmpFile.delete();
throw new ReplicationException("Unable to create repository content", e);
}
}
public String getName() {
return NAME;
}
public String getTitle() {
return TITLE;
}
}
Key Configuration Steps
- Above code enables the new serialization type โReโfetch Dispatcher Flushโ while configuring the flush agent.
- Configure the refetch flush agent on the Publish instance, similar to a standard Dispatcher flush agent.
- Define the custom content builder in the flush agent under the Serialization section.
- In the Extended tab, set the HTTP Method to POST, allowing the Dispatcher to receive a list of URIs and reโfetch them immediately.
This setup ensures that large assets such as videos are cached proactively instead of being fetched repeatedly by user requests.
A Critical Warning: Use with Caution
While refetching flush agents are powerful, they can also be dangerous if misused.
Risk of SelfโInflicted Overload
If a large volume of pages is activated simultaneously with refetch enabled, the Dispatcher can flood the Publish instance with requests in a very short period of time. In extreme cases, this can resemble a selfโinflicted DDoS attack.
This risk increases when:
- Large sections of the site are activated at once
- Page rendering is slow or resourceโheavy
- Custom AEM code is not optimized
Flushing and immediately refetching all content at the same time is especially problematic and should be avoided.
Best Practices
- Limit refetch to critical pages or assets
- Batch activations carefully
- Test rendering performance under load
- Stagger refetch operations if possible
AEM as a Cloud Service Considerations
In AEM as a Cloud Service, traditional replication agents used in onโprem or AMS setups are no longer available.
Instead:
- Content distribution is handled using Sling Content Distribution
- Dispatcher flush behavior is managed via forward distribution agents
While the underlying mechanisms differ, the same architectural principle applies: proactive cache management is essential for performance and stability.



Post Comment