Refetching Dispatcher Flush Agents in AEM

Caching plays a critical role in ensuring high performance and scalability in Adobe Experience Manager (AEM) architectures. The Dispatcher sits between the Publish tier and end users, caching rendered content to reduce load on the Publish instances. However, how cache invalidation is handled can significantly impact system stability—especially during traffic spikes.

This article explains the concept of refetching Dispatcher flush agents, why they are needed, how they work, and when they should (and should not) be used.

The Problem with Plain Dispatcher Flush Agents

A standard (plain) Dispatcher flush agent invalidates cached content when a page or asset is activated. Once flushed, the content is not immediately re-cached. Instead, it is fetched again from the Publish instance only when the next user request arrives.

At first glance, this behavior seems reasonable—but it introduces an important risk.

Traffic Spike Risk

Consider this common scenario:

A large number of pages or assets are flushed during off‑peak hours (for example, at night).
In the morning, users start visiting the site as usual.
Multiple users request the same recently flushed pages at the same time.

In this case, the Dispatcher may forward multiple concurrent requests for the same content to the Publish tier before the cache is re‑populated. Instead of one render per page, the Publish instance may end up rendering the same page many times in parallel.

High‑Risk Content Types

This issue becomes more severe when dealing with large or expensive‑to‑render resources, such as:

High‑resolution images
Large video files
Pages with complex backend logic or heavy integrations

Under peak load, this can lead to:

Increased Publish CPU and memory usage
Longer response times
Potential availability issues

What Is a Refetching Dispatcher Flush Agent?

A refetch flush agent enhances the standard flush behavior by instructing the Dispatcher to immediately re‑fetch the content from the Publish instance right after the cache invalidation happens.

In other words:

Cache is flushed
Dispatcher proactively requests the updated content
The cache is repopulated before real users hit the page

This approach effectively eliminates the thundering‑herd problem caused by concurrent user requests.

Benefits of Using Refetch Flush Agents

1. Controlled Cache Warm‑Up

With refetching enabled, cache warm‑up happens as part of the publication workflow rather than being left to chance through user traffic.

This means:

The first user does not pay the cost of rendering
Subsequent users are served directly from the Dispatcher cache

2. Reduced Load on Publish Instances

Since the Dispatcher fetches the content only once per flushed item, Publish instances are protected from multiple redundant rendering requests.

This is particularly beneficial for:

Resource‑intensive pages
Media‑heavy content
Sites experiencing sharp traffic peaks

3. Predictable and Planned Rendering

Refetch agents give you full control over when rendering occurs. Because refetching is tied to replication events, you know exactly when the Publish tier will be involved.

A common and effective practice is:

Perform activations and refetch flushes during low‑traffic windows (e.g., overnight)
Ensure the cache is fully populated before users arrive

As a result, daytime traffic is served almost entirely from Dispatcher cache.

Live Example: Custom Refetch Dispatcher Flush for Media Content

A practical implementation of refetching can be seen in a custom Dispatcher flush content builder that sends a list of URIs to be re‑fetched immediately upon flushing—video assets being a common example.

@Component(service = ContentBuilder.class, property = {"name=re_fetch_dispatcher_flush"})
public class DispatcherFlushContentBuilder implements ContentBuilder {

  @Reference
  private ResourceResolverFactory resourceResolverFactory;

  public static final String NAME = "re_fetch_dispatcher_flush";

  public static final String TITLE = "Re-fetch Dispatcher Flush";

  private static final String NT_DAM_ASSET = "dam:Asset";
 
  private static final String RENDITION_PATH = "/_jcr_content/renditions/";

  @Override
  public ReplicationContent create(Session session, ReplicationAction action,
      ReplicationContentFactory factory, Map<String, Object> parameters)
      throws ReplicationException {
    HashMap<String, Object> map = new HashMap<String, Object>();
    map.put(MyConstants.AUTHENTICATION_INFO_SESSION, session);
    try (ResourceResolver resourceResolver = resourceResolverFactory.getResourceResolver(map)) {
      String path = action.getPath();
      Resource res = resourceResolver.getResource(path);
      if(res != null) {
        Node node = (Node) res.adaptTo(Node.class);
          // Check if the node is a dam asset, if yes then create content with all the renditions of the asset
          // other than node type, information like extension, size etc can also be used to filter the renditions if required
          if (NT_DAM_ASSET.equals(node.getPrimaryNodeType().getName())) {
              int renditionSize = this.configuration.renditions().length;
              String[] uris = new String[renditionSize];
              int counter = 0;
              for (String rendition : this.configuration.renditions()) {
                uris[counter] = path + RENDITION_PATH + rendition;
                counter++;
              }
              return create(factory, uris);
          }
      }
    } catch (LoginException | RepositoryException e) {
      //log error and return void content
    }
    return ReplicationContent.VOID;
  }

  private ReplicationContent create(ReplicationContentFactory factory, String[] uris)
      throws ReplicationException {
    File tmpFile;
    try {
      tmpFile = File.createTempFile("cq5", ".post");
    } catch (IOException e) {
      throw new ReplicationException("Unable to create temp file", e);
    }
    try (BufferedWriter out = new BufferedWriter(new FileWriter(tmpFile))) {
      for (int i = 0; i < uris.length; i++) {
        out.write(uris[i]);
        out.newLine();
      }
      out.close();
      IOUtils.close(out);
      return factory.create("text/plain", tmpFile, true);
    } catch (IOException e) {
      tmpFile.delete();
      throw new ReplicationException("Unable to create repository content", e);
    }
  }

  public String getName() {
    return NAME;
  }

  public String getTitle() {
    return TITLE;
  }
}

Key Configuration Steps

Above code enables the new serialization type “Re‑fetch Dispatcher Flush” while configuring the flush agent.
Configure the refetch flush agent on the Publish instance, similar to a standard Dispatcher flush agent.
Define the custom content builder in the flush agent under the Serialization section.
In the Extended tab, set the HTTP Method to POST, allowing the Dispatcher to receive a list of URIs and re‑fetch them immediately.

This setup ensures that large assets such as videos are cached proactively instead of being fetched repeatedly by user requests.

A Critical Warning: Use with Caution

While refetching flush agents are powerful, they can also be dangerous if misused.

Risk of Self‑Inflicted Overload

If a large volume of pages is activated simultaneously with refetch enabled, the Dispatcher can flood the Publish instance with requests in a very short period of time. In extreme cases, this can resemble a self‑inflicted DDoS attack.

This risk increases when:

Large sections of the site are activated at once
Page rendering is slow or resource‑heavy
Custom AEM code is not optimized

Flushing and immediately refetching all content at the same time is especially problematic and should be avoided.

Best Practices

Limit refetch to critical pages or assets
Batch activations carefully
Test rendering performance under load
Stagger refetch operations if possible

AEM as a Cloud Service Considerations

In AEM as a Cloud Service, traditional replication agents used in on‑prem or AMS setups are no longer available.

Instead:

Content distribution is handled using Sling Content Distribution
Dispatcher flush behavior is managed via forward distribution agents

While the underlying mechanisms differ, the same architectural principle applies: proactive cache management is essential for performance and stability.