When Pre-Loading Beats Streaming: The Caching Advantage

2026-03-18

Introduction

Web pages are often composed of:

semi-static page parts which change infrequently and which are the same for all users,
dynamic page parts which change frequently and whose content may depend on the user.

Recently released JavaScript frameworks, and the pioneering JS framework Marko, optimize page loading by progressively streaming different parts of the page to the client. Although this kind of streaming is an effective optimization, I argue in this article that we should consider it only after taking care of a more important optimization: Caching.

In my article, How to make fast web frontends, I classified optimization techniques broadly into two categories:

Techniques which speed up page loading by reducing the work necessary, for the server and for the network, to deliver the content to the client. Of these techniques, Caching is a prime example.
Techniques which do not necessarily reduce work but instead reduce user wait time by scheduling resource loading intelligently. Both HTTP-streaming and Pre-Loading fall into this category.
- With HTTP Response Streaming, the server can start loading all page parts in parallel as soon as it receives the client’s request.
- With Pre-Loading, the client can start fetching page parts as soon as it receives the page’s headers or its head element.

Although relying on HTTP response streaming allows the server to start fetching resources earlier than pre-loading, it is not a perfect solution: Streaming all page parts together is a form of bundling and therefore hinders caching: The highly cacheable semi-static page parts cannot benefit from the HTTP cache as they are bundled with the dynamic parts in a single URL.

In this article, I compare the performance of page loading when the full page is streamed to the client versus when dynamic page parts are pre-loaded as separate resources. Using diagrams generated by simulation, I show that both Full-Page Streaming and Split-Page Pre-Loading can achieve similar performance with the latter being more effective at reducing overall work thanks to better compatibility with caching.

Introduction
Table of contents
Simulation Settings
The baseline pages to compare: Full-Page Streaming vs Split-Page with Pre-Loading
The effect of server-side and edge caching
Assembling the Full-Page version on the edge for better caching
Comparing page loads for returning users with a warm cache
Conclusion

Simulation settings

For the remainder of the article, I’ll be showing timeline charts (or Gantt charts) of the page loading of multiple versions of the same web page. These charts were generated by simulating the client, the server, the network, and the database holding the page data.

The page of interest is composed of 2 parts: One semi-static part which is cacheable, and a dynamic part which is not. The page also loads a script file. The page is considered completely loaded once both page parts are loaded, the script is loaded and executed, and both page parts are hydrated (if needed).

The simulation uses the parameters:

Parameter	Value
Database Query Duration	50 milliseconds
Database Query Response Size	25 KB
Render Data To HTML Duration	50 milliseconds
Render From HTML Duration	50 milliseconds
Render From JSON Duration	100 milliseconds
Execute Script Duration	250 milliseconds
Hydration Duration	50 milliseconds
Request Size	250 Bytes
Head Size	1 KB
Semi-Static HTML Part Size	25 KB
Dynamic HTML Part Size	25 KB
Script Size	250 KB
Dynamic JSON Data Size	25 KB
Client To Server Network Latency	200 milliseconds
Client To Server Network Bandwidth	2.5 MB/s
Client To Edge Network Latency	50 milliseconds
Client To Edge Network Bandwidth	2.5 MB/s
Edge To Server Network Latency	150 milliseconds
Edge To Server Network Bandwidth	10 MB/s

Additionally, the simulation assumes that:

The server and the database are completely idle when the request arrives.
The server and the database are single-threaded (processing requests one at a time).
The network is completely un-congested.
The full network bandwidth is instantly available (no slow starting).
No additional latency is created by HTTPS handshakes
The page’s script is async and non-render-blocking

You can generate timeline charts with different parameters by visiting the simulation playground.

The baseline pages to compare: Full-Page Streaming vs Split-Page with Pre-Loading

Let’s see the page loading timeline diagrams for two versions of our web page:

The first version streams all page content in response to a single URL full-page.
The second version delivers the semi-static page part in response to the URL split-page, and then the dynamic part in response to dynamic-page-data.json.

In the first round of simulation, which omits caching, both full-page and split-page achieved identical First Contentful Paint latency, with the former completely loading 50ms earlier than the latter.

Full-Page Streaming version: Notice how the server sends requests to get both the semi-static and the dynamic page parts as soon as it receives the full-page request (at T=200ms). The page is fully loaded at T=1249ms.

Split-Page with Pre-loading version: When the server receives the request for the page, it only fetches the semi-static page part at T=200ms. As for the dynamic page part, it is fetched by a separate client request which the server starts processing at T=600ms (400ms later than in the streamed full-page version). That said, the full page loading finishes only 50ms later in this particular example (at T=1309ms).

Thanks to pre-loading, the client requests the dynamic page part as soon as it receives the page's head element (at T=400ms - twice the network's client to server latency). Without pre-loading, the dynamic page part wouldn't be requested by the client until the script is loaded and executed, which delays full page loading until T=1718ms.

The effect of server-side and edge caching

Now let’s add caching at two levels:

The server caches the semi-static page part, so it does not need to reach the database to get this part.
An edge node, or a CDN point of presence, is placed between the client and the server, serving cacheable resources without reaching the origin server.

Both the full-page and the split-page versions benefit from caching on the server and the edge. Their page load times improved by 290ms and 610ms respectively.

The split-page version benefited more from caching. Compared to the full-page version, it got a 300ms earlier First Contentful Paint and a 260ms earlier page load.

Full-Page Streaming with server and edge caching: Thanks to server-side caching of the semi-static page part, the First Contentful Paint arrives earlier than without caching (at T=461ms instead of T=569ms). And thanks to the edge, the script file is loaded with reduced latency. The page loads fully at T=959ms.

Preloaded Split-Page with server and edge caching: The semi-static page part is delivered from the edge with very reduced latency, leading to a First Contentful Paint as soon as T=160ms (300ms earlier than the full-page version) and a full page load at T=699ms (260ms faster than the full-page version, and 610ms faster than the split-page version without caching).

Pre-loading significantly impacts performance. Without it, the split-page takes 421ms longer to fully load (T=1120ms) even with caching enabled.

Assembling the Full-Page version on the edge for better caching

As we saw in the previous section, the streamed full-page version cannot take full advantage of edge caching: each page request must reach the origin server, which re-sends the otherwise cacheable semi-static page part to the client every time.

It is possible to address this problem with edge-side page assembly, which involves caching semi-static parts at the edge and streaming them to the client as dynamic parts are fetched from the origin server.

This approach has existed since the early 2000s with Edge Side Includes (ESI).
More recently, Next.js implemented this pattern with Partial Pre-rendering (PPR) available on Vercel.
Today, many frameworks can run at the edge, enabling them to render pages there rather than on the origin server. This allows streaming the mixed page as one resource while still taking advantage of the edge cache for the semi-static part of the page.

Edge-side page assembly has some drawbacks:

It requires framework support and often vendor-specific code
It requires edge processing, which may incur additional costs
The origin server no longer repeatedly sends semi-static parts to the edge. However, the edge still sends them to returning clients on each request.

Full-Page Streaming with edge-side page assembly: Thanks to edge-side page assembly, the semi-static page part is now delivered from the edge, leading to a First Contentful Paint at T=160ms and a full page load at T=699ms (identical to the split-page version with edge caching example).

In this example, the edge starts fetching the dynamic page part 100ms earlier than the split-page version. That is earlier by twice the edge-to-client latency. As a result, the client receives the dynamic page part earlier (at T=521ms instead of T=562ms). However, in this example, the client is busy executing the page's script when the dynamic page part is received, so the slight loading advantage does not get observed by the user. Under ideal circumstances, full-page with edge-side page assembly should be able to beat split-page with pre-loading by two times the edge-to-client latency.

Page loading from a returning user with warm client cache

Lastly, let’s examine a less representative but still interesting scenario: how fast each page loads for returning users who have fresh cached resources in their browser cache.

Full-Page Streaming for a returning user: The client requests full-page which is not cacheable, and receives it very rapidly (because of edge-side page assembly from the previous section). The script starts executing as soon as the page head is received (at T=100ms). First Contentful Paint is delayed until T=400ms because the script started executing before the page's semi-static part arrived. The page is fully loaded at T=621ms.

Pre-loaded Split-Page for a returning user: The semi-static page part is immediately available from the browser cache. Thanks to this, both the static page part and the page script are available for processing from the cache at T=0ms. The First Contentful Paint arrives at T=50ms (350ms earlier than the full-page version), the script starts executing at T=50ms, and the page is fully loaded at T=571ms (50ms earlier than the full-page version).

Note that for returning users, the Split-Page with Preloading version has an edge over the Full-Page Streaming version, even with edge-side page assembly. The Full-Page version always needs to download the page's head element and the semi-static part from the edge, while these elements are available from the cache for the Split-Page at T=0ms. As such, the client can start processing cached resources earlier. Also, with the Split-Page approach, the client preloads the dynamic page part at T=0ms which means that the edge can start fetching the dynamic parts from the server as early as with the Full-Page Streaming under edge-side page assembly.

Conclusion

In this article, we explored the performance trade-offs between two approaches for loading mixed semi-static and dynamic web pages:

Full-Page Streaming: Delivering the entire page (both semi-static and dynamic parts) as a single streamed HTTP response.
Split-Page with Pre-loading: Delivering page parts as separate resources, with the client fetching dynamic parts early via pre-loading.

In our simulation, we observed that:

With edge caching, Split-Page with Pre-Loading achieves faster load times than Full-Page Streaming, because it separates cacheable content from dynamic content, allowing the fast delivery of semi-static content from the edge.
Full-Page Streaming needs edge-side processing to achieve similar caching benefits, but that adds cost and complexity.
Without browser cache, edge-side page assembly can give Full-Page Streaming a slight advantage, because the edge starts fetching dynamic parts earlier.
With fresh browser cache, Split-Page with Pre-Loading has a slight advantage, because cached resources are available immediately.

Recent JavaScript frameworks make Full-Page Streaming easy, which works well for dynamic content but hinders edge and browser caching of semi-static content. Split-Page with Pre-loading avoids this problem and can be implemented without framework support. That said, framework support is needed to use this pattern alongside DX-enhancing features like server functions/actions in various frameworks, which abstract away endpoint creation and invocation. Among mainstream JS frameworks, Astro’s server islands implement the Split-Page approach with arguably the best developer experience.