Accueil > Blog > Fix Gatsby 4 matchPath issue

Fix Gatsby 4 matchPath issue

Gatsby
Fix
Appjs
Javascript

Introduction

If your Gatsby website has a lot of pre-generated pages, you may have noticed that your app.js file grows fast. That's because from the moment you add a matchPath to one of your pages, Gatsby will write every pre-generated "child page route" in the app.js file.

💡 By "child page route" I mean all pages with same level or deeper level path.

It can concern some parts of your website if you are using a fallback route or, in the worst case, every page of your site if you choose to implement a custom 404 page.
This results in a more and more longer page loading on every page and a catastrophic SEO side effect.

Let's take an example

We have a news website where the journalist post some hot posts. Our website is pre-generated every 10 minutes and build every news with the route /news/{id}. But if a journalist post a news, users have to be able to see it before the next build.
At every moment, we can split our pages in two parts : the generated news and the non-generated news.
Problem : both types have the same route /news/{id}. How to access news created after the last build ?

The problem

This issue can shortly be solved with a matchPath. All we have to do is to add this code anywhere in our createPages function

/* ... */
createPage({
  component: newsFallbackTemplate,
  context: newsContext,
  matchPath: "/news/*",
  /* ... */
})
/* ... */
Enter fullscreen mode Exit fullscreen mode

The created page will be shown each time the user access a non-pre-generated page with path starting with /news/. We just have to handle API calls in the newsFallbackTemplate component to display our news.
In fact, it's work really well. The only problem is the way Gatsby handle it. To better understand why, let see, schematically, what append when a user visit your website at the path /news/ex_6.

The loadPage process

When a page is loading, there are two functions called.

First, the window.loadPage function is called. It's looking for the component matching with the path it's loading.

💡 On an SSG page, it display the pre-generated files and "confirm" the used component.

The process give priority to the more detailed path following this algorithm. To achieve this, the loadPage use the app.js file to get all paths and matchPaths. Once the component is loaded, the page is displayed.

💡 If you don't have any matchPaths, Gatsby will only use the pre-generated data but if you have at least one matchPath, Gatsby have to verify that he display the right component. That's why the app.js will contain every path and matchPath and will grow proportionally with your page number.

Then, the window.loadPageSync will execute to double-check this result.

Image description

In our case, when the browser loads the path /news/ex_6, as we have at least one matchPath page, the loadPage function will look into our app.js file. This file will contain an array which look like this

[
  {"path" : "/news/ex_1", "matchPath" : "news/ex_1"},
  {"path" : "/news/ex_2", "matchPath" : "news/ex_2"},
  /* ... */
  {"path" : "/news/ex_5", "matchPath" : "news/ex_5"},
  {"path" : "/xxx", "matchPath" : "news/*"},
]
Enter fullscreen mode Exit fullscreen mode

Using the algorithm link before, it will return the component corresponding to the path /xxx as the closer matchPath is news/*. The component should be newsFallbackTemplate and will handle all API calls to display our news as a pre-generated one. Everything work fine but in our case we have only 5 pre-generated news so the app.js array only have 6 entries. But if we have 100K news pre-generated, the array will be huge as well as our app.js file.

How to fix it

To fix it, we'll need to replace the way matchPath are handle.

First, we will replace all our matchPath by a static path like /__fallback/__xxx. The createPage call we write at the beginning to handle non-pre-generated news will now look like this

/* ... */
createPage({
  component: newsFallbackTemplate,
  context: newsContext,
  path: "/__fallback/__news_fallback",
  /* ... */
})
/* ... */
Enter fullscreen mode Exit fullscreen mode

We now have a static loading page for every old matchPath page.

Then, we will add to our newsFallbackTemplate component the following code

return (
  /* ... */
  <Helmet>
    <script>
      {`
        window.pagePath = undefined
        window.news__routeParams = {
          path : "/__fallback/__news_fallback",
          matchPage : "/news/*"
        };
      `}
    </script>
  </Helmet>
  /* ... */
)
Enter fullscreen mode Exit fullscreen mode

This will add to the page a script which will add to the global window variable a property news__routeParams with the static path .

💡 Replace news__ prefix by your app name to avoid interferences

Finally, in the gatsby-browser file, we add an exported function onClientEntry to handle these data. This function will rewrite the loadPage and loadPageSync functions to add the logic to handle our custom matchPath, only if news__routeParams is defined. The onClientEntry shape will look like this

const { ___loader: loader, news__routeParams } = window

if (!loader) {
  return
}

//Original loadPage functions, to call them later
const originalLoadPage = loader.loadPage
const originalLoadPageSync = loader.loadPageSync

if (news__routeParams) {
  loader.loadPage = async rawPath => {/* ... */}

  loader.loadPageSync = rawPath => {/* ... */}
}
Enter fullscreen mode Exit fullscreen mode

loadPage

The loadPage function will be the bigger,

loader.loadPage = async rawPath => {
  const path = news__routeParams.path
  const matchPath = news__routeParams.matchPath
  const lastActualPath = window.news__lastActualPath

  const isFallbackPage = !!path
  //As this function is executed before the url change, we still detect a
  //fallback page if we are leaving one so we need to know if we are leaving or not
  const isLeavingFallbackPage =  !!lastActualPath && lastActualPath !== rawPath

  let pageResources

  //If we detect a fallback Page and we're not leaving we override the pagePath 
  //with the loader page path and we add a matchPath corresponding to the 
  //wanted page url with the last part replaced by *
  if (isFallbackPage && !isLeavingFallbackPage) {
    pageResources = await originalLoadPage(path)
    const rawParts = rawPath.split("/")
    rawParts.splice(rawPath.slice(-1) === "/" ? -2 : -1)
    pageResources.page.matchPath = matchPath ?? [...rawParts, "*"].join("/")
  } else {
    //If we detect a non-fallback page or leaving one
    pageResources = await originalLoadPage(rawPath)
  }

  //We save some data for later
  window.news__lastActualPath = rawPath
  window.news__savedPageRessources = pageResources
  return pageResources
}
Enter fullscreen mode Exit fullscreen mode

As you can see, we check if we're loading a fallback page (with matchPath) or not. In the case of a fallback page, we first load the pageResources (component, page-data.json, ...) with the original function (originalLoadPage) called with news__routeParams.path instead of the actual path (rawPath). Then, we set the pageRessource's matchPath to the window.news__routeParams.matchPath
Thanks to this, if the user load news/ex_6 (which is not pre-generated) the loadPage function with load the pageResources of the link /__fallback/__news_fallback and display it.

loadPageSync

At the end of the loadPage function, we saved the loaded pageResources into window.news__savedPageRessources. Thanks to this, we can simply return this variable instead of copy-paste the same logic. Obviously, if there is no saved page resources, we return the original function result

loader.loadPageSync = rawPath => {
  return window.news__savedPageRessources ?? originalLoadPageSync(rawPath)
}
Enter fullscreen mode Exit fullscreen mode

You maybe notice, but to make this work, we need to have set the window.news__routeParams variable. But how can we have this variable which is on the fallback template, if we're trying to load this template ? We add a redirect !

⚠️ That's the limit of this solution : it's only work with an hosted instance, not in development. As it's an SEO issue, you can add matchPath in dev mode.

All you have to do is to add this just after the createPage seen before

createRedirect({
  matchPath:`news/*`,
  toPath: `/__fallback/__news_fallback`,
  redirectInBrowser: false,
  statusCode: 200,
})
Enter fullscreen mode Exit fullscreen mode

That's it ! Now our server will transparently redirect (redirectInBrowser: false) to the fallback page which will display the news dynamically. Once the JavaScript is loaded, the loadPage will be executed and, thanks to our modification, will load the same component as well as loadPageSync.

Image description

Results

Here is the Lighthouse Treemap results before and after our modifications on a +40K pages Gatsby 4 project.

Before

Image description

After

Image description

As you can see, the app.js bundle is drastically smaller !

Maybe Gatsby will solve the problem by himself 🤞🏻