Building the Future of Web Development with React, Gatsby and Static Sites

January 9th, 2019 - 21 min read

As part of this post I have recreated my blog using the latest technologies available. These include:

  • ReactJS
  • Gatsby
  • GraphQL
  • Webpack
  • Babel
  • SSR
  • Contentful/Headless CMS

web-development-tech

React

React-icon

Let's start with React. React is a JavaScript library for building user interfaces. It is around 5 years old and was created by Facebook.

React works in a similar way to other libraries such as Vue and Angular; it utilizes components, which you, as a developer, create yourself.

JSX

Writing code for React is a little unusual at first, as it uses a technology called JSX in order to output markup. A little preview of this is below.

export default class HelloWorldApp extends Component {
  render() {
    return (
      <Text>Hello world!</Text>
    );
  }
}

The syntax gets a little getting used to, but quickly begins to make sense.

Bundled Components

A component in React contains everything required to display it. This includes the markup, JavaScript to control the behaviour, and CSS for styling.

Usually, in a web application, markup, code and styling/presentation is split up into separate areas of the app to make it easier to manage. However, given the tight concept of components it makes a lot more sense to bundle these together in the same file (or set of files).

Again, this takes a little getting used to, but quickly makes sense.

Push State

On this blog you'll notice that each page loads instantly, with a short transition between each page.

This is possible because I'm using pushState between each page. Rather than actually loading each page, I'm just pulling in the content I need from within my React app, and then using pushState to tell your browser what the new URL is so that the back & forward buttons on your browser work properly, and the address bar shows what's required.

Additionally, I also change a few key details from page-to-page, such as the meta description, and open graph tags.

Gatsby

gatsby

Gatsby is a static site generator that sits on top of React. The idea is to create your website using React and then generate a series of static HTML and asset files.

Static Site Generators

The concept of a static site generator isn't new. People have been building sites with git repositories as a backend for a while, and GitHub Pages works using Jekyll, another Static Site Generator.

What makes Gatsby so interesting though is the ability to use dynamic data as part of your static site.

The content that you're reading right now is being served from a static .html file, however, it wasn't written as a .html file. Instead, the content was written on Contentful, which is a hosted headless CMS (I'll come onto this shortly).

The data is pulled in using GraphQL (more on this below), and Gatsby uses it, in conjunction with my React app, to build out static files and assets.

Why a Static Site?

Static sites are fast, really fast. They also remove many of the dangers of exposing a website online.

Security

If all your website is made up of is read-only static files then what damage can someone really do? The data flows one-way only. There's nothing to attack, no vulnerabilities, no backdoors.

Speed

They're rapid. If you're hosting your site using a CMS then every page that you load up has to bootstrap a full CMS, which includes handling sessions (often on a separate machine), connecting to a database, pulling in data from various tables, speaking to different APIs and goodness knows what else.

If you're loading a page on a static site then the only thing that needs to happen is the page gets displayed. That's it.

Hosting

Any website requires hosting of some kind. If you're hosting a small blog or website you might be using WordPress or CraftCMS, which require hosting that has PHP enabled, and a MySQL database. This can be reasonably cheap, but when you start to get any real visitor numbers you'll need to look into dedicated hardware or a true cloud solution.

Likewise if you're using Umbraco or a .Net CMS, you'll be looking at a similar set up but will also be incurring licensing costs for Windows Server and SQL Server.

What about a static site? Well these can be hosted pretty much anywhere. Any web host will support a static site as it just displays the same content a CMS ends up producing... HTML.

But, more importantly, it's highly scalable. You can host a static site on a CDN only, as there is no server-side processing. That means you can scale to billions of visits per day and it will cost you hardly anything, but still be highly available.

GraphQL

graphql

The data is pulled in using GraphQL, a data querying language. The concept of GraphQL is a little difficult to describe; their website does a much better job of it than me. However, the best way of thinking of GraphQL is a querying language for an API.

Set up your API, and then you can use GraphQL to pull the data you need without needing to write a specific end-point for every use case.

In the long run this makes everything quicker, too. GraphQL only has the data it needs, rather than receiving everything from a typical API response and ignoring everything apart from what is required.

Use with Gatsby

On my blog, Gatsby uses GraphQL to pull in posts from my headless CMS. These posts have different data depending on what page they're used on. For example, this page has all of the details of the post, whereas the home page will only require a snippet.

Headless CMS

For those who don't know, a headless CMS is just a CMS without a presentation layer. A typical CMS will have an admin area where you can manage your content, and then a frontend where visitors can view that content.

You, as a developer, can define the content format, and you can define how that is shown by way of the theme/frontend/presentation layer. Content authors can then use the admin area to write their content.

With a headless CMS there is no presentation layer. You get the backend where you can define your content types and manage the content itself, but there's no frontend. Instead, the CMS exposes an API that spits out the content.

Why Headless?

Headless CMS's are actually extremely useful. For this blog it's perfect, as I can manage my content and rebuild my static files whenever I need to. No drama.

A more real-world example is when you may have a website that displays data from multiple sources. Imagine a situation where a company that sells products to other businesses and they want a website. They have an ERP (stock/warehouse management) & PIM (catalogue/product management) system with all of their product info in, which they want to sell online.

However, they also want a content-rich website which is multi-lingual & multi-territory.

Using a headless CMS means you can have all of the structured content in the CMS, where it belongs. You can then use the ERP and PIM APIs to pull in product data as you see fit. You can even mix and match the two together.

This forms the premise of a microservices pattern.

Contentful

My content is all managed on Contentful. Contentful is a CMaaS platform (Content Management as a Service). Effectively, it's a hosted headless CMS that allows me to log in and update my content types/models, and to manage my content.

Contentful exposes an API that allows me to use GraphQL to pull in my blog posts into Gatsby ready to be built into static pages.

React, Gatsby & SEO

Those of you who may be familiar with React and similar frameworks will know that they don't necessarily play nicely with search engines. React is built up of components; when you view-source on a page built in Gatsby you may see something as basic as:

<html>
  <head>[...]</head>
  <body>
    <gatsby></gatsby>
  </body>
</html>

The above code snippet actually serves the entire website, including all content across all pages. Search engines aren't going to have a great time reading that, right?

Wrong. Google's crawler bots have JavaScript engines, that means they see what you see (within reason).

However, it's not as simple as that. The JavaScript-enabled bots often take a while to crawl your pages, whereas the more basic bots will crawl far more quickly. Additionally, Google's Webmaster Guidelines (and common sense) state that your site should work without JS enabled.

SSR

That's where Server-Side Rendering comes in. SSR takes care of this for us. With a little bit of extra coding within the React app and with a bit of magic from Gatsby you'll see that each page has its own markup with the content housed within.

Go ahead and view-source on this page. You should see the content in there, which is what the search engines will be able to read.

Other Tech

There are various other bits of technology that help build this blog which work to form part of the toolchain. These include:

Babel

babel

Babel can be thought of as a JavaScript translator. Babel allows me to write code in the latest version of JavaScript, which isn't supported by most browsers. Babel then translates that code to an older version of JavaScript, which can be read by browsers.

React and Gatsby rely heavily on Babel, as most of their code is not compatible with older browsers.

Webpack

webpack

Webpack is the glue that brings everything together. Webpack compiles my Scss to CSS, minifies my JavaScript, runs Babel on my code to make it compatible, and gets everything ready for production.

Cloudflare

I'm hosting my blog on some cheap web hosting I have access to. Nothing particularly impressive about it, and it's not very fast. However, all traffic is routed through Cloudflare, who do a number of jobs for me:

  • Additional security

Cloudflare will detect would-be attackers based on their geographical location, IP and various other metrics. If Cloudflare suspect it could be malicious the entire request is blocked.

  • SSL certificate

I have an SSL certificate, which isn't from Let's Encrypt, but hasn't cost me anything. The whole site runs over SSL.

  • My server is anonymous

No one knows where my server is, what its IP is, or has a clue where my site is hosted. I don't have to worry about a would-be attacker port scanning or looking for vulnerable software on the server away from my website.

HTTP/2

HTTP is the protocol the web runs on. Version 2 exists, and is now widely supported by most browsers. However, many hosts don't support it yet.

There are a lot of differences between HTTP1.1 and HTTP/2, but I'll just focus on one for now.

When your browser requests a website from the server it does so in one request. The server responds with a HTML document, which can contain links to a number of assets that make up the page; things like videos, images, stylesheets, JS, etc. Each asset is then requested, one-at-a-time, in seperate requests, by your browser. The server obliges and returns them one-at-a-time.

HTTP/2 works differently. Your browser requests a web page, and the server responds with everything your browser needs to compile that page, including any linked assets.

It's not straightforward to get this working from a web application point-of-view, but it has fairly big speed advantages when done properly. This blog uses HTTP/2, wherever possible.

Other Solutions

I'm a big fan of Vue, which is an equivalent to React. If you've never used either and are looking to pick one up, I'd definitely recommend Vue. The learning curve is much shallower, and it's being picked up by some great companies.

Unfortunately I couldn't find a static site generator that played nicely with Vue, yet. The closest I could find was Gridsome which looks great, but it was still in beta at time of writing, and the documentation was still unfinished.

Deploying

Another big difference for me when working with static sites and JS was the deployment procedure. I'm used to working on larger builds that involve multiple servers, databases and various service machines so I tend to deploy using a deployment tool or from our repo tool.

Because we're dealing with a static site any time there is a change to the site or data we need to rebuild. A change can be something obvious such as a CSS tweak, or something you might expect to be automatic, such as a new blog post being added.

Unfortunately due to the nature of static sites, the deployment process includes building the full site upon each deployment, and each change of any data. This is a manual process.

For me, that's not so much of a problem. The odds are that whatever machine I'm writing a blog post on can easily be used to build the site and deploy from. But, from a content-editors point-of-view it's a bit of a big roadblock having to get a developer involved every time a piece of content is changed or created.

Netlify

In comes Netlify. Netlify is a build service for static sites. It exists to solve this very problem.

  1. Add your repo to Netlify so it can access your code.
  2. Create a web hook from your CMS or any other data source you may have to trigger a build.
  3. Netlify will build your site based off your defined build process (usually just yarn build) and deploy it to your host for you.

Effectively you have an automated build and deployment process whenever your data sources change.

What this isn't suitable for

This whole set up is great for me. It's fast, reliable, and bullet-proof from a security stand point. However, you'll notice this blog has zero functionality on it beyond a blog. No contact form, no login area, nada.

As soon as you require anything with backend involvement static sites quickly lose their appeal. It is, of course, possible to integrate with a backend framework by using AJAX or even hosting a backend app and posting directly to it, but suddenly you are maintaining several code bases to do one thing. For me, that just isn't right, but that might change.

As with anything, it's always worth running through your situation now, and in the future, before you commit to building anything in a piece of technology. If it's not suitable or you don't think it will be suitable, then it's probably not for you.

Summary

In summary, React and Gatsby will work well if you've got a basic site with little backend interaction. A static site generator is even more appealing if you have several data sources you want to display.

I focussed a lot on speed and efficiencies during the build of my site, and I managed to achieve a perfect score on Lighthouse.

lighthouse

However, it's always important to make sure you're building with a toolset that will help you, and not hinder you, especially in the mid-long term. Speed is often a trade off against ease of development, time constraints or features, so try not to get too hung up on your Lighthouse score, so long as the site is quick, responsive and easy to use.