Why I Love Jekyll!
I recently had the pleasure of working with the SpokenWeb team on their website, which is a static “minimal computing” website built using a program called Jekyll. Jekyll is a static web framework that allows users to build sites by encoding their content into markdown files and then embedding that content in HTML templates written in a language called Liquid. When run, Jekyll merges these markdown and Liquid files into a folder of static HTML files that can be produced on any web server without any further processing. Jekyll is a static site generator and is “minimal” in the sense that it creates websites that minimizes the computer cycles needed to serve a website to a user.
For our project, longevity was a main driver of our desire for minimal computing. How can we build a website that will continue to exist long after the research funding that created it dries up? This is an issue that a lot of academic studies run into. The conventional wisdom when I first got into technology was that content on the internet existed forever; for example, there was the assumption that pictures I uploaded as a child would continue to exist on the internet for the rest of my life. However, I no longer believe this to be true1. I’ve learned that content on the internet is very ephemeral and can disappear at any time for any number of reasons. Links can break as domains change or mediating technology, like link shorteners, disappear. Software updates can break websites in unexpected ways, making existing content inaccessible. Search engines and other web platforms can change policies and algorithms at any time, making content inaccessible, and servers themselves can disappear for economic, security, or political reasons.
If I upload a website today and then forget about it, it probably only has a few years of life before it naturally becomes inaccessible2. However, this is a big problem for those of us who want to use websites as a form of publication. A permanent public record that a project was completed is crucial in an increasingly competitive academic environment; it demonstrates that not only did we get funding, but that we did something very cool with it. This was a primary goal for SpokenWeb at the University of Alberta. We wanted to create something that was easy to maintain, but also would continue to exist long after we stopped maintaining it.
However, creating an archival quality website also has a ton of other benefits. It turns out that all of the qualities that make a website ephemeral are also the same qualities that make it inefficient, insecure, and hard to maintain.
Minimal computing is efficient
A major contributor to the ephemeral nature of modern websites is so-called “cloud computing”; most of the interaction on the internet is mediated through cloud applications. We access websites through search engines, we organize socially through social networks, and we access our email and office applications through web browsers. Even the act of building websites themselves is now done through cloud applications like Shopify, Wix, and even WordPress. This is a problem however, because unfortunately, applications aren’t static resources. They change and evolve as the environment around them changes and evolves and, as a result, require multiple steps and lose their efficiency in the process.
For example, WordPress, an open-source technology that powers roughly 40% of the sites on the internet, is one such inefficient application. A user wishing to build a WordPress website pays hosting fees to a provider, like wordpress.com or WP engine, who installs the program onto a server that the user can access. The user then can build and update their entire website through an application in their browser.
This approach to website management has a number of advantages:
- It creates a strong separation between the user and the server. The server provider does not need to care how the website is configured, and the website owner does not need to care how the service is hosted. This strong division of responsibility makes web hosting easy and scalable. One hosting company can easily host hundreds or thousands of websites cheaply and without much effort.
- The user doesn’t need to download and install specialized software to build their website, making this model of website-building platform-agnostic: you don’t have to know anything to get started.
- The user interface allows users to build and customize their website without needing to learn how to code. What you see is what you get (WYSIWYG).
However, these advantages of a hosted websites come at a high cost. The internet in its original form is a system that lets computers access files that are being stored on another computer. When I type a URL (uniform resource locator) into my web browser, I am essentially asking my web browser to get the object located at that address and display it on my local machine. In the case of websites, the files are HTML files that contain content and directions on how to format that content. Normally, when a web page is requested, the web server merely acts as an intermediary: it verifies the remote computer, collects the resource in question, and sends it on its way.
Search engines and other web platforms can change policies and algorithms at any time making content inaccessible, and servers themselves can disappear for economic, security, or political reasons.
However, cloud platforms like WordPress complicate this process, because they are applications and not static resources. When the WordPress application receives a request from the web server for a certain resource, it dynamically builds that resource using its own internal database and configuration. This happens every time a page is requested. So, if a million users request the same page, even though the page is identical to all users, that page needs to be constructed, from scratch, a million times. The difference between the WordPress model and a static website is the same as the difference between using a machine to type a letter out versus just photocopying it.
The justification for this repetition is that computers are normally so fast and efficient that the end user won’t notice the slight delay. However, even if the increased processing time is too small for individual users to notice, it still creates a massive network wide effect. If we multiply that processing time by every request to every WordPress website across the entire internet, the increase becomes significant. Ultimately, this results in higher load times, higher energy usage, and the need for larger and more powerful servers. As well, the dynamic nature of such web pages also creates a strong link between the software and the page itself. Any change to the WordPress software could dynamically alter any page that it generates. So, behind the scenes, a webpage can alter itself without the page author intending it to change.
Minimal computing is secure
Let’s continue with the example of WordPress to contrast with Jekyll on the issue of security. WordPress users, by design, have no access to the actual website they are building, so the designers of WordPress must allow the user to modify the data through some other means. Thus, WordPress needs a login page, and WordPress users need to be verified somehow before they can be allowed to make changes. This solution does a good job of protecting the server from the website owner, but it creates an avenue for the website user to attack the website itself. The moment there is a public-facing entry point into the back end of the website, the developer has suddenly created ongoing security issues that don’t exist for static sites. Sometimes a WordPress release might have bugs in it that allows unauthorized users access to a website’s content. These bugs do get fixed in newer versions of the software, but those fixes only matter if the WordPress application is updated regularly, and whose responsibility is it to do that?
Even worse, if the WordPress application is installed incorrectly, then an extreme security issue could allow a malicious user access to the web server itself. This incentives hosting companies to force software updates on their clients or take down old or unmaintained sites. The very existence of security concerns on a website means that it cannot be left alone. Sooner or later, someone will decide that it is no longer worth the effort of keeping the content alive and take it down.
Security issues also increase energy usage on the network. I no longer have any WordPress websites installed on my personal server, yet hackers are still constantly probing my domains for WordPress security vulnerabilities. Even today, the day I wrote this sentence, I can find at least one weird request from Germany sending strange characters to what would be WordPress’ search page if it were still installed3. These probes aren’t free. Someone has to spend computer resources sending these requests, and I have to spend resources rejecting them. Someone needs to run a server, and that server must periodically make requests to my server to check if it is vulnerable to these exploits. I don’t have hard numbers, but the fact that I am being probed for these vulnerabilities likely means that many others are as well. There is no incentive to stop probing sites for these vulnerabilities because WordPress exists precisely to make websites more accessible to less-technical users, and less-technical users are exactly the type of people won’t set things up correctly. Ultimately, this means that there is a significant increase in global internet traffic simply because WordPress has had security vulnerabilities in the past.
Jekyll doesn’t have this issue. Static websites are just posters on the internet. No matter how many strange requests a static website gets, they will never get access to anything more than just the static page they are requesting. They cannot sneakily get access to a back end web application because there is no application to get access to. Likewise nobody will be spamming other websites searching for known Jekyll security bugs because they will never find one; no such websites exist or can exist.
Minimal computing ensures control
While some may argue that WordPress makes it easy for a casual user to customize their own website, I would counter with the fact that these customizations are extremely limited. WordPress works best when a user is comfortable working within pre-built themes or plugins that the platform makes available. However, these themes can only be modified in ways that are allowed by the theme authors, who are themselves restricted by what WordPress itself allows. For example, I include a ton of footnotes4 in my blog posts, but the version of WordPress I was using does not allow for footnotes. So, to get footnotes working, I needed to install a third-party application. The plugin I chose worked well for the first couple years the website was live; however, development of that plugin stopped, and it eventually became incompatible with newer versions of WordPress. Eventually my footnotes stopped working. The only solution was for me to install a different plugin. However, the syntax the new plugin used was different, so I had to manually edit all my previous posts and update them with the new format. Had I not fixed these issues myself, those blog posts would still have broken footnotes and become less readable as a result.
The very existence of security concerns on a website means that it cannot be left alone. Sooner or later, someone will decide that it is no longer worth the effort of keeping the content alive and take it down.
Much of the development of a WordPress website works along these lines. If a user wants to make something unique, they both need to hope that an actual developer has created the thing they wanted, made it customizable in the way they wanted, and will continue to update it perpetually so that it will continue to work as expected. Unfortunately, as I discovered, building around a plugin that loses support, or changes in ways that break previous formatting, can be a frustrating experience and acts as a disincentive for users to keep their older content available.
All cloud-based content ages in ways that static content does not. Choosing to update software means that a user is locked into a perpetual treadmill of content updates. New content, as well as old content, needs to be configured to use maintained features correctly, and every change in a feature could require a change in any content that uses that plugin. Platform core developers, theme developers, and third party developers all have a say in how long your content can exist. Features you use can change at any time, and without warning, and it is your job to make sure your content can adapt to those changes5.
Jekyll, on the other hand, generates static websites. This means that the files it produces do not change unless someone specifically changes them. So long as web browsers and the underlying HTML standard exists6 these pages will continue to function. So I can be confident in the fact that the footnotes on my web page will continue to operate long after a blog post is published.
Minimal computing supports privacy
Finally, WordPress and all other cloud based applications also introduce a number of privacy issues that do not exist for static sites. In a conventional HTML file transfer, only two computers need to be involved: the client and the server. However, WordPress inserts itself into this transaction because it needs to build the page for the web server. However, WordPress isn’t just one application: each plugin that a user has installed also has some say in how the page is built. So, the web server hands the request to WordPress, which in turn hands it to the plugin, and the plugin could very well hand it off to someone else. Now we have a situation where WordPress itself is mediating between any number of applications that could themselves be speaking to their own servers. An advanced user may be able to properly audit the plugins they install, but we can make no such assumption for WordPress’s intended audience - the inexperienced creator. Thus, by making WordPress “easy to use,” they have created an incredibly complicated ecosystem of software that introduces dependencies on larger and larger software teams, all of which manage their own technology stacks and web services, and all of which have access to your data. What was once a single request to a single web server has now become a complicated network of technologies speaking to each other in complex and expensive ways.
Jekyll, once again, does not have this issue. Not to sound like a broken record but it is a static site. There is no need for my server to communicate with any other servers to build the web page. It already exists. The only thing my server needs to do is figure out which page the user is requesting and return it to them.
The Problem With Cloud Computing
Minimal computing, to me, isn’t really about fewer computers or less complicated technology. It means removing unnecessary complexity from technology. It means building things that depend on as few other developers as possible. If my website does not need to be dynamically generated, why would I build it to depend on such software. Why would I waste my time installing, configuring, updating, and securing a complicated piece of software when I can just build a thing and give it away? Why would I bother to learn the nuances of a software ecosystem that constructs HTML files when I can just construct HTML files myself? At some point, the solution becomes more complicated and unwieldy than the problem it is trying to solve. Minimal computing is realizing that every decision has a cost. Every tool built with the assumption that a different technology maintained by a different organization will continue to exist is ephemeral. Tools need to be considered beyond simply picking the one that makes a single individual more “productive.” WordPress is designed to make building and changing a website easy, but maintaining it hard. It is a bad solution for websites that don’t change often and care about older content. From an archival standpoint, it’s a non-starter.
Thus, by making WordPress “easy to use,” they have created an incredibly complicated ecosystem of software that introduces dependencies on larger and larger software teams, all of which manage their own technology stacks and web services, and all of them have access to your data.
I don’t hate WordPress. It is a reasonably fair compromise for a lot of people. I do, however, genuinely think that it does demonstrate the trade-off between ease of use and complexity in software. The issues with WordPress primarily come from the fact that it trades static output files for a customizable application. Dynamically generated web pages are useful because they can allow a user to build content in a more intuitive manor, but they also create a dependency on the platform. Every time WordPress changes, the content it hosts changes as well. WordPress isn’t alone in this, as all the problems I have laid out are actually common to all user-oriented “cloud applications.” Any content tied to ephemeral software can only exist so long as the software exists. If a large company is maintaining the software that governs a million users’ websites; what incentive do they have to maintain a feature that is only being used by a couple of those users? Unfortunately, economics of scale also means a systematic deprioritization of unscalable things. It doesn’t matter how cool or unique your web project is, the developer of every technology you use to build it has the de-facto veto on its right to exist and guaranteeing that a specific feature you are using requires time, energy, and money. They will always pick the larger audience over whatever you are doing.
Why Jekyll?
Jekyll shares a lot of similarities with WordPress, in that it is a program that manages the content of a website, with pre-built themes that a user can customize, and it has plugins that third-party developers can develop to extend its capabilities. However, unlike WordPress, Jekyll is not a cloud application; it runs entirely on a user’s local machine. The user installs Jekyll, installs their chosen themes and plugins, configures their website, and finally runs the program, which spits out a folder of HTML files. Site customization is done through editing project files, and the creator can use whatever text editor they feel comfortable working with to do this. The website has no need for authentication because it has no way of modifying itself. Instead, users only need to transfer the final built files to a web host, a process that can be done manually, or using any number of existing mature technologies7 without the need to re-implement this functionality on a web server.
Unfortunately, economics of scale also means a systematic deprioritization of unscalable things.
Once built, there is no remaining dependency on Jekyll or any of its plugins. The web server doesn’t need Jekyll installed, or even to know how the website was built. So long as the server is functional, the website will continue to function indefinitely. Jekyll itself does need software updates from time to time, but these updates in no way affect a live website. If I choose to update the version of Jekyll, I do so only because I wish to use features that the new version has implemented. Likewise, moving a Jekyll website to a new server is as simple as copying the files from one computer to another, with no special software needed. Fundamentally, Jekyll works because it repeats an older model of computing: local computing8, where software has everything it needs to operate locally and doesn’t need to rely on external servers. It is what software used to be before everything moved to “the cloud.”
Jekyll is not perfect, far from it, but its problems are solvable. Like all tools that try to make it easier to build websites, it shifts complexity somewhere else. Liquid, the language Jekyll uses to build templates, is a Ruby9 application, and a user needs to be able to install and configure Ruby before they can use Liquid. Liquid itself is just the next in a long and endless line of flawed web frameworks that attempt to fix the previous flawed web framework. As it is a new technology, it is missing a ton of features that other more mature technology, like PHP or React, has. It also fails at being “low code” and “easy to use” because it takes a very high level of knowledge to work around what few features Liquid has. Many users’ first experience with Jekyll will be through GitHub, which allows users to host Jekyll sites for free, but the underlying technology, GIT, is very complicated and is a massive hurdle when trying to win users away from “simpler” solutions like WordPress.
However, all of these issues are solvable. Liquid will continue to develop and in time become better and easier to use. Ruby is mostly invisible to Jekyll and could disappear completely with a more robust installer. Finally, GIT is optional and alternative hosting platforms could easily pick something else. A user with the desire to maintain their own content can work with and eventually overcome these challenges. Which is in itself another reason why I love Jekyll, it encourages its user to learn more about how the internet and their own computer works. This is much different than using cloud technologies, like WordPress, where the user has no access to the underlying software, no route to learn about how that software works, and ultimately no choice but to accept issues with the software. These issues are baked in by design and cannot be overcome. Jekyll is “minimal” in the sense that it is a program that allows a user to create a website, instead of the usual cloud approach where the program is the website. It minimizes the number of development teams that get any say in how my websites are built and how long they get to stay on the internet. My love of Jekyll is not because it is perfect, easy, or makes me more productive (on the contrary, working with Liquid can be painful). I love Jekyll because ultimately, whatever I produce with it is mine and can exist for as long as I want it to exist.
Moving Forward
I am still experimenting with the software and, after helping with the University of Alberta’s SpokenWeb site, I rebuilt my own website (the one you are now browsing) using it and my own custom theme. That theme has already been used to make one other website that I am also hosting. However, my interest in the technology does not stop there. I am still very drawn to some of the early ideas that powered the internet. Can I use this technology to enable everybody to have their own space on the internet that they can customize as they want?
In its current form, Jekyll is still not easy enough for regular users to use. To get things usable, I’ve had to replace Git with Subversion10 on my own server, as well as aggressively remove software on the server to keep things affordable. In the future, I am interested in finding ways to make this technology even more accessible, so that users can not only make their own websites, but continue to make those websites accessible in the inevitable eventuality that my server goes down. Ideally, there wouldn’t just be one server, but an open network of servers centered around specific communities that users can move their website to at any time. Unfortunately, hosting isn’t free, and while services like GitHub offer free Jekyll hosting, they can only do so because they are backed by large tech firms who make money in other ways. However, the cost of web hosting is greatly inflated by the weight of hosting and managing software like WordPress. A truly static website is functionally just a poster on the internet, and I believe hundreds of such sites can be served to millions of users with minimal overhead.
This is much different than using cloud technologies, like WordPress, where the user has no access to the underlying software, no route to learn about how that software works, and ultimately no choice but to accept issues with the software.
Fundamentally, this is all about taking control of the internet back from large “cloud” companies that use “complexity” and “expense” as excuses to take control of the network away from their users. If AngelFire, a website hosting platform built in 1996, can still exist today, we have no excuse to not be publishing content that will still exist tomorrow.
All that to say, I would like to offer my services on a contract basis to help set up and serve simple Jekyll websites to clients who are interested in creating or switching to a website that is efficient, secure, completely under user control, private, and long-lasting. I have access to an independent Canadian hosting company with servers located in Canada and a genuine interest in making the internet a better place and I will, for a minimal fee, set clients up with them if hosting and domain names are needed. If you are interested, please check out the Work With Me page, where you can find a list of services available.
-
In fact, if it weren’t for the internet archive, nothing I did as a child would still exist. ↩
-
There was absolutely no way for my younger self to know that websites built on AngelFire would still exist in 2024, but GeoCities would be shut down in 2009. At least the content on those websites was static and capable of being archived. The same can’t be said about modern content sites like Facebook or Instagram, which exist behind proprietary software and cannot be archived. When they disappear, everything inside them will be gone for good. ↩
-
Specifically the user search page. I assume they are trying to extract a list of usernames. ↩
-
Yes, the simple inability to do this was the single most frustrating thing about my old site. ↩
-
For the developers of other “cloud applications”, the content treadmill is super profitable, and this fact is a feature and not a bug. ↩
-
So to be clear, the HTML standard does change. It just happens very slowly as it is controlled by the W3C organization and other international standards organizations. The best part about standards committees is that they don’t have “most fast and break things” as a guiding principle. ↩
-
Ftp, remote desktop, version control like GIT or Subversion, or anything else really. ↩
-
Unfortunately, tech enthusiasts have attempted to rebrand this as “edge computing” or “moving compute power physically closer to where data is generated link”, but this is just a fancy way of labeling a program that runs on my own hardware instead of someone else’s. ↩
-
Ruby is itself a programming language. Which creates a tech stack where we are using one programming language that depends on another programming language. ↩
-
Subversion is a simplified version control system designed for much smaller development teams. Instead of the infinitely branching tree structure of Git repositories, subversion maintains one authoritative copy of the project. Users can check it out, make changes, and then check it back in. In my experience it is a lot easier to teach, and acts as a great starting point to eventually learn Git. ↩