Using Google Sheets as a data source in Eleventy
Posted on: 13 March 2023
In part one, we used a script to setup to sync our transient Letterboxd RSS feed to Google Sheets for some permenance. In part two we'll look into reading this spreadsheet and outputting the data onto a page using Eleventy. Then we'll take a look at the underlying class doing all the work: CachedSpreadsheet
.
Bringing the data to life in Eleventy
After syncing your Letterboxd RSS feed with Google Sheets, you should have a spreadsheet full of data to output onto a page in Eleventy.
Here is how I read the film and organise data in the Google Sheet and render it using Eleventy:
// _data/films.js
const { weekNumberYear } = require('weeknumber');
const timeAgo = require('node-time-ago');
const CachedSpreadsheet = require('../../src/cached-spreadsheet');
const films = new CachedSpreadsheet('<SHEET ID>');
const reviews = await films.sheet();
return reviews.
filter(film => film.watched).
map(data => {
const date = new Date(data.watched);
return {...data, date};
}).
sort((a, b) => a.date < b.date ? 1 : -1).
reduce((obj, film) => {
film.timeAgo = timeAgo(film.date);
const { year } = weekNumberYear(film.date);
obj[year] = obj[year] || [];
obj[year].push(film);
return obj;
}, {});
I read the data in the sheet into an array using the sheet()
method on the CachedSpreadsheet
object. I then filter the array, sort the array by date, and reduce it into an hash grouped by year. This is done in a "data file" in Eleventy, meaning we can access this date from any template by calling the name of the file as a variable: films
.
Then in my Nunjucks template:
// films.njk
---
layout: layout.liquid
eleventyComputed:
title: "Films watched in {{ year }}"
metaDescription: "A collation of films I watched and reviewed in {{ year }}, pulled from my Letterboxd diary."
pagination:
data: films
size: 1
alias: year
reverse: true
permalink: "films/{{ year | slug }}/"
---
{% import 'macros/years.njk' as 'macros' %}
<h1 class="text-3xl md:text-4xl font-extrabold mb-4">{{ title }}</h1>
{{ macros.yearsNav(pagination) }}
<div class="space-y-8">
{% for film in films[year] %}
<div>
<h2 class="text-xl mb-3 font-bold">
<a href="{{ film.url }}" title="My Letterboxd review of {{ film.name }}">
{{ film.title }}
</a>
</h2>
<div class="flex gap-4 md:gap-6">
{% if film.posterImage %}
<div class="basis-1/3 md:basis-1/4 self-start">
{% image film.posterImage, film.title, loop.index0 > 0 %}
</div>
{% endif %}
<div class="basis-2/3">
<p class="text-lg mb-3 md:w-4/5 before:text- before:text-slate-500 before:mr-1 after:ml-1 after:text-slate-500 after:text- before:content-['“'] after:content-['”']">
{{- film.description -}}
</p>
<time datetime="{{ film.watched }}" class="text-sm block text-slate-700">{{ film.timeAgo }}</time>
</div>
</div>
</div>
{% endfor %}
</div>
We use "pagination" in Eleventy with a size of 1 to produce a URL for each year I have film data for, e.g. /films/2023/
, /films/2022/
, showing the films watched that particular year. This works because the films
variable is populated with a hash of films by year.
Now, let’s dig into the CachedSpreadsheet
class under the hood and see what’s going on.
Setting the scene
The CachedSpreadsheet
class is instantiated with a unique spreadsheet ID, which can be found in the Google Sheet URL (URL part after the /d/ and before the /edit):
The constructor for this class is pretty basic:
constructor(spreadsheetId) {
this.spreadsheetId = spreadsheetId;
this.cache = null;
}
The most significant method on the class is sheet
, which retrieves a cached array of the data contained in the Google Sheet. For the sake of simplicity in this blog post, we’ll assume we’re only dealing with single-sheet spreadsheets. The full definition for the sheet
method is as follows:
async sheet(customFields = null) {
this.cache = new AssetCache(`spreadsheet-${this.spreadsheetId}`);
if (this.cache.isCacheValid('1h')) return this.cache.getCachedValue();
await this.loadSpreadsheet();
const sheet = this.spreadsheet.sheetsByIndex[0];
console.log(`fresh spreadsheet cache: ${this.spreadsheet.title} (${sheet.title})`);
const rows = await sheet.getRows();
const fields = customFields || sheet.headerValues;
const data = rows.map(row => Object.fromEntries(fields.map(f => [f, row[f]])));
await this.cache.save(data, "json");
return data;
}
We make use of the AssetCache
class in the @11ty/eleventy-fetch
library for caching data retrieved from the Google Sheet. Really this just speeds up development where you want to avoid multiple read calls to a Google Sheet that hasn’t changed.
We initialise a new AssetCache
object, passing it a unique identifier, made up of the prefix spreadsheet-
, followed by the spreadsheet ID.
If the cache is valid for this key, i.e. the data has been cached previously, within the defined cache expiry duration (in this case 1 hour), we can return the cached value and exit the method early. No spreadsheet interaction required at all:
this.cache = new AssetCache(`spreadsheet-${this.spreadsheetId}`);
if (this.cache.isCacheValid('1h')) return this.cache.getCachedValue();
If the cache has expired, we load the spreadsheet into memory by calling the async loadSpreadsheet
method, and waiting for it to complete, using await
. Here’s how this method looks:
async loadSpreadsheet() {
if (this.spreadsheet) return;
this.spreadsheet = new GoogleSpreadsheet(this.spreadsheetId);
await this.spreadsheet.useServiceAccountAuth({
client_email: process.env.GOOGLE_SERVICE_ACCOUNT_EMAIL,
private_key: Buffer.from(process.env.GOOGLE_PRIVATE_KEY, 'base64'),
});
await this.spreadsheet.loadInfo();
}
Here we have our first interaction with the GoogleSpreadsheet
class, a class provided by the google-spreadsheet NPM package.
The first thing loadSpreadsheet()
does is guard against repeated calls to this method, by returning early if a GoogleSpreadsheet
instance has already been instantiated. Otherwise, we instantiate an object, passing it the Google Sheet ID.
Next, the class requires we authenticate.
Accessing the sheet
The Google Spreadsheet NPM package offers 3 different ways to authenticate. For my needs, and reportedly what most projects use, a “Service Account” with Google is the way to go. A service account connects as a specific "bot" user generated specifically for your application.
First, though, you’ll need to go to your Google Developers Console and enable the Sheets API. Select your project or create a new one. Then enable the Sheets API for your project.
You’ll then need to create a service user by going to Credentials, clicking “Create Credentials” and selecting “Service Account”. For full instructions on how to setup a service account, visit the Authentication section of the package docs.
You should come out of this process with a JSON key file, which you should download and store locally. You’ll need a way to import this file into your app in development and production. I chose to do this using environment variables.
You could store the entire JSON strong in an environment variable and parse it in your app, but Google Spreadsheet NPM package only requires the client_email and private_key values from the JSON object, so that’s all I store. In development, I use an .env file which looks like this (values removed):
GOOGLE_SERVICE_ACCOUNT_EMAIL=*******
GOOGLE_PRIVATE_KEY=*******
In Eleventy, we can read these values from process.env.GOOGLE_SERVICE_ACCOUNT_EMAIL
and process.env.GOOGLE_PRIVATE_KEY
, like so:
await this.spreadsheet.useServiceAccountAuth({
client_email: process.env.GOOGLE_SERVICE_ACCOUNT_EMAIL,
private_key: Buffer.from(process.env.GOOGLE_PRIVATE_KEY, 'base64'),
});
We then pass these credentials into the useServiceAccountAuth
method on the GoogleSpreadsheet
object. We’ll await
a response, then call loadInfo()
which loads the document properties and worksheets.
Retrieving rows and persisting the cache
With the spreadsheet loaded in memory, we can start interacting with the data in the sheet. We retrieve the first sheet by calling sheetsByIndex[0]
on the GoogleSpreadsheet
instance. On the resultant sheet, calling getRows()
will then give us an array of row data.
We’ll also do some helpful console logging to provide progress updates when building the site in Eleventy:
const sheet = this.spreadsheet.sheetsByIndex[0];
console.log(`fresh spreadsheet cache: ${this.spreadsheet.title} (${sheet.title})`);
const rows = await sheet.getRows();
The data isn’t yet in a nicely formatted key/value hash, so we have to loop over the array and build objects using Object.fromEntries
:
const fields = customFields || sheet.headerValues;
const data = rows.map(row => Object.fromEntries(fields.map(f => [f, row[f]])));
customFields
is an optional array argument we can provide when calling the sheet
method. If not provided, we’ll grab all the columns from the Google Sheet using the headerValues
property on the sheet object. It will automatically determine the columns from the values in the first row of the sheet.
Once we have the data, we’ll do 2 things with it. Persist it to the cache (and await
a response), and return it:
await this.cache.save(data, "json");
return data;
Writing data to the sheet
The second role of CachedSpreadsheet
is to write data to the Google Sheet. The definition for this is fairly straight forward:
async writeRow(data) {
await this.loadSpreadsheet();
await this.spreadsheet.sheetsByIndex[0].addRow(data);
this.clearCache(sheet);
}
First, as with reading data, we ensure the Google Sheet has been loaded into memory with loadSpreadsheet()
. We then grab the first sheet, using sheetsByIndex[0]
, and call the addRow
method, passing it a hash of the data we want to add to the sheet.
Finally, we clear the cache, so subsequent read requests will be forced to retrieve the updated sheet afresh.
Clearing the cache
At the time I wrote this, there was no way to manually purge the stored asset cache in the @11ty/eleventy-fetch
package (this may be possible now). As the cached data is stored in the file system, under .cache, we can manually purge the cache by deleting the relevant files. That’s what we do in the clearCache()
method:
clearCache() {
const cachePath = this.cache.getCachedContentsPath("json");
if (fs.existsSync(this.cache.cachePath)) {
fs.unlinkSync(cachePath);
fs.unlinkSync(this.cache.cachePath);
}
}
There are 2 files associated with a cached asset, the configuration, and the JSON data itself. We need to delete both. The former can be identified by the cachePath
property on the AssetCache
object (stored on the class in the this.cache
property). And the latter, by calling getCachedContentsPath("json")
.
We remove both these files, using the unlinkSync
method on the fs
file system package (built in to Node). We check to see if the configuration file exists first, i.e. the spreadsheet has been previously cached, using existsSync
to avoid an error.
To sum up
The CachedSpreadsheet
class has 3 main roles: retrieving data, caching data and writing new data. Encapsulating these 3 functions within a single utility class makes the interaction with a Google Sheet a nice, smooth experience in Eleventy. It also provides an abstraction to the underlying package.
On my site, I have multiple Google Sheets integrations, all individually cached, read and written to/from via the various sync scripts I have. Examples include Letterboxd film data, TV Maze episode data and Spotify podcast data.
The whole process, end-to-end, for syncing film data, looks like this:
- Check films I’ve watched in my Letterboxd RSS feed
- Sync any new ones to a Google Sheet
- Read and cache the data from the Google Sheet in a data file in Eleventy
- Render a page for each year of film data, outputting the film name, rating, review & poster
What next?
The CachedSpreadsheet
class isn’t perfect. For one, it’s not tested. Due to its tight coupling with environment data and the GoogleSpreadsheet class, it would be tricky to test.
In part 3 of this mini-series of posts, I’m going to refactor and write tests for this class. I am by no means an expert in testing in the Node world, so I’ll be learning as I go. Join me!