Automatically building a PDF of my CV
I have a web version of my CV online.
I’m currently applying for lots of jobs, so I’m making tweaks to my CV all the time. In a previous post I talked about how (and why) I’ve made the CV editable online. The problem I was facing this time is that when I make changes to the original CV I want to have the web version and PDF version stay in sync, without me having to remember to manually save the web version as a PDF.
Changing the HTML side of things is nice and easy. But regenerating the PDF is a fiddly pain, so I decided to automate it.
1. Add Puppeteer to my project
I already had a Docker-based build system for my website, I just needed to add Puppeteer as a new dependency. Puppeteer is a node library so I needed to install node and npm, and then use npm to install Puppeteer. All standard stuff.
What makes it fun is that my local dev machine is an M1 Mac, and I’ve been using the Linux ARM64 runners in GitHub Actions to run the build system (in order to have a little bit of parity between my dev machine and the “production” machine). Most of the world (but certainly the Puppeteer docs) assume that you’re running x86_64 and the installation instructions often end up causing very confusing errors. In this case I had an ubuntu linux docker image printing a rosetta error message:
rosetta error: failed to open elf at /lib64/ld-linux-x86-64.so.2
But rosetta is a MacOS utility, not something you’d find installed in a Linux Docker image!
After a fair amount of fiddling with how to get ARM64 builds running with Chromium I’ve ended up with this setup in the Dockerfile:
FROM debian:latest
RUN apt update && apt install -y --no-install-recommends \
chromium \
curl \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /pdf
RUN curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.3/install.sh | bash && \
bash -c "source /root/.nvm/nvm.sh && \
nvm install 22.18.0 && \
nvm use 22.18.0 && \
npm install puppeteer"
What ended up being super important was a) to switch from ubuntu to debian because the chromium build in ubuntu requires using snap, which seemed like a very silly/complex dependency to have to add to a docker image, and b) to install chromium manually, and not rely on Puppeteer to install it itself (which it didn’t seem to be able to do).
!!! info “Side bar” The Docker system I’m talking about here is only the build system. It outputs static HTML that is hosted using GitHub pages. So there are no Docker containers in my stack that a reader would ever interact with.
The next fiddly bit was to get the nvm managed version of node to be loaded up and
recognised when using docker run
. Outside of docker this is done by adding
source /root/.nvm/nvm.sh
to your shell profile, but that isn’t going to work here very
well. You don’t normally run a full/modern interactive shell inside a docker image, it’s
wasteful, you don’t need history and bash completions etc. Normally you’re using
something super lightweight like sh
. So I can’t easily rely on shell startup profile
scripts, instead I modified the entrypoint of the docker image, but only when running
the pdf
service, here’s what I added to my docker compose file:
pdf:
build: .
tty: true
stop_signal: SIGINT
working_dir: /pdf
entrypoint: ["bash", "-c", 'source /root/.nvm/nvm.sh && exec "$@"', "--"]
command: ["node", "/pdf/makePdf.js"]
volumes:
- .:/app
It’s that entrypoint part that’s doing the heavy lifting, sourcing the nvm set up script and then immediately executing whatever the original command was.
2. Write/run JS script
Next I needed to be able to load the local HTML file with my CV in it into Puppeteer and have it save the page as a PDF:
const browser = await puppeteer.launch({
executablePath: "/usr/bin/chromium",
args: [
"--no-sandbox",
"--headless",
"--disable-gpu",
"--font-render-hinting=none",
],
});
const page = await browser.newPage();
await page.setRequestInterception(true);
page.on("request", (request) => {
const headers = {
...request.headers(),
Origin: "https://www.mayortech.co.uk",
};
request.continue({
headers: headers,
});
});
await page.goto("file:///app/dist/cv/index.html", {
waitUntil: "networkidle0",
});
await page.pdf({
path: "/app/dist/William Mayor's CV.pdf",
format: "A4",
margin: {
top: "0px",
left: "0px",
right: "0px",
bottom: "0px",
},
printBackground: true,
});
await browser.close();
This is mostly a pretty simple set up. I’ve got a full page background on my CV, so I
had to set the margins to be 0 (relying on the HTML padding in the page to push the
content away from the edges), and then use the printBackground: true
setting to allow
the “printed” PDF to use the background colour.
The most fun issue to solve was that the Font Awesome icons wouldn’t load. At first I
thought it was a timing issue; maybe the assets were loading in but the page wasn’t
given enough time for Font Awesome to switch out the <i>
elements with the <svg>
s
they actually use for the icons. But I added in some really large timeouts and it didn’t
solve the problem.
So I loaded up the file in my local browser using the file://
protocol, not http://
as I normally use. This replicated the issue that Puppeteer was seeing and I could see
from the network tab that Font Awesome was returning a 403 error saying that the origin
for the page wasn’t set. This makes sense, I’m loading up a local file, I’m not serving
a page in the normal manner, there is no origin. I looked to see if I could whitelabel
the file://
protocol in my Font Awesome settings, but I don’t think you can. So next I
looked into spoofing the origin to get Font Awesome to be happy. You can do this really
easily using curl:
$ curl -H "Origin: https://example.com" https://kit.fontawesome.com/YOUR_KIT_CODE.js
After I verified that this would work I looked at how to add the origin header in
Puppeteer. You should be able to add it using page.setExtraHTTPHeaders
but this didn’t
work for me, Font Awesome still returned a 403 error. So instead I used request
intercepting to add the header to each request as it left the browser:
await page.setRequestInterception(true);
page.on("request", (request) => {
const headers = {
...request.headers(),
Origin: "https://www.mayortech.co.uk",
};
request.continue({
headers: headers,
});
});
At the time of writing there’s an open bug ticket in the Puppeteer repo about
inconsistent text rendering in the headless vs non-headless modes. The fix is to add
that --font-render-hinting=none
parameter to the launch args.
3. Tell GitHub Actions to generate, save, and deploy the PDF version
This one was the simplest bit, once I had the system working locally I then needed to add a step to my GitHub Actions:
- name: Build PDF version of CV
run: docker compose run --rm pdf
That’s it! The hard work is done inside Docker, so once it’s working locally, getting it to run in GitHub is usually pretty easy (ignoring the pain that can come with trying to get GitHub to run anything at all).