Since my last post, I’ve been moving more things over to GitHub Actions. It has been an interesting journey. Some things are pretty straightforward to port from PythonAnywhere Tasks. Others have puzzled me a bit more.
As an example of the latter, I wanted to set up my rss-to-mastodon bot as a GitHub Action. It’s not a complicated bit of code — it posts the latest from an RSS feed to Mastodon — but one piece was tricky: the code relies on a file that contains the title of the last blog post that it has processed. How do you persist (and possibly modify) this one little piece of data across invocations?
The main challenge here is that, with GitHub Actions, you don’t really have a persistent filesystem. You do however, have a temporary filesystem pre-populated with whatever is committed to your git repository. So with a bit of yaml and bash, you can read a text file — which you’ve stored in your GitHub repository — and pass it as an argument to a Python script:
- name: 'run the python script'
run: |
pip install -r requirements.txt
python fetch.py "$(< last_post.txt)"
Which works great. You can then access this positional argument with Python’s argparse
library, and you’re off and running.
But we’re not done. Once your Python code has executed, the last_post.txt
file needs to be updated with the latest data. You can try writing to file, you can even commit to git, but that is not going to be sufficient, because the filesystem is going to disappear in a few seconds. So you need to run through the whole add
, commit
and push
workflow to push the changes to the remote repository (in this case, GitHub). The remote repository will be all that remains once the job runs to completion.
- name: 'commit if changed'
run: |
git add last_post.txt
if [[ $(git diff --staged --name-only) ]]; then
git commit -m "Saved new post [skip actions]"
git push origin main
else
echo "No changes detected"
fi
This bit of yaml and bash commits and pushes (if there are any changes). That’s it! If you want to see the whole workflow in yaml, it is available here.
To be clear, this is not the only possible way to solve this problem; I did consider other options (such as caching, or file storage extensions), which have their own advantages and disadvantages. I welcome your thoughts as to what you think is best!