Recommending journals programmatically

I had been keeping this project under my hat, but I’ve proposed it as a conference talk now, so maybe it’s time to share. I’ve been building a journal recommender tool. It’s aimed at faculty, and is built on data from the Directory of Open Access Journals. The idea is that a faculty member could submit a draft abstract, and the recommender tool would give back a top 5 list of journals that best match that abstract.

The point is that this might help faculty more effectively find a way to identify prospective journals for their work. This could speed up the time-consuming process of manually reading through many articles or abstracts in order to evaluate the fit of the article.

It has been interesting to try to figure out ways to write an effective matching script. The results need to be accurate, and the process needs to be quick enough that it can be delivered on the web. The second of these requirements turns out to be more challenging than the first. I’ve spent quite a bit of time trying to optimize the process, but it’s not there yet. Here is the code, if you are interested: https://github.com/MarkEEaton/doaj. I will follow up with more details as I work out the bugs.

Posted in doaj, journal recommender | Leave a comment

The modern web

I’ve been learning some JavaScript recently. Mostly this is so that I can better understand modern JS frameworks. I have some catching up to do: it has been a while since I looked at JavaScript.

Anyhow, my main takeaway from learning (a very small amount) about React and Vue.js is that the DOM isn’t what it used to be. I suppose I kind of understood this already – web templating in Python gave me some hints – but I wasn’t prepared for how something like React completely rethinks the DOM. It’s been a great eye opener, and now I want to make something using a modern JS framework.

So I’m fishing for frontend projects to take on, if you have any suggestions…

Posted in javascript | Leave a comment

On podcasts

I spend more time than I’d like to on the New York City subway. It’s mostly boring, punctuated by occasional dramatic failures, where getting to your destination seems almost impossible. In any case, I find that my time on the subway needs to be put to use with some productive activity, or I would totally waste it, probably by staring into space.

Like many others, I find solace in podcasts. The ones that I always keep returning to are programming podcasts: Talk Python to Me and Programming Throwdown, among others. They’re both amazing. Having these to listen to on my commute keeps me occupied and keeps my technical faculties active, even during weeks when I’m not writing any code. It’s nice to learn productive stuff during an otherwise wasted part of my day.

Posted in learning, podcasts | Comments closed

On testing code

Unit testing seems like a very sensible thing to do. Having tests that ensure that your code is doing what you expect is an obviously good idea. I’m a very cautious person by nature, so testing appeals to me very strongly. Nonetheless I’ve been slow to add it to my workflow.

In part this is because most of the projects I’ve made have leaned heavily on external APIs. Mocking out an API for testing is an intimidating way for a beginner to start testing, so I’ve hesitated repeatedly. Also most of the tools I’ve built have not been critical infrastructure, so no one is relying on them very much, which also (unfortunately) helped me justify my procrastination.

However, more recently, I’ve been automating our library’s gathering and parsing of Counter reports. Counter reports are standardized usage reports that show how much our electronic collections have been used by our patrons. It’s critical stuff for our library, so the data we draw from them needs to be correct.

This finally motivated me to figure out how to write unit tests. Because I’m comparing static files, no complicated techniques are needed. Even my very elementary understanding of pytest is enough to get me through. Now my tests give me confidence that my scripts are reporting the correct data. It’s reassuring.

Posted in testing | Comments closed


I’ve had a Raspberry Pi sitting around my office for quite a while. I bought it in a brief moment of enthusiasm for hardware that quickly faded, so it has mostly sat around in the box unused. I think the main reason I neglected it is because I hadn’t come up with a compelling use case. That changed recently when I read about Pi-hole.

Pi-hole is software that you can run on your Pi (or in a Docker container) that will block advertising across your network. While I’m not super militant about avoiding advertising altogether, I think that having Pi-hole on my home network can improve my experience of the web.

It was nice to learn a bit about networking while setting this up. Networking is a bit impenetrable at times, especially to the newcomer. But setting up a Pi-hole was an accessible way for me to try out some things that helped me learn how my home network is working.

Despite going through all of the setup, I haven’t yet deployed the Pi-hole on my network, although I hope to soon. It was a fun learning experience nonetheless! if you don’t want to look at ads, and you’re willing to muck around a bit with your network and some hardware, I recommend this as a weekend project.

Posted in networking, pi-hole | Comments closed

Creating a desktop application using Python

Recently, my colleague Julia and I made a Python script to support outreach to faculty. You can read about our attempts to document the project here.

Because deploying a script like this can be difficult for librarians who are sometimes unfamiliar with Python, we thought it might be useful to sidestep some of the complexity by creating a desktop app that would obviate the need for working directly with the command line.

There is more than one way to create and package desktop applications in Python. We used Gooey and Pyinstaller. Gooey is useful for turning a program built on Python’s argparse into a GUI application. Since we already had a working command line application made with argparse, this was a good fit for us.

We also needed to create an executable. The Gooey documentation provides steps for doing this using Pyinstaller. We basically followed these instructions, and they worked as advertised. We ended up with an .exe file that we could put on our desktops to run the application. While the .exe file won’t work across platforms (Pyinstaller is not a cross-compiler), it does work for our librarians. So now there’s no need for librarians to wrangle with the command line; the desktop app takes care of that.

If you’re interested, we branched the GitHub repo for this project, so there is a separate branch for the command line application and the GUI application. Let us know if you find this useful!

Posted in desktop application | Comments closed


I’ve had some adventures with DNS recently. DNS resolves domain names to IP addresses, so it, uh, mostly just looks things up. It won’t forward traffic, or resolve the path in a url to some other path, or anything like that. It won’t help you serve things over HTTPS. It just translates domain names to IP addresses.

That’s a bit limiting, but so it goes. For example, over the past few years, I’ve created a lot of urls that I’d ultimately like to resolve differently than I had originally intended, but DNS is not actually much help for this. On the positive side, the most useful aspect of DNS for me has been CNAMEs. CNAMEs allow me to assign one of my domains or subdomains to a service out there somewhere on the internet. For example, I can make the subdomain https://fictograph.ocert.at point to an application that lives in the cloud (in this example, on PythonAnywhere). That’s kind of neat.

Anyhow, all of this has given me a greater appreciation of the difficulty of keeping links working. DNS has taught me that naming stuff is hard.

Posted in dns | Comments closed

On documentation

Two weeks ago, Julia Furay and I presented a poster at ACRL 2019 about automating a library outreach initiative with Python. The presentation went well; people seemed interested in the project. But talking to librarians at the conference really got us thinking about how to make our code more reusable. If others want to use our work, there are probably some steps we could take to make that easier.

One thing we could do is to document the project better. Leading up to the conference, I put a lot of comments into the code to make it more understandable. This is a good step, but it presumes that the reader will be able to figure out the code with the help of some comments. In reality, some librarians will be able to do this, but not everyone.

Besides understanding the code, there is also some setup that needs to be done before the program will run. This includes things like installing Python, cloning a git repository, making a virtual environment, and installing dependencies. Most of this will be non-obvious to someone who does not deal with these issues regularly. Creating a README with some instructions on where to start is a good idea too.

Nevertheless, there’s a balance to be struck. Lengthy instructions on how to set up a development environment would be overkill for a small script like ours. But being a bit more thorough with documentation makes the code accessible to more people. You can see our attempt here. Feedback is welcome.

Posted in conference, documentation | Comments closed

The unpredictability of bots

I recently made a Mastodon bot that didn’t really turn out how I expected. My goal was for it to be a bit cheeky, by being a bot who poses as a scholar. That’s not how it comes across. Rather, it presents itself as pedantic and over-confident. I suppose I could tweak it to make it more what I wanted. But I think it is unexpectedly interesting as it is.

The bot’s posts are written by code, and the curious thing about code is that it is often very deterministic and very unpredictable. One reason that code can be so unpredictable is because, as humans, we have trouble reasoning like computers, so we’re surprised by some of the weird things they produce. That is part of the joy of bots. They say some weird stuff.

So I don’t particularly like the personality of my bot, but I’m going to leave it as it is. It shows how hard it is to think like a computer does.

Posted in bots, mastodon, software | Comments closed


I’ve recently had the honor of contributing to an open source project called ephemetoot. It’s a project by Hugh Rundle that auto-deletes your old Mastodon posts. I’ve wanted to contribute more to open source projects for a while now, but finding the right project is surprisingly hard to do.

Hugh’s project appealed to me for a couple of reasons: (1) I think it’s an awesome use case; (2) it’s a small code base that I could wrap my head around; (3) I’m enthusiastic about contributing to the Mastodon ecosystem; (4) Hugh is doing interesting work at the intersection of libraries and code.

Contributing has taught me some valuable lessons. I learned a few things about git. More importantly, I learned about collaborating on code with someone who I’ve never met. Up until now, most of the code I’ve written has been for myself. While I usually openly license my projects and on put them GitHub, they’re often written without that much thought about how others could use them. This approach needed a shake-up. Writing contributions for someone else’s project is a good catalyst to refocus on collaboration.

Posted in ephemetoot, mastodon, open source | Comments closed
Need help with the Commons? Visit our
help page
Send us a message
Skip to toolbar