On testing code

Unit testing seems like a very sensible thing to do. Having tests that ensure that your code is doing what you expect is an obviously good idea. I’m a very cautious person by nature, so testing appeals to me very strongly. Nonetheless I’ve been slow to add it to my workflow.

In part this is because most of the projects I’ve made have leaned heavily on external APIs. Mocking out an API for testing is an intimidating way for a beginner to start testing, so I’ve hesitated repeatedly. Also most of the tools I’ve built have not been critical infrastructure, so no one is relying on them very much, which also (unfortunately) helped me justify my procrastination.

However, more recently, I’ve been automating our library’s gathering and parsing of Counter reports. Counter reports are standardized usage reports that show how much our electronic collections have been used by our patrons. It’s critical stuff for our library, so the data we draw from them needs to be correct.

This finally motivated me to figure out how to write unit tests. Because I’m comparing static files, no complicated techniques are needed. Even my very elementary understanding of pytest is enough to get me through. Now my tests give me confidence that my scripts are reporting the correct data. It’s reassuring.

Posted in testing | Comments closed


I’ve had a Raspberry Pi sitting around my office for quite a while. I bought it in a brief moment of enthusiasm for hardware that quickly faded, so it has mostly sat around in the box unused. I think the main reason I neglected it is because I hadn’t come up with a compelling use case. That changed recently when I read about Pi-hole.

Pi-hole is software that you can run on your Pi (or in a Docker container) that will block advertising across your network. While I’m not super militant about avoiding advertising altogether, I think that having Pi-hole on my home network can improve my experience of the web.

It was nice to learn a bit about networking while setting this up. Networking is a bit impenetrable at times, especially to the newcomer. But setting up a Pi-hole was an accessible way for me to try out some things that helped me learn how my home network is working.

Despite going through all of the setup, I haven’t yet deployed the Pi-hole on my network, although I hope to soon. It was a fun learning experience nonetheless! if you don’t want to look at ads, and you’re willing to muck around a bit with your network and some hardware, I recommend this as a weekend project.

Posted in networking, pi-hole | Comments closed

Creating a desktop application using Python

Recently, my colleague Julia and I made a Python script to support outreach to faculty. You can read about our attempts to document the project here.

Because deploying a script like this can be difficult for librarians who are sometimes unfamiliar with Python, we thought it might be useful to sidestep some of the complexity by creating a desktop app that would obviate the need for working directly with the command line.

There is more than one way to create and package desktop applications in Python. We used Gooey and Pyinstaller. Gooey is useful for turning a program built on Python’s argparse into a GUI application. Since we already had a working command line application made with argparse, this was a good fit for us.

We also needed to create an executable. The Gooey documentation provides steps for doing this using Pyinstaller. We basically followed these instructions, and they worked as advertised. We ended up with an .exe file that we could put on our desktops to run the application. While the .exe file won’t work across platforms (Pyinstaller is not a cross-compiler), it does work for our librarians. So now there’s no need for librarians to wrangle with the command line; the desktop app takes care of that.

If you’re interested, we branched the GitHub repo for this project, so there is a separate branch for the command line application and the GUI application. Let us know if you find this useful!

Posted in desktop application | Comments closed


I’ve had some adventures with DNS recently. DNS resolves domain names to IP addresses, so it, uh, mostly just looks things up. It won’t forward traffic, or resolve the path in a url to some other path, or anything like that. It won’t help you serve things over HTTPS. It just translates domain names to IP addresses.

That’s a bit limiting, but so it goes. For example, over the past few years, I’ve created a lot of urls that I’d ultimately like to resolve differently than I had originally intended, but DNS is not actually much help for this. On the positive side, the most useful aspect of DNS for me has been CNAMEs. CNAMEs allow me to assign one of my domains or subdomains to a service out there somewhere on the internet. For example, I can make the subdomain https://fictograph.ocert.at point to an application that lives in the cloud (in this example, on PythonAnywhere). That’s kind of neat.

Anyhow, all of this has given me a greater appreciation of the difficulty of keeping links working. DNS has taught me that naming stuff is hard.

Posted in dns | Comments closed

On documentation

Two weeks ago, Julia Furay and I presented a poster at ACRL 2019 about automating a library outreach initiative with Python. The presentation went well; people seemed interested in the project. But talking to librarians at the conference really got us thinking about how to make our code more reusable. If others want to use our work, there are probably some steps we could take to make that easier.

One thing we could do is to document the project better. Leading up to the conference, I put a lot of comments into the code to make it more understandable. This is a good step, but it presumes that the reader will be able to figure out the code with the help of some comments. In reality, some librarians will be able to do this, but not everyone.

Besides understanding the code, there is also some setup that needs to be done before the program will run. This includes things like installing Python, cloning a git repository, making a virtual environment, and installing dependencies. Most of this will be non-obvious to someone who does not deal with these issues regularly. Creating a README with some instructions on where to start is a good idea too.

Nevertheless, there’s a balance to be struck. Lengthy instructions on how to set up a development environment would be overkill for a small script like ours. But being a bit more thorough with documentation makes the code accessible to more people. You can see our attempt here. Feedback is welcome.

Posted in conference, documentation | Comments closed

The unpredictability of bots

I recently made a Mastodon bot that didn’t really turn out how I expected. My goal was for it to be a bit cheeky, by being a bot who poses as a scholar. That’s not how it comes across. Rather, it presents itself as pedantic and over-confident. I suppose I could tweak it to make it more what I wanted. But I think it is unexpectedly interesting as it is.

The bot’s posts are written by code, and the curious thing about code is that it is often very deterministic and very unpredictable. One reason that code can be so unpredictable is because, as humans, we have trouble reasoning like computers, so we’re surprised by some of the weird things they produce. That is part of the joy of bots. They say some weird stuff.

So I don’t particularly like the personality of my bot, but I’m going to leave it as it is. It shows how hard it is to think like a computer does.

Posted in bots, mastodon, software | Comments closed


I’ve recently had the honor of contributing to an open source project called ephemetoot. It’s a project by Hugh Rundle that auto-deletes your old Mastodon posts. I’ve wanted to contribute more to open source projects for a while now, but finding the right project is surprisingly hard to do.

Hugh’s project appealed to me for a couple of reasons: (1) I think it’s an awesome use case; (2) it’s a small code base that I could wrap my head around; (3) I’m enthusiastic about contributing to the Mastodon ecosystem; (4) Hugh is doing interesting work at the intersection of libraries and code.

Contributing has taught me some valuable lessons. I learned a few things about git. More importantly, I learned about collaborating on code with someone who I’ve never met. Up until now, most of the code I’ve written has been for myself. While I usually openly license my projects and on put them GitHub, they’re often written without that much thought about how others could use them. This approach needed a shake-up. Writing contributions for someone else’s project is a good catalyst to refocus on collaboration.

Posted in ephemetoot, mastodon, open source | Comments closed

Learning the conceptual stuff

Because I’m a self-taught programmer, and still very much a beginner, there’s a lot of computer science theory that I’m totally unaware of. Yet I’m now beginning to see the value of classic theoretical solutions to common programming problems. When you can immediately identify a problem as being solvable with, say, a concept like a deque, or recursion, or whatever, you save an huge amount of time that would otherwise be spent groping around for a homemade (and probably inefficient) solution.

The classic solutions are really interesting too. Rather than spending my time just hacking together stuff that barely works, they have got me thinking about how a computer might best tackle a certain problem. Learning these approaches is definitely making me a better programmer, because it means I can come up with an efficient, realizable solution to a problem more quickly. Even better, problems which seemed unsolvable to me are suddenly accessible, because I’ve learned a way to approach them.

Anyhow, this post is in part inspired by this book, which I’m finding interesting.

Posted in learning | Comments closed

Friday thoughts on greenOA

As Kingsborough’s representative for CUNY Academic Works, our university’s institutional repository, I help faculty share their publications freely and openly online. This is useful work, because it increases the visibility of their work, and allows many people around the world to access scholarship that might otherwise have been unavailable to them. The institutional repository provides a second home for many publications that may be paywalled elsewhere.

My work with the institutional repository mostly involves reading about licensing and writing emails to faculty. This is not glamorous stuff, but it is rewarding work because it is increasing access to scholarly knowledge in a very tangible way: one publication at a time. It is also very typical of the behind the scenes work done by academic librarians, who keep the wheels of knowledge sharing turning, so that scholars everywhere can continue working effectively.

Posted in institutional repository, open access | Comments closed

Fuzzy string matching

A few months ago, I wrote about a tool I made called the Fictograph, which graphs the awesomeness of authors’ works over time. It leans heavily on data from the Goodreads API. I expected the Goodreads API to be reliable, but it turns out it has some design problems. For example, if you query an author name with a minor spelling mistake, you sometimes get back data on a random author who is totally unrelated to your search.

This behavior is irritating to users, who get results for a different author than they intended. Plus, Fictograph users aren’t going to have much sympathy for my whining and blaming the underlying API for the problem. So I needed to find a programmatic way to compensate for this unhelpful API behavior.

I was stuck on this problem until I saw a presentation at PyGotham that touched on fuzzy string matching. This was a plausible solution, as fuzzy string matching can evaluate whether the name entered by the user is more or less the same as the name returned from the API. If they’re pretty much the same, great! If they’re not, it means the API is probably returning an unexpected result, so the Fictograph should probably return an “author not found” error.

The best part is that I didn’t have to write any string matching code myself; Python has libraries like fuzzywuzzy that will take care of fuzzy string matching for you.

Posted in api, goodreads | Comments closed
Need help with the Commons? Visit our
help page
Send us a message
Skip to toolbar