Slow down, be thorough

Since (for now) the Open Journal Matcher is built without using a proper task queue, I’ve been spending a lot of effort handling the various errors thrown by my Google Cloud Function. This is both satisfying and annoying: it is nice to catch and handle each error properly, but it takes some digging to figure out how to deal with some of them.

Mostly, when my cloud function fails, I need to send the request again. Maybe more than once. I’m creating a lot of API calls, but it’s needed for the sake of thoroughness. I want the OJM to cover as many journals as possible, and so I’m willing to wait for a 200 response. The underlying assumption is that people will be willing to wait if the application returns the best results. My approach is probably not the most efficient way to do this, but for now it’s churning out the best recommendations I can muster.

Posted in api, google, journal recommender | Comments closed

On variable costs

Now that the Open Journal Matcher is live and receiving traffic, I’m wondering how much it is going to cost to keep running.

There isn’t an obvious answer. Mostly this is because Google Cloud Functions scale with your project. This is definitely good for scalability and availability, but it makes it much more difficult to budget. When I flipped the switch to turn on the OJM, I really had no idea how expensive it would be.

(Needless to say, this is a different, contrasting billing approach to PythonAnywhere – which I talked about in my last post)

Since the project has been online for two months, the costs are becoming more clear now. There have been some big spikes in traffic ($141 worth of traffic on August 10th), as well as some quiet periods. At least now I can start to calculate some meaningful averages.

I have a bit of time to figure out funding for the future, since my Google Cloud Platform Credit grant runs until January. The looming question is how much will it cost to fund the project after that?

Posted in budgeting, google, journal recommender | Comments closed

On scaling

I’ve been using PythonAnywhere to host web projects for some time now, and while I am very happy with the service, one of its weak points is scalability. This is especially problematic when trying to handle unpredictable spikes in traffic. While there are plenty of platforms that will scale seamlessly along with your traffic, PA is not one of these. With PA, you statically set the number of web workers that will serve up your pages. Not only is this number fixed, but it is the same across all of your various projects. This is less than ideal. I haven’t yet been burned by not having enough web workers, although this seems quite possible. On the other hand, I also don’t want to waste money by buying idle resources. For now – or until I can think of something better – my solution is to keep an eye on it and try to strike the right balance.

Posted in pythonanywhere | Comments closed


Recently, CUNY libraries migrated to Alma, our new library services platform. Alma is a pretty mighty piece of software. It can manage many, many library functions. Given how much it does, it amazes me that it works. I would love to look at the codebase for a bit, just to get a sense as to how it is organized.

My experience, as a user, has so far been pretty good. There are things that have confused me, although so far once I’ve learned their purpose, they are useful. I like how it brings together disparate library workflows, and makes them work together. It’s going to be a journey getting it all figured out, but it seems like a positive step.

Posted in alma | Comments closed

Access to readings during remote instruction

Our students are facing a potential textbook crisis this fall. Many may not even realize it yet. But with in-person library services potentially greatly curtailed, one crucial source of textbooks – the library reserve desk – may not be readily available to our students.

The librarians are working to mitigate this crisis on several fronts. But some of the most effective remedies can come from faculty themselves. Here are some strategies we recommend:

  • Assign readings from the library’s electronic collections. These are already paid for, and are free to our students.
  • Make readings available on Blackboard. When being vigilant about avoiding copyright law violations, this can be an effective strategy.
  • Use open educational resources (OERs) as course textbooks. This is the best long-term solution. OERs can be reused and adapted by anyone, and can be a very effective solution to prohibitive textbook costs.

Please don’t stick your students with huge textbook bills this fall! Take steps to reduce textbook costs while planning your reading list.

Posted in textbooks | Comments closed


Now that the Open Journal Matcher is more or less complete and shared with the world, I feel like it’s time for a new beginning. I like this part of my job: occasionally I arrive at a point where I get to reset and consider what’s next.

I’m going to take a different approach. In the past, I’ve sought out technologies to learn, and then once I knew enough about them, figured out ways to apply them to my library or my job. While that can be a sensible way to learn a technology, it’s a bit myopic and doesn’t necessarily provide the most benefit to the people around me. Going forward, I will aim to focus more specifically on people. Making this more explicit will hopefully push me to build more constructive projects.

Posted in Uncategorized | Comments closed

Further into Mastodon

Joining Mastodon in 2017 turned out to be a positive change for me. Twitter had been making me depressed and anxious, and I needed a way out. At that point Mastodon – an open source social network – was unfamiliar to me and seemed rough around the edges. But the people I found there were compelling, and the interactions were weirdly real and non-algorithmic feeling. I was hooked.

This week, I began administering my own server in the Mastodon network. I’m not doing the technical bits of system administration; I’ve left that to the much more skilled people at Mastohost. But I’m now actively managing my little corner of the Mastodon network. Mastodon has powerful tools for moderation and countering abuse. These allow me to 100% decide what I want to allow in my timelines and on my server. It feels very positive and constructive.

Anyhow, it’s a work in progress. I’ll report back once I’ve put some more miles on it!

Posted in mastodon | Comments closed

Open Journal Matcher is here

I’m very pleased to let you know about a project I’m releasing today, the Open Journal Matcher. It’s a resource for people looking to find a suitable journal for their work. Paste in a draft abstract and see the best matching open access journals!

Open Journal Matcher uses a natural language processing library called spaCy to compare the text of a draft abstract with the abstracts of ~5000 journals from the Directory of Open Access Journals. It ranks all of these journals by similarity, and returns the top five matches.

While there are other journal matching services available, to my knowledge this is the only one that is fully interdisciplinary and fully open source. The code for the matcher application, the code for the matching algorithm, and the content of the journals, is all openly licensed.

Please enjoy, and let me know what you think of this tool!

Posted in doaj, journal recommender, open access, open source | Comments closed

Backing away from GCP

I finally got a working prototype of my journal recommender project up and running; I just shared it with a few co-workers yesterday. However, it has been a bit of a journey to get here: I was only able to get the project finished by backing away from Google Cloud Platform. I won’t dwell on my complaints with GCP; instead I want to focus how I got out of my predicament.

Basically, I solved the problems that were dogging the recommender project by returning as much as possible to python. One of the reasons that I found Google Cloud Platform so frustrating was because I had to do a lot of work outside of python.

The solution was to run the project on PythonAnywhere instead. The reason PA worked for me was that I could solve my problems directly in python, in the shell. For the most part, there were no additional layers of complexity. GCP is a much more comprehensive offering, but I’m fine with a more simple tool like PA if it means I can get my work done effectively.

Posted in google, python, pythonanywhere | Comments closed

Some things are harder remotely

I know there has been a lot of talk about how people will continue to work from home, even after the virus fades. This seems mostly sensible. I find I can do most of my job from home quite well, thank you. Of course teaching and reference are more difficult remotely, but the rest is quite manageable with very little disruption.

What I had not foreseen was that generating new project ideas would be so hard. I’ve been at a loss to come up with new interesting projects since WFH started. Apparently I need in-person conversations with my fellow librarians to do this. Needless to say, I look forward to eventually being back in the office, so that we can dream up new ideas together again.

Posted in projects, remote | Comments closed
Need help with the Commons? Visit our
help page
Send us a message
Skip to toolbar