Unexpected funds

I recently found out that I got funding from Google for my journal recommender project. More than I expected, too. I had hoped to receive maybe $200 or $300 of Google Cloud Platform credits, but they gave me $5000 in credits, which was quite a surprise.

This is good news for this project, because it allows me to try out some technical solutions and optimizations that would not have been possible if I were spending my own money (or CUNY’s money). Hopefully, it will allow me to create a better public-facing journal recommender tool than I could have otherwise.

I’ll try to be pragmatic about what this funding can do for this project, while keeping in mind some of the drawbacks of leaning on Google. Specifically, there are vendor lock-in risks and user privacy risks that come with using Google products. It’s best not to lose sight of these problems. Keeping those in mind, the credits give me a chance to experiment with approaches to project infrastructure without having to worry too obsessively over cost. That is welcome news.

Posted in doaj, google, journal recommender | Comments closed

Further into hardware

I used to really dislike computer hardware. It seemed like it always cropped up as an inconvenience or annoyance when I was trying to do something with code. I preferred the abstractions and ephemeralness of software to the inconvenience of physical things.

More recently, my thoughts on this have been changing. In part, I’m realizing how essential the physical stuff is to working with computers. This should be obvious but I’m a bit slow to catch on.

I’m learning from the good folks at sdf.org, who are in general way more hardware-oriented than I am. Their joy at hacking on hardware is contagious. I have a lot to learn, but I’m excited about it.

Posted in hardware, learning | Comments closed

Hacker hours

Recently, I have been occasionally guest-hosting Hacker Hours at the Brooklyn Public Library on Sunday afternoons. It’s an event where people get together to work on programming projects, and help each other out by sharing skills and knowledge. It draws some seasoned programmers, as well as people who are entirely new to coding. The informal setting and the mix of skill levels makes for a good learning environment.

The event used to be at a café on Atlantic Avenue. But relocating to the library has meant that we’ve been able to reach people who may not have otherwise heard about the event. The library helpfully promotes it on their events page. So while I do miss the sandwiches and coffee at our former café, I think moving to the library has been a positive change. I recommend dropping by. Bring some code to work on.

Posted in learning, meetup | Comments closed

Feast or famine

Quite recently, I was feeling disappointed that I didn’t have any library coding projects to work on. I felt I was failing at my goal of bringing more code into my academic librarianship.

But things change quickly. I’m not sure what happened, but over the course of last week, five(!) new programming projects landed on my plate. It’s far too much to work on at once. I’m not sure if this recent influx was due to a change in mindset, or the regular turnover of the semester cycle, or maybe just randomness. Nonetheless it is very welcome.

This abrupt switch from too few projects to too many has me thinking that maybe much is beyond my control. I can try to be open, engaged and active, but that isn’t always sufficient to bring in the projects that I want to work on. Sometimes it just takes patience and good luck.

Posted in Uncategorized | Comments closed

Sentiment analysis

For almost five years now, our library has been archiving tweets about our college. I’ve posted about that here and here. Until recently, I didn’t really have an agenda for this data, other than preserving it. Last week that changed. At our college’s Data Faculty Interest Group, I mentioned the tweet archive as a potentially useful data set. Dorina Tila, the chair of the FIG, had a great suggestion: why not use the data for some sentiment analysis?

I immediately got to work coding up a prototype. I used some Python libraries to chart sentiment about Kingsborough over time: Textblob to score the tweets based on sentiment; pandas to wrangle the data; and matplotlib to produce a nice chart:

A chart of tweet sentiment over time

While this chart is interesting, it’s not ready yet. Specifically, the archive has a lot of false positives: Tweets that mention the words Kingsborough or KBCC, yet have nothing to do with our college. These need to be removed from the dataset before the data can produce a meaningful chart. Just by eyeballing the data, my estimate is that 30-50% of the 14,210 tweets will need to be removed. However, sorting through that many tweets is a really big job. I’m not sure how to tackle this yet, but I hope to get it figured out soon. At that point we can produce a meaningful sentiment chart.

Posted in archives, python, sentiment analysis, twitter | Comments closed

Recommending journals programmatically

I had been keeping this project under my hat, but I’ve proposed it as a conference talk now, so maybe it’s time to share. I’ve been building a journal recommender tool. It’s aimed at faculty, and is built on data from the Directory of Open Access Journals. The idea is that a faculty member could submit a draft abstract, and the recommender tool would give back a top 5 list of journals that best match that abstract.

The point is that this might help faculty more effectively find a way to identify prospective journals for their work. This could speed up the time-consuming process of manually reading through many articles or abstracts in order to evaluate the fit of the article.

It has been interesting to try to figure out ways to write an effective matching script. The results need to be accurate, and the process needs to be quick enough that it can be delivered on the web. The second of these requirements turns out to be more challenging than the first. I’ve spent quite a bit of time trying to optimize the process, but it’s not there yet. Here is the code, if you are interested: https://github.com/MarkEEaton/doaj. I will follow up with more details as I work out the bugs.

Posted in doaj, journal recommender | Comments closed

The modern web

I’ve been learning some JavaScript recently. Mostly this is so that I can better understand modern JS frameworks. I have some catching up to do: it has been a while since I looked at JavaScript.

Anyhow, my main takeaway from learning (a very small amount) about React and Vue.js is that the DOM isn’t what it used to be. I suppose I kind of understood this already – web templating in Python gave me some hints – but I wasn’t prepared for how something like React completely rethinks the DOM. It’s been a great eye opener, and now I want to make something using a modern JS framework.

So I’m fishing for frontend projects to take on, if you have any suggestions…

Posted in javascript | Comments closed

On podcasts

I spend more time than I’d like to on the New York City subway. It’s mostly boring, punctuated by occasional dramatic failures, where getting to your destination seems almost impossible. In any case, I find that my time on the subway needs to be put to use with some productive activity, or I would totally waste it, probably by staring into space.

Like many others, I find solace in podcasts. The ones that I always keep returning to are programming podcasts: Talk Python to Me and Programming Throwdown, among others. They’re both amazing. Having these to listen to on my commute keeps me occupied and keeps my technical faculties active, even during weeks when I’m not writing any code. It’s nice to learn productive stuff during an otherwise wasted part of my day.

Posted in learning, podcasts | Comments closed

On testing code

Unit testing seems like a very sensible thing to do. Having tests that ensure that your code is doing what you expect is an obviously good idea. I’m a very cautious person by nature, so testing appeals to me very strongly. Nonetheless I’ve been slow to add it to my workflow.

In part this is because most of the projects I’ve made have leaned heavily on external APIs. Mocking out an API for testing is an intimidating way for a beginner to start testing, so I’ve hesitated repeatedly. Also most of the tools I’ve built have not been critical infrastructure, so no one is relying on them very much, which also (unfortunately) helped me justify my procrastination.

However, more recently, I’ve been automating our library’s gathering and parsing of Counter reports. Counter reports are standardized usage reports that show how much our electronic collections have been used by our patrons. It’s critical stuff for our library, so the data we draw from them needs to be correct.

This finally motivated me to figure out how to write unit tests. Because I’m comparing static files, no complicated techniques are needed. Even my very elementary understanding of pytest is enough to get me through. Now my tests give me confidence that my scripts are reporting the correct data. It’s reassuring.

Posted in testing | Comments closed


I’ve had a Raspberry Pi sitting around my office for quite a while. I bought it in a brief moment of enthusiasm for hardware that quickly faded, so it has mostly sat around in the box unused. I think the main reason I neglected it is because I hadn’t come up with a compelling use case. That changed recently when I read about Pi-hole.

Pi-hole is software that you can run on your Pi (or in a Docker container) that will block advertising across your network. While I’m not super militant about avoiding advertising altogether, I think that having Pi-hole on my home network can improve my experience of the web.

It was nice to learn a bit about networking while setting this up. Networking is a bit impenetrable at times, especially to the newcomer. But setting up a Pi-hole was an accessible way for me to try out some things that helped me learn how my home network is working.

Despite going through all of the setup, I haven’t yet deployed the Pi-hole on my network, although I hope to soon. It was a fun learning experience nonetheless! if you don’t want to look at ads, and you’re willing to muck around a bit with your network and some hardware, I recommend this as a weekend project.

Posted in networking, pi-hole | Comments closed
Need help with the Commons? Visit our
help page
Send us a message
Skip to toolbar