css.php

Hacker hours

Recently, I have been occasionally guest-hosting Hacker Hours at the Brooklyn Public Library on Sunday afternoons. It’s an event where people get together to work on programming projects, and help each other out by sharing skills and knowledge. It draws some seasoned programmers, as well as people who are entirely new to coding. The informal setting and the mix of skill levels makes for a good learning environment.

The event used to be at a café on Atlantic Avenue. But relocating to the library has meant that we’ve been able to reach people would not have otherwise heard about the event. The library helpfully promotes it on their events page. So while I do miss the sandwiches and coffee at our former café, I think moving to the library has been a positive change. I recommend dropping by. Bring some code to work on.

Posted in learning, meetup | Leave a comment

Feast or famine

Quite recently, I was feeling disappointed that I didn’t have any library coding projects to work on. I felt I was failing at my goal of bringing more code into my academic librarianship.

But things change quickly. I’m not sure what happened, but over the course of last week, five(!) new programming projects landed on my plate. It’s far too much to work on at once. I’m not sure if this recent influx was due to a change in mindset, or the regular turnover of the semester cycle, or maybe just randomness. Nonetheless it is very welcome.

This abrupt switch from too few projects to too many has me thinking that maybe much is beyond my control. I can try to be open, engaged and active, but that isn’t always sufficient to bring in the projects that I want to work on. Sometimes it just takes patience and good luck.

Posted in Uncategorized | Leave a comment

Sentiment analysis

For almost five years now, our library has been archiving tweets about our college. I’ve posted about that here and here. Until recently, I didn’t really have an agenda for this data, other than preserving it. Last week that changed. At our college’s Data Faculty Interest Group, I mentioned the tweet archive as a potentially useful data set. Dorina Tila, the chair of the FIG, had a great suggestion: why not use the data for some sentiment analysis?

I immediately got to work coding up a prototype. I used some Python libraries to chart sentiment about Kingsborough over time: Textblob to score the tweets based on sentiment; pandas to wrangle the data; and matplotlib to produce a nice chart:

A chart of tweet sentiment over time

While this chart is interesting, it’s not ready yet. Specifically, the archive has a lot of false positives: Tweets that mention the words Kingsborough or KBCC, yet have nothing to do with our college. These need to be removed from the dataset before the data can produce a meaningful chart. Just by eyeballing the data, my estimate is that 30-50% of the 14,210 tweets will need to be removed. However, sorting through that many tweets is a really big job. I’m not sure how to tackle this yet, but I hope to get it figured out soon. At that point we can produce a meaningful sentiment chart.

Posted in archives, python, sentiment analysis, twitter | Leave a comment

Recommending journals programmatically

I had been keeping this project under my hat, but I’ve proposed it as a conference talk now, so maybe it’s time to share. I’ve been building a journal recommender tool. It’s aimed at faculty, and is built on data from the Directory of Open Access Journals. The idea is that a faculty member could submit a draft abstract, and the recommender tool would give back a top 5 list of journals that best match that abstract.

The point is that this might help faculty more effectively find a way to identify prospective journals for their work. This could speed up the time-consuming process of manually reading through many articles or abstracts in order to evaluate the fit of the article.

It has been interesting to try to figure out ways to write an effective matching script. The results need to be accurate, and the process needs to be quick enough that it can be delivered on the web. The second of these requirements turns out to be more challenging than the first. I’ve spent quite a bit of time trying to optimize the process, but it’s not there yet. Here is the code, if you are interested: https://github.com/MarkEEaton/doaj. I will follow up with more details as I work out the bugs.

Posted in doaj, journal recommender | Leave a comment

The modern web

I’ve been learning some JavaScript recently. Mostly this is so that I can better understand modern JS frameworks. I have some catching up to do: it has been a while since I looked at JavaScript.

Anyhow, my main takeaway from learning (a very small amount) about React and Vue.js is that the DOM isn’t what it used to be. I suppose I kind of understood this already – web templating in Python gave me some hints – but I wasn’t prepared for how something like React completely rethinks the DOM. It’s been a great eye opener, and now I want to make something using a modern JS framework.

So I’m fishing for frontend projects to take on, if you have any suggestions…

Posted in javascript | Leave a comment

On podcasts

I spend more time than I’d like to on the New York City subway. It’s mostly boring, punctuated by occasional dramatic failures, where getting to your destination seems almost impossible. In any case, I find that my time on the subway needs to be put to use with some productive activity, or I would totally waste it, probably by staring into space.

Like many others, I find solace in podcasts. The ones that I always keep returning to are programming podcasts: Talk Python to Me and Programming Throwdown, among others. They’re both amazing. Having these to listen to on my commute keeps me occupied and keeps my technical faculties active, even during weeks when I’m not writing any code. It’s nice to learn productive stuff during an otherwise wasted part of my day.

Posted in learning, podcasts | Comments closed

On testing code

Unit testing seems like a very sensible thing to do. Having tests that ensure that your code is doing what you expect is an obviously good idea. I’m a very cautious person by nature, so testing appeals to me very strongly. Nonetheless I’ve been slow to add it to my workflow.

In part this is because most of the projects I’ve made have leaned heavily on external APIs. Mocking out an API for testing is an intimidating way for a beginner to start testing, so I’ve hesitated repeatedly. Also most of the tools I’ve built have not been critical infrastructure, so no one is relying on them very much, which also (unfortunately) helped me justify my procrastination.

However, more recently, I’ve been automating our library’s gathering and parsing of Counter reports. Counter reports are standardized usage reports that show how much our electronic collections have been used by our patrons. It’s critical stuff for our library, so the data we draw from them needs to be correct.

This finally motivated me to figure out how to write unit tests. Because I’m comparing static files, no complicated techniques are needed. Even my very elementary understanding of pytest is enough to get me through. Now my tests give me confidence that my scripts are reporting the correct data. It’s reassuring.

Posted in testing | Comments closed

Pi-hole

I’ve had a Raspberry Pi sitting around my office for quite a while. I bought it in a brief moment of enthusiasm for hardware that quickly faded, so it has mostly sat around in the box unused. I think the main reason I neglected it is because I hadn’t come up with a compelling use case. That changed recently when I read about Pi-hole.

Pi-hole is software that you can run on your Pi (or in a Docker container) that will block advertising across your network. While I’m not super militant about avoiding advertising altogether, I think that having Pi-hole on my home network can improve my experience of the web.

It was nice to learn a bit about networking while setting this up. Networking is a bit impenetrable at times, especially to the newcomer. But setting up a Pi-hole was an accessible way for me to try out some things that helped me learn how my home network is working.

Despite going through all of the setup, I haven’t yet deployed the Pi-hole on my network, although I hope to soon. It was a fun learning experience nonetheless! if you don’t want to look at ads, and you’re willing to muck around a bit with your network and some hardware, I recommend this as a weekend project.

Posted in networking, pi-hole | Comments closed

Creating a desktop application using Python

Recently, my colleague Julia and I made a Python script to support outreach to faculty. You can read about our attempts to document the project here.

Because deploying a script like this can be difficult for librarians who are sometimes unfamiliar with Python, we thought it might be useful to sidestep some of the complexity by creating a desktop app that would obviate the need for working directly with the command line.

There is more than one way to create and package desktop applications in Python. We used Gooey and Pyinstaller. Gooey is useful for turning a program built on Python’s argparse into a GUI application. Since we already had a working command line application made with argparse, this was a good fit for us.

We also needed to create an executable. The Gooey documentation provides steps for doing this using Pyinstaller. We basically followed these instructions, and they worked as advertised. We ended up with an .exe file that we could put on our desktops to run the application. While the .exe file won’t work across platforms (Pyinstaller is not a cross-compiler), it does work for our librarians. So now there’s no need for librarians to wrangle with the command line; the desktop app takes care of that.

If you’re interested, we branched the GitHub repo for this project, so there is a separate branch for the command line application and the GUI application. Let us know if you find this useful!

Posted in desktop application | Comments closed

DNS

I’ve had some adventures with DNS recently. DNS resolves domain names to IP addresses, so it, uh, mostly just looks things up. It won’t forward traffic, or resolve the path in a url to some other path, or anything like that. It won’t help you serve things over HTTPS. It just translates domain names to IP addresses.

That’s a bit limiting, but so it goes. For example, over the past few years, I’ve created a lot of urls that I’d ultimately like to resolve differently than I had originally intended, but DNS is not actually much help for this. On the positive side, the most useful aspect of DNS for me has been CNAMEs. CNAMEs allow me to assign one of my domains or subdomains to a service out there somewhere on the internet. For example, I can make the subdomain https://fictograph.ocert.at point to an application that lives in the cloud (in this example, on PythonAnywhere). That’s kind of neat.

Anyhow, all of this has given me a greater appreciation of the difficulty of keeping links working. DNS has taught me that naming stuff is hard.

Posted in dns | Comments closed
Need help with the Commons? Visit our
help page
Send us a message
Skip to toolbar