css.php

Preliminary thoughts on Google Cloud Platform

As I mentioned in an earlier post, I was granted a bunch of Google Cloud Platform credits. I’ve started using them, and here are my initial thoughts:

  1. There is a Cloud Platform service for most of the use cases I can imagine for my project. This means that there are a lot of options to consider.
  2. It’s already becoming clear that there’s a fair bit of platform lock-in, even if you only consider the technical debt incurred by shoehorning existing code into Google services.
  3. The costs are about in line with what I expected. I’ve been coding against the platform pretty intensively for three weeks now, and it has cost me $43.32 in credits so far. In brief, more than I’d be willing to spend from my own pocket. I’m quite willing to spend the credits they’ve given me, though.
  4. Ironically, some of these tools are very hard to google! Good luck getting precise technical help when searching terms like “google cloud function”!
Posted in google | Leave a comment

Async

I started working with Python around version 3.3, which predates the addition of asynchronous features into the language’s standard library. Asynchronous programming in Python has come a long way since then; it is now a well-established feature of the language.

I first wrestled with async on an Amtrak train, on my way home from the Code4Lib conference in 2018. I remember it being a bit daunting. For me, the transition to an asynchronous mindset was made easier by starting with trio, an aysnc library with metaphors that are – in my opinion – much simpler than those used by the more common asyncio library.

The value of async is that unlocks some programming problems that would not otherwise be solvable in a reasonable amount of time. For example, do you have a lot of requests to make across the network all at once? Async will make that way faster for you than waiting on each request to finish synchronously. It’s a valuable tool in the Python programmer’s toolkit. If you have not had a chance to dive into this yet, I encourage you to give it a try!

Posted in async, python | Leave a comment

Unexpected funds

I recently found out that I got funding from Google for my journal recommender project. More than I expected, too. I had hoped to receive maybe $200 or $300 of Google Cloud Platform credits, but they gave me $5000 in credits, which was quite a surprise.

This is good news for this project, because it allows me to try out some technical solutions and optimizations that would not have been possible if I were spending my own money (or CUNY’s money). Hopefully, it will allow me to create a better public-facing journal recommender tool than I could have otherwise.

I’ll try to be pragmatic about what this funding can do for this project, while keeping in mind some of the drawbacks of leaning on Google. Specifically, there are vendor lock-in risks and user privacy risks that come with using Google products. It’s best not to lose sight of these problems. Keeping those in mind, the credits give me a chance to experiment with approaches to project infrastructure without having to worry too obsessively over cost. That is welcome news.

Posted in doaj, google, journal recommender | Leave a comment

Further into hardware

I used to really dislike computer hardware. It seemed like it always cropped up as an inconvenience or annoyance when I was trying to do something with code. I preferred the abstractions and ephemeralness of software to the inconvenience of physical things.

More recently, my thoughts on this have been changing. In part, I’m realizing how essential the physical stuff is to working with computers. This should be obvious but I’m a bit slow to catch on.

I’m learning from the good folks at sdf.org, who are in general way more hardware-oriented than I am. Their joy at hacking on hardware is contagious. I have a lot to learn, but I’m excited about it.

Posted in hardware, learning | Leave a comment

Hacker hours

Recently, I have been occasionally guest-hosting Hacker Hours at the Brooklyn Public Library on Sunday afternoons. It’s an event where people get together to work on programming projects, and help each other out by sharing skills and knowledge. It draws some seasoned programmers, as well as people who are entirely new to coding. The informal setting and the mix of skill levels makes for a good learning environment.

The event used to be at a café on Atlantic Avenue. But relocating to the library has meant that we’ve been able to reach people who may not have otherwise heard about the event. The library helpfully promotes it on their events page. So while I do miss the sandwiches and coffee at our former café, I think moving to the library has been a positive change. I recommend dropping by. Bring some code to work on.

Posted in learning, meetup | Comments closed

Feast or famine

Quite recently, I was feeling disappointed that I didn’t have any library coding projects to work on. I felt I was failing at my goal of bringing more code into my academic librarianship.

But things change quickly. I’m not sure what happened, but over the course of last week, five(!) new programming projects landed on my plate. It’s far too much to work on at once. I’m not sure if this recent influx was due to a change in mindset, or the regular turnover of the semester cycle, or maybe just randomness. Nonetheless it is very welcome.

This abrupt switch from too few projects to too many has me thinking that maybe much is beyond my control. I can try to be open, engaged and active, but that isn’t always sufficient to bring in the projects that I want to work on. Sometimes it just takes patience and good luck.

Posted in Uncategorized | Comments closed

Sentiment analysis

For almost five years now, our library has been archiving tweets about our college. I’ve posted about that here and here. Until recently, I didn’t really have an agenda for this data, other than preserving it. Last week that changed. At our college’s Data Faculty Interest Group, I mentioned the tweet archive as a potentially useful data set. Dorina Tila, the chair of the FIG, had a great suggestion: why not use the data for some sentiment analysis?

I immediately got to work coding up a prototype. I used some Python libraries to chart sentiment about Kingsborough over time: Textblob to score the tweets based on sentiment; pandas to wrangle the data; and matplotlib to produce a nice chart:

A chart of tweet sentiment over time

While this chart is interesting, it’s not ready yet. Specifically, the archive has a lot of false positives: Tweets that mention the words Kingsborough or KBCC, yet have nothing to do with our college. These need to be removed from the dataset before the data can produce a meaningful chart. Just by eyeballing the data, my estimate is that 30-50% of the 14,210 tweets will need to be removed. However, sorting through that many tweets is a really big job. I’m not sure how to tackle this yet, but I hope to get it figured out soon. At that point we can produce a meaningful sentiment chart.

Posted in archives, python, sentiment analysis, twitter | Comments closed

Recommending journals programmatically

I had been keeping this project under my hat, but I’ve proposed it as a conference talk now, so maybe it’s time to share. I’ve been building a journal recommender tool. It’s aimed at faculty, and is built on data from the Directory of Open Access Journals. The idea is that a faculty member could submit a draft abstract, and the recommender tool would give back a top 5 list of journals that best match that abstract.

The point is that this might help faculty more effectively find a way to identify prospective journals for their work. This could speed up the time-consuming process of manually reading through many articles or abstracts in order to evaluate the fit of the article.

It has been interesting to try to figure out ways to write an effective matching script. The results need to be accurate, and the process needs to be quick enough that it can be delivered on the web. The second of these requirements turns out to be more challenging than the first. I’ve spent quite a bit of time trying to optimize the process, but it’s not there yet. Here is the code, if you are interested: https://github.com/MarkEEaton/doaj. I will follow up with more details as I work out the bugs.

Posted in doaj, journal recommender | Comments closed

The modern web

I’ve been learning some JavaScript recently. Mostly this is so that I can better understand modern JS frameworks. I have some catching up to do: it has been a while since I looked at JavaScript.

Anyhow, my main takeaway from learning (a very small amount) about React and Vue.js is that the DOM isn’t what it used to be. I suppose I kind of understood this already – web templating in Python gave me some hints – but I wasn’t prepared for how something like React completely rethinks the DOM. It’s been a great eye opener, and now I want to make something using a modern JS framework.

So I’m fishing for frontend projects to take on, if you have any suggestions…

Posted in javascript | Comments closed

On podcasts

I spend more time than I’d like to on the New York City subway. It’s mostly boring, punctuated by occasional dramatic failures, where getting to your destination seems almost impossible. In any case, I find that my time on the subway needs to be put to use with some productive activity, or I would totally waste it, probably by staring into space.

Like many others, I find solace in podcasts. The ones that I always keep returning to are programming podcasts: Talk Python to Me and Programming Throwdown, among others. They’re both amazing. Having these to listen to on my commute keeps me occupied and keeps my technical faculties active, even during weeks when I’m not writing any code. It’s nice to learn productive stuff during an otherwise wasted part of my day.

Posted in learning, podcasts | Comments closed
Need help with the Commons? Visit our
help page
Send us a message
Skip to toolbar