Iceland is on track for summer

Discussions are ongoing about the viability of summer travel given the pandemic. However, as my colleague Charlie has blogged recently, we are “acting as if”. As such, we are trying to maintain our original calendar.

Lo and behold, we have:

Imagine a “You are here” arrow by Spring 1

Here’s the full breakdown of that schedule and our progress:

Our plan for the fall was to find and test alternative UAV’s. This proved prudent, as the federal government banned DJI craft late last year. We are happy with both the Parrot and the Skydio craft, for different reasons which we’ll undoubtedly cover here on this blog in the future.

December and January, which were effectively a long winter break for a subset of us, were dedicated to testing the craft, capturing initial video, and possibly beginning development. This was a success as well. Additionally we have begun spinning up a more sophisticated web presence for the stories we’re telling – changes we will be prepared to publish soon.

We’ve now started the calendar for the spring, term 1 of 2. We are moving into scaling up our operation of the craft and developing software to automate that work. It’s a tough problem but one we can solve in the time we have.

We’re optimistic about our ability to meet the moment. If the world continues to make progress on COVID-19, we should be in shape to have a successful research trip.

2020 in review

You don’t need me to tell you 2020 was a bad year. Others will write about the details that apply nationally and globally, so I’m going to jump right into my own retrospective.

The 2020 wallowing

I was planning on a trip to Iceland followed by a new job with room for advancement in 2020. Instead I stayed at my current job (a good job!), made an attempt at a side hustle that has so far largely fizzled, and was obviously not able to go to Iceland. I didn’t visit family in Montana for Thanksgiving or Christmas. At work, I made a few dumb mistakes (we did rebound in each case, happily). It was, on net, a rough year.

That’s about all I have to say about that. I don’t want to wallow too much, but I also don’t want to go further without acknowledging the struggle.

The better stuff

All that said, the rest of this post summarizes my accomplishments for the year. I write this to remind myself that even though it didn’t generate those external signals, I still did a lot. I advanced my skillset, did my job well, and patched through the year.

Accomplishments:

  • Got out of bed every day and went back to sleep every night
  • Kept Earlham CS running through the pandemic, student dispersal, lockdown, and restricted return
  • Modernized our systems engineering infrastructure with better monitoring, solid backups, improved responsiveness to inquiries, and higher availability – still a long way to go, but we’re so much better than we were a year ago
  • For each error I made, rebounded and learned a lesson
  • Migrated our cluster infrastructure from Torque to Slurm successfully
  • Learned a bunch of low-level details about filesystems, SELinux, and more en route to improving overall quality
  • Took 10,000 steps most days and got outside frequently
  • Provided a lot of internal tech support, engineering, feedback, and project contributions to the projects associated with the Iceland program
  • Made checking LinkedIn a regular part of my routine, though I should use it more socially in the new year
  • Purely as a hobby, learned a ton about video and audio production

Casual observation: I posted a lot in February, and my tech achievements primarily happened over the summer.

I also want to dedicate a section to expanding my horizons. I couldn’t do it with travel, but there were a lot of new things I got the chance to explore this year:

  • On Spotify, listened to 1,408 new artists and 366 new genres (“genre” seems like a nebulous category, but I am taking the win)
  • Watched a lot of new movies, including the complete Hayao Miyazaki filmography
  • Learned to make curry! (this winter squash red Thai curry recipe is great)
  • Baked a pie – apple – for the first time, at Christmas
  • Grew my hair long for the first time in my life (it’s still growing actually – not getting a professional haircut during a pandemic)
  • Visited and walked new hiking trails

That was 2020 for me.

What’s next?

I can’t guarantee 2021 will be better than this year. However, I do have some broad intentions around a theme for the new year. I will do all that is in my control to make the next year better, and I hope you join me.

Tech improves pandemic life

I can’t imagine going through the COVID-19 pandemic without computers. Tech improves pandemic life, and it makes it easier for us to make good decisions.

For reasons of both personal caution and what I see as a moral duty, I am probably in the 80th percentile for cautious behavior during the pandemic. I live alone, and my job lends itself to remote work for almost everything. What’s more, my workplace is a socially-conscious liberal arts college. As a result, I interact with very few people (those I do see are always masked-up).

That lifestyle is only sustainable because of computer technology. I buy and pick up groceries through an app. Meetings take place over video chats. Songs or podcasts play in the background while I cook. I can stream almost anything I want to see. I’ve continued to learn and to work using some excellent rectangles.

Ron Swanson "This is an excellent rectangle." Tech life.

There are tradeoffs, of course, but I have basically lived this way since March. Doing so I have weathered the pandemic as well as I could hope (so far).

The national dialogue now includes a lot of chatter about how to stay safe for the holidays. I’m cautious and want to model good behavior. That means I’ll be on FaceTime for Thanksgiving, Christmas, and New Year’s Eve. That’s not great, and it’ll be sad not to be physically visiting family.

But for people like me, the alternative to a FaceTime holiday isn’t an in-person holiday, but a canceled holiday, spent in isolation. Thanks to the people in my industry, I don’t have to do that. Technology brings people together. It’s one reason I remain idealistic about the work I do.

Amidst the tragedies and terrors of 2020, pause to appreciate the age we live in and the cool things we’ve invented. Tech improves pandemic life – and improves life in general. There’s lots to worry about if you want (conspiracy theories, AI risk, etc.), but I’m happy to live in a technologically advanced society.

Jupyterhub user issues: a 90% improvement

photo of Jupiter the planet, as a play on words in the context of Jupyterhub user issues
Jupyter errors are not to be confused with Jupiter errors.

At Earlham Computer Science we have to support a couple dozen intro CS students per semester (or, in COVID times, per 7-week term). We teach Python, and we want to make sure everyone has the right tools to succeed. To do that, we use the Jupyterhub notebook environment, and we periodically respond to user issues related to running notebooks there.

A couple of dozen people running Python code on a server can gobble up resources and induce problems. Jupyter has historically been our toughest service to support, but we’ve vastly improved. In fact, as I’ll show, we have reduced the frequency of incidents by about 90 percent over time.

Note: we only recently began automatic tracking of uptime, so that data is almost useless for comparisons over time. This is the best approximation we have. If new information surfaces to discredit any of my methods, I’ll change it, but my colleagues have confirmed to me that this analysis is at least plausible.

Retrieving the raw data

I started my job at Earlham in June 2018. In November 2018, we resolved an archiving issue with our help desk/admin mailing list that gives us our first dataset.

I ran a grep for the “Messages:” string in the thread archives:

grep 'Messages:' */thread.html # super complicated

I did a little text processing to generate the dataset: regular expression find-and-replace in an editor. That reduced the data to a column of YYYY-Month values and a column of message counts.

Then I went and searched for all lines with subject matching “{J,j}upyter” in the subject.html files:

grep -i jupyter {2018,2019,2020}*/subject.html 

I saved it to jupyter-messages-18-20.dat. I did some text processing – again regexes, find and replace – and then decided that followup messages are not what we care about and ran uniq against that file. A few quick wc -l commands later and we find:

  • 21 Jupyter requests in 2018
  • 17 Jupyter requests in 2019
  • 19 Jupyter requests in 2020

One caveat is that in 2020 we moved a lot of communication to Slack. This adds some uncertainty to the data. However, I know from context that Jupyter requests have continued to flow through the mailing list disproportionately. As such, Slack messages are likely to be the sort of redundant information already obscured using uniq in the text processing.

Another qualifier is that a year or so ago we began using GitLab’s Issues as a ticket tracking system. I searched that. It found 11 more Jupyter issues, all from 2020. Fortunately, only 1 of those was a problem that did not overlap with a mailing list entry.

Still, I think those raw numbers are a good baseline. At one level, it looks bad. The 2020 number has barely budged from 2018 and in fact it’s worse than 2019. That’s misleading, though.

Digging deeper into the data

Buried in that tiny dataset is some good news about the trends.

For one thing, those 21 Jupyter requests were in only 4 months out of the year – in other words, we were wildly misconfigured and putting out a lot of unnecessary technical fires. (That’s nobody’s fault – it’s primarily due to the fact that my position did not exist for about a year before I arrived at it, so we atrophied.)

What’s more, the 19 this year are, by inspection, half password or feature requests rather than the 17 problems we saw in 2019, which I think were real.

So in terms of Jupyter problems in the admin list, I find:

  • around 20 in the latter third of 2018
  • 17 in ALL OF 2019
  • only two (granted one was a BIG problem but still only 2) in 2020

That’s a 90% reduction in Jupyterhub user issues over three years, by my account.

“That’s amazing, how’d you do it?”

Number one: thank you, imaginary reader, you’re too kind.

Number two: a lot of ways.

In no particular order:

  1. We migrated off of a VM, which given our hardware constraints was not conducive to a resource-intensive service like Jupyterhub.
  2. Gradually over time, we’ve upgraded our storage hardware, as some of it was old and (turns out) failing.
  3. We added RAM. When it comes to RAM, some is good, more is better, and too much is just enough.
  4. We manage user directories better. We export these over NFS but have done all we can to reduce network dependencies. That significantly reduces the amount of time the CPU spends twiddling its thumbs.

What’s more, we’re not stopping here. We’re currently exploring load-balancing options – for example, running Jupyter notebooks through a batch scheduler like Slurm, or potentially a containerized environment like Kubernetes. There are several solutions, but we haven’t yet determined which is best for our use case.

This is the work of a team of people, not just me, but I wanted to share it as an example of growth and progress over time. It’s incremental but it really does make a difference. Jupyterhub user issues, like so many issues, are usually solvable.

An inspirational place

I graduated from Earlham in December 2016. I returned to work for the Computer Science Department here in June 2018. Like so many in the community, I relate to it as more than an alma mater or an employer: it’s an institution and a community I (and so many others) hold in high esteem.

For all that, I don’t think I’ve ever been more inspired by this place than I was by this:

It’s an incredible display. I wasn’t able to attend this event myself but I watched some of its organization unfold on social media in the hours beforehand. It was a breathtaking, awe-inspiring achievement.

This is a time of a lot of fear, heartbreak, and frustration. To briefly lapse into politics, it is horrifying to check the news and to see the President of the United States so thoroughly, spectacularly, and dangerously fail in guiding this nation through the crisis. The effects of COVID-19 may hover over our heads for a long time to come.

But this is also a moment of profound social solidarity. I need look no further than this small liberal arts college to see it. It’s wonderful to be part of a community where this could materialize.

We’re dispersing geographically for the rest of the semester, but we carry this spirit with us wherever we go. I can only hope the people of America and the world rise to the occasion as this community did.

Small resolutions for 2020

I have a lot coming up in 2020, so I don’t want to make any major resolutions. But I do see a few obvious, relatively simple places for improvement in the new year:

  • Use social media better. I’ve cut back quite a bit, but Twitter, Facebook, and LinkedIn each have benefits. I want to put them to good use.
  • Listen to more variety in music. I’ve expanded my taste in movies significantly in the last couple of years and want to nurture my musical taste as well.
  • Read fewer articles, more books.
  • More intense workouts. I’ve been coasting on light-to-moderate walking and jogging, and I’d like to push myself more. HIIT and strength training are in my mind currently.

This is all in addition to continuing to the next steps in my career and skills growth.

Happy New Year, friends!

Christmas trees and trip cost vs item cost

When building software for large datasets or HPC workflows, we talk a lot about the trip cost versus the item cost.

The item cost is the expense (almost always measured in time) to run an operation on a single unit of data – one member of a set, for example. The trip cost is the total expense of running a series of operations on some subset (possibly the whole set) of the data. The trip cost incorporates overhead, so it’s not just N times the item cost.

This is a key reason that computers, algorithms, and data structures that support high-performance computing are so important: by analyzing as many items in one trip as is feasible, you can often minimize time wasted on unnecessary setup and teardown.

Thus trip cost versus item cost is an invaluable simplifying distinction. It can clarify how to can make many systems perform better.

Yes, Virginia, there is a trip cost

Christmas tree

Christmas trees provide a good and familiar example.

Let’s stipulate that you celebrate Christmas and that you have a tree. You’ve put up lights. Now you want to hang the ornaments.

The item cost for each of the ornaments is very small: unbox and hang the ornament. It takes a couple of seconds, max – not a lot, for humans. It also parallelizes extremely well, so everyone in the family gets to hang one or more ornaments.

The trip cost is at least an order of magnitude (minutes rather than seconds) more expensive, so you only want to do it once:

  • Find the ornament box
  • Bring the box into the same room as the tree
  • Open the box
  • Unbox and hang N ornaments
  • Close the box
  • Put the box back

Those overhead steps don’t parallelize well, either: we see no performance improvement and possibly a performance decline if two or more people try to move the box in and out of the room instead of just one.

It’s plain to see that you want to hang as many ornaments as possible before putting away the ornament box. This matches our intuition (“let’s decorate the tree” is treated as a discrete task typically completed all in one go), which is nice.

Whether Christmas is your holiday or not, I wish you the best as the year draws to a close.

Going to Iceland

The title is the tl;dr. (!)

Barring a catastrophe (a non-trivial assumption!), I’m traveling to Iceland with the field science program at Earlham in June 2020. I’ll be one of the faculty members on the student-faculty research team for the annual expedition. It’s a thrilling opportunity.

I’ve been working toward this for a while. I’ve acted as “ground control” for several summers, both as a student and in my current role as a member of the CS faculty. Between trips, I’ve been part of the team that’s engineering and coding software to do fascinating things:

  • DNA analysis
  • in-field mobile data collection
  • drone flight planning
  • image analysis

But for a variety of reasons, it’s never been feasible for me to take the trip. Finally, the path seems clear.

Of course, anything could happen. This could fall through, or it could turn out amazingly well. Wherever it ultimately falls on that spectrum, I’ll spend the next few months wrangling various software projects, mentoring students, and assisting the official leaders of the trip as we prepare to go do science. My focus will be on building software, but that will be just one task among many.

Text is not a perfect medium for communicating emotion, but I’m quite excited. I wanted to flag this as a personal and professional milestone. I’m certain to be posting more about it over time.

Some twilight Americana

My alma mater-turned-employer Earlham College has a back-campus area with some trails and buildings, and it’s where I usually go to exercise. I took a long walk yesterday evening to savor the first day of cooler weather here, and I took a few photos there (iPhone 7 camera). These are some of my favorites.