Learning on my own

One of the CS servers choked this week. That made for a stressful recovery, assessment, and cleanup process.

To cut the stress, I put in about an hour each night to some autodidactic computing education. That included a mix of reading and exercises.

In particular, I walked through this talk on compilers.

As I noted in my GitHub repo‘s README, I did fine in my Programming Languages course as a student, but I was never fully confident with interpreters and compilers in practice. People (accurately!) talk about building such programs as “metaprogramming”, but as a student I found they always came across as more handwavey or tautological than meta.

This exercise, which I’d emphasize consists of code built by someone else (Tom Stuart, posted at his site Codon) in 2013 for demo purposes and not by me, was clarifying. Meticulously walking through it gave me a better intuition for interpreters and compilers – which are not, in fact, handwavey or tautological. 🙂 The Futamura projections at the end of the article were particularly illuminating to discover and think through.

I also read some articles.

  • Teach Yourself Programming in Ten Years” (Peter Norvig; re-read)
  • Purported origin of “never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway”. (This was evidently considered an extremely funny joke among 1970s computer geeks.)
  • The performance links in this post.
  • The next language I’d like to explore is Go, and I started this week.

Computing lets you learn on your own. It’s not unique as a field in this way, but I appreciate that aspect of it.

Improve software performance, both sooner and later

This week I read “How To Be A Programmer”. It’s part of my work to shore up my fundamental computing skills. From a section in “Beginners” called “How to Fix Performance Problems” (emphasis added):

The key to improving the performance of a very complicated system is to analyse it well enough to find the bottlenecks, or places where most of the resources are consumed. There is not much sense in optimizing a function that accounts for only 1% of the computation time. As a rule of thumb you should think carefully before doing anything unless you think it is going to make the system or a significant part of it at least twice as fast.

That struck me for two reasons: One: I’ve reflected in the past on high-performance computing showing exceptions to rules we learn as beginners. Two: just an hour earlier, I’d read Nelson Elhage’s excellent blog post “Reflections on software performance” (emphasis added):

I think [“Make it work, then make it right, then make it fast”] may indeed be decent default advice, but I’ve also learned that it is really important to recognize its limitations, and to be able to reach for other paradigms when it matters. In particular, I’ve come to believe that the “performance last” model will rarely, if ever, produce truly fast software (and, as discussed above, I believe truly-fast software is a worthwhile target). 

One of my favorite performance anecdotes is the SQLite 3.8.7 release, which was 50% faster than the previous release in total, all by way of numerous stacked performance improvements, each gaining less than 1% individually. This example speaks to the benefit of worrying about small performance costs across the entire codebase; even if they are individually insignificant, they do add up. And while the SQLite developers were able to do this work after the fact, the more 1% regressions you can avoid in the first place, the easier this work is.

Software development advice: land of contrasts!

Both approaches have merit. However, from my admittedly limited experience, I’m partial to the latter.

The traditional advice – build it, make it work, then make it fast – works in many cases. It’s a pleasantly simple entry point if you’re just learning to build software. I learned to code that way, and so do many of our students. Both text selections give it credit – “rule of thumb”, “decent default”. But I think its placement in the “Beginner” section is appropriate.

I’m not even at a tech company, but I work on projects where performance matters from start to finish. I’ve also worked on projects where bad performance made the user experience pretty miserable. As Elhage emphasizes in his post, “Performance is a feature”. CS majors learn “big-O” notation for a reason. Everyone likes fast software, and that requires both good design and ongoing optimization.


Small resolutions for 2020

I have a lot coming up in 2020, so I don’t want to make any major resolutions. But I do see a few obvious, relatively simple places for improvement in the new year:

  • Use social media better. I’ve cut back quite a bit, but Twitter, Facebook, and LinkedIn each have benefits. I want to put them to good use.
  • Listen to more variety in music. I’ve expanded my taste in movies significantly in the last couple of years and want to nurture my musical taste as well.
  • Read fewer articles, more books.
  • More intense workouts. I’ve been coasting on light-to-moderate walking and jogging, and I’d like to push myself more. HIIT and strength training are in my mind currently.

This is all in addition to continuing to the next steps in my career and skills growth.

Happy New Year, friends!

Christmas trees and trip cost vs item cost

When building software for large datasets or HPC workflows, we talk a lot about the trip cost versus the item cost.

The item cost is the expense (almost always measured in time) to run an operation on a single unit of data – one member of a set, for example. The trip cost is the total expense of running a series of operations on some subset (possibly the whole set) of the data. The trip cost incorporates overhead, so it’s not just N times the item cost.

This is a key reason that computers, algorithms, and data structures that support high-performance computing are so important: by analyzing as many items in one trip as is feasible, you can often minimize time wasted on unnecessary setup and teardown.

Thus trip cost versus item cost is an invaluable simplifying distinction. It can clarify how to can make many systems perform better.

Yes, Virginia, there is a trip cost

Christmas tree

Christmas trees provide a good and familiar example.

Let’s stipulate that you celebrate Christmas and that you have a tree. You’ve put up lights. Now you want to hang the ornaments.

The item cost for each of the ornaments is very small: unbox and hang the ornament. It takes a couple of seconds, max – not a lot, for humans. It also parallelizes extremely well, so everyone in the family gets to hang one or more ornaments.

The trip cost is at least an order of magnitude (minutes rather than seconds) more expensive, so you only want to do it once:

  • Find the ornament box
  • Bring the box into the same room as the tree
  • Open the box
  • Unbox and hang N ornaments
  • Close the box
  • Put the box back

Those overhead steps don’t parallelize well, either: we see no performance improvement and possibly a performance decline if two or more people try to move the box in and out of the room instead of just one.

It’s plain to see that you want to hang as many ornaments as possible before putting away the ornament box. This matches our intuition (“let’s decorate the tree” is treated as a discrete task typically completed all in one go), which is nice.

Whether Christmas is your holiday or not, I wish you the best as the year draws to a close.

My new laptop

Now that Apple fixed the keyboard, I finally upgraded my Mac.

I am a non-combatant in the OS wars. I have my Mac laptop and a Windows desktop. I have an iPhone and build for Android at work. I run Linux servers professionally. I love everybody.

My little 2015 MacBook Air is a delightful machine. But it wasn’t keeping up with my developer workloads, in particular Android app builds. I bought the $2799 base model MacBook Pro 16″.

I started from scratch with a fresh install. Generally I try to avoid customizing my environments too much, on the principle of simplicity, so I didn’t bother migrating most of my old configs (one exception: .ssh/config).

Instead I’ve left the default apps and added by own one by one. I migrated my data manually – not as daunting as it sounds, given that the old laptop was only 128GB and much of it was consumed by the OS. I closed with an initial Time Machine backup to my (aging) external hard drive.

Now I’ve had a couple of weeks to actually use the MacBook Pro. Scattered observations:

  • WOW this screen.
  • WOW these speakers.
  • WOW the time I’m going to save building apps (more on that later).
  • I’m learning zsh now that it’s the Mac’s default shell.
  • Switching from MagSafe to USB-C for charging was ultimately worth the tradeoff.
  • I was worried about the footprint of this laptop (my old laptop is only 11-inch!), but I quite like it. Once I return to working at my office, I think it will be even better.
  • I am running Catalina. It’s fine. I haven’t seen some of the bad bugs people have discussed – at least not yet.
  • I’m holding on to my old Mac as a more passive machine or as a fallback if something happens to this one.

Only one of those really matters, though.

Much better for building software

The thing that makes this laptop more than a $2799 toy is the boon to my development work. I wanted to benchmark it, not in a strictly scientific way (there are websites that will do that) but in a comparative way in the actually existing use case for me: building Android apps.

The first thing I noticed: a big cut in the time taken to actually launch Studio. It’s an immediate lifestyle improvement.

I invalidated caches and restarted Studio on both machines. The two apps opened at the same time (not optimal performance-wise, but not uncommon when I’m working on these apps intensively).

I then ran and recorded times for three events, on each machine, for both of the apps I regularly build:

  • Initial Gradle sync and build
  • Build but don’t install (common for testing)
  • Build and install

Shock of all shocks, the 2019 pro computer is much better than the 2015 budget-by-Apple’s-standards computer (graphs generated with a Jupyter notebook on the new laptop, smaller bars are better; code here):

Yeah. I needed a new computer. 🙂

I expect 2020 to be a big year for me for a number of reasons I’ll share over time, and my old laptop just couldn’t keep up. This one will, and I’m happy with it.

I sure hope there’s enough to do…

I begin the month-long winter break this weekend. Students and teaching faculty finish about a week later. When classes reconvene in January, we’ll start spending a lot of time on Iceland and its spinoff scientific computing projects.

To lay the groundwork, we’ve spent the last few weeks clearing brush:

  • Updating operating systems and apps on a dozen laptops, handsets, and tablets
  • Syncing accounts with Apple and Google
  • Sitting in on planning/logistics meetings
  • Coordinating with the students who will do most of the actual research and development
  • Producing the list of software improvements we need to make

The last item is the most substantial from the perspective of my colleagues and students in CS. We build a lot of software in-house to collect and visualize data, capture many gigabytes of drone photos, and run datasets through complex workflows in the field.

It takes a lot of work locally to make this succeed on-site. Students have to learn how to both use and develop the software in a short time. Since the entire Iceland team (not just CS students) depends on everything we build, these projects provide real and meaningful stakes.

All of this has come together in the last few weeks in a satisfying way. We’re up to 62 GitLab issues based on our experiences using the software. That’s a good enough list to fill a lot of time in the spring, for both students and faculty.

We’ll hit the ground running in January, when the clock officially begins ticking.

Going to Iceland

The title is the tl;dr. (!)

Barring a catastrophe (a non-trivial assumption!), I’m traveling to Iceland with the field science program at Earlham in June 2020. I’ll be one of the faculty members on the student-faculty research team for the annual expedition. It’s a thrilling opportunity.

I’ve been working toward this for a while. I’ve acted as “ground control” for several summers, both as a student and in my current role as a member of the CS faculty. Between trips, I’ve been part of the team that’s engineering and coding software to do fascinating things:

  • DNA analysis
  • in-field mobile data collection
  • drone flight planning
  • image analysis

But for a variety of reasons, it’s never been feasible for me to take the trip. Finally, the path seems clear.

Of course, anything could happen. This could fall through, or it could turn out amazingly well. Wherever it ultimately falls on that spectrum, I’ll spend the next few months wrangling various software projects, mentoring students, and assisting the official leaders of the trip as we prepare to go do science. My focus will be on building software, but that will be just one task among many.

Text is not a perfect medium for communicating emotion, but I’m quite excited. I wanted to flag this as a personal and professional milestone. I’m certain to be posting more about it over time.

Computing lessons from DNA analysis experiments

I’ve been working with my colleagues in Earlham’s Icelandic Field Science program on a workflow for DNA analysis, about which I hope to have other content to share later. (I’ve previously shared my work with them on the Field Day Android app.)

My focus has been heavily experimental and computational: run one workflow using one dataset, check the result, adjust a few “dials”, and run it again. When we’re successful, we can often automate the work through a series of scripts.

At the same time, we’ve been trying to get our new “phat node” working to handle jobs like this faster in the future.

Definitions vary by location, context, etc. but we define a “phat node” or “fat node” as a server with a very high ratio of (storage + RAM)/(CPU). In other words, we want to load a lot of data into RAM and plow through it on however many cores we have. A lot of the bioinformatics work we do lends itself to such a workflow.

All this work should ultimately redound to the research and educational benefit of the college.

It’s also been invaluable for me as a learning experience in software engineering and systems architecture. Here are a few of the deep patterns that experience illustrated most clearly to me:

  • Hardware is good: If you have more RAM and processing power, you can run a job in less time! Who knew?
  • Work locally: Locality is an important principle of computer science – basically, keep your data as close to your processing power as you can given system constraints. In this case, we got a 36% performance improvement just by moving data from NFS mounts to local storage.
  • Abstractions can get you far: To wit, define a variable once and reuse it. We have several related scripts that refer to the same files, for example, and for a while we had to update each script with every iteration to keep them consistent. We took a few hours to build and test a config file, which resolved a lot of silly errors like that. This doesn’t help time for any one job, but it vastly simplifies scaling and replicability.
  • Work just takes a while: The actual time Torque (our choice of scheduler) spends running our job is a small percentage of the overall time we spend shaping the problem:
    • buying and provisioning machines
    • learning the science
    • figuring out what questions to ask
    • consulting with colleagues
    • designing the workflow
    • developing the data dictionary
    • fiddling with configs
    • testing – over, and over, and over again
    • if running a job at a bigger supercomputing facility, you may also have to consider things like waiting for CPU cycles to become available; we are generally our systems’ only users, so this wasn’t a constraint for us

A lot of this is (for computer scientists, software engineers, etc.) common sense, but taking care to apply that common sense can be critical for doing big interesting work.

The punchline of it all? We managed to reduce the time – walltime, for fellow HPC geeks – required to run this example workflow from a little over 8 hours to 3.5 hours. Just as importantly we developed a bunch of new knowledge in the process. (I’ve said almost nothing here about microbiology, for example, and learning a snippet of that has been critical to this work.) That lays a strong foundation for the next several steps in this project.

If you read all this, here’s a nice picture of some trees as a token of my thanks (click for higher-resolution version):

Image of trees starting to show fall color
Relevance: a tree is a confirmed DNA-based organism.

Some twilight Americana

My alma mater-turned-employer Earlham College has a back-campus area with some trails and buildings, and it’s where I usually go to exercise. I took a long walk yesterday evening to savor the first day of cooler weather here, and I took a few photos there (iPhone 7 camera). These are some of my favorites.

A tale of two large-ish app updates

This week I spent some time working on Earlham CS’s Field Day Android application. It’s the app used by our student-faculty field science researchers to collect data on trips to, say, a glacier in Iceland. I made two substantial changes.

The first was updating our system dependencies. At the start of the summer, Field Day wasn’t a fully modern application. That’s mostly because its development is contingent on the interest levels of students and faculty who (correctly!) have other priorities during the academic year. We experience our only consistent spikes in development during preparation for a trip to Iceland. Even then, we tend to focus on adding or fixing features, rather than major design choices or boring updates. Whatever their benefits, such changes always risk eating up precious time in the short run.

As a result, we had long neglected to update SDK versions, themes, and other app fundamentals. I wanted to fix that before classes resumed this month.

Not being an Android expert (yet?), I relied on a mix of automated tools in Android Studio, manual code tweaks, and careful testing to push the update process forward. Here’s how I described it in my merge request:

I wanted to make us a “grownup” application, by which I mean that I wanted to move us away from as many deprecated tools and dependencies as possible, as far in advance of a field trip as possible. (EDIT: With one exception: these changes do not attempt to resolve the [looming] Google Drive [API] deprecation.)

To that end, this merge request involves substantial changes to build fundamentals like the Gradle version, as well as some Lint cleanup and general tidying. Much of it was done following a simple pattern:

– run a built-in Android Studio update tool (e.g. “Update to AppCompat”)

– change a bunch of details in the code so it builds

– test on the device

– lather, rinse, repeat

Field Day merge request 9

After some tests by myself and a colleague, I approved the merge.

To reward myself for accomplishing that admittedly tedious process (which followed a long, slow battery testing process), I did something more fun.

For a long time I’d wanted to improve Field Day’s UI to streamline the navigation. I made a batch of changes, then submitted the following merge request:

[Field Day’s original creative developers] created a great design palette for Field Day: fun fonts, bright colors, intuitive icons.

I wanted to keep that but update the navigation to reflect the current understanding of our usage model. To that end, this merge centralizes everything onto one screen, miniaturizes our less-used buttons, and puts database and sensors at the forefront.

No specific activities or fragments other than the main screen (and the deletion of the obsolesced sensor screen) have been changed.

I can foresee a future where we do more data analysis and aggregation through the lab notebook, so I’ve preserved the notebook icon for future use.

Field Day merge request 10

The changes in that request took us from this set of two main screens:

Previous main screen (“Sampling” takes user to the second screen)
Previous second screen, containing our sensor and database features

… to this one screen:

Our most commonly-used buttons are on the main screen and fill the entire screen width.

I again checked with my colleague and then approved the request. I’m now working on other issues and have already found the changes to be substantial boosts to the user experience.

This is a sample of my own personal work, but of course building software is a team sport. And it relies on iteration. The original designers of Field Day – current and former colleagues of mine – did a lot of the heavy lifting over a few years building the core logic and aesthetic of the app. As I made my changes in the last few months, I’ve worked to maintain their original design palette while improving usability, performance, and the underlying data model. It’s a useful, specialized, and dare I say fun application, and I want it to keep getting better.

As a closing note about process, I find it sharpens my skills development when I have to summarize my work into prose, as in these merge requests. Writing them requires more precision than a quick chat in a hallway. That’s to say nothing of possible benefits to future developers trying to retrace changes and intentions.