A few more thoughts on family photos

For context, see this post and this one.

I have a few more observations about my family photo project. They are below, and I reserve the right to update this list in the future rather than creating a new post on the subject.

  • Use fdupes or something similar to find and remove duplicate photos. I avoided uploading over 9,000 duplicate photos by that. I wish I’d discovered it much sooner.
  • Something I glossed over in my earlier posts: do not throw away photos. I’ve tossed a handful, almost exclusively duplicates, but the default should be to keep the original pictures. It’s one extra layer of redundancy.
  • If you are cropping multiple pictures out of a single scan, you might at some point crop an original scan by accident and lose some of the pictures. To prevent the incredible annoyance – or outright data loss – of an event like this, make sure to have local backups. I’m a believer in the classic 3-2-1 backup system.
  • It is delightful to hear from your loved ones about their excitement at viewing their old photos.
  • You will never be done. 🙂

To me there’s often a cutoff where something goes from being a “project” to just something I occasionally refresh. This venture is at that point. I’m quite happy how it’s turned out.

Family photos by decade

I crunched a bit of data about my ongoing photo-scanning project, which now includes some more scans as well as a heap of JPEG’s harvested from SD cards from our digital cameras.

Surprising no one, the rate of photos taken has grown dramatically over the last century:

Linear-scale graph showing the increase in number of photos taken by my family from the 1940s to the 2010s

That’s the linear scale. Here’s a prettier graph on a log scale, showing steep growth every decade:

Log-scale graph showing decade-by-decade jumps in number of photos taken by my family from the 1940s to the 2010s

These charts actually understate the case, of course. The era labeled “2010s” is really only 2010-2015 or so, peaking in 2012. After that, analog and standalone digital cameras finally gave way (in our family) to phone cameras. In the decade or so since then, I’ve taken more than this entire graph’s worth of pictures combined – which is to say nothing of the number of photos my parents, grandparents, and siblings have taken.

UPDATE: It turns out the dataset also includes lots of duplicates, but the order of magnitude by decade remains accurate. Sanitize the data first, people!

Looking back on my 2022

This has been a bit of an odd year for me.

Go back in time. The year 2020 was bad, one of my worst years. Then in 2021, I made up a lot of lost ground. I got vaccinated, started on antidepressants, went to Iceland, got a new job, and moved to Colorado. It turned into one of my best years.

I think I had forgotten what it’s like for a year to be “just fine”, but that’s what I got in 2022 – and when you’re used to extremes, maybe normal feels weird.

All that to say: Not much changed for me this year, which is good. I have a job I love and that pays well. My role at work has shifted but aligns with the same goals I had when I started. I got to visit my parents and siblings several times, and since we’re a close family this was a refreshing change from the last few years. I visited my good friends in the Rio Grande Valley for the first time since before COVID, but otherwise I was relatively grounded in one place.

Particularly beneficial was an improved standard of living. My theme for 2022 was “Year of Upgrades”, and I embraced it — largely in the sense of improving my creature comforts. My desktop PC is vastly better than it was a year ago, now equipped with a shiny new graphics card, processor, and motherboard. The apartment is well-furnished and nicely decorated. I just got a new car. I’m doing all of this while building a reasonable savings balance and a retirement account.

In the material sense, this has unquestionably been a good year. I also recognize the profound luck, privilege, and blessings in my life that have paved the way here. As such, I’ve recently been scaling back my personal consumer buys and boosting my charitable giving. I intend to continue that in the next year.

Separate from buying cool stuff, I also worked on some great projects this year:

  • Family photo digitization. See my notes in this post. There’s more left to do, but I have a system in place now that can readily process new photos as we find them.
  • Family history document digitization. A closely-related effort to my work with the family photos. Way back in junior high, I did a genealogy project. I gathered a plastic tub full of documents and pictures from my family history. This year, at long last, I scanned and shared those files. This is another project I will probably continue on-and-off for a while.
  • Cleaning and transcribing a taped interview with a WWII veteran. This is one I would absolutely love to share more detail about someday, but for now it’s not mine to share. I’ll just say that if you have old audio tapes, it’s generally possible to read them into a computer using Audacity and then transcribe them using Whisper. Pretty amazing project.
  • Migrating from Twitter to Mastodon. I created accounts on a few Mastodon servers and hooked up my blog to a little instance for easy sharing in the Fediverse. Twitter is basically read-only to me at this point and Mastodon is where I post social content now. Maybe one day that will change — who knows?
  • Dabbled with Stable Diffusion to do some AI image generation. Not much to say here and I don’t have any good outputs to share. But I do think AI technology like this is an impressive new tool for creating digital art.

Projects are never completed, they’re just abandoned, but these are at least honorably abandoned. Several produced seeds for new projects that I’ll tend to next year.

While 2022 felt oddly subdued compared to its predecessors, it was a good year for me by any reasonable metric. There is more to do in 2023, but I ended up in a pretty good place.

I wish you and your loved ones a happy new year and a lovely 2023.

Notes on photo digitization

Disclaimer: I do not endorse any of the products mentioned in this post, nor does my employer the University of Colorado Boulder. I will provide links to programs and resources that are free and especially those that are FOSS, but not to those that cost money. I include the various tools here for clarity only.

Two shoeboxes of 3x2 photos, some other bits of paper, and one pink photo album with silly-looking cartoon ducks on it
Love those ducks

As part of a project now running multiple years, I recently digitized over 1,100 analog photos for my parents. Like a lot of families, our photo storage and cloud apps safely store most of the last decade of pictures and videos. Memories from around 2012 and earlier, though, were effectively hidden in a series of shoeboxes and envelopes. I wanted to change that.

Here are some things I’ve noted.

I used a Canon LiDE 400, though Epson and other manufacturers also have good scanners that work for this purpose. For me, the specific scanner doesn’t really matter above a certain threshold of resolution and reliability.

I scanned the images at 600dpi and stored them as uncompressed TIF to preserve quality. For final sharing, I compressed them into JPEG’s. That shrunk them by a factor of about 10. (Naturally I’ll keep the original TIF’s as well as the source photographs.)

I used my workhorse, a Windows PC with supplemental work in a Debian environment in the Windows Subsystem for Linux. I cobbled together some scripts based on a few core pieces of software:

  • ImageMagick – the one and the only
  • Bulk Rename Utility – an absolutely core part of the workflow once images are properly scanned
  • multicrop2 – it saves a lot of time to scan multiple images at once, and multicrop2 is a handy script that does a reasonably good job breaking them into separate files again

Sometimes a photo will have important metadata on the back. This is a pain if you have a one-sided flatbed scanner (as I do). I scanned a batch of fronts, then flipped them to scan the backs in the same relative positions. I saved them in a special folder for later processing. At the end, I used multicrop2 to break them apart and used the original images as a guide for which front to pair with which back when appending with convert.

I used the genealogist-recommended YYYYMMDD-description format for naming purposes, which makes for easier metadata encoding and sorting. To support my nerdy automation workflows, I used hyphens as the delimiter between words in the description, rather than the more commonly-used spaces.

I have a lot left to do, but those are a few of my notes so far.

The obvious question here is “why bother?” There are good digitization services in the U.S. so why spend the time and work on this? (I have in fact used such services for other media, like 8mm video and VHS tapes.) Honestly I did it because I wanted to do it. It was a fun series of low-stakes puzzles that were rewarding to solve, and in the end I had lots of digitized family memories. YMMV but it was worth it to me.

Alpine updates and RMACC 2022

This week I had the opportunity to speak at the 2022 RMACC Symposium, hosted by my own institution, about the Alpine supercomputer. My presentation and the others from my CU colleagues are available here.

In summary, Alpine has been in production since our launch event in May. After some supply chain issues (the same that have affected the entire computing sector), we are preparing to bring another round of nodes online within weeks. That will put Alpine’s total available resources (about 16,000 cores) on par with those of the retiring Summit system. It’s an exciting step for us at CURC.

As for RMACC: I’ve never attended the symposium before. After three days, I came away with a lot of new information, new contacts, and ideas for how to support our researchers better. A few topics in particular I paid attention to:

  • Better and more scalable methods of deploying HPC systems and software
  • How the community will navigate the transition from XSEDE to ACCESS
  • The companies, organizations, and universities (like mine!) building the future of this space
  • Changes in business models for the vendors and commercial developers we work with

Academic HPC is a small niche in the computing world, and gatherings like this can be valuable as spaces to connect and share our best ideas.

New supercomputer just dropped

These certainly are server cabinets alright…

Today marks the launch of CU Boulder’s shiny new research supercomputer, Alpine. Text of the university press release:

The celebratory event signals the official launch of CU Boulder’s third-generation high performance computing infrastructure, which is provisioned and available to campus researchers immediately.

On May 18, numerous leaders from on- and off-campus will gather to celebrate, introduce and officially launch the campus’s new high-performance computing infrastructure, dubbed “Alpine.”

Alpine replaces “RMACC Summit,” the previous infrastructure, which has been in use since 2017. Comparable to systems now in use at top peer institutions across the country, Alpine will improve upon RMACC Summit by providing cutting-edge hardware that enhances traditional High Performance Computing workloads, enables Artificial Intelligence/Machine Learning workloads, and provides user-friendly access through tools such as Open OnDemand.

“Alpine is a modular system designed to meet the growing and rapidly evolving needs of our researchers,” said Assistant Vice Chancellor and Director of Research Computing Shelley Knuth. “Alpine addresses our users’ requests for faster compute and more robust options for machine learning.”

Notable among the technical specifications that will make Alpine an invaluable tool in research computing for researchers, industry partners and others, Alpine boasts: 3rd generation AMD EPYC CPUs, which provide enhanced energy efficiency per cycle compared to the Intel Xeon E5-2680 CPUs on RMACC Summit; Nvidia A100 GPUs; AMD MI100 GPUs; HDR InfiniBand; and 25 Gb Ethernet.

The kick-off event on May 18 will celebrate the Alpine infrastructure being fully operational and allow the community to enjoy a 20-minute tour, including snacks, an introduction to Research Computing, and a tour of the supercomputer container. The opportunity is open to the public and free of charge, and CU Boulder Research Computing staff will be on site to answer questions. CU Boulder Chief Information Officer Marin Stanek, Chief Operating Officer Patrick O’Rourke, and Acting Vice Chancellor for Research and Innovation Massimo Ruzzene will offer remarks at 1:30 p.m.

In addition to the main launch event, Research Computing is offering a full slate of training and informational events the week of May 16—20.

Researchers seeking to use Research Computing resources, which includes not only the Alpine supercomputer, but also large scale data storage, cloud computing and secure research computing, are invited to visit the Research Computing website to learn about more training offerings, the community discussion forum, office hours and general contact information.

Alpine is funded by the Financial Futures strategic initiative.

This is the biggest project I have ever worked on. It was in the works months before I arrived but has consumed most of my professional time since September. It’s exciting that we can finally welcome our researchers to use it.

What’s next

Some personal news… 🙂

I have been at Earlham College for almost seven years, including my time as a student and as CS faculty. Today is my last day there.

It’s been an incredible place to grow as a person, deepen my skills, collaborate with talented people from all walks of life, and try to make the world a little bit better. I’ve seen a few generations of the community cycle through and watched us withstand everything up to and including a literal pandemic. I capped it with the trip of a lifetime, spending a month doing research in Iceland – on a project I hope to continue working on in the future.

To the Earlham Computer Science community in particular I owe a big thanks. I have had a supportive environment in which to learn and grow for virtually the entirety of those years. The value they’ve added to my life can’t be quantified. I am deeply grateful.

What’s next?

I am elated to announce that in mid-September I will go to work as a Research Computing HPC Cluster Administrator at the University of Colorado Boulder! I’m excited to take the skills I’ve built at Earlham and apply them at the scale of CU Boulder. Thanks to the many people who’ve helped make this opportunity possible.

Highlights of an amazing trip

Today is the last day most of us are in Iceland for this trip. As I started this post, we were completing a tour of the Golden Circle after a few days in beautiful Reyjkavik. Now we are preparing for departure.

Our view of the volcano

I wanted to post some of the highlights of our trip. There’s a rough order to them, but don’t take the numbering too seriously – it’s been a great experience all-around. Without further ado:

  1. The volcano is truly incredible. It was not uncommon for people to spontaneously shout “Wow!” and “Oh my god!” as the lava burst up from the ground.
  2. We woke up every day for a few weeks with a view of a fjord.
  3. We did a glacier hike on Sólheimajökull, with two awesome guides.
  4. This was a historically successful round of data collection, both on the drone side and on the biology side. We’ll write and share a lot more about this in the next few months.
  5. We shared space with the group of phenomenal students from the University of Glasgow. We also collaborated with them on multiple occasions, learning a lot about different ways to study wildlife and local sites.
  6. THE FOOD – you probably don’t associate Iceland with food culture (I certainly didn’t), but our meals were delicious.
  7. The architecture and decorations are so distinctly Icelandic.
  8. Amazing photography and video – in high quality and high quantity.
  9. Walking along the boundary between the North American and European plates.
  10. Guided tour from our Skalanes hosts – who incidentally are awesome people – of a stretch of eastern Iceland.
Getting the rundown about glaciers at Solo

Some of my personal honorable mentions include:

  • Trail running at Skalanes is breathtaking.
  • Blue glacier ice is real neat.
  • The National Museum of Iceland is fascinating and well-done.
  • Rainbow roads in both Seyðisfjörður and Reykjavik highlight what a welcoming place this country is – also perfect reminders of Pride Month in the U.S.!
  • My first-in-my-lifetime tour of a beautiful country happened alongside people I admire who teach me things every single day. What more could I ask for?
A drone photo of the coast by the fjord

If you haven’t already, check out this interview with Charlie and Emmett, conducted by Cincinnati Public Radio.

Davit and Tamara flying

In addition to our success this year, we’ve also set up some great new opportunities for future years. With our long-time friend and collaborator Rannveig Þórhallsdóttir, we’ve added the cemetery in Seyðisfjörður to our list of sites to survey. We believe there may be historically-significant artifacts to be found there, and our drone work lends itself well to finding out.

The fjord at Skalanes

Finally, here’s the trip by the numbers:

  • 7 Earlhamites
  • 26 days
  • 183 GB of initial drone images and initial assemblies
  • 2 great hosts at Skalanes
  • 6 outstanding co-dwellers
  • 4 guides at 2 sites
  • 1 perfect dog
  • N angry terns
  • 1 amazing experience
Admiring the view

And that’s a wrap. Hope to see you again soon, Iceland!

Cross-posted at the Earlham Field Science blog.

Flying cameras are good

Update: We have learned! And we no longer agree with this post! GCP’s remain critical for deriving elevation. The cameras are not yet ready to replace that kind of precision. Always something you didn’t realize at first glance. Post preserved for posterity and because lots of it is still perfectly valid.

We recently chose not to use ground control points (GCP’s) as part of our surveying work. This is a departure from standards and conventions in the near-Earth surveying space. However, we believe we have made a sound decision that will support equally effective and more time and cost-effective research. In this post, I’ll explain that decision.

The short version: drone imagery and open-source assembly software (e,g. OpenDroneMap) are now so good that, for our purposes, GCP’s have no marginal benefit.

We have high-quality information about our trial area from an established authority – the Cultural Heritage Agency of Iceland. Their 2007 report of finds is the basis of our trial runs here at Skalanes. Surveying these predefined areas, we’ve now flown multiple flights, gathered images, and then run three assembles with OpenDroneMap.

Here’s a simple run over the area with no GCP’s:

Here’s a run over the area with GCP’s, adding no location metadata other than the craft’s built-in GPS coordinates (you’ll note that the ground footprint is slightly different, but the roundhouse in the middle is the key feature):

We also manually geocoded the GCP’s for one run.

In the end, we observed no meaningful difference between an assembly with GCP’s and an assembly without them. Adding the images as raster layers to a QGIS project confirmed this to our satisfaction:

With GCP:

Without GCP:

In summary, ground control points just don’t help us much compared to just taking a bunch of good photos and using high quality software to assemble them. They also cost us in portability: even four GCP’s are difficult to carry, occupying significant space in airport luggage and weighing down walks in the field. For scientists interested in doing work over a large area, potentially multiple times, that inconvenience is not a trivial cost.

The ODM assemblies are outstanding by themselves. We have good technology and build on the work of a lot of brilliant people. That frees us to be more nimble than we might have been before.

It wouldn’t be a post by me if it didn’t end with a cool picture. Here’s a drone image from a cliff near the house where we’re staying:

Cross-posted at the Earlham Field Science blog.

Awe in Iceland

The Greater Good Institute at Berkeley considers awe one of the keys to well-being:

Awe is the feeling we get in the presence of something vast that challenges our understanding of the world, like looking up at millions of stars in the night sky or marveling at the birth of a child. When people feel awe, they may use other words to describe the experience, such as wonder, amazement, surprise, or transcendence.

That’s the feeling I have at least once a day, every day, here in Iceland.

And it’s difficult to write a blog post about awe. Almost by definition, it’s an emotion that defies easy explanation. It has a mystique that risks being lost in the translation to plain language.

But if I can’t describe the feeling, I can describe why I’m having it.

Unique among my traveling companions, this is my first-ever trip out of my country of origin (🇺🇸) The sliver of gray in this image is the first thing I ever saw of a country not my own:

When we arrived, I got a passport stamp and exchanged currency – both brand new experiences. However mundane, they were novel for me and began waking me up to the new world I’d entered.

Our first few days were chilly, windy, and rainy. I was much happier about this than were my traveling companions. If our weather wasn’t pleasant, it was nonetheless exactly the immersive experience I was hoping for when I signed up for this trip.

In those first few days, I got to see this amazing waterfall:

I got to participate in collecting soil samples at a glacier —


— and in howling wind on the side of a moraine:

The right side of the moraine was calm and quiet. The left was much less so.

For good measure, I saw floating blue ice for the first time:

All this was great, and to me they made this trip worth the months of planning and days of travel difficulties it took to get here.

Then we got to Skalanes, where I’m writing this post, and its landscapes exist on a whole other level. Here are ten views here, drawn almost at random from my photos:

This is a country that absolutely runs up the score on natural beauty.

I’ve taken hundreds of pictures here and they’re all amazing – but none does justice to actually being here. That combination is the signature of an awe-inspiring experience.

Awe puts us in touch with something above and beyond our daily worldly experience – call it the divine, the sublime, whatever speaks to you. It’s an experience you can reproduce if you try, but I believe it connects most deeply when it emerges organically from the world you enter. That’s what’s happened to me here.

It is remarkable that this is what we get to do for work, and I am so glad we have some more time to spend here in this awesome country.

Cross-posted at the Earlham Field Science blog.