Notes on photo digitization

Disclaimer: I do not endorse any of the products mentioned in this post, nor does my employer the University of Colorado Boulder. I will provide links to programs and resources that are free and especially those that are FOSS, but not to those that cost money. I include the various tools here for clarity only.

Two shoeboxes of 3x2 photos, some other bits of paper, and one pink photo album with silly-looking cartoon ducks on it
Love those ducks

As part of a project now running multiple years, I recently digitized over 1,100 analog photos for my parents. Like a lot of families, our photo storage and cloud apps safely store most of the last decade of pictures and videos. Memories from around 2012 and earlier, though, were effectively hidden in a series of shoeboxes and envelopes. I wanted to change that.

Here are some things I’ve noted.

I used a Canon LiDE 400, though Epson and other manufacturers also have good scanners that work for this purpose. For me, the specific scanner doesn’t really matter above a certain threshold of resolution and reliability.

I scanned the images at 600dpi and stored them as uncompressed TIF to preserve quality. For final sharing, I compressed them into JPEG’s. That shrunk them by a factor of about 10. (Naturally I’ll keep the original TIF’s as well as the source photographs.)

I used my workhorse, a Windows PC with supplemental work in a Debian environment in the Windows Subsystem for Linux. I cobbled together some scripts based on a few core pieces of software:

  • ImageMagick – the one and the only
  • Bulk Rename Utility – an absolutely core part of the workflow once images are properly scanned
  • multicrop2 – it saves a lot of time to scan multiple images at once, and multicrop2 is a handy script that does a reasonably good job breaking them into separate files again

Sometimes a photo will have important metadata on the back. This is a pain if you have a one-sided flatbed scanner (as I do). I scanned a batch of fronts, then flipped them to scan the backs in the same relative positions. I saved them in a special folder for later processing. At the end, I used multicrop2 to break them apart and used the original images as a guide for which front to pair with which back when appending with convert.

I used the genealogist-recommended YYYYMMDD-description format for naming purposes, which makes for easier metadata encoding and sorting. To support my nerdy automation workflows, I used hyphens as the delimiter between words in the description, rather than the more commonly-used spaces.

I have a lot left to do, but those are a few of my notes so far.

The obvious question here is “why bother?” There are good digitization services in the U.S. so why spend the time and work on this? (I have in fact used such services for other media, like 8mm video and VHS tapes.) Honestly I did it because I wanted to do it. It was a fun series of low-stakes puzzles that were rewarding to solve, and in the end I had lots of digitized family memories. YMMV but it was worth it to me.

2 thoughts on “Notes on photo digitization”

Leave a Comment