I’m interested in digital minimalism right now, so I wanted to examine my browser history. Google Chrome on my desktop is the place I spend most of my time online, and it’s also a black box to me. Unlike iOS with its Screen Time feature, I have no obvious window into my browser activity over time. All chrome://history shows is a stream of links you’ve clicked in reverse-chronological order, with no aggregation options. I have a rough idea, but that’s not much to go on.
This weekend I decided I wanted to investigate. At first I thought I’d keep it simple: get my history as a CSV file and open it in Google Sheets. That didn’t work: 15,000 lines is apparently a lot for a web-connected browser-based tool, and it crashed my tab. I could have used macOS’s Numbers, but I realized quickly that my task lent itself to programming better than to a spreadsheet.
As a rough cut (and presented to you now), I made a Python program – code here – that, given a history file of a particular format, produces a graph of your most-visited websites. It makes use of the
seaborn libraries. The earliest date on my dataset is October 12, 2018. The program produced this graph:
The first thing I noticed in the image was that I clicked into Reddit a lot. I’ve had a Reddit account for less than a year, so I knew I could live without it and I swiftly deleted my account.
What was left fell into a few categories:
- search/reference: I was surprised and then immediately unsurprised by Google’s supremacy on this list; Wikipedia and StackOverflow are also in this category
- news: Instapaper, Feedly, Twitter
- professional tools: Gitlab, GitHub, Wiki, Google Drive, and WordPress
- entertainment: Netflix, TVTropes, Amazon, Facebook, YouTube – all of which I regulate using Freedom
- Esquire scored surprisingly high, I think because viewing a slideshow there requires a click per slide and I’ve visited a few of them.
A few caveats about this approach:
- I’d like something more dynamic, maybe an improved version of some old browser extensions I found on my initial research on this idea. This got the very specific information I wanted, but now I want more.
- I’ve separated the code that obtains the data (which I didn’t write) from the code that processes it. This way when Google inevitably changes how it manages history data, I don’t have to disturb the processing code.
- I used this tool to decide my Reddit account should be axed, but it’s arguably unfair to Reddit: I actually read a lot more tweets than Reddit posts, but when you want to expand a Reddit post you click it and it changes the URL. (One minor change I may make is to aggregate twitter dot com, t dot co, and tweetdeck into one row.)
- This analyzes page visits, not time spent. This, I imagine, would be a much stickier problem. I’d need to have an indicator of when the tab and window were both active, and it would be distorted by the frequent distractions of my office. It would also be a much more useful thing to display. Maybe Google can get with the “digital wellness” moment on this.
- Future work: group by time. I have a much better idea of when I’m on the Internet than of what sites I’m visiting most frequently over time, so this wasn’t my priority. That said, it’s possible I could learn something interesting.
- Sites visited in Incognito Mode don’t appear in the history so they also don’t appear on the chart.
Finally, through the lens of digital minimalism, that graph is better than I had expected. There’s not a lot of cruft, the cruft that does exist can be removed pretty easily, and most of the sites provide real value to me. This has been a useful exercise.