Lessons learned from botching an OS upgrade

Last weekend, I attempted to take advantage of a brief lull that I had after the Joint Mathematics Meeting to make a long-delayed upgrade to my work machine.  This machine, which is what I use to store data and run quantitative analyses, is a Linux laptop that’s been running on Ubuntu for the last six years.

I have more than enough experience with Linux distributions to know to expect the unexpected when it comes to upgrades, but the Ubuntu project has invested a lot of effort in making installs and upgrades as painless as possible.  The last time I upgraded my machine (12.04 LTS to 14.04 LTS), the upgrade was pleasantly and surprisingly straightforward, so I thought that this upgrade from 14.04 LTS to 16.04 LTS would go just as well.  I’ll let my Twitter thread take it from there (click on the tweet to find it):

Fortunately, I’ve learned my lesson from previous hard drive failures and data loss disasters, and I am much more disciplined about backing up my most valuable documents and data.  I used TestDisk/PhotoRec to extract photos and other documents from the hard drive (and I sincerely thank Daniel Rossbach (@da_rossbach) for suggesting those tools), and it was effective at extracting a big dump of salvaged data.  I’ve since reformatted and reinstalled Ubuntu on my laptop, and I’ve restored my backups to the machine as well.  There will be some lost files, and the databases are lost and have to be rebuilt, but it seems that the only thing I’ve lost has been a week of aggravation.  That’s much more preferable to the alternative.

So what lessons have I learned from this event?

If you use Linux, set up a home partition. In the past, it was hard for me to imagine how much space I needed for personal and work files and how much space I needed to reserve for system files.  So I placed everything in a single partition, which reduces the need for mental arithmetic.  Unfortunately, it also exposed me to data risk in case an upgrade to the operating system goes wrong.  And if backups aren’t consistent, the risk of loss becomes very real.  A separate home partition reduces that risk significantly by isolating the system files, so if the worst-case scenario occurs it won’t be as devastating. I reserve about 40-50 GB for the root partition and leave the rest (more than 100 GB) to the home partition, which is plenty and manageable.

Back up, back up often, and back up everything. All it takes is one major data loss event to become very passionate about regular backups.  From personal experience, it is a painful, ruinous, and potentially career-changing event.  I’ve backed up my most important files — source code, analysis results, publications, legal and economic documents — but until recently I hadn’t backed up everything on my home folder.  There were a couple of reasons for that, such as lack of backup storage space and a reluctance to put personal documents in a third-party environment that can get hacked.  Now that we live in an era where storage is so cheap (1TB hard drives are now being sold for less than $80, and on the cloud for less than $10/month!), there is really no reason to not back up all of your files.  How often you need to back up is a matter of how much loss you’re willing to tolerate.  I like the continual backup that cloud storage provides, but some people only need to backup once a week. Pick a regular schedule and stick to it, and choose backup services and programs that require minimal setup.

Use the cloud, but use external hard drives, too. This is all about increasing redundancy in a data backup scheme.  Storage is cheap, so purchase cloud storage, purchase an external hard drive, and use both regularly and consistently in your backups.

Don’t forget your databases. It’s easy to forget to backup your databases, but if you lost them, how hard would it be to recreate them?  If you have the raw data and ingestion code to automate things (you did do that, right?) it makes things easier, but it will take a while depending on how much data you have.  Set up a regular schedule where you dump the current state of your databases to a database backup/restore folder, and make sure those files are backed up.

Losing data due to malfunctioning software or operating systems is terrible, and it can cost you time, money, memories, or even careers.  It will happen at some point in your life, but you can prepare for it with a well thought out plan for protecting your data, from drive partitioning to frequent backup across multiple storage media.

So in conclusion, I’m back!  More analytics content to come soon.