October 1, 2015

How to create a data backup strategy using multi-period snapshots

What are Snapshots?

Snapshots are point-in-time state of a share (folder). Snapshot technology in general, and snapshot implementation of Copy-on-Write (CoW) in BTRFS filesystem (Linux) in particular, offer users simple data-backup, data-protection and data-recovery tools.  

Why do Snapshots matter - what are some possible real world scenarios that could use Snapshot technology?

Imagine you are working on an important project such as : writing a book or a grant proposal, editing a video you shot during your vacation, recording a podcast, editing your photos, creating a presentation, writing code, or anything else of significance - there are a number of problems that could occur :

  1. You create your work (say 4 paragraphs of a proposal or a video clip) and save it.  Now you edit and save it again (say you remove a paragraph or remove some video footage).  For some reason you want to compare your work and want to get back the paragraph you deleted or the video footage you removed.  For your document, you may have maintained an in-efficient manual version control and listed down every possible change, or most likely not done it. Or you could have set-up a back-up to be saved automatically,  or most likely not done it. Retrieving your video editing project could to be even more difficult.
  2. You are working on your project and the batteries or your computer suddenly die, or your computer freezes and needs a reboot - you could suddenly lose work and in-case you forgot to save, you may have to start all over again.

And, there could be number of other problems that could cause you to lose or work resulting in wasted time, effort and frustration.  So, what are your options?  Storage systems such as Rockstor, offer 'Snapshots' technology that is designed to solve the problems like the ones described above and offer not just a solution, but also allow you to create a strategy to proactively manage these issues before they arise.

With Rockstor Snapshot feature,  you have the ability to create automatic point-in-time backups of your work, irrespective of the nature of your work (it can be anything - docs, videos, music, code, presentation etc.) and also build a strategy to manage back-ups .

And, setting up snapshots and the management strategy on Rockstor storage platform is easy!  Here I am assuming familiarity with Rockstor installation and set-up.  Please refer to this document for instructions on Rockstor installation and set-up : http://rockstor.com/docs/quickstart.html.

For setting automatic snapshots, go to  "Scheduled Tasks" under "System" (and click "Schedule a Task") or "Snapshots" under "Storage" (and click "Schedule") on your Rockstor Web UI - images shown below (basically these are two ways of getting to the same place).  You can configure Rockstor to take snapshots every 30 minutes, an hour, day, week, month, year, or across multiple time periods. This way, you can recover whatever copy you had saved in time. This video also shows the process of setting up hourly snapshots. In a similar way, you can set-up snapshots for every 30 minutes, day, week, month, year.

Going back to our example above, if you want to retrieve the paragraph you edited and removed, and compare it to your current work, you can simply retrieve your back-up from last 30 mins or past hour.

Snapshots, therefore insure that you not just have the current version of your work, but also a trail of all prior versions saved in time.  

The screenshots below show the set-up screens for snapshots 

Create Snapshots using "Scheduled Task" - Step 1
Create Snapshots using "Scheduled Task" - Step 1
Create Snapshots using "Scheduled Task" - Step 2.
Create Snapshots using "Scheduled Task" - Step 2.
Create Snapshots from "Storage" menu.
Create Snapshots from "Storage" menu.

How to set-up and manage your backup using snapshots?

Backups can be managed by using "multi-period" snapshots.  But first, what are multi-period snapshots and why are they important?

Multi-period snapshots is a technique or strategy that allows you to schedule snapshots across multiple time periods. If you are actively working on a project for a week -- let us again take our example proposal document mentioned above.  You may want to back up every 30 minutes or hour of your work while you are actively working on it during that week. However, once you are done, you may want to back-up every month, and then every year for archival purpose.  Multi-period snapshots helps you achieve this 'self-thinning' and manage your back-ups and storage space.

Here I describe one multi-period snapshot and 'self-thinning' backup management strategy for the proposal document. You can modify this strategy or build your own that fits your needs better.

  1. While setting up a snapshot task, there is a field called 'maximum count' of snapshots where you can specify the maximum number of snapshots you want (see above images). Say you are setting up an hourly back-up, and you want to back-up every hour for 7 days of a week, then you can set the maximum count t0 168 (24 x 7) snapshots. Once the snapshot count reaches 168, the newer snapshots start over-writing the older ones.  Suppose your project was a week long, and for now you are done, but may make minor edits later.  So, you want to retain the 168th snapshot, and carry it forward. Say, you set-up the snapshot tasks at 10pm Monday, so the 168th snapshot will be taken at 9pm on Sunday. You can set-up a weekly snapshot task for 9pm of Sunday of that week and move-off of the daily backups. Set it up as shown in this video for the hourly snapshots and this video for weekly set-up.
  2. You could also choose to carry forward this weekly snapshot forward to a month. And, you choose a snapshot for this particular day (Say, 9pm on 3rd Sunday of September) and delete the rest. So, now your snapshots will be taken every month. This allows you to make minor changes to your backup and then save a copy every month. See this video.
  3. Assuming that you no more have the need for a monthly back-up, and now you want to retain one snapshot for a month in a year for archival purpose and delete the rest, you can set it up as shown in this video.  Also, see Fig 1 and Fig 2, below.
  4. This way, you can start with a lot of snapshots (started with 168 a week) when you are actively working on a project and then just save one copy of the final version for archiving.

Similarly, you could create your own strategy for daily, weekly, monthly and yearly snapshots and backups.  Also, refer the how-to on Multi-period Snapshots and Scheduling a Snapshot task.

Self-thinning backups using multi-period snapshots
Fig 1: Self-thinning Backups for above example

 

 Multi-period snapshots
Fig 2: Multi-period snapshots carried over time periods
Click Here!