Linux.com

Home News Enterprise Computing Linux Storage How to recover lost files after you accidentally wipe your hard drive

How to recover lost files after you accidentally wipe your hard drive

By Shawn Hermans

After the overwhelming feeling of dread passed, I started to look into file recovery options. I demoed a variety of commercial products to see if any of them could find my lost files or partitions. Nothing seemed to work. Finally, I discovered TestDisk and PhotoRec, and was able to use the latter to recover my lost files.

TestDisk can recover lost partitions of virtually any filesystem. PhotoRec can recover files of most types, including most picture and video formats. PhotoRec can be used on existing partitions, or can be used to recover files on deleted partitions without having to recover the underlying partitions. Both PhotoRec and TestDisk can be run on DOS, Windows (9x, NT, 2000, XP, 2003), Linux, FreeBSD, NetBSD, OpenBSD, Sun Solaris, and Mac OS X, and, their developers claim, can be compiled and run on most Unix systems.

The recovery

I began my attempt at recovery by using TestDisk run from a Knoppix CD. Unfortunately, I had already overwritten the partition table, and an exhaustive search of the hard drive for lost partitions yielded too many results. I decided to use PhotoRec instead to recover my lost files.

PhotoRec recovers files by finding deleted files and copying them to disk. This means that files should not be recovered to the same disk partition on which the deleted files reside (unless you're recovering from a disk image file), because that could lead to the deleted data being permanently overwritten.

Another important thing to remember is that PhotoRec will most likely recover a lot of files. This means that the partition on which the recovered files are to be stored should have at least as much free space as the size of the partition on which PhotoRec is searching for recovered files.

Possible setups for recovery include:

  1. Recover the files to a separate hard drive.
  2. Recover the files to a networked storage drive.
  3. Recover the files to a separate partition on the same hard drive.
  4. Image the hard drive using a tool like ddrescue and recover files using only one partition.

As I had completed erased my partitions, I could not use the third option. The second option introduces problems associated with network speed and latency. The fourth option is worth considering in the case of an incident response where the image of the hard drive is used as evidence.

I chose the first option, and installed two hard drives in a single computer. I divided the hard drive used to recover files into two major partitions; the first partition held the operating system (CentOS 4), while the second partition was set up to hold the recovered files. Partitioning in this manner is an extra precaution to prevent PhotoRec from halting the system by writing more files than the storage space allows. Another option is to run the operating system off a live CD such as Knoppix, which contains the TestDisk and PhotoRec utilities.

You can download both PhotoRec and TestDisk in a single archive file. The files photorec_static and testdisk_static are the executable files, and can be executed from the command line.

Make sure that the recovery partition is mounted (I mounted it at /var/recovery). Don't mount the hard drive that contains the deleted files; if the partition remains unmounted, you can't overwrite the data it contains.

Recovery steps

PhotoRec recovers files to the directory from which it is run. Therefore, I changed into the /var/recovery directory and ran photorec_static. If the PhotoRec executable does not run with this command, make sure that you either copy the executable to the /usr/bin directory or type in the full path where the program resides.

The PhotoRec interface is easy to understand. At the initial screen, you select the hard drive you wish to recover. In my case, it was /dev/hdb.

Select a media (use Arrow keys, then press Enter):
Disk /dev/hda - 200 GB / 186 GiB (RO)
Disk /dev/hdb - 160 GB / 149 GiB (RO)
Disk /dev/hdc - 120 GB / 111 GiB (RO)
Disk /dev/hdd - 296 MB / 282 MiB (RO)
[Proceed ]  [  Quit  ]

Next, you select the partition type. In my case, I selected an Intel/PC partition.

Disk /dev/hdb - 160 GB / 149 GiB (RO)
Please select the partition table type, press Enter when done.
[Intel  ]  Intel/PC partition
[Mac    ]  Apple partition map
[None   ]  Non partitioned media
[Sun    ]  Sun Solaris partition
[XBox   ]  XBox partition
[Return ]  Return to disk selection
Note: Do NOT select 'None' for media with only a single partition. 
It's veryrare for a drive to be 'Non-partitioned'.

The next screen listed the partitions on the hard drive. I wanted to recover partitions on the whole hard drive, so I selected the first option. However, before selecting this option, I needed to go to the [File Opt] menu to select which type of files I wanted to recover.

Disk /dev/hdb - 160 GB / 149 GiB (RO)
     Partition                  Start        End    Size in sectors
   D empty                    0   0  1 19456 254 63  312576705
 [Whole disk] 1 * Linux LVM                0   0  2 19457  80 63  312581807
[ Search ]  [Options ]  [File Opt]  [  Quit  ] 
                             Start file recovery

PhotoRec can recover a variety of files, but I only wanted to recovery Word documents, AVI video files, JPG picture files, and MPEG video files. I selected the appropriate boxes.

PhotoRec will try to locate the following files
[ ] dbf  DBase 3, prone to false positive
[X]      FAT subdirectory
[X] doc  Microsoft Office Document (doc/xls/ppt/vis/...)
[X] dsc  Nikon dsc
[X] eps  Encapsulated PostScript
[ ] exe  MS executable
[X]      EXT2/EXT3 Superblock
[X] gif  Graphic Interchange Format
[X] gz   gzip compressed data
[X] jpg  JPG picture
[X] mdb  Access Data Base
[X] mov  MOV video
[X] mp3  MP3 audio (MPEG ADTS, layer III, v1)
[X] mpg  Moving Picture Experts Group video
[X] mrw  Minolta Raw picture
[  Quit  ]
                              Return to main menu 

After you select the file types, go back to the previous screen and begin the scan of the hard drive. The scanning process is automated; on my machine it took a few hours to complete. Once PhotoRec is finished, the recovered files will be in multiple directories of the form recup_dir.x where x is the number of the directory. The files within these directories will not contain the names of the original files; instead, they are numbered to indicate the order in which the file was recovered, and an extension that indicates the file type. For example, f89.avi is the 89th file recovered and is an AVI file.

Post-recovery cleanup

While all of my files were recovered, I had many files on my hard drive. Manually examining each file would be time-consuming and tiresome. I created three folders within the /var/recovery directory named VID/, DOC/, and JPG/, into which I sorted the files using the commands:

find/var/recovery/ -name "*.avi" | xargs -i mv {}/var/recovery/VID/

find /var/recovery/ -name "*.mpg" | xargs -i mv {}/var/recovery/VID/

find /var/recovery/ -name "*.jpg" | xargs -i mv {}/var/recovery/JPG/

Although all the files are sorted into folders of the same type, the sorting was far from over. Before my accident, my hard drive contained more than 10,000 pictures, each around 2MB in size. During the recovery process, PhotoRec recovered all the pictures it could find -- including picture files from the Web browser cache. This meant it brought back a lot of unwanted files. To eliminate most of the picture files from miscellaneous sources, I moved files smaller than 1MB to a folder called SMALL, which I kept until I was satisfied that none were of interest. I moved the files to the folder using the command:

find /var/recovery/JPG/ -name "*.jpg" -size -1024k
   |xargs -i mv {} /var/recovery/SMALL/

PhotoRec does not recover the file names of recovered files, but luckily my recovered picture files contained EXIF metadata such as the time and date the picture was taken and the camera make and model. I used the Jhead command-line utility to extract this metadata. In the JPG folder I ran the command:

jhead -n%Y%m%d-%H%M%S *.jpg


This command renames all files with the jpg extension with its time/date stamp in the format YYYYMMDD-HHMMSS.jpg. Any files with the same time and date stamp are named in the format of YYYYMMDD-HHMMSSx.jpg, where x is a lower-case letter that increments for each duplicate time/date stamp found. Given that these pictures were all taken on the same digital camera, any pictures with the same time/date stamp should be the same picture. I moved duplicates to a foldercalled DUPS using the command:

find /var/recovery/JPG/ -name "*a.jpg" | xargs -i mv {}/var/recovery/JPG/DUPS/

Once I had the files labeled with the time/date stamp, I could sort them into folders according to their year and month.

If I had included keywords or comments in the picture files, I could have used libextractor to extract keywords from the JPEG files and sort the files into folders using those keywords. Alas, this was not the case, so I had to spend hours sorting the pictures manually into folders after the recovery. I did however use libextractor on my AVI files to determine information regarding the codec, frame-rate, and resolution of those videos.

How to prevent recovery

I was glad that I was able to recovery my files with such "ease," but I also realized how easy it would be for my sensitive data to be recovered if I ever got rid of an old computer or hard drive. Luckily, you can wipe data from a hard drive in such a way as to prevent files from ever being recovered. Whitedust Security gives the following as options for secure data removal.

  1. Writing over existing data with "junk" data.
  2. Giving the hard drive an acid bath.
  3. Degaussing the hard drive with a degausser.
  4. Damaging the disk with fire.
Assuming you do not want to render the hard drive unusable, the best options are either writing over existing data or degaussing the hard drive. If you don't have access to degaussing equipment, use a program like Wipe that writes over data with patterns known to cause data to become unrecoverable. Based upon your paranoia level, you can wipe your data with as many passes as you see fit. Some recommend 22 passes, while others say 99 is needed for absolute security. If you are only worried about the casual snooper who only has access to tools like PhotoRec, then three or four passes should suffice.
 

Comments

Subscribe to Comments Feed
  • Mieke Dee Said:

    Hi! My question is - does this recovery program work for computers in which the original hard drive (which contained the now lost files) has been completely replaced/rebooted with a new hard drive? (Same computer, new hard drive). I am exploring all options as to how (and if) it would be possible to salvage some or all of the files lost when that hard drive was wiped out, and I want to know if this program (ar any other method) would work for that. Thanks!

  • Glen Stark Said:

    What you are describing is not possible. The data, i.e. the sequence of bits which represents the informatin you want acess to is stored on a hard drive. You can recover it from that hard drive. If the data never resided on a particular hard drive, how should you "recover" it? It was never there.

  • vishal Said:

    hi! thanks a lot for this! am gonna try this out tonight.. i accidentally formatted my phone's memory card while trying to install linuxmint on my laptop..(the phone was connected to the laptop for charging) hope i get all my files back!

  • Tangeena Said:

    Hey, sorry I'm very new to this stuff, do you mind explaining it in a much simpler way? Thing is, I accidentally wiped my hard drive in order to get my computer, that wasn't turning on, to reset everything so that any viruses that were affecting it would be eliminated. I admit that was a rather rash and stupid thing to do and I regret it now but I really need to get those files back. Is their any simple way to get it done and have them all back in one piece?

  • KOBRORO Said:

    @Tangeena MOST IMPORTANT: Put HDD in another computer and do a complete image. Then you can apply the techniques described here. Just read carefully twice or four times, rest a day and then go to work. Good luck!

  • Infected Said:

    Thanks a lot for this! I recently erased a full 2 TB hdd by accident, so I'm extremely glad to find out the there's a way to get back all my data :) Just started recovering and all is going well! Great tutorial!

  • kashif Said:

    I accidentally formatted my hard disk during ubuntu intallation. I run testDisk and it recovered files from one partition, but was unable to recover files from other 2 partitions. is there any solution which can recover files like testdisk did (TestDisk placed files in same directory structure as was original but photorec place all files in same place). I have different programs and source code files and directory structure is also as important as the files and same file may have been placed in different directories and different contents.

  • David Morton Said:

    Hi, Appreciate this post is rather old, however, hope springs eternal. I have a Synology DS411 NAS on my iMac. It contained/contains some 5TB of .mkv and DVD files on 4 3TB seagate drives. I am told the operating system is Ubuntu. I unknowingly selected a reset button in the DS's Control Panel software. I thought it was to reset the machine back to factory defaults, but it actually starts to reset everything by erasing the hard drives. Tech support finally told me I would have to go to a data recovery service. Seeing your post gave me some hope to recover these 4500 odd files myself. I have ordered an enclosure to contain the 4 DS411 drives connecting to my iMac via USB 3 cable (unfortunately USB 2 is on my iMac so it will be slow going) I have also purchased 2 4TB drives to install in the DS411 and format as Raid 0 giving me something less than 8TB of new storage. Is there anything special I should do using PhotoRec regarding recovery from a RAID system that is slightly tarnished! (Previous RAID was Synology special 12TGB gives 8TB storage) Of course, the main question is will PhotoRec work on my particular problem anyway? Any help very much appreciated.

Upcoming Linux Foundation Courses

  1. LFD331 Developing Linux Device Drivers
    25 Aug » 29 Aug - Virtual
    Details
  2. LFD411 Embedded Linux Development
    25 Aug » 29 Aug - Santa Clara, CA
    Details
  3. LFS422 High Availability Linux Architecture
    08 Sep » 11 Sep - Raleigh, NC
    Details

View All Upcoming Courses

Become an Individual Member
Check out the Friday Funnies

Sign Up For the Linux.com Newsletter


Who we are ?

The Linux Foundation is a non-profit consortium dedicated to the growth of Linux.

More About the foundation...

Frequent Questions

Join / Linux Training / Board