Securely removing files and folders

Mehrad Mahmoudian published on
12 min, 2207 words

Categories: Lovely Linux

Abstract

This is a general post to explain why and how to remove/delete files on a Linux system in a secure way. As usual, I will add some explanation on why and then get to how. Although I encourage you to read the explanation part, you of course have the freedom to jump right into the solution section.

A little bit of explanation

Reason to do it

There are many scenarios in which one would like to, or even should, remove the file from a computer in a secure way. But first, let's briefly define "secure" in this context. According to the Merriam-Webster dictionary (my favorite and go-to dictionary), the word secure (/si-ˈkyu̇r/) as an adjective is defined as:

    • a. free from danger
    • b. affording safety
    • c. trustworthy, dependable
    • d. free from risk of loss
    • a. easy in mind : confident
    • b. assured in opinion or expectation : having no doubt
    • c. unwisely free from fear or distrust : overconfident
  1. assured

Here, in this short article, I would take the meanings 1a, 1b, 1d, 2a, and 2c.

I think the legitimate reasons to securely wipe a file would be to:

  1. protect yourself and/or your data from others
  2. remove a file to improve workflow

The first reason is that most people associate secure deleting with and maybe even the most common reason people/companies have for securely deleting some files. As an easy example, imagine that a company or a hospital would like to renew their computers and hardware, and they have to decommission and sell the old hardware, but they do not want others to access their confidential information. This is a very practical and plausible reason, and IT departments with competent sysadmins would do this before decommissioning any type of storage.

The second reason is why I just securely deleted about a terabyte of data from my desktop machine, and perhaps it is the non-intuitive reason as well. This is why I'm writing this article to also demonstrate that this reason is legitimate and people should consider secure deleting in these circumstances.

Now, why did I do it? It is actually very simple. I had about a terabyte of data on my computer that was actually a backup of several external disks since one of those disks was showing signs of dying. So before the disk goes fully unresponsive, and because at the time, we didn't have enough NAS quota available, I used my desktop computers to backup the data until we could do the paperwork and get more NAS quota (bureaucracy kills performance). Anyways, I later rsynced the files to the NAS and triple-checked that everything had gone well and copied properly. Now that the files are in proper NAS with proper RAID, it is time for me to clear my disk. So why not use the classic rm -rf /path/to/backup/ to get rid of the files? The answer is actually simple: if you use the rm command or simply delete it from the GUI file manager, you will only remove the records in the table that connects the file name to the inode, which basically means you have deleted the link between the file name (as a string) and the location on your partition that the actual data (zeros and ones) lives. This means that the file content is still there sitting on your disk, although you voluntarily decided not to be able to see them and care if they are overwrote when you are creating a new file. This is actually the reason that the recovery software can recover your data.

Now, back to my question: Why is it not suitable for me to remove the files with rm? The answer is perhaps more clear to you at this point: I don't want to be able to recover these files anymore.

You might now ask why I would want to ban myself from having the luxury of recovery. To which my answer is: simply because I don't want to clog my recovery results in case I have to use recovery for some other file in the relatively near future. If you have ever used recovery tools, you know that many of them do not return the folder structure and file path, and even sometimes, they cannot recover the file name. For instance, if you try to run recovery software on your camera's microSD card, you will see that it will recover many PNG (or whatever format) files with generic names like IMG00001.png. So sifting through a bunch of garbage files just to find your precious files that you just deleted can be very time-consuming and very hard. When I clean the backup files beyond recovery, I don't need to worry about these backup files, and whatever I find would be other files.

How it is done in reality

There are many different software that are developed to address this very problem, but they all practically do the same exact thing, just in different order or in slightly different ways, but the basics are practically the same. All these software do is that they overwrite every bit multiple times. Some only do 2 passes, some even by default do 38, some only write zeros, some only write ones, some write random bits, and some alternate these at every pass on every file. But the end goal is simple: eliminate the possibility of recovery.

Risks and costs

Every piece of storage hardware's bit can change its state (from 0 to 1 or vice versa) reliably only a finite number of times. In SSDs it is called program-erase (PE or P/E) cycles which indicates how many times a NAND flash memory cell on average, can be overwritten. So running these software or any form of secure deleting would wear down your hardware, which means you should consider this a calculated decision.

Also, considering that these software practically overwrite everything numerous times, this process can be very disk IO intensive and would take a long time for large files or folders. The high IO means that your computer might get sluggish, laggy, and even unresponsive, especially if your /home, /etc, /var, /opt, and /sys are on the same disk as you are wiping. Also, as a rule of thumb, wiping a file would take n times more than the time it took to be written on your disk. For instance, if you have a 1

GiBAs much as I like the metric system, the IT world is binary, hence using Gibibyte instead of Gigabyte :)
file and your disk write speed is 100MiBps, writing this file to disk would ideally take 10.24 seconds. So if you want to securely delete that file with 38 passes, it would ideally take at least 38 times more, which means 389.12 seconds which translates to 6.49 minutes. So your 10 second small file suddenly takes about 6.5 minutes to get wiped!

The solution

There are various software on can use to securely delete files and folders, and they all come with their own style and also limitations. I start with the one that is virtually shipped with every Linux distro and then move on to other software that need installation. At the end, I'll also introduce you to a GUI tool.

But before everything, I would like to address the elephant in the room, which is you can "almost" do this manually:

dd if=/dev/zero of=file_1GiB.txt bs=1024 count=1073741824

The command above overwrites zeros on the file_1GiB.txt file with the amount of exactly 1GiB. This, of course, can be automated, but you should not do it for two reasons:

  1. the dd command is very dangerous, and it can wipe out your entire disk if you make a mistake. Very powerful and very dangerous.
  2. this assumes that the file will be written precisely on the same bytes. This is actually not guaranteed, and your operating system might allocate a different location on disk and point dd to that. You are better to read the location where your file is written on the disk and exactly overwrite those specific bytes.

But why bother with dd when excellent tools already exist?

shred

This software is in GNU coreutils and is most definitely already installed on your machine. The only downside of shred is that it only and only works with files and not folders. For instance to delete one single file you can do:

shred --force --remove="wipesync" --verbose --iterations=5 --zero "/the/file/path/file1.txt" "/the/file/path/file2.txt"

Make sure to check the man shred for information about the arguments.

As we said, shred only works for files and not folders, and unfortunately, there is no argument to shred folders recursively. But we can use the find command to generate the list of files and then pass them all at once to shred:

find "/path/to/folder" -type f -print0 | xargs -0 shred --force --remove="wipesync" --verbose --iterations=5 --zero

Note that the use of -print0 and -0 is to make sure having spaces in file names are not messing up our command.

The shred does not remove any folder, so it will end up with an empty folder structure. You can remove those with a simple rm -rf "path/to/folder"

secure-delete

This software is not super actively maintained, but it does the job and does the job very very well and reliably. But it is not in the official repository of some Linux distros (e.g Arch, Manjaro) because it is not maintained and you can either compile it yourself or install it through AUR. I believe it should be available in some Ubuntu versions as the manpage is available on the Ubuntu website [link1] [link2]

This fantastic software is written by van Hauser, but for some reason that I failed to find online, he has stopped actively maintaining it. At the moment, a person behind GIJack Github account is maintaining a working fork, but at the time of writing this article (2022-04-11), the last commit was done on Dec 8, 2019 Git stats.

One of the advantages of secure-delete is that apart from srm command that does the wiping files, it also has sfill command, which as far as I understand, will create a large file to completely fill up the empty space on your disk, then it will try to wipe that like a regular file. This is useful for when you have deleted the file using rm or through some file manager, and you want to wipe the empty disk space. I imagine this is super useful for companies and hospitals.

srm -d -r -v "/path/to/folder_or_file"

and for sfill you can use:

sfill -v "/path/to/mounted/device"

You can speed up the wiping process for both srm and sfill by adding -l for lower security (meaning 2 passes), or -ll for even lesser security (only one pass with random). As the last resort to speed up by compromising in security, you can add -f for the "fast" mode, but according to the man page it is insecure because "no /dev/urandom, no synchronize mode".

wipe

This is software that I have not used and have no experience with, but it is available in Arch Linux extra repository. Wipe is based on work by Peter Gutmann, according to the SourceForge repository, and is "maintained" by Tom Vier. Based on its SourceForge repository, the last release time is 2009-11-01, which put it in the ballpark of 12.5 years of inactivity regarding official release! But according to SourceForge, the last activity goes back to 2013-04-15 when Tom Vier commented under a ticket. In other words, it seems like a good software, but it definitely is not actively maintained, so use it at your own risk. As a matter of fact, use all the software and tools and codes you see on my website at your own risk!

Anyways, using wipe seems to be very easy:

wipe -r "/path/to/folder_or_file"

BleachBit

This software made a bit of political news some years ago, and that is when and where I got to know it exists. I haven't ever worked with it, so I have no experience with its speed or reliability. The only thing BleachBit has that others lack is GUI.

You can download it for many different operating systems and various Linux distros. It is also available in Arch Linux community repository. Of course, you can also build it from the source and get the code from their Github repository (instructions for the building are available on the Github README).

This is a GUI software, so there is no code here ¯_(ツ)_/¯ .