Computer Security Article

Data Storage Basics
(what you don't know can hurt you)


Author: Chey Cobb

Exclusive to the Web Site
Copyright Chey Cobb

Highlights:

Data Storage
You probably know that computers store information as binary data, a string of ones and zeroes. These can be magnetic or electric ons and offs, or optical holes and bumps, depending on the storage media. When the storage meida is re-writable (including floppy disks, hard drives, and CD-RW, DVD-RW) it is possible to delete or overwrite the data. When this happens the read/write mechanism in your computer alters the pattern of ones and zeroes to match the new data. This data starts out being stored in contiguous sections sort of like the grooves on a vinyl record. After time, however, the deletion of files leaves gaps on the storage media. Because of this, the data gets shuffled around and ends up scattered all over the media. This is known as fragmentation. A heavily fragmented disk is slower to read than one with contiguous segments of data and there are hundreds of utilities available that will defrag the data so it is more efficient for your computer to read.

Whether your media is fragmented or not, there are still small areas of the disk that contain data you don't see or know about. This data is generally hidden from casual viewing, but can be seen and reassembled with special software utilities or tools. These tools are the basis for a forensic evaluation of any media.

Unallocated Space
A file is not really deleted when you issue the delete command on your computer. The data still resides on a physical section of the disk, it's just that the operating system has been told that the space can be physically occupied by another file as soon as it is needed. In that way, new data is actually overlaid on top of older data. Because of this, it's possible to resurrect old, deleted files. We've all made the mistake of deleting the wrong file at one time or another, so vendors have made available many commercial utilities that help you to find and recover a file you deleted by mistake. As long as the file hasn't already been overwritten by another file, you can usually restore the entire file. Thank goodness for these utilities!

When a file is deleted or removed, it's "address" is removed from the file directory. That address is just a pointer to where the data had resided on the physical media. Just because the address has been taken away, though, doesn't mean the data isn't still there. It's like a renter who has to move by the first of the month, but hasn't quite finished getting all of his stuff out of the apartment. The landlord shows the property as empty and available to rent, but there's still some debris laying about. This unrented but available space on storage media is called unallocated space. It's not really empty but can be occupied by another file. As long as the space hasn't been occupied by another file, the old data is still there. As soon as the new file is written in that space, the old data is gone forever.

The software tools that the forensic experts use can examine the physical arrangement of the ones and zeroes on the media without regard to the file format, address pointer, or data type (such as binary or ASCII). The software has the ability to display the data on the screen or it can be stored as a text file. What the forensic expert sees is lots of words, letters, and special characters that are all run together. It's kind of like reading a book with few spaces and no punctuation between words. The software also lists the physical location on the disk that the data was found. What's really amazing is that some applications store the keystrokes used to create the data inside the file. This means that corrections, deletions, changes, etc. can be seen with these tools. For example, if you typed, "my boss is a jerk" and then changed it to "my boss is the best", what you may find on the disk is "my%boss%is%a%jerk^C^C^C^C%the%best".

When a disk or other media has been used for a long time, the amount of data left in the unallocated space really adds up. I've seen disks that had as much as 75MB of data left over in the unallocated space. The actual amount of unallocated space is only limited by the size of the disk. This can be a real boon for forensic examiners who are looking for a pattern of behavior or abuse. This can also work against them if more than one person has been using the computer as it becomes very hard to attribute the original owner of data found on a disk.

You might think that overwriting or reformatting a disk a number of times would get rid of all the residual data. Reformatting a disk means that the ones and zeroes are realigned and the address pointers are all removed. However, this isn't necessarily true. While reformatting or overwriting multiple times, the data can still be left over. There is anecdotal evidence of data being recovered from disks that have been reformatted up to nine times. For this reason, government agencies are required to physically destroy media that has been used to store Secret and Top Secret information. You simply cannot trust that the data is gone unless the media has been turned into dust.

Slack Space
So files can contain data that has no relation to the data you saved. In Windows machines especially, the operating system tries to make the data fit into predetermined sized "boxes" - sort of like pigeon holes. If the data doesn't fit exactly into the box, the operating system needs to pad out the extra space sort of like a putting a shim in a door frame or bubble-wrapping in a package.

Windows stores data in "boxes" or chunks that are a minimum of 512 bytes in size. So, if your file is only 500 bytes long, there are 12 bytes to be padded out. This extra space is called slack space and it's located everywhere on a drive or a disk.

The Windows operating system takes superfluous data from wherever it can find it in memory, and stuffs it into the slack space. Often, the extra data comes from the keyboard buffer and other stuff temporarily stored in RAM. Things like passwords, user names, and parts of email messages can actually be stored in the same file space that you thought was just your resume! That's scary! Usually, the data stored in the slack space when the file was created will stay as is unless or until the original file is changed or moved. Imagine all the "secret" stuff stored in your files!

Again, the forensic software tools can find the data stored in the slack space on a disk. I've found URLs of web sites, email addresses, parts of personal correspondence, printer names, passwords, and file directories all stored in slack space. It's a potential gold mine for forensic examiners because this data was not intentionally stored. It captures just a small snapshot of what was being done with the computer at a given time. However, because the data stored there is not an actual file, there are no time stamps or dates associated with it. It can be hard to determine when exactly that data was stored.

Swap Space
When your computer is multi-tasking, it is swapping portions of data from random access memory (RAM) to your hard drive and back to RAM. This data can be temporarily stored in either slack space or unallocated space. It doesn't matter to the operating system just as long as it can find it when it needs it. It's like the Post-It Notes commercial where the squirrel uses the notes to remember where he buried his nuts. The operating system is storing little notes to itself all over your drives. This space is constantly being overwritten until the system finally dumps everything that is in memory. The system always has data being held in memory until it is shut down. When a Windows system is shut down (with the Shut Down command), some data is sent to a permanent swap file.

Whether or not to shut down a Windows machine is an important consideration for forensics because the shutdown processes are written to the swap file during this time. This could potentially overwrite other information in the swap file that is needed for evidence. And, when Windows is restarted, more swap files are created and others are deleted. I'll discuss what you should do with Windows machines later in this chapter.

Caches
Caches are a bit like swap files except they tend to be associated with specific applications - like your web browser, for example. These files store temporary data in order to maximize performance. It's easier for the program to temporarily store the data on your disk drive than to hold it in memory. These caches can sometimes be really huge. I've seen browser caches as large as 100MB in size. This is what the browser uses to store the history of all the URLs you typed and all the links you clicked on and the associated files that have been downloaded. All of this can be a wealth of information. Typically, when porn rings are discovered, it's usually via the browser cache files that this data was found.

Temp Files
All systems contain temporary (temp) files. Some of them are temporary temp files and others are sort of permanent temp files. These are not usually hidden from view and you won't need special software to see them, but they are very useful to forensic examiners. Often a temp file will hold information about when and where a program was installed and by whom.

Articles


Updated Spring, 2002 by webloke © Stephen Cobb
Some article content reprinted by permission.
Article content copyright named author(s).