K&F Consulting Inc.

Electronic Data Discovery Unleashed
(page 2)

 

File Directories

The FAT is only one component of the file management system. Another component is the directory. As mentioned earlier, the directory contains the file name and its starting cluster number; but, it also contains additional data.

In all of the Microsoft operating systems, both DOS and Windows, the directory entry is 32 bytes long. The contents of the directory entry in these various operating systems has evolved over time and currently contains all of the information shown in the following diagram.

For short file names, the filename and extension is eleven bytes long. For long file names the same 32 byte directory record is used; however, many directory entries are used.

The first byte of the filename can provide some special information. If the first character is E5h then the file has been deleted from the directory and the values of its cluster chain in the FAT changed to 0. If the first character is 2Eh then this is either the “dot” directory entry or the “dot-dot” directory entry. The determination can be made based on the cluster number. If the cluster number points to the directory itself then it is the “dot” entry. If the cluster number points to the parent directory then it is the “dot-dot” entry. In the case of long file names the first character of the file name is intelligently coded as a sequence identifier. The way the operating system knows whether long or short file names are being used depends on the value of the Attributes byte.

Windows Directory

So, with long file names the file name is actually comprised of numerous 32 byte directory entries. While the first 32 byte record follows the short file name format each subsequent record can use most of the 32byte record length to retain the additional characters. After all, there is no need to keep repeating the file’s various time and date stamps or file size.

The attributes are a single byte that is bit mapped. In other words, each bit has a significant meaning. Five of the eight bits are used to identify whether the file is Read-Only, a System file, a Hidden file, a Disk Drive volume name, a Subdirectory Name or Archive. A final combination of these bit values, 0Fh, indicates that the directory entry is for a long file name.

The next byte is reserved and not generally used. It is used, however, by the Windows NT and Novell operating systems to hold the first character of the deleted file name–the one replaced with E5h.

The next five bytes are for the file creation date and time. This feature exists only in Windows 98 and later systems. Prior to that these bytes were reserved and had no meaning. The creation date and time identifies the date and time that the file was created on the drive. If the file is being copied from some other drive then its actual creation date and time could be much earlier.

Five bytes are required for the file creation date and time versus only four bytes for the file update date and time because the creation date and time is valid down within 10 milliseconds. The update date and time lacks this level of precision.

The last access date is also a feature found in Windows 98 and later systems. Notice that only the date value is captured. A time value has not been captured. The access date is changed whenever the file is used or its directory entry viewed in applications like Windows Explorer.

The high cluster number was a feature added for 32 bit Windows systems, Windows 95 and later. Notice both the high cluster number and the cluster number are 16 bit values. Clearly, a total of 32 bits are needed to reference the cluster number in 32 bit systems.

The update date and time is also known as the last write date and time. These values capture the date and time when the file was last modified. Unlike the creation date and time these values remain unchanged when a file is being copied from one drive to another.

The remaining date element is the file size in bytes. With only 4 bytes to work with, however, the maximum file size, in a directory entry, is 4 gigabytes.

Deleting Files

As indicated previously, when files are deleted they are not actually erased from disk. Rather, only the directory entry is changed and the next cluster value in the FAT is changed. With respect to the directory entry, all that happens is that the first character in the file name is changed to E5h. With respect to the FAT, all the values in the file’s cluster chain are changed to zero. Notice that the first cluster number in the file’s directory entry remains unchanged.

Since only the directory and FAT entries are effected by a deletion the original file data is still on disk. All that needs to be done in order to recover a file is change the first character in the file name back to some character other than the E5h deletion character and reload the FAT with the appropriate cluster chain entries. If the file was not stored in contiguous clusters it could be difficult to recover the entire file but if the disk is not terribly fragmented the chances are good that the file clusters will be contiguous.

It is interesting that not even the directory entry is deleted. So, whether or not a file used to exist on the disk can be determined simply by reviewing the directory lists. The figure illustrating the directory index contents, shows the first two sectors of a directory named RECOVERY. The contents of that directory are comprised of one subdirectory named ORIGINALS and a number of 32 byte directory entries that are recorded in cluster 534,638.

 

The figure is comprised of threeDirectory Indexwindows. There is the big window showing the contents of theRecovery directory and the offset for each entry. There is another window on top that shows the decoding of the directory entry for a file named “Drawing. WPD”. In between these windows is another smaller window that shows the directory name, subdirectory list and the directory’s cluster chain.

Notice that most of the directory entries are for file names starting with the å (E5h) symbol, which means that those files have been deleted. On the next to the last line is a file name without the å symbol named “Drawing WPD”.

I use the file Drawing.WPD to demonstrate file deletion and recovery. It is easy to see how many times the file has been deleted and recovered. Each time the file was recovered a new directory entry was created.

This interesting feature applies to all kinds of computer objects. Short cut files, files with a LNK extension, are just one example. By examining directory entries for short cut files the examiner may be able to spot machine configurations that were not apparent on a physical examination of the machine or even links to files or other programs that have been deleted or uninstalled.

Moving files has a similar effect. The file name entry is deleted in the original directory and a new entry is created in the new directory. If the movement is on the same drive volume then the other parameters such as file creation date and time are unchanged. If the file is moved to a new drive volume then a new creation date and time is established.

Performing Disk Analysis

With the right tool disk analysis is easier than any might think. All that is needed is a disk editor that will allow the user to examine the contents of the disk at a level below the file level. It is at this lower level where one can inspect the contents of boot records, directory tables, FATs, allocated space, slack space and free space.

All it would take to search the boot records, directory tables and FATs would be to navigate to their locations and view their contents. By knowing their layouts it is easy to decode their contents as illustrated in the previous figure of the Directory Index.

Allocated, slack and free space are as easily searched. The data saved in files is typically stored in the hexadecimal representations of ASCII values. Consequently, simply performing text based word searches across the disk for items of interest can quickly yield profitable results. The figure illustrating a text based search across the disk shows the result of a search for the word “drawing”.

The offset column shown in the figure indicates the starting byte number when the disk is viewed in 16 byte increments. The first letter of the word “drawing” occurs at the 2,745,119,923th byte on the disk. The hexadecimal representation of the value at that location is “44". The far right column shows the ASCII representation of the stored value at each byte’s location. This particular collection of text is located in disk cluster number 333,659 as indicated in the lower left of the figure. This entry is also part of unallocated space (a deleted file) as indicated by the “?” in the lower left. Had the entry been part of a file in allocated space the file name would appear in the place of the question mark.

In addition to just searching the disk contents some of the other places that one would want to investigate on a Microsoft Windows based system are Thumbs.DB, Index.DAT, the registry, printer spool files and the recycle bin.

Text Based Search

The Thumbs.DB file is a hidden system file that contains a copy of every image in the folder so that they can be viewed as thumbnails when viewing the folder. Even though the images have been deleted in the folder they could still exist in the Thumbs.DB file along with their modification dates. So, if graphics are your interest this will definitely be one place that you will want to look.

Index.DAT is the file used by Internet Explorer to cache websites. The Index.DAT file captures the URL, the date that the page was last modified by the server and the date that the URL was last accessed by the user.
Analysis of disk drives requires significantly more care than other forms of electronic data.
The registry is a hierarchical database used on all 32 bit Windows systems. The registry is used to store settings and parameters for program operation. When programs are installed their existence is recorded in the registry. If the program has user settings those, too, are often recorded in the registry. The registry can be a valuable tool for comparing current machine configuration to its prior configurations, since it is possible that not all of the registry keys for previously installed programs would be removed upon the program’s removal.

Printer spool files are identified by either the SPL or SHD file extensions. SHD files will contain information about a print job including the owner, the printer, the name of file printed and the printing method. SPL files can contain the data to be printed along with the name of the file, and a list of files that contain the data to be printed.

Within the Recycle Bin is a file named INFO or INFO2 depending on the version of Windows. The INFO file contains the information about a deleted file such as its original location and its deletion date. Since system file deletions do not travel through the Recycle Bin its contents are limited to those files having been deleted by a user. When the Recycle Bin is emptied the INFO file is also deleted. It can be recovered as with any other deleted file as long as the file has not been overwritten.

Preserving Evidence

Analysis of disk drives requires significantly more care than other forms of electronic data. This is because simply by turning on the machine important evidence can be destroyed. Remember that when a Windows based machine boots-up files are being executed, the last accessed date stamp is being changed on those files and a recycle bin is created if one previously did not exist. In addition, swap file or page file contents are overwritten.

So, it is necessary to never turn on the computer and to access the drive only as a slave and in a system that is designed to prevent changes being made to the drive. In order to make copies of the drive so that analysis can be performed either a clone should be made or the drive imaged.

A clone is a bit-for-bit copy of the original drive. Some programs like Norton Ghost do not perform complete clones. Instead they only clone the active files on the drive. Hence, they do not clone free space. So, it is essential that the cloning mechanism perform a complete clone of the original evidence drive.

Also, when the clone is created it is necessary to ensure that the duplicate drive is free from any old data. Many forensic grade cloning tools will overwrite any additional sectors with zeros to make sure that there is no chance of contamination from errant data on the new drive. Even if the new drive is straight from the box it could have data left from the final production or testing process that while innocuous to a normal user could be devastating for the forensic investigator.

When drives are imaged their bit-for-bit representations are captured in a data file. The file can then be used to restore duplicate copies of the original hard drive to other drives. In addition, imaging systems usually provide analysis tools that allow the user to analyze the contents of the image file.

There are several advantages to imaging a drive instead of cloning a drive. First, drive images are more portable, since the user can use a drive of any size on which to capture the image file. It is not necessary that the capture drive match the geometry of the source drive. Second, drive images are more durable since, it is less likely that their contents can be altered simply by operating the computer on which the imaging system resides. Therefore, the images are easily moved from system to system, since they are a data file.

 

< Previous  Next >

 

Printable Version
Printable Version

 

When Every Move Matters

2550 Northwinds Parkway, Suite 275, Alpharetta, Georgia 30004
Copyright 2008 K&F Consulting Inc. This site is for informational purposes only. For technical advice please contact a representative.