K&F Consulting Inc.

What Evidence is Obtained Through Computer Forensics?
 By Todd L. Dietrich and Gregory Fordham
(Page 4)
 

    • Stay in Your Own Backyard

Many litigators have previously tried acquiring their opposition’s electronic data.  Typically, they have failed for one of two reasons.  First, they were not able to exploit the data once they had it.  That problem is discussed in the section on exploitation and analysis.

The second reason that they failed was they did not specify that the data be delivered in a usable form.  As a result, the data was delivered in its native format.  To use the data in its native format the litigator would have to have the software used by the opposition to run the data.

The solution to this latter problem is to go with what you know and play in your own backyard.  This is easily accomplished by requesting the data in a particular, yet standard, format.  That way the litigator can take the data and use it in any system that he might want to use to examine the data.

ASCII [American Standard Code for Information Interchange] format, either fixed length or delimited, is the most widely accepted data format.  Every database system should be able to save its data in ASCII format and any system used by the litigator for analyzing the data should be able to import data in this format.

Other than ASCII, many systems are designed to download data into a number of other popular data formats.  Some of the other popular database formats are dBase, Microsoft Access, Corel Paradox or any of the popular spreadsheet programs like Lotus 1-2-3 or Microsoft Excel.

Of course getting the data and using it can still be two different things.  For example, just because the data is produced in a spreadsheet format like Microsoft Excel does not mean that it can be analyzed in a spreadsheet.  Typically, there are constraints regarding how many rows a spreadsheet can contain even though there are no constraints about how many rows can be exported to a file in Excel format.  So, the litigator should be sure that his analysis tools can match his data delivery.

    •  Be Sure Not to Pack

A number of database systems are something like hard drives–deleting a record does not actually delete the record.  Instead, deleting a record tags the record for deletion but the actual deletion does not occur until the database has been “packed”.  When those types of databases are being used the litigator should consider getting those records tagged for deletion before the table has been packed.

In these types of databases the records tagged for deletion can be recovered anytime before the database table is packed.  Litigators can identify which records had been tagged for deletion if they ask that the opponent’s database records be provided two different ways.  First, ask for the extract from unrecovered database tables and then obtain the same extraction from the recovered database tables.  The difference be­tween the two data sets would be the records marked for deletion but not yet packed.

    •  No Limits

Remember that there is typically more data in the electronic data than is visible in the paper version.  For example, there could be system generated input dates, record tags, and various audit trail information.  So, it is a mistake for the litigator to try and list the individual data elements to be delivered.  Instead, the litigator should ask for all the data fields (columns) within a table and all the records (rows) within a table matching the relevant case criteria.

When dealing with relational databases, which most databases used in business applications are relational, remember that they are not comprised of one data table but rather numerous data tables that can be related to one another in order to produce an outcome.  When the tables are combined, “related”, they produce the complete transaction cycle.  So, be sure to ask for all the tables including any lookup and data validation tables.

It is also best to ask for the tables in normalized form.  With respect to relational databases, “normal” is a term of art.  Relational database tables are “normalized” as part of their design process to optimize a number of different attributes; space minimization, speed maximization, facilitate analysis and facilitate maintainability.  By asking for normalized tables the litigator is asking that the data be provided in the same optimum configuration as was designed for its use.  This should facilitate the litigator’s subsequent analysis and also prevent the opposition from combining or denormalizing tables.

    •  Mark the Trail

Relational database systems, particularly when they model complex business processes, can be complex themselves.  The litigator’s analysis of these complex models can be expedited if the trail is well marked.  There are three types of documentation that the litigator will ideally want to acquire in order to mark the trail.

Since the discovery data is produced by the opposition, the first piece of documentation that the litigator will want is their production procedures.   He can get these by asking for the SQL scripts used to prepare the data down­loads.  Of particular interest is  whether the opposition specifically identified the particular fields to be extracted or whether they used the “*” character, which is the SQL equivalent of all fields.  If the opposition specifically identified the fields to be produced they could be hiding important data elements.

Also, the litigator is interested in any selection criteria that the opposition may have used in its SQL scripts to limit the result set.  These limitations will be obvious in the “WHERE” or “HAVING” sections of the SQL script, as well as, the “FROM” section, if other than full table joins were used.

The second thing that the litigator will want the opposition to produce are the database table schemas also known as dataset descriptions.  These descriptions identify all of the fields and their attributes in each table provided.  The litigator can use these to confirm receipt of all data fields and to document how the data is actually arranged when he starts the analysis. 

The schemas may also identify the significance of any codes that have been used as semaphores.  For example, in a payroll system the difference between regular hours and overtime hours may be designated by a particular code.  If the universe of such codes is really small the database designer may have chosen not to use another table to identify the significance of these codes.  Rather, they may be documented in the schemas themselves.

The final type of documentation that the litigator should obtain is any data diagrams that illustrate the relationships of the tables, as well as, any other database documentation.  The preceding example of a relational database design is the type of additional database diagrams that the litigator should try to acquire.  

If the litigator is not able to obtain any of the above documentation, his excursion into the electronic data of the opposition’s databases is not foiled.  An experienced database designer can still re-engineer the design and make the data useful.  Particularly if the data has been delivered in normalized form.  After all, the experienced hacker does not need the password to a computer.  It just takes him a little longer to gain access without it.

    •  Database Analysis

More than likely everyone’s database data will be very different.  Not only in content but in structure as well.  Just because two companies use the same accounting software does not mean that their financial data or their accounting system will be similar.  So, it is impossible to devise any standardized analysis for databases other than some tests to determine that the data delivery was complete and reliable.

As mentioned earlier, one of the things that the litigator can do to confirm a complete database delivery is to analyze the SQL scripts to determine how the data was selected and whether the selection criteria are compliant with the production request.  Next the litigator can examine the schema and confirm that the data delivered matches the schemas.

Next by examining the database structure the litigator can check for widows and orphans, unmatched foreign keys, fields with null values and key fields and table indexes.  The existence of widows and orphans means that the produced data contains records that cannot be related to anything else.  This could be a sign that there was an error in the logic of the selection criteria or that otherwise responsive data has been omitted.

Unmatched foreign keys means that there are table fields that are used to relate records in one table to the records in another table.  The fact that there are unmatched foreign keys means that either an entire table was not delivered or that selected records have not been delivered that otherwise would have met the criteria to deliver a database in normalized form.

Null values are fields in the database records with no value.  Null fields are different than fields with zero value.  Zero means zero.  Null means nothing.  The twelve rules for database design devised by E.F. Codd, one of the developers of the relational database model, include a rule that database fields should not be populated with null values.  Although this rule is not strictly followed by many database designers null values can cause problems for relational databases.  When the litigator finds tables with null values he must determine whether these are another signal that something is amiss with the data produced or whether it was simply the result of a designer who did not follow the rules.  If the null values occur on a foreign key field then there certainly is a problem.  Also, if the null values occur in important system fields such as userids, transaction dates, etc. then that is another signal that a problem is likely present with the data delivery.  If it just happens to occur in a field for a cell phone number then perhaps there simply is no cell phone number to be captured.

Next the study of index and key values can reveal whether records have been omitted and whether they were contemporaneously entered or entered in a later time period. For example, suppose that a particular record in a database table had a field for transaction date.  On the surface the date matches other records having similar dates for that particular transaction event.  A study of the key fields and table indexes might reveal that those values are inconsistent with a transaction having a date equal to the one in that particular record.  While this might not signal an omission of data it would signal that the opposition has falsely entered transactions into their database system.

Some other methods that can be employed to test the veracity of the database production is to compare the data structure of the database to data elements on system input forms and output reports.  If data is being captured or produced that does not appear in the database production then that is certainly something that should be subsequently investigated.

4. Deleted Data, Temporary Files, Hidden Files and Encrypted Files

Certain relevant data may not be easily visible, but that does not mean that it is not easily accessible.  A typical extraction process conducted by an e-discovery vendor will not capture deleted data, and may not capture temporary, hidden or encrypted files.  Even if they can be captured, the e-discovery vendor may not be prepared to process and analyze them.  Thus, the only sure way to capture these types of data is by having them acquired and analyzed in a forensic examination.

    • Deleted Data

      Deleted files are easily recovered due to the file system entries for them remaining intact, and including a flag which notes their status.  Numerous utilities both, forensic and non-forensic, exist to recover such files. 
    • Temporary Files

      Some applications, Microsoft Word for instance, create temporary (temp) files while an existing file is being edited.  One reason this occurs is so that all changes are written to the temp files, and not to the original file.  When the file is saved, the temp file becomes the file, and the original file is deleted.  Its position in the file system is replaced by the new version.  The prior version still exists on the media, but since there is no file system information noting its location on the media, it must be data carved in order to recover the file. 

      As a user works on a Word file over a period of time, numerous temp files will be created.  When the file is saved and then closed, all the temp files are deleted.  Using a data carving tool, these temp files can be recovered and the changes made to the file can become visible.
    • Hidden Files

      There are files and/or folders relating to the functioning of the operating system, which are normally hidden from view.  It is possible for a user to set the hidden attribute on any file they wish.  Thus, the file will not be visible unless the “Show Hidden Files and Folders” option is selected.  Many e-discovery extraction tools will not see, or capture hidden files.  Forensic tools will show all files on the media no matter what attribute is set.
    • Encrypted Files

      Files can be encrypted in a few different methods.  It may be something simple like password protection, or it may be a complete encryption.  If the file is simply password protected, numerous tools exist which crack passwords and open such files.  If however, the file has actually been encrypted with some sort of separate tool, breaking the encryption could be an extremely time and resource consuming task that may never complete in this lifetime.  Forensic examiners employ numerous methods to look for passwords stored on a hard drive in an attempt to circumvent having to conduct time consuming brute-force attacks.  However, if the password of the encryption program can not be located on the hard drive, any files encrypted by the program will effectively be irrecoverable. 

 

< Previous  Next >

Print Article
Printable Version

 

When Every Move Matters

2550 Northwinds Parkway, Suite 275, Alpharetta, Georgia 30004
Copyright 2008 K&F Consulting Inc. This site is for informational purposes only. For technical advice please contact a representative.