Canadian Lawyer

Page 17 of 47

OP I N IO N BY DERA J. NEVIN TECH SUPPORT If you can't look properly, Ensure your e-discovery is searchable by understanding your database's search properties. O ne of the many useful attributes of e-discovery is the searchability of the data. For some, search- ability is the whole point of e-discovery. Let's look at how search- ability arises in electronic data sets. Almost by definition, e-discovery involves an electronic "data set," which refers to the information collected (i.e. the discovered material). The characteristics of an electronic data set are different from those of paper data sets, with searchabil- ity as one of the most useful aspects for lawyers. At the most basic level, this is obvious by comparing the searchability of paper with electronic data. For example, if you have several boxes of documents, it can take a while to determine how many and which documents contain the words "mortgage" or "transferor." However, if the documents are in an electronic database, you can quickly find relevant ones. Electronic data sets usually consist of electronic images and/or native data, which have different properties that not only affect what is searchable but also what steps you need to take to ensure the data set is searchable. Electronic images are scanned copies of paper documents. In the litigation sup- port industry, paper is usually scanned into a .tiff format (tag image file format). Tiffs are image files that are not search- able unless made so by the dual processes of "coding" — adding information about the image in a database, i.e. metadata/data about data — and "OCR" — optical char- acter recognition, a technology used with scanners so paper information will not need to be retyped into the computer. In coding, the data added can be "objective" — readily ascertained, such as a date — or "subjective" — require legal or factual analysis, such as a privilege or relevance call. In most jurisdictions, it's now fairly standard or required by practice direction to code the following metadata fields: document date, author, recipient, document type, and document title. Other common fields to code are number of pages, cc, bcc, parent-child relationship — and, now that e-mails are frequently printed, scanned, and coded (more on this 18 M A RCH 2011 www. CANADIAN Lawyermag.com practice in a later article), it's also common to see to, from, sent date, and subject line coded, or rolled into the other columns. When these metadata fields are includ- ed in a database that contains imaged documents, even one as rudimentary as a Word table with hyperlinked image files, they can facilitate searches of the associat- ed images. In Summation, one of the most commonly used document review plat- forms in Canada, this metadata is placed into a kind of Excel-like table, called a data file. A "Summation load file" is a database file transfer format that creates a concor- dance of images to the data file, usually through a document ID number unique to each image with an associated row in the data file. Coding and OCR are both neces- sary to ensure the ability to search across any words that may be found within the image. Together, they facilitate search of electronic images when these various ele- ments are linked together within a review database. As litigation support technology has evolved, so has searchability: you can now concurrently or separately search text y ou ma y not find enrico VArrAsso

Articles in this issue

Archives of this issue

view archives of Canadian Lawyer - March 2011

March 2011

Contents of this Issue

Navigation

Page 17 of 47

Articles in this issue

Archives of this issue