Canadian Lawyer

March 2012

The most widely read magazine for Canadian lawyers

Issue link: https://digital.canadianlawyermag.com/i/56644

Contents of this Issue

Navigation

Page 23 of 47

TECH SUPPORT Indexes are built using index engines, and different document review tools contain different index engines and not all index engines behave the same way, or include the same words or characters in their lists. keyword, the computer looks first to the index and then uses the index to locate and highlight the indexed words inside the documents. If a word is not in the index, the computer may not be able to find it in the documents. That is why a well-built index is essential to search. Indexes are built using index engines, and different document review tools contain different index engines and not all index engines behave the same way, or include the same words or characters in their lists. It is therefore important to check how the program is building that index if you want to really understand how and whether your searching will generate the results you want. Start by asking what the index engine is in your tool, and what it can index. Are there some words or figures it will not index? Can parts of words or numbers be indexed? What about punctuation, special characters, or elements of for- eign languages? Second, determine whether all the documents in your database that con- tain text are searchable. A document that contains text in the image may not, in fact, have that text available to the index engine, and that text is therefore not searchable. For example, pdf documents, which are created or converted to images, are not necessarily searchable. I encounter this often with pdfs that are attachments to e-mails. Often, these pdfs have been created on an office scanner which has not had its settings configured to create search- able text within the scanned document. When there are unsearchable files in your document population, you may wish to run a secondary process across those documents to make them search- able. Two related processes — optical character recognition and optical word recognition — can make the letters and words in those images available to the index engine, and therefore searchable. Most litigation support systems that process files can produce a report that identifies files and file types that have not been indexed. Where a file type that could contain or that does contain text exists, but is un-indexed, you can send those files for additional processing, and rerun the index engine across those documents (or across the extracted text 24 M A RCH 2012 www. CANADIAN Lawyermag.com LL-PROMO_CL_Mar_12.indd 1 12-02-15 2:28 PM

Articles in this issue

Links on this page

Archives of this issue

view archives of Canadian Lawyer - March 2012