Investigating agencies noted that the presence of e-mails at Andersen and other companies involved that asked the email recipient to destroy documents, was unwarranted. A clear intent of hiding crime based on circumstantial evidence was present.
Following the detection of fraud, US Congress made it obligatory for companies to put in place internal compliance measures. It set standards on when and what documents must be retained by companies. In an atmosphere that encouraged corporate whistle blowing, the Securities and Exchange Commission and U.S. Sentencing Commission made it attractive to take compliance seriously. The federal bodies evolved policies that, in case a dispute arises between companies or a company and the state, give credit to companies that implement reasonably effective internal controls.
|
|
As no company wants to attract punishment for corporate malpractices, good business has now become equated with good compliance.
The measures therefore comprise credits for compliance and punishment, both statutory and by market forces. But crimes can and are likely to be committed for various reasons.How does one find them as there are millions of documents on a corporate network?
With corporate scandals hitting the headlines almost daily, responding to the constant threat of litigation is a prime concern for corporate counsel and law firms alike. This is why discovery and risk assessment are playing an important in large-scale litigation.
This has made data mining a critical application. S search and detect software engines such as Stratify Legal Discovery help in doing this.
According to general manager Parveen Mittal, while normal search algorithms, such as Google Search, are keyword-based, the Stratify engine discovers relationships. eDiscovery can ferret out dubious data and actions that might have been pushed under the carpet by perpetrators of corporate crime.
eDiscovery uses Bayesian statistical methods to look for associations among words and brings together groups of words that have the closest association in a given context.
In a Bayesian distribution, a datum is plotted in a relevancy map. For e.g., if Tendulkar opened a hotel, this information will appear close to both cricket and hotels. Given the other categories in the set of data being mapped, data that are closest in relevancy of association would coalesce into associated drops, which is a filter of the best relevancy for a specific data. 数据挖掘实验室
The application of this methodology can be seen in action in finding documents that could point to anomalous documents among millions of stored unstructured data. The method translates the bubbles into a taxonomy. Since this is independent of language the method can work with any language. It has nothing to do with semantics.

