There is a widespread belief among lawyers and other professionals that investigators, armed only with special proprietary databases, can solve all kinds of problems other professionals cannot.
While certain databases are a help, we often tell our clients that even if we gave them the output of all the databases our firm uses, they would probably still not be able to come to most of the conclusions we do. That is because databases have incomplete, conflicting outputs. You need a knowledgeable person to weigh those outputs and come up with a “right” answer that is always, at first, a best guess that requires verification.
Commercial databases are also hobbled by legal and commercial restrictions that other large information resources are not. This is one of the reasons I have argued for several years now that increases in of artificial intelligence will increase – not diminish – jobs for humans.
Forget what you know about SQL and relational databases. Forget about the large universes of even millions of documents put into various artificial intelligence engines for expedited document review. When users populate and control a database, they can alter the content. They can control the programming to make the databases talk to each other.
Even in hospitals, with more than a dozen different information systems struggling to interconnect, it’s possible to imagine a smoothly-running network of networks given enough computing power and programming time.
The commercial databases that depend on credit-header information, utility bills, commercial mailing lists and more are different. Commercial databases used in investigations do not play nicely with other databases for two main reasons:
- Competitive Barriers
There is only one New York Secretary of State, or one recorder of deeds in a county. They are presumed to be correct as a matter of law as long as you get a certified copy of their documents.
The commercial databases are different in that they compete with one another. Databases do not share results so that we may sort out conflicts automatically. They do not suggest, for example, that if a John R. Smith of Houston is this John R. Smith on Walnut Avenue, then this John R. Smith owns the following companies in Nevada.
Instead, one database will tell you about the man on Walnut Avenue, and a different database may give you the Nevada company information and suggest that its owner lives not on Walnut Avenue but at another address in San Diego. A third database may tell you the same person who lived on Walnut Avenue in 2014 now lives in San Diego.
You will have to stitch it all together yourself. By you, I mean you a human being and not you, using some kind of databases version of kayak.com that assembles travel sites and hands you results in one convenient place.
Note that for someone searching for completeness, Kayak’s no model either. If you assume that every airline and hotel in the world is listed on Kayak, you are wrong. That is because Kayak is a business that makes money from the places it lists (it’s part of a profit-making company called Booking.com). If you go to another for-profit site, Travelocity.com, and enter flights leaving from New York for Dallas, you do not get all possible options. You can fly on Southwest between those two cities, but the other day Travelocity didn’t give you that choice. Tomorrow may be different if the market for airline pricing changes.
Commercial databases are like airlines and hotels: they too are profit-making enterprises. Just as American won’t let you board with a Delta ticket, databases don’t like to share either.
- Legal Impediments to Sharing
Even if databases wanted to share information, doing so would be fraught with legal difficulty. That is because the information they offer is accessible to licensees only. These users need a permissible purpose under federal law that governs credit reports, and the varying permissible purposes yield varying amounts of information.
Each database has to review the permissible uses you enter, and each has to make sure you are a paying customer. Databases could theoretically subcontract that job to a central entity that handles its competitors as well, but with just a handful of big competitors, there’s less incentive to outsource.
The other, larger problem an information “Kayak” would have is that the real travel Kayak just gives you a small number of data points: price, when you leave and when you arrive. Database output about a person, his known residences, phone numbers, associates and more is voluminous. How would you sort the differing outputs of different competitors? “Rank by accuracy” is a laughable non-starter.
All in all, if you make your living sorting through database output and then use that to check against public filings, litigation, licenses, news stories, blogs, videos and social media, be of good cheer.
The robots haven’t even come close to usurping your duties.