style="margin-top:70px;" Clickstream


Homeland Security's Poor Data Quality May Land You in Jail

The various security laws like the USA Patriot Act can show a darker side to data quality. Following up on my prior post about data quality and how bad data makes its way through a chain of computer systems, imagine the impacts of flawed data when that data is used for law enforcement purposes. This is an area where an error can destroy someone's life.

Because Homeland Security and the TSA (and the FBI and the CIA and local and state police departments) are all collecting data - and thanks to the Patriot Act are free to spy on citizens who are not suspects in any criminal investigation - there is real danger when inaccuracies in one agency's data get widely dispersed. Imagine if bad data is propagated from one system to other databases, and from there to even more databases. Even if the originating agency corrects their data, the corrections may not make their way to all the replicated copies or may show up too late to do any good. This gives me the screaming heebie-jeebies.

The case of the David Nelsons is a perfect example of what can happen when bad data gets into a system. Anyone named David Nelson was targeted for special scrutiny to the point of missed flights and hours-long detention until the enforcers realized this was not the David Nelson they were interested in. Amazingly, the database did not bother to identify specifics, like a middle name or physical address or drivers license number. This meant that anyone named David Nelson from Oregon is suspect.

To make matters worse, there was no way to alert airports further down the line that the current person is not an "interesting" David Nelson and has already been screened. Some people were subject to repeated detentions as they changed planes on subsequent legs of the same flight.

This highlights one of the key problems: the multiple databases and the disconnected nature of systems. Even if one airport cleared David Nelson, the next airport did not know this other than via the obvious fact that he just got off a plane and was booked on another segment. Each airport is an island, and each island has its own copy of the bad data.

We have multiple airports all using data from the same source: the TSA. Nobody knows where the TSA got their data because they keep their "can't fly" criteria and lists secret, even though this practice makes flying less secure by making it easier for someone to subvert the system. This is typical of most government attempts to increase security over the past two years.

If this happens to you, assuming you can get the information corrected, you will probably still be fighting the bad data sitting in the backwaters of some regional TSA office and find yourself unexpectedly detained.

Now magnify this annoying but minor incident with some of the other federal efforts, like the "Total Information Awareness" program that was renamed the "Terrorist Information Awareness" program in an effort to make everyone feel better about the Orwellian goals. The feds will build a database on everyone in the US, just in case the data might be useful, and share that with other shadowy federal organizations. Now the bad data could be feeding into police surveillance programs, suspect questioning or even detentions. And there is no mechanism in place to review the data for accuracy or correct it and any downstream uses.

Each error in the data has the potential for a huge cost in misdirected law enforcement, diverting security efforts and making us less secure. And bad data is a very difficult, almost intractable problem, particular when secretive government agencies are involved. Let's hope these massive surveillance data warehouse projects are stopped before too many people are sent to Cuba because of bad data.

Comments: Post a Comment


Data warehousing, business intelligence, IT strategy and architecture, and occasional interesting bits.

Subscribe to XML feed

Bio / About Me

Check out my book

Clickstream data warehousing book cover Buy clickstream data warehousing from

Search this site or  the web

Site search   Web search
powered by FreeFind
Popular Posts
Primate programming.
Why development in crunch mode doesn't work.
Enterprise data modeling sucks big rocks.
XP Exaggerated.
Ping-pong in the matrix.
Time management for anarchists.
Is Ab Initio worth evaluating?
Job posting: omniscient architect.
Why hiring more sales people won't grow revenues faster.
Some resources for Open Source CMS.

Reading List
The Cruise of the Snark
Blue Latitudes
Everyone in Silico
The Klamath Knot
Swarm Intelligence (Bonabeau)
A three year backlog of F&SF

Listening List
Toots and the Maytals
The Buena Vista Social Club
American Idiot

Watching List
Winged Migration Quicktime trailer
Ghengis Blues
Howl's Moving Castls
A Bronx Tale

Daily KOS
Due Diligence
Boing Boing
Kevin Kelly (Recomendo)
Not Geniuses
3 Quarks Daily

War in Context
Valmiki's Ramayana
Choose the Blue
Third Nature
Mark Madsen
The Data Warehouse Institute
James Howard Kunstler
Clickstream Data Warehousing
Technorati Profile

04/01/2003 - 05/01/2003 05/01/2003 - 06/01/2003 06/01/2003 - 07/01/2003 07/01/2003 - 08/01/2003 08/01/2003 - 09/01/2003 09/01/2003 - 10/01/2003 10/01/2003 - 11/01/2003 11/01/2003 - 12/01/2003 12/01/2003 - 01/01/2004 05/01/2004 - 06/01/2004 06/01/2004 - 07/01/2004 07/01/2004 - 08/01/2004 08/01/2004 - 09/01/2004 09/01/2004 - 10/01/2004 10/01/2004 - 11/01/2004 11/01/2004 - 12/01/2004 12/01/2004 - 01/01/2005 01/01/2005 - 02/01/2005 02/01/2005 - 03/01/2005 03/01/2005 - 04/01/2005 05/01/2005 - 06/01/2005 06/01/2005 - 07/01/2005 07/01/2005 - 08/01/2005 08/01/2005 - 09/01/2005 09/01/2005 - 10/01/2005 10/01/2005 - 11/01/2005 11/01/2005 - 12/01/2005 12/01/2005 - 01/01/2006 01/01/2006 - 02/01/2006 03/01/2006 - 04/01/2006 05/01/2006 - 06/01/2006 06/01/2006 - 07/01/2006 07/01/2006 - 08/01/2006 08/01/2006 - 09/01/2006 09/01/2006 - 10/01/2006 10/01/2006 - 11/01/2006 01/01/2007 - 02/01/2007 02/01/2007 - 03/01/2007 03/01/2007 - 04/01/2007 04/01/2007 - 05/01/2007 05/01/2007 - 06/01/2007 06/01/2007 - 07/01/2007 07/01/2007 - 08/01/2007 08/01/2007 - 09/01/2007 09/01/2007 - 10/01/2007 10/01/2007 - 11/01/2007 11/01/2007 - 12/01/2007 12/01/2007 - 01/01/2008 01/01/2008 - 02/01/2008 02/01/2008 - 03/01/2008 03/01/2008 - 04/01/2008 08/01/2008 - 09/01/2008 06/01/2009 - 07/01/2009 08/01/2009 - 09/01/2009 10/01/2009 - 11/01/2009 01/01/2010 - 02/01/2010 09/01/2011 - 10/01/2011 04/01/2013 - 05/01/2013

Powered by Blogger.

Creative Commons License
This work is licensed under this Creative Commons License except where indicated.