style="margin-top:70px;" Clickstream


Why You Should Avoid Enterprise Metadata Projects

Going along with the post on enterprise data modeling earlier this month, I should include metadata projects. It's a rare metadata project that is done properly and makes sense. Properly =
1. clear goals
2. achievable goals
3. driven by a distinct need
4. has resources / ability to be maintained when project is complete

Most projects attempt to capture metadata in a repository where it will be useful to all. It usually ends up being useful to none. This is because it's not actively linked with the applications in the organization. If you rely on developers to notify the repository of every schema or rule change, you'll have metadata that's as good as the documentation on most internally-developed IT applications.

The only really useful repositories I've seen are linked to ETL or data integration applications. Projects focused on doing just a metadata repository in the absence of active integration with infrastructure or applications results in failure.

I came across a great metadata rant the other day that discusses many of these problems, even though the metadata being described is mostly web-page metadata. Here's a snippet:
People are lazy

You and me are engaged in the incredibly serious business of creating information. Here in the Info-Ivory-Tower, we understand the importance of creating and maintaining excellent metadata for our information.

But info-civilians are remarkably cavalier about their information. Your clueless aunt sends you email with no subject line, half the pages on Geocities are called "Please title this page" and your boss stores all of his files on his desktop with helpful titles like "UNTITLED.DOC."

This laziness is bottomless. No amount of ease-of-use will end it. To understand the true depths of meta-laziness, download ten random MP3 files from Napster. Chances are, at least one will have no title, artist or track information -- this despite the fact that adding in this info merely requires clicking the "Fetch Track Info from CDDB" button on every MP3-ripping application.

Short of breaking fingers or sending out squads of vengeful info-ninjas to add metadata to the average user's files, we're never gonna get there.

This Is Not Me Dancing

He needs to change his name.
It's embarassing when people mix us up.
Mark Madsen dancing

Academics Say Enterprise Data modeling Sucks Big Rocks

I was browsing MIS Quarterly archives and stumbled across an academic paper that confirmed a long-held belief - Strategic Data Planning, aka Enterprise Data modeling, aka many other TLAs is an expensive waste of time and, ahem, "not the best way to develop a data architecture." From their abstract:
"In spite of strong conceptual arguments for the value of strategic data planning as a means to increase data integration in large organizations, empirical research has found more evidence of problems than of success." [emphasis mine]
A lot of data warehouse projects used to start off with a big data-model-in-the-sky effort. Most people now understand that data warehouse and data integration efforts are use-based more than they are data-model-driven.

Sadly, companies like IBM were foisting this crap on organizations as recently as three years ago. I have proof in the form of three big, useless binders that were supposed to help a project I was working on. I suspect IBM and many other companies continue to foist away today.

I have to include some choice quotes from the paper. How many times have we heard "try harder" or its relative "work smarter, not harder" in our work? In the case of SDP, do both!
"Although some previous researchers have questioned the basic efficacy of the SDP method, there has usually been the implicit assumption that if problems in the implementation of the approach are identified and addressed, the expected benefits will follow. In a sense, practitioners have been encouraged to "try harder." This paper suggests that "trying harder" may not be the answer in many situations."
This suggestion of how these types of projects are staffed rings true. I call it the "Island of Misfit Toys" staffing model.
"Proposition #13: The time required by individuals may self-select the wrong participants.
Prior research and the cases in this article suggest that the SDP methodology requires a major time commitment from its participants, often as much as half a year of effort. Because the time of the most insightful and capable individuals is usually in high demand, the methodology may self-select the wrong people--those who can be spared because they are not involved in other critical business issues. The result could be less insightful and less creative solutions." [there's an understatement!]
The conclusion is a real zinger:
"The assumption is that given the right participants and cooperation from the rest of the organization, the resulting architecture will be correct enough to serve as a blueprint for more detailed data and systems design. Several findings from these nine cases seem to challenge that basic assumption. First, the participants themselves express uneasiness about the correctness of their architectures. Second, systems developers complain that the architectures are too high-level to be useful in designing systems, or they contain errors that are only apparent when a more detailed analysis is conducted. And third, other organizations using quicker, easier methods (such as "stealing" an architecture, or putting six IS professionals in a room for a week to create one), are able to develop similar architectures at a fraction of the cost."[6 people, 1 week? Ouch!]
Admittedly, Strategic Data Planning has evolved since this article was written. It's too bad that for many practitioners it is still largely an academic exercise with little practical use, like the IBM project I mentioned.

The problem is that 'strategic' and 'planning' were left behind by most practitioners as they focused on 'data'. With proper focus on results (the strategic and planning parts) rather than work products (data) some usefulness can be salvaged from SDP.

I made "Strategic data planning: Lessons from the field" available so you can enjoy it as much as I did, but you should still support your local library.

Government Provides Official Count of Pointy-Haired Bosses

[Pointy Haired Boss]
I came across an analysis of data from the Bureau of Labor Statistics which demonstrates the ubiquity of PHBs.

"It's pretty easy to tell from looking at it that lots of people have jobs that don't pay a lot of money, and a few people have jobs that pay a lot of money. Nothing really unexpected here, I guess.

But what is that dot... that single, attractive dot that employs nearly 2 million people and pays nearly $90,000? Surely, if there was a job worth having, it would be that one. Lots of people do it, so it must not require a lot of skill, yet it pays better than the vast majority of other jobs out there."
A better way to display the data is to have a scatterplot with dots proportional to the number of jobs. Unfortunately I ran into the same problem with Excel's crappy graphing he did, and I don't have my Teradata data mining application handy. Check out Why Your Pointy Haired Boss Is A Mathematical Certainty


Data warehousing, business intelligence, IT strategy and architecture, and occasional interesting bits.

Subscribe to XML feed

Bio / About Me

Check out my book

Clickstream data warehousing book cover Buy clickstream data warehousing from

Search this site or  the web

Site search   Web search
powered by FreeFind
Popular Posts
Primate programming.
Why development in crunch mode doesn't work.
Enterprise data modeling sucks big rocks.
XP Exaggerated.
Ping-pong in the matrix.
Time management for anarchists.
Is Ab Initio worth evaluating?
Job posting: omniscient architect.
Why hiring more sales people won't grow revenues faster.
Some resources for Open Source CMS.

Reading List
The Cruise of the Snark
Blue Latitudes
Everyone in Silico
The Klamath Knot
Swarm Intelligence (Bonabeau)
A three year backlog of F&SF

Listening List
Toots and the Maytals
The Buena Vista Social Club
American Idiot

Watching List
Winged Migration Quicktime trailer
Ghengis Blues
Howl's Moving Castls
A Bronx Tale

Daily KOS
Due Diligence
Boing Boing
Kevin Kelly (Recomendo)
Not Geniuses
3 Quarks Daily

War in Context
Valmiki's Ramayana
Choose the Blue
Third Nature
Mark Madsen
The Data Warehouse Institute
James Howard Kunstler
Clickstream Data Warehousing
Technorati Profile

04/01/2003 - 05/01/2003 05/01/2003 - 06/01/2003 06/01/2003 - 07/01/2003 07/01/2003 - 08/01/2003 08/01/2003 - 09/01/2003 09/01/2003 - 10/01/2003 10/01/2003 - 11/01/2003 11/01/2003 - 12/01/2003 12/01/2003 - 01/01/2004 05/01/2004 - 06/01/2004 06/01/2004 - 07/01/2004 07/01/2004 - 08/01/2004 08/01/2004 - 09/01/2004 09/01/2004 - 10/01/2004 10/01/2004 - 11/01/2004 11/01/2004 - 12/01/2004 12/01/2004 - 01/01/2005 01/01/2005 - 02/01/2005 02/01/2005 - 03/01/2005 03/01/2005 - 04/01/2005 05/01/2005 - 06/01/2005 06/01/2005 - 07/01/2005 07/01/2005 - 08/01/2005 08/01/2005 - 09/01/2005 09/01/2005 - 10/01/2005 10/01/2005 - 11/01/2005 11/01/2005 - 12/01/2005 12/01/2005 - 01/01/2006 01/01/2006 - 02/01/2006 03/01/2006 - 04/01/2006 05/01/2006 - 06/01/2006 06/01/2006 - 07/01/2006 07/01/2006 - 08/01/2006 08/01/2006 - 09/01/2006 09/01/2006 - 10/01/2006 10/01/2006 - 11/01/2006 01/01/2007 - 02/01/2007 02/01/2007 - 03/01/2007 03/01/2007 - 04/01/2007 04/01/2007 - 05/01/2007 05/01/2007 - 06/01/2007 06/01/2007 - 07/01/2007 07/01/2007 - 08/01/2007 08/01/2007 - 09/01/2007 09/01/2007 - 10/01/2007 10/01/2007 - 11/01/2007 11/01/2007 - 12/01/2007 12/01/2007 - 01/01/2008 01/01/2008 - 02/01/2008 02/01/2008 - 03/01/2008 03/01/2008 - 04/01/2008 08/01/2008 - 09/01/2008 06/01/2009 - 07/01/2009 08/01/2009 - 09/01/2009 10/01/2009 - 11/01/2009 01/01/2010 - 02/01/2010 09/01/2011 - 10/01/2011 04/01/2013 - 05/01/2013

Powered by Blogger.

Creative Commons License
This work is licensed under this Creative Commons License except where indicated.