style="margin-top:70px;" Clickstream

Clickstream

     
XP Exaggerated

[Agile Head]XP Exaggerated is a very funny send-up of XP. Whittle away some time with articles like this:
Mr. Mark Flopberry was initially excited when his team decided to transition to XP. "Now I can sit next to Angelica five hours a week, and sometimes she looks at me." he said when interviewed. However, all is not 'peaches and cream' on Mr. Flopberry's team. Last Friday, several members of the team accused Mr. Flopberry of breaking the build on three occasions during the week. Mr. Flopberry appeared stunned by the accusation and claimed that a bearded alien, who called himself 'Chuck', had come while his partner went to the restroom and forced him to break the build.
It's The Onion for XP.


XP: Not for Data Warehouses

I've always been skeptical of development methodologies. They ignore the fact that different organizations develop software under different constraints and in different ways. What's important is the development process, and that it address the needs of the organization while managing the process well enough to produce useful, reasonable-quality goods.

Methodologies occasionally work if they are focused on specific problem domains, such as the data warehouse development methodology espoused in The Data Warehouse Lifecycle Toolkit. Methodologies should be like training wheels: you use them to learn the problem domain and ways to go about solving the problems in that domain, then you take the wheels off and evolve your process. More often, methodologies are top-down efforts to improve problems that are caused by ham-handed management.

That said, I agreed to let a development manager try out XP on portions of the ETL development for a data warehouse project. The first problem was that the user was really me since the users specified the requirements we did the dimensional models for. That meant the user focus of XP was following the data mappings that populated dimension tables with the proper data.

The result? Simple tasks took three times the duration required to develop in a more standard fashion. The total lack of documentation meant that when data anomalies popped up the developers had to read code to figure out how to deal with them. Some simple documentation could have specified that transactions X, Y and Z were handled, so that when transaction A showed up we knew something had been missed in the data feed. User stories were basically "get data from these places to those places, and do these things to make sure it's clean", not much different from the data mapping rules and flowcharts.

The worst part is what should be XP's strength: testing. This was a joke because the expected results of the test require that you pull production data and process it to get the correct output. Without the ETL program to generate that output, you have to work out the results manually based on your understanding of the source data. That's fine when you eyeball the data and work out the desired results. But it does not account for data quality problems. A few bad values in a column can lead to a join failure, so that data is missing. But the test will never catch that.

The developers couldn't develop every possible test case for full coverage testing because a simple dimension extract might pull 15 columns from 6 different tables, joining via 8 different columns, with hundreds of potential data values for each column. The combinatorial explosion of data values and relationships makes this extremely difficult, not to mention that many of the test cases should generate error conditions which are so unlikely in production data that it's hardly worth trying to catch them.

Lastly, real production data changes over time so there's no way to guarantee that today's test cases cover tomorrow's production data. ETL programs don't have the luxury of controlling the inputs, only the outputs. The primary goal is winnowing bad data so that only the good gets through and the bad is flagged with a reason for rejection so it may be corrected at the source.

The non-XP half of the development team had all the core scheduling, dependency checking and logging code and three dimensions done before the first dimension under the XP process saw the light of day. That one dimension passed all its tests, but it failed the first time it hit production data because of a data quality problem.

We stopped using XP at this point, much to the relief of the developers. The kicker? These developers were all trained on location by Kent Beck and one of his associates for another project, but we picked them up while they were idle.
There are some domains for which XP does not work, and systems integration - at least of the type done in data warehousing - appears to be one of them.


Home

Data warehousing, business intelligence, IT strategy and architecture, and occasional interesting bits.


Subscribe to XML feed


Bio / About Me


Check out my book

Clickstream data warehousing book cover Buy clickstream data warehousing from Amazon.com

Search this site or  the web



Site search   Web search
powered by FreeFind
Popular Posts
Primate programming.
Why development in crunch mode doesn't work.
Enterprise data modeling sucks big rocks.
XP Exaggerated.
Ping-pong in the matrix.
Time management for anarchists.
Is Ab Initio worth evaluating?
Job posting: omniscient architect.
Why hiring more sales people won't grow revenues faster.
Some resources for Open Source CMS.

Reading List
Quicksilver
The Cruise of the Snark
Blue Latitudes
Everyone in Silico
The Klamath Knot
Swarm Intelligence (Bonabeau)
A three year backlog of F&SF

Listening List
Toots and the Maytals
The Buena Vista Social Club
American Idiot

Watching List
Winged Migration Quicktime trailer
Ghengis Blues
Howl's Moving Castls
Hero
A Bronx Tale

Blogroll
Daily KOS
Due Diligence
Boing Boing
Kevin Kelly (Recomendo)
Not Geniuses
3 Quarks Daily
Futurismic
Fafblog
Kottke.org

Miscellany
War in Context
Salon.com
Valmiki's Ramayana
Choose the Blue
Third Nature
Mark Madsen
The Data Warehouse Institute
James Howard Kunstler
WorldChanging
/.
Clickstream Data Warehousing
Technorati Profile

Archives
04/01/2003 - 05/01/2003 05/01/2003 - 06/01/2003 06/01/2003 - 07/01/2003 07/01/2003 - 08/01/2003 08/01/2003 - 09/01/2003 09/01/2003 - 10/01/2003 10/01/2003 - 11/01/2003 11/01/2003 - 12/01/2003 12/01/2003 - 01/01/2004 05/01/2004 - 06/01/2004 06/01/2004 - 07/01/2004 07/01/2004 - 08/01/2004 08/01/2004 - 09/01/2004 09/01/2004 - 10/01/2004 10/01/2004 - 11/01/2004 11/01/2004 - 12/01/2004 12/01/2004 - 01/01/2005 01/01/2005 - 02/01/2005 02/01/2005 - 03/01/2005 03/01/2005 - 04/01/2005 05/01/2005 - 06/01/2005 06/01/2005 - 07/01/2005 07/01/2005 - 08/01/2005 08/01/2005 - 09/01/2005 09/01/2005 - 10/01/2005 10/01/2005 - 11/01/2005 11/01/2005 - 12/01/2005 12/01/2005 - 01/01/2006 01/01/2006 - 02/01/2006 03/01/2006 - 04/01/2006 05/01/2006 - 06/01/2006 06/01/2006 - 07/01/2006 07/01/2006 - 08/01/2006 08/01/2006 - 09/01/2006 09/01/2006 - 10/01/2006 10/01/2006 - 11/01/2006 01/01/2007 - 02/01/2007 02/01/2007 - 03/01/2007 03/01/2007 - 04/01/2007 04/01/2007 - 05/01/2007 05/01/2007 - 06/01/2007 06/01/2007 - 07/01/2007 07/01/2007 - 08/01/2007 08/01/2007 - 09/01/2007 09/01/2007 - 10/01/2007 10/01/2007 - 11/01/2007 11/01/2007 - 12/01/2007 12/01/2007 - 01/01/2008 01/01/2008 - 02/01/2008 02/01/2008 - 03/01/2008 03/01/2008 - 04/01/2008 08/01/2008 - 09/01/2008 06/01/2009 - 07/01/2009 08/01/2009 - 09/01/2009 10/01/2009 - 11/01/2009 01/01/2010 - 02/01/2010 09/01/2011 - 10/01/2011 04/01/2013 - 05/01/2013


Powered by Blogger.

Creative Commons License
This work is licensed under this Creative Commons License except where indicated.