style="margin-top:70px;" Clickstream


Writing on Water

What a neat trick. This is the kind of thing researchers do when nobody is watching. In this case someone was watching, though.

Good Infographic

It's seldom that I come across an infographic in a newspaper that works. Usually full of chart junk and not communicating the underlying statisitics, they are rarely worth printing. The Indepedent has a good one showing who backs a cease fire in Lebanon and who doesn't.

Very effective at telling a story of numbers and meaning with a single image. [Image links to a larger version] Via kottke

Upcoming Speaking Events

It's the busy time of year. Here are my latest webcasts and speaking events:

Business Objects Webcast: Data Integration Best Practices, July 19, 2006
Join Business Objects and a panel of industry experts for an interactive discussion on data integration best practices. During this online event, audience members will have the opportunity to ask questions and dictate the flow and topics covered during the presentation.

Enterprise Integration Series: Evaluation Criteria for Selecting ETL Tools, August 2, 2006
A proof of concept is key to getting the ETL product that fits best in any architecture. Comparing the various available products can be challenging because the tools look similar, there is a wide range of costs, and differentiation between “low end” and “high end” products is often hidden in the features or in how the product works. A POC forces you to think about what you’re doing, thus helping to ensure that you appreciate the critical details of your environment and what you need to achieve.
[in reality, I'll spend about 15 minutes on POC and such, and the rest talking about evaluating DI platforms]

TDWI Summer World Conference, August 20-25, 2006
Monday: "What's Hot, What's Hype, What Now?" panel session for executives. We'll be talking about emerging topics in the market, including open source, BI software as a service, and what's happening with BI/DW technology.

Wednesday: Evaluating ETL Tools and Technologies
See... real software crashing!
Hear... cries of anguish as the client disconnects from the server!
Feel... the flames of overheating laptops as they try to run ETL jobs!

A full day on how to evaluate the tools, criteria, and a market overview. The entire afternoon is running through a live mini-proof of concept with Informatica, Business Objects and Microsoft. It's much more fun live. If you like watching un-canned demos.

Friday: Open Source Adoption in Data Warehousing and Business Intelligence
Open source software (OSS) has become a force in the commercial software industry. DW is not immune to the impact of open source, with developments in the past year affecting a range of different market segments. IT organizations are challenged with sorting through OSS to measure the risks, measure the rewards, and to decide what is worth implementing. This session will review technology adoption theory to frame the discussion, discuss what early adopters are doing, and review the state of relevant OSS projects for the BI and DW market.
Unfortunately I couldn't get any early adopters to come, so that part might be kinda short.

May make a side trip to a UC field research station east of San Diego to look at some amazingly advanced environmental monitoring and visualization. Or I might just go surfing. I'll be at the conference for the entire week, so if you're bored and want to do something, I have lots of downtime...

Ten Mistakes to Avoid When Selecting and Deploying ETL Tools

TDWI published my ETL "Ten Mistakes" paper and it's now up on their web site . Unfortunately, site registration is required. A PDF will be on the way in the near future and I'll post a link to it as soon as I have it. In the meantime, I'll pretend I'm Charles Dickens and serialize them here. A note to get things started:

You'd be surprised by how many companies just look at the two or three tools Gartner listed as "up and to the right", watch some demos, then spend a few hundred K$. Following a formal process is vital. I also think that companies spend too much time looking for the best tool, rather than the one that best fits their specific needs. A lot of time is wasted evaluating features that will likely never be used.


Data integration and extraction form the foundation for business intelligence and data warehousing projects. Making the right decisions when selecting and deploying this technology is key to a successful long-term implementation. We’ll take a look at 10 of the most common mistakes organizations make when selecting and deploying extract, transform, and load (ETL) products.

The proper frame for evaluating ETL products is “best fit” rather than finding a single “best product.” All products have their strengths and weaknesses; the goal of an evaluation is to identify those strengths and weaknesses and match them up with what is important for your specific project.

To do this requires that you follow a formal process of qualifying, comparing, and selecting vendors, and that you take into account both the immediate needs and the future evolution of your systems. If you avoid the following 10 mistakes, you should be on your way to finding the technologies that best fit your organization’s needs.

1. Failing to Follow a Formal Process

One of the most common errors that organizations make when selecting ETL products is not following a formal evaluation process. A list of criteria isn’t enough; it’s vital to take a series of steps to make sure you find the product that best fits your needs. The following process works well for most companies:

  1. Identify the selection committee members who will determine the evaluation criteria and rate the vendors.
  2. Determine the requirements that all products must meet in order to be considered.
  3. Research a list of vendors that meet your minimum set of requirements.
  4. Based on your research, narrow the list to three to five vendors that meet the minimum requirements. This is the shortlist that you’ll evaluate more thoroughly.
  5. Create a list of detailed evaluation criteria, including qualitative criteria such as the look and feel of the interface or how easy the product is to use. Define your method for rating product performance on these criteria, as well as a method for ranking the results.
  6. Meet with vendors and rate the products against your list of criteria based on their presentations and demonstrations. At this point, you may further eliminate products from consideration.

    Be flexible. After meeting with vendors and seeing demonstrations, you may discover criteria you had not thought to include in your evaluation. If everyone agrees that the criteria are reasonable, add them to your list and check back with the other vendors. You’ll find that evaluation is an iterative process, particularly if you haven’t seen most of the products before.

  7. Conduct a proof of concept and rate the products, or re-rate the products now that you have more detailed information.
  8. Rank the vendors and make your first and second choices. Choosing a second-place vendor can save you time if you fail to negotiate a deal with the first vendor.

Following a formal process ensures that your criteria are well thought out and prioritized before you begin looking at products. The process also ensures that products are examined at the same level of detail and compared equally. By creating a short project plan for the process, you’ll be able to keep the evaluation focused and complete it quickly.

Link to TDWI page

Far From Zero: NSA Chances of Mistakenly Identifying You as a Terrorist Via Mass Surveillance

Excellent short article on the lack of a statistical basis for finding terrorists in a total population by using mass surveillance. Conclusion: the chance of finding a small number of individuals in a large population via large-scale surveillance is approximately zero, with the corollary that the rate of false identifications will be very high. link to article

The good news is that a much higher rate of suspects means that such surveillance will work, albeit not for the original goal. Turns out the systems they are building (ala TIA which still lives on) will function well for things like political espionage. Or maybe they are suspecting a massive fifth column to infiltrate the US in the next year or so and haven't mentioned anything yet...

BPM and SOA: New Life for Desktop App Programmers

Publishers keep churning out articles about BPM, SOA and web services and how new and astoundingly different all this is. The basis of BPM is an SOA and services that you can call to accomplish the tasks you need to accomplish. In general, the services are loosely coupled (independent), and the system you are building orchestrates the services into a user-presentable application. The application probably has both user-driven events and system-driven events.

When I worked in a robotics lab (way back in the days of VAXen and Sun 3s) I used the X-Windows event dispatcher to handle sensor data and hardware events in mobile robots since the problem was reacting to many, sometimes simultaneous, events as quickly as possible. This looked a lot like building a GUI application and someone had already gone to the trouble of hacking out all that fast, efficient event-handling code.

Fast forward to all the Business Process Management and SOA talk and the underlying architecture looks a lot like a desktop apps model built around an event dispatcher. Instead of dealing with a single user, we're having to track events/state for hundreds of users, but otherwise it's similar. So rejoice, desktop programmers, the wheel is turning and all that knowledge about how to program this model will find another home in mainstream IT shops, albeit adjusted to fit a larger multi-user model.

As an FYI, the robots I was working on were giant metal robots, just like the ones Survival Research Labs makes. Back then, flames and explosions were accidental and snails could outpace the robots. I was born a generation too early to really enjoy robotics.


Data warehousing, business intelligence, IT strategy and architecture, and occasional interesting bits.

Subscribe to XML feed

Bio / About Me

Check out my book

Clickstream data warehousing book cover Buy clickstream data warehousing from

Search this site or  the web

Site search   Web search
powered by FreeFind
Popular Posts
Primate programming.
Why development in crunch mode doesn't work.
Enterprise data modeling sucks big rocks.
XP Exaggerated.
Ping-pong in the matrix.
Time management for anarchists.
Is Ab Initio worth evaluating?
Job posting: omniscient architect.
Why hiring more sales people won't grow revenues faster.
Some resources for Open Source CMS.

Reading List
The Cruise of the Snark
Blue Latitudes
Everyone in Silico
The Klamath Knot
Swarm Intelligence (Bonabeau)
A three year backlog of F&SF

Listening List
Toots and the Maytals
The Buena Vista Social Club
American Idiot

Watching List
Winged Migration Quicktime trailer
Ghengis Blues
Howl's Moving Castls
A Bronx Tale

Daily KOS
Due Diligence
Boing Boing
Kevin Kelly (Recomendo)
Not Geniuses
3 Quarks Daily

War in Context
Valmiki's Ramayana
Choose the Blue
Third Nature
Mark Madsen
The Data Warehouse Institute
James Howard Kunstler
Clickstream Data Warehousing
Technorati Profile

04/01/2003 - 05/01/2003 05/01/2003 - 06/01/2003 06/01/2003 - 07/01/2003 07/01/2003 - 08/01/2003 08/01/2003 - 09/01/2003 09/01/2003 - 10/01/2003 10/01/2003 - 11/01/2003 11/01/2003 - 12/01/2003 12/01/2003 - 01/01/2004 05/01/2004 - 06/01/2004 06/01/2004 - 07/01/2004 07/01/2004 - 08/01/2004 08/01/2004 - 09/01/2004 09/01/2004 - 10/01/2004 10/01/2004 - 11/01/2004 11/01/2004 - 12/01/2004 12/01/2004 - 01/01/2005 01/01/2005 - 02/01/2005 02/01/2005 - 03/01/2005 03/01/2005 - 04/01/2005 05/01/2005 - 06/01/2005 06/01/2005 - 07/01/2005 07/01/2005 - 08/01/2005 08/01/2005 - 09/01/2005 09/01/2005 - 10/01/2005 10/01/2005 - 11/01/2005 11/01/2005 - 12/01/2005 12/01/2005 - 01/01/2006 01/01/2006 - 02/01/2006 03/01/2006 - 04/01/2006 05/01/2006 - 06/01/2006 06/01/2006 - 07/01/2006 07/01/2006 - 08/01/2006 08/01/2006 - 09/01/2006 09/01/2006 - 10/01/2006 10/01/2006 - 11/01/2006 01/01/2007 - 02/01/2007 02/01/2007 - 03/01/2007 03/01/2007 - 04/01/2007 04/01/2007 - 05/01/2007 05/01/2007 - 06/01/2007 06/01/2007 - 07/01/2007 07/01/2007 - 08/01/2007 08/01/2007 - 09/01/2007 09/01/2007 - 10/01/2007 10/01/2007 - 11/01/2007 11/01/2007 - 12/01/2007 12/01/2007 - 01/01/2008 01/01/2008 - 02/01/2008 02/01/2008 - 03/01/2008 03/01/2008 - 04/01/2008 08/01/2008 - 09/01/2008 06/01/2009 - 07/01/2009 08/01/2009 - 09/01/2009 10/01/2009 - 11/01/2009 01/01/2010 - 02/01/2010 09/01/2011 - 10/01/2011 04/01/2013 - 05/01/2013

Powered by Blogger.

Creative Commons License
This work is licensed under this Creative Commons License except where indicated.