Lecture in November: Establishing a Data Quality Program
I'll be giving a talk on establishing a data quality program for FirstLogic at their iSummit Live conference in San Francisco on November 17. This differs from most talks I give in that it's only an hour and I usually do four to eight hour in-depth technical tutorials. It should be good because data quality problems make for interesting stories, and I plan to hand out product samples. Here's the abstract:
Case Study: Not Just Pears and Plants: Data Helps Bear Creek Grow
Delivering high quality products from Harry and David and Jackson & Perkins depends on more than just a hearty appetite or a green thumb; data also plays a key role. Hear first hand how Bear Creek is identifying the data problems impacting their organization, what those problems are costing, and their early efforts to address data quality.
Bear Creek Corporation is one of the nation's premier direct marketing and e-commerce companies. Their brands, Harry and David and Jackson & Perkins, have become a part of American life: through their catalogs, retail stores and the Internet, Americans have trusted Bear Creek's brands and products for decades.
Posted by Mark Thursday, October 28, 2004 11:14:00 AM |
ETL Evaluation Criteria is Available for Download
I posted a document containing the list of ETL product eval criteria mentioned in my last DM Review article. You can download a Microsoft Word version now. I'll post an update when the PDF version is available.
When reading through the list you will find that there are redundant criteria in different evaluation categories. If you want to use this list, first determine what areas are important in your evaluation, then take the criteria from the relevant categories. If you try to use this list as-is, you'll spend a lot of time finding answers to redundant questions. It's also a lot of work writing out and ranking criteria, so you are better off with the smallest number possible to get your evaluation done.
Posted by Mark Tuesday, October 12, 2004 10:07:00 AM |
My Latest Article on ETL Eval Criteria is Available
My latest article, "Criteria for ETL Product Selection", is now out in the DM Direct newsletter. This is a loosely related followon to the article in this month's issue of DM Review, which I posted a few entries earlier.
I will be posting a PDF of a criteria list I used to maintain shortly. Check back here in a week.
To make tool selection easier, it is best to develop criteria in categories of functionality. This will make it easier to compare the tools in different areas and is more effective than trying to come up with a single number or score to indicate that one product is best. All ETL tools have their strong and weak points. The goal of an evaluation should be to identify those strengths and weaknesses and match them up to what is important to your organization. It doesn't make much sense to rate products based on features you may never use, so look at the things that are important to you and ignore the rest. Fewer criteria will also help to speed up your evaluation. The remainder of this article discusses the basic categories you might want to use to develop your detailed evaluation criteria.
Posted by Mark Monday, October 04, 2004 4:00:00 AM |
Is Ab Initio Worth Evaluating?
Ab Initio has been around for long enough to make a name for themselves in the ETL space. Their marketing approach appears to be one of mystique: maintain secrecy around the product while allowing some information out about the high-end customers, creating interest because of the tantalizing tidbits they provide. Their web site is more typical of an advertising firm with nothing meaningful to say than a technology company.
I've seen other software comparnies do this and it works up to a point. Once the company has been around long enough, the approach stops working so well. Enough people have used the product that it becomes easier to find out the good and bad points about the product. Ab Initio uses non-disclosure agreements to try and stifle public discussion as much as possible, maintaining secrecy.
Another reason the secretive approach becomes less effective is that it doesn't scale. There's a point at which the market is aware of the company and interested, but the secrecy limits further exposure. It also limits the market supporters (analysts and consulting firms) from working with the company. If I have to sign an NDA just to see a presentation about the product, and there are seveal other companies clamoring for attention, I probably won't bother. As an analyst or consultant, what good is seeing the product and company if you can't talk about it?
In Ab Initio's case, the market perception is of a high-performance ETL product for large data volumes, similar to Torrent (acquired by Ascential). With both Ascential and Informatica releasing product versions that can scale to large data volumes across a workgroup of servers, the question is one of relative performance and cost. In Ab Initio's case this is hard to judge, because they won't say how much it costs, and the only word is that it is high-cost relative to other vendors.
The lack of information about product features also means that it is difficult to see what you trade in ETL features for that performance, and whether the performance is really that much better than the other vendors. Smart when you are small and can't afford a marketing budget. Maybe not so smart when you have more market exposure. Eventually people begin to think that maybe you are simply protecting margins and the product is ok, but not so much better than the competition that it rates the exorbitant cost.
In a discussion forum I read a consultant's story of trying to get training on the product. Ab Initio would not provide training because they only allow training for companies that already own the product (exception: Knightsbridge, from what I have learned). That person was unable to work on the data warehouse project because they wouldn't allow him to get training, so they've created a critic.
I believe Ab Initio is reaching the point of maximum returns, and it is starting to hurt them. I was asked if I would be adding them to my ETL evaluation course, and the answe is "no" because I can't get enough information to make it useful, at least not without more work than it's worth. My talks with other analysts have turned up the same thing. In some cases there is outright hostility, and both analysts and consultants are telling their customers that they should not consider the company because they are hard to work with and only provide content-free marketing unless you go through a ridiculous NDA process.
That's why Ab Initio is not included in any evaluations I've been doing. I can only offer an opinion based on what I've seen, which is a typical ETL tool, not as easy to develop with as many others, but with an apparent edge in performance at the very high end. That leads me to conclude that if you aren't trying to process a billion rows a day, it's probably not going to be worth the trouble or expense of including them.
Posted by Mark Sunday, October 03, 2004 11:02:00 AM |