Web Data Sourcing Tools We Used for the Mashup Contest
We just finished recording a podcast for IBM DeveloperWorks which will be up in the next few days so I was looking to see what else happened at Mashup Camp while we were writing code. We got a mention at Programmable Web (first place I go to see what's new in the mashup space). I wish we had been able to spend more time in sessions, but being heads-down in new tools was still worthwhile.
Apart from the possibility of winning prizes, this is the best learning environment I've found for this space. Unless you spend a lot of time reading through blogs, you aren't going to find many resources. Besides, nothing beats learning from other people while doing.
Here's the rundown of tools Renat and I worked with:
QEDwiki (not an official product or downloadable, yet)
and lots of Google sources
One thing that hasn't been mentioned in most of the news is that all these companies had people at Camp. To be honest, if it weren't for Dan Gisolfi and Meg Sorber from IBM, we never would have finished work by the time the event closed. They stayed up late to help us with problems, bugs and techniques using QEDwiki.
I used Dapper a lot to scrape pages and make RSS feeds. I ran into problems with Dapper and the bad HTML practices of some web sites. Fortunately, Eran Shir and Jon Aizen (the CEO and CTO of Dapper) were there to help out.
Out of all the things I've worked with, theirs is the most impressive because of its simplicity. Unlike Pipes, which manipulates RSS feeds, Dapper scrapes pages and turns them into feeds in many different formats. Dapper + Pipes is a great combination. Kapow is an industrial strength scraper, so Dapper is not as powerful as Kapow's tools, but it's a lot easier to use for something quick and dirty.
We used Apatar because it was the only way short of directly coding to APIs to get data from Salesforce.com. And it's open source. And Renat knows how to use it. It's not a page scraper, it's a data integration tool, so it does things these other tools can't. Overall, the combination of a DI tool, a scraper, and a manipulation and delivery formatting tool are what you need to get data for mashups if you're doing it inside an IT shop.
QEDwiki is the assembly hub, so it doesn't provide data sourcing or manipulation features. IBM is going to include another tool in the kit for that. They did a demo of this during Mashup U, but didn't have it available for us.
Labels: apatar, dapper, ibm, kapow, mashup, mashup camp, pipes, qedwiki
Posted by Mark Tuesday, July 24, 2007 9:02:00 AM |