A Personal Data Warehouse?
Can you have a personal data warehouse? One man has built it, but you won’t find it in the store, at least not yet.
- By Ted Cuzzillo
- April 25, 2007
By Ted Cuzzillo, CBIP
If you're all too frustrated with the long, long road to BI--starting with meetings and crawling through the usual bottlenecks and detours--say hello to Al Cherdak's Personal Data Warehouse. It would take you off-road--if you could have it.
"I recognized immediately that this solves a problem nothing else solves," said Mark Goebel, director of data administration from his office at Mt. Sinai Medical Center in New York City. He's one of only a few hundred people who have seen it work, and he became so enthused he has helped Cherdak promote it.
Personal Data Warehouse, sometimes called Corporate Image Inquiry, runs on a PC and in effect hot-wires operational data sources. It runs reports and exports, and its semi-automated functions help users join tables, clean data, and reuse past queries.
In PDW, operational data shows up as "apples to apples. It aligns the data," said Goebel. Traditional BI companies say that can do that, "but it's real hard, and it takes a lot of programming."
Cherdak's demo starts out proving that anyone able to run Excel without adult supervision can run PDW. He points and clicks into two operational databases to produce a report drawing from both. The data looks like it came from just one source.
To join tables, the software matches fields based on past matches stored in the knowledgebase. If it finds none, it compares fields and presents the user with likely choices.
All the time, PDW takes notes. Any inquiry can be named for easy reuse, either alone or strung together into complex routines.
"I've heard people say, 'You mean you're going to take people who don't know too much about the software or database and expect them to design applications?'" he said. To that, he points to the knowledgebase. Beginners can reuse inquiries set up by experienced users.
Off-road travel, of course, entails dirty data as users bypass the checkpoints found on the road to traditional data warehouses. His solution? Built-in functions that help find, clean, and propagate data.
Tracker, which Cherdak calls the "data integrity, acquisition and distribution engine," keeps an eye open for changes in its domain. When it catches one, "It says, 'Aha!,'" and kicks in cleaning routines. They compare fields and prompt the user to pick the right one.
The function can also synchronize data among databases it can write to, says Cherdak. "The idea of having bi-directional capability," he said, his voice rising with excitement, "the ability to update and manage and to do all this across multiple platforms, that is totally unique."
Whether matching fields, cleaning and propagating data, or making reports, PDW presents choices and the user decides. "Users know their data."
Not so fast, warns Business Intelligence Roadmap co-author Larissa T. Moss. Do users really know their data? "I don't believe they do. And even if they did, do all those users agree? I say they don't."
She concedes that for interim needs, such as urgently needed reports, PDW might be good enough. After all, redundant and dirty data is used already.
In the big picture, however, such methods will fail, she said. It can't clean up the data chaos and manage data as a sharable, unique asset of the company. "Data warehousing is not about silo solutions! For strategic decision-making that can make or break the company," she said, "I simply do not see this as an acceptable solution."
Cherdak gainsays this as "top down" thinking from an "old model."
In fact, he says he's thinking bottom-up. "How else can all the little groups in a corporation manage their business in the same way the corporation manages itself?" he asks.
It's his version of thinking globally and acting locally. Or just acting locally.
Most lower-level planning is still done with Excel and Access, he said. "They run into terrible problems because they use fixed bundles of information with little connectivity. It's a pain in the neck."
Why not use data from a data warehouse? "There is a great deal of private data," he said. "Many managers don't want the higher-ups to see what he's basing his numbers on."
Whether PDW is a good thing or bad, you won't see it ready to download for at least a little while longer. Cherdak, still the sole owner, is looking for a partner to polish it up and market it.
His ally Mark Goebel has tried to interest several companies in PDW. "Nobody knows what to do with a tool in this condition," he said. "This is not shrink-wrapped. It's somewhere between an idea and complete. Nobody knows what to do with that. It's mind-boggling."
Another ally, software engineer Mike Busak of Denver, Colorado, said yet another problem is reluctance to try it first. "I've tried to introduce it to companies. No one wants to go first."
Cherdak says there's a big future for Personal Data Warehouse. "I think people smarter than I am will be able to move this into areas further than anyone could have envisioned," he said. "At least that's the hope of one crazy inventor."
Ted Cuzzillo, CBIP, is a freelance business and technology writer based in Point Richmond, CA. He can be reached at firstname.lastname@example.org.
Ted Cuzzillo is an industry analyst and journalist in the business intelligence industry. He’s looking for anyone who tells stories with data or even thinks about it, and those who receive such stories. He’s researching best practices for storytelling with data, careers, reactions to storytelling with data, and possibly other issues. He asks that you contact him at email@example.com with a line or two about your involvement with data stories.