TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

TDWI Blog

Data Warehousing Blog Posts

See the most recent Data Warehousing related items below.

TDWI Blog: Data 360

Zen BI: The Wisdom of Letting Go

One of my takeaways from last week’s BI Executive Summit in Las Vegas is that veteran BI directors are worried about the pace of change at the departmental level. More specifically, they are worried about how to support the business’ desire for new tools and data stores without undermining the data warehousing architecture and single version of truth they have worked so hard to deliver.

At the same time, many have recognized that the corporate BI team they manage has become a bottleneck. They know that if they don’t deliver solutions faster and reduce their project backlog, departments will circumvent them and develop renegade BI solutions that undermine the architectural integrity of the data warehousing environment.

The Wisdom of Letting Go

In terms of TDWI’s BI Maturity Model, these DW veterans have achieved adulthood (i.e. centralized development and EDW) and are on the cusp of landing in the Sage stage. However, to achieve true BI wisdom (i.e. Sage stage), they must do something that is both counterintuitive and terrifying: they must let go. They must empower departments and business units to build their own DW and BI solutions.

Entrusting departments to do the right thing is a terrifying prospect for most BI veterans. They fear that the departments will create islands of analytical information and undermine data consistency with which they have worked so hard to achieve. The thought of empowering departments makes them grip the proverbial BI steering wheel tighter. But asserting control at this stage usually backfires. The only option is to adopt a Zenlike attitude and let go.

Trust in Standards

I'm reminded of the advice that Yoda in the movie “Star Wars” provides his Jedi warriors-in-training: “Let go and trust the force.” But, in this case, DW veterans need to trust their standards. That is, the BI standards that they’ve developed in the BI Competency Center, including definitions for business objects (i.e. business entities and metrics), processes for managing BI projects, techniques for developing BI software, and processes and procedures for managing ETL jobs and handling errors, among other things.

Some DW veterans who have gone down this path add the caveat: “trust but verify.” Although educating and training departmental IT personnel about proper BI development is critical, it’s also important to create validation routines where possible to ensure business units conform to standards.

Engage Departmental Analysts

The cagiest veterans also recognize that the key to making distributed BI development work is to recruit key analysts in each department to serve on a BI Working Committee. The Working Committee defines DW and BI standards, technologies, and architectures and essentially drives the BI effort, reporting their recommendations to the BI Steering Committee comprised of business sponsors for approval. Engaging analysts who are most apt to create renegade BI systems ensures the DW serves their needs and helps ensure buy-in and support.

By adopting a Zen like approach to BI, veteran DW managers can eliminate project backlogs, ensure a high level of customer satisfaction, and achieve BI nirvana.

Posted by Wayne Eckerson0 comments

Reflections from TDWI's BI Executive Summit

More than 150 BI Directors and BI Sponsors from small, medium, and large companies plus a dozen or so sponsors attended TDWI’s BI Executive Summit last week, a record turnout.

Here are a few of the things I learned:

- Veteran BI directors are worried about the pace of change at the departmental level. More specifically, they are worried about how to support the business’ desire for new tools and data stores without undermining the data warehousing architecture and single version of truth they have worked so hard to deliver.(See "Zen BI: The Wisdom of Letting Go.")

- Jill Dyche explained that corporate BI teams can either be Data Providers or Solutions Providers. That this is an option is a new concept for me. However, after some thought, I believe that unless BI teams help deliver solutions, the data they provision will be underutilized. Unless the BI team helps solve business problems by delivering business solutions, it can never be viewed as a strategic partner.

- Most BI teams have project backlogs and don’t have a great way to get in front of them. Self-service BI can help eliminate a lot of the onesy and twosey requests for custom reports. BI Porfolios and roadmaps can help prioritize deliverables but executives always override their own priorities. Many veteran BI managers are looking to push more development back into the departments as a way to accelerate projects.

- There is a lot of interest in predictive analytics, dashboards, and the cloud. Those were the top three vote getters to the question, “Which technologies will have the most impact on your BI program in three years?”

- Most of the case studies at the Summit described real-time data delivery environments, often coupled with analytics. GE Rails applied statistical models to real-time data to help customer service agents figure out the optimal repair facility to send railroad cars to get fixed; Linkshare captures and displays Web activity and commissions to external customers (publishers and advertisers), and Seattle Teachers’ Credit Union delivers real-time recommendations to customer service agents.

- There was a lot of interest in how to launch an analytics practice and Aldo Mancini provided some great tips from his experiences at Discover Financial Services. To get SAS analysts to start using the data warehouse as a way to accelerate model development, he had them help design the subject areas and variables that should go into it. Then, he taught them how to use SQL so they could transform data into their desired format.

We’re already gearing up for our next Summit which will be held August 16-18 in San Diego. Hope to see you there!

Posted by Wayne Eckerson0 comments

Proactive Analytics That Drive the Business

“I love the chart, but what am I supposed to do about it?” With that simple question, Ken Rudin is schooling analysts at Zynga how to deliver information that makes a difference in the way the wildly successful gaming company creates and enhances games for customers.

“My mantra these days is ‘It’s gotta be actionable,’” says Rudin, former CEO of the early BI SaaS vendor LucidEra who now runs analytics at Zynga, creators of Farmville, Mafia Wars, and other popular applications for Facebook, iPhone, and other networks. “Just showing that revenue is down doesn’t help our product managers improve the games. But if we can show the lifecycle with which a subgroup uses the game, we can open their eyes to things they never realized before.”

It’s surprising that Rudin has to do any analytics tutoring at Zynga. Its data warehouse is a critical piece of its gaming infrastructure, providing recommendations to players based on profiles compiled daily in the data warehouse and cached to memory. With over 40 million players and 3TB of new data a day, Zynga’s 200-node, columnar data warehouse from Vertica is no analytical windup toy. If it goes down for a minute, all hell breaks out because product managers have no visibility into game traffic and trends.

Moreover, the company applies A/B testing to every new feature before deploying and has a bevy of statisticians who continually dream up ways that product managers can enhance games to improve retention and collaboration among gaming users. “I’ve never seen a company that is so analytically driven. Sometimes I think we are an analytics company masquerading as a gaming company. Everything is run by the numbers,” says Rudin.

Anticipating Questions

Yet, when Rudin came to Zynga in early 2009, he discovered the analytics team was mostly in reaction mode, taking orders from product managers for custom reports. So, he split the team into two groups: 1) a reporting team that creates reports for product managers and 2) an analytics team that tests hypotheses and creates models using statistical and analytical methods. A third part of his team runs the real-time, streaming data warehousing environment.

The reporting team currently uses a home grown SQL-based tool for creating parameterized reports. Rudin hopes to migrate them to a richer, self-service dashboard environment that delivers most of the routine information that product managers need and the ability to generate ad hoc views without the help of a SQL professional.

Rudin is encouraging the analytics team to be more proactive. Instead of waiting for product managers to submit requests for hypotheses to test, analysts should suggest gaming enhancements that increase a game's "stickiness" and customer satisfaction. “It’s one thing to get answers to questions and it’s another to know what questions to ask in the first place. We need to show them novels ways that they can enhance the games to increase customer retention.”

Zynga is already an analytics powerhouse, but it sees an infinite opportunity to leverage the terabytes of data it collects daily to enhance the gaming experience of its customers. “My goal for the year is to use analytics to come up with new product innovations,” says Rudin. By proactively working with the business to improve core products, the analytics team is fast becoming an ideas factory to improve Zynga’s profitability.

Editor's Note: By the way, the Zynga Analytics team is growing as fast as the company, so if you’re interested in talking to them, please contact Ken at [email protected].

Posted by Wayne Eckerson0 comments

Succeeding with SaaS

I’m hearing a lot success stories about deploying BI in the cloud. I believe these stories are just the tip of the iceberg.

Last week, I delivered a Webcast with Ken Harris, CIO of Shaklee Corp., a 50-year old natural nutrition company, which has run its data warehouse in the cloud for the past four years. (Click here for the archived Webcast.) And this past year, ShareThis and RBC Wealth Management discussed their successful cloud-based BI solutions at TDWI’s BI Executive Summits this past year.

Adoption Trends

The adoption of the cloud for BI is where e-commerce was in the late 1990s: people had heard of e-commerce but laughed at the notion that a serious volume of transactions would ever take place over the wire. They said consumers would never embrace e-commerce for security reasons: people with network “sniffers” might steal their credit card numbers.

We all know how that story turned out. Today, e-commerce accounts for more than $130 billion in annual transactions, about 3.5% of total retail sales, according to the U.S. Census Bureau.

According to TDWI Research, a majority of BI professionals (51%) are either “somewhat familiar” or “very familiar” with cloud computing. About 15% of BI programs have deployed some aspect of their BI environment in the cloud today, but that percentage jumps to 53% in three years. And 7% said that either “half” or “most” of their BI solutions will run in the cloud within three years. That’s according to 183 respondents to a survey TDWI conducted at its November, 2009 conference in Orlando.

Shaklee Success

At Shaklee, the decision to move the BI environment to the cloud was a no brainer. Says CIO Harris: “It was an easy decision then, and it’s still a good one today.” Formerly a CIO at the Gap and Nike, Harris has many years of experience delivering IT and BI solutions so his word carries a lot of clout.

Harris said his team evaluated both on-premise and cloud-based solutions to replace a legacy data warehouse. They opted for a cloud-based solution from PivotLink when the vendor was able to run three of the team’s toughest queries in a three-week proof of concept. Once commissioned, PivotLink deployed the new global sales data warehouse in three months “versus 18 months” for an on-premises system, said Harris.

PivotLink “bore the brunt” of integrating data from Shaklee’s operational systems. Although Harris wouldn’t say how much data Shaklee moves to its data warehouse daily, he said the solution has spread organically and now encompasses all sales, cost, and marketing data for Shaklee, a medium-sized home-based retailer. And since Shaklee has no internal resources supporting the solution, “the cost savings are big” compared to an on-premise solution, Harris says.

SaaS Vendors. Harris was quick to point out that not all Software-as-a-Service vendors are created equally. Harris has used a number of SaaS vendors for various applications but not all have worked out and he has pulled some SaaS applications back in house. “You need a vendor that really understands SaaS, and doesn’t just put a veneer on an on-premise piece of software and business model.”

Harris added that the challenge of implementing a SaaS solution pales in comparison to the challenge of implementing BI. “SaaS is easy compared to the challenge of delivering an effective BI solution.”

Posted by Wayne Eckerson0 comments

Crossing the Chasm II - Departmental to Enterprise BI

In a prior blog, I discussed strategies for crossing the “Chasm” in TDWI’s five-stage BI Maturity Model. The Chasm represents challenges that afflict later-stage BI programs. In my prior blog, I showed how later-stage BI programs must justify existence based on strategic value rather than cost savings efficiency, which is a hallmark of early-stage BI programs.

But perhaps a more serious challenge facing BI programs that want to cross the Chasm is migrating from a departmental BI solution to one with an enterprise scope.

Departmental Solutions. In my experience, some of the most successful BI solutions in terms of user satisfaction are developed at the department level. One reason is because the scope of the project is manageable. A department is small enough so that the technical developers might sit side by side with the business analysts and subject matter experts. They may see themselves as belonging to a single team and may even report to the same boss. The BI project is smaller and probably draws from just one or two sources that are used regularly and well known by people in the department. Moreover, hammering out semantics is straightforward: everyone uses the same language and rules to define things because they all live and work in the same “departmental bubble.”

Chasm Challenges

Politics. Conversely, developing an enterprise BI solution or migrating a successful departmental one to the enterprise level is fraught with peril. The politics alone can torpedo even the most promising applications. The departmental “owners” are afraid corporate IT will suck the lifeblood out of their project by over-architecting the solution. Corporate sponsors jockey for position to benefit from the new application but are reluctant to pony up hard dollars to lay the infrastructure for an enterprise application that benefits the entire company, not just them.

Semantics. In addition, semantics are a nightmare to resolve in an enterprise environment. What sales calls a customer is often different than how finance or marketing defines a customer. Ditto for sale, margin, and any other common term used across the enterprise. In fact, it’s always the most common terms that are the hardest to pin down when crossing departmental boundaries.

Scope. Finally, scope is an issue. Enterprise applications often try to tackle too much and get stalled in the process. The number of data sources rises, the size of the team increases, and the number of meetings, architectural reviews, and sign offs mount inexorably. The team and its processes eventually get in the way of progress. Simply, the BI program has become a bottleneck.

Hybrid Organizational Model

To cross the chasm, BI programs need to maintain the agility of departmental scale efforts with the economies of scale and integration delivered in enterprise solutions. To do this teams must adopt a hybrid organizational model, in which the central or corporate BI team manages the repository of shared data, a common semantic model, and a set of documented standards and best practices for delivering BI solutions.

Departmental IT. The corporate team then works closely with departmental IT staff responsible for developing BI solutions. The departmental developers leverage the enterprise data warehouse, common semantic model, and ETL and BI tools to create subject-specific data marts and standard reports tailored to departmental users. The corporate team oversees their development, ensuring adherence to standards where possible.

Super Users. If no IT staff exists in the department, the BI team cultivates “super users”—technically savvy business users in each department—to be their liaison or “feet on the ground” there. The corporate BI team creates subject-specific data marts and standard reports for each department using input from the Super Users. It then empowers the Super Users to create ad hoc reports around the boundaries of the standard reports but not overlapping them.

BICC. A fully-functioning BI Competency Center adopts this hybrid model where the corporate BI staff work cooperatively with departmental IT professionals and super users The glue that holds this hybrid model together are the standards and best practices that empower departmental experts to build solutions quickly that leverage the enterprise resources.

This approach ensures both agility and alignment, a rare combination seen in most BI programs. The fear of giving control back to departments and proliferating spreadmarts is why many companies are stuck in the chasm. But unless the corporate BI team cedes development work via well documented standards and practices, it becomes a bottleneck to development and causes spreadmarts to grow anyway.

Posted by Wayne Eckerson0 comments

Operational BI: To DW or Not?

Operational business intelligence (BI) means many things to many people. But the nub is that it delivers information to decision makers in near real time, usually within seconds, minutes, or hours. The purpose is empower front-line workers and managers with timely information so they can work proactively to improve performance.

Key Architectural Decision. This sounds easy but it’s hard to do. Low latency or operational BI systems have a lot of moving parts and there is not much time to recover from errors, especially in high-volume environments. The key decision you need to make when architecting a low-latency system is whether to use the data warehouse (DW) or not. The ramifications of this decision are significant.

On one hand, the DW will ensure the quality of low-latency data; but doing so may disrupt existing processes, add undue complexity, and adversely impact performance. On the other hand, creating a stand-alone operational BI system may be simpler and provide tailored functionality and higher performance, but potentially creates redundant copies of data that compromise data consistency and quality.

So take your pick: either add complexity by rearchitecting the DW or undermine data consistency by deploying a separate operational BI system. It’s kind of a Faustian bargain, and neither option is quick or cheap.

Within the DW

If you choose to deliver low-latency data within your existing DW, you have three options:

1. Mini Batch. One option is simply to accelerate ETL jobs by running them more frequently. If your DW supports mixed workloads (e.g. simultaneous queries and updates), this approach allows you to run the DW 24x7. Many start by loading the DW hourly and then move to 15-minute loads, if needed. Of course, your operational systems may not be designed to support continuous data extracts, so this is a consideration. Many companies start with this option since it uses existing processes and tools, but just runs them faster.

2. Change Data Capture. Another option is to apply change data capture (CDC) and replication tools which extract new records from system logs and move them in real-time to an ETL tool, staging table, flat file, or message queue so they can be loaded into the DW. This approach minimizes the impact on both your operational systems and DW since you are only updating records that have changed instead of wiping the slate clean with each load. Some companies combine both mini-batch and CDC to streamline processes even further.

3. Trickle Feed. A final option is to trickle feed records into the DW directly from an enterprise service bus (ESB), if your company has one. Here, the DW subscribes to selected events which flow through a staging area and into the DW. Most ETL vendors sell specialized ESB connectors to trickle feed data or you can program a custom interface. This is the most complex of the three approaches since there is no time to recover from a failure, but it provides the most up-to-date data possible.

Working Outside the DW

If your existing DW doesn’t lend itself to operational BI for architectural, political, or philosophical reasons, then you need to consider building or buying a complementary low-latency decision engine. There are three options here: 1) data federation 2) operational data stores 3) event-driven analytic engines.

1. Data federation. Data federation tools query and join data from multiple source systems on the fly. These tools create a virtual data mart that can combine historical and real-time data without the expense of creating a real-time DW infrastructure. Data federation tools are ideal when the number of data sources, volume of data, and complexity of queries are low.

2. ODS. Companies often use operational data stores (ODS) when they want to create operational reports that combine data from multiple systems and don’t want users to query source systems directly. An ODS extracts data from each source system in a timely fashion to create a repository of lightly integrated, current transaction data. To avoid creating redundant ETL routines and duplicate copies of data, many organizations load their DW from the ODS.

3. Event-driven engines. Event-driven analytic engines apply analytics to event data from an ESB as well as static data in a DW and other applications. The engine filters events, applies calculations and rules in memory, and triggers alerts when thresholds have been exceeded. Although tailored to meet high-volume, real-time requirements, these systems can also support general purpose BI applications.

In summary, you can architect operational BI systems in multiple ways. The key decision is whether to support operational BI inside or outside the DW. Operational BI within a DW maintains a single version of truth and ensures high quality data. But not all organizations can afford to rearchitect a DW for low latency data and must look to alternatives.

Posted by Wayne Eckerson0 comments

Teradata the Wise

One sign of wisdom is that you become less dogmatic about things. Years of experience show you (often the hard way) that there is no one right way to think or act or be. People, cultures, tastes, and beliefs come in all shapes and sizes. And so do data warehouses.

It’s good to see Teradata acknowledge this after years of preaching the enterprise data warehouse uber alles. President and CEO Mike Koehler said in his keynote in this week at Teradata Partners conference, “We like data marts.” And he could have added “operational data stores,” “appliances,” “the cloud” and so on. Rather than pitching a single platform and architectural approach, Teradata now offers an “ecosystem” of products (i.e. its appliances), enabling customers to pick and choose the offerings that best meet their needs and budget. Kudos!

Empowering Analysts

Teradata Agile Analytics for the Cloud. I was particularly captivated by the newly announced Teradata Agile Analytics for the Cloud, which enables individual business analysts to automatically carve out temporary sandboxes within a central data warehouse. Business analysts can upload data to the sandbox and run queries against their data and data in the warehouse they have permission to access. This empowers business analysts to explore data without tempting them to create renegade data marts under their desks.

Like any good software vendor, Teradata is following the lead of some of its more advanced customers, such as eBay, which have been doing this for years. Oliver Ratzeberger, eBay’s senior director of architecture and operations, discussed his use of these types of sandboxes at last year’s TDWI Executive Summit and again this week at Teradata Partners. eBay has now moved beyond what Teradata announced (which is not yet officially available) and can now dynamically reallocate resources to departmental data marts (not just personal data marts) based on usage. eBay calls these virtual data marts. Hopefully, Teradata will add this functionality in the near future, and then we’ll truly know it likes data marts!

I think Teradata’s marketing team took some liberties with the product name by adding the word “cloud” to it. Although perhaps technically correct, the name is confusing to the general data warehousing community. I’d rather see them call it something more straightforward, like Teradata’s Personal Sandbox Service.

Posted by Wayne Eckerson0 comments

From Data Warehousing to Performance Management

Next year marks the 15^th anniversary of TDWI, an association of data warehousing and business intelligence (BI) professionals that has grown nearly as fast as the industry it serves, which now tops $9 billion according to Forrester Research.

In 1995, when data warehousing was just another emerging information technology, most people—including some at TDWI—thought it was just another tech fad that would fade away in a few short years like others before it (e.g. artificial intelligence, computer-assisted software engineering, object-relational databases.) But data warehousing’s light never dimmed, and it has evolved rapidly to become an indispensable management tool in well-run businesses. (See figure 1.)

Figure 1. As data warehousing has morphed into business intelligence and performance management, its purpose has evolved from historical reporting to self-service analysis to actionable insights. At each step along the way, its business value has increased.

In the beginning…

In the early days, data warehousing served a huge pent up need within organizations for a single version of corporate truth for strategic and tactical decision making and planning. It also provided a much needed, dedicated repository for reporting and analysis that wouldn’t interfere with core business systems, such as order entry and fulfillment.

What few people understood then was that data warehousing was the perfect, analytical complement to the transaction systems that dominated the business landscape. While transaction systems were great at “getting data in,” data warehouses were great at “getting data out” and doing something productive with it, like analyzing historical trends to optimize processes, monitoring and managing performance, and predicting the future. Transaction systems could not (and still can’t) do these vitally important tasks.

Phase Two: Business Intelligence

As it turns out, data warehousing was just the beginning. A data warehouse is a repository of data that is structured to conform to the way business users think and how they want to ask questions of data. (“I’d like to see sales by product, region and month for the past three years.”)

But without good tools to access, manipulate, and analyze data, users can’t drive much value from the data warehouse. So organizations began investing in easy-to-use reporting and analysis tools that would empower business users to ask questions of the data warehouse in business terms and get answers back right away without having to ask the IT department to create a custom report. Empowering business users to drive insight from data using self-service reporting and analysis tools became known as “business intelligence.”

Phase Three: Performance Management

Today, organizations value insights but they want results. Having an “intelligent” business (via business intelligence) doesn’t do much good if the insights don’t help it achieve its strategic objectives, such as growing revenues or increasing profits. In other words, organizations want to equip users with proactive information to optimize performance and achieve goals.

Accordingly, business intelligence is now morphing into performance management where organizations harness information to improve manageability, accountability, and productivity. The vehicles of choice here are dashboards and scorecards that graphically depict performance versus plan for companies, divisions, departments, workgroups and even individuals. In some cases, the graphical “key performance indicators” are updated hourly so employees have the most timely and accurate information with which to optimize the processes for which they are responsible.

Organizations have discovered that publishing performance among peer groups engenders friendly competition that turbocharges productivity. But more powerfully, these tools empower individuals and groups to work proactively to fix problems and exploit opportunities before it is too late. In this way, performance management is actionable business intelligence built on a single version of truth delivered by a data warehousing environment.

The Future

As Timbuk3 once sung, “The future looks so bright I have to wear shades.” The same could be said of business intelligence as it increasingly becomes a powerful tool for business executives to measure, monitor, and manage the health of their organizations and keep them on track towards achieving strategic goals.

Posted by Wayne Eckerson0 comments

Bastardizing Analytics

If you think that semantics is a huge problem for data warehousing initiatives—and it is—it’s an even bigger problem for our industry at large.

Take the word analytics, for example. It’s a popular term right now: reporting is passé; everyone wants to do “analytics.” And like most popular terms, it’s been completely bastardized. This is largely because vendors want to exploit any term that is popular and bend it to match their current or future product portfolios. But it’s also because we’re too complacent, uninformed, or busy to quibble with exact meanings—that is, until we have to plunk down cold, hard cash or risk our reputation on a new product. Then we care about semantics.

Most vendors use the term analytics to describe what I would call “interactive reporting.” Evidently, any tool that lets users filter data on page or drill down to greater detail qualifies as “analytics.” This isn’t wholly inaccurate since Webster’s dictionary defines analysis (the root of analytics) as: “Studying the nature of something or determining its essential features and their relations.”

“We should use labels to clarify not obscure meaning.”

But let’s get real. We should use labels to clarify not obscure meaning. Reporting tools have given users the ability to filter, sort, rank, and drill for years, and yet we never called them analytical tools. Instead, we’ve used the terms ad hoc query, end user reporting, dashboards, scorecards, and OLAP. To borrow a biblical phrase, let’s not put old wine into new wineskins. Otherwise, we simply spoil something good.

“Analytics is easy to say but hard to do.”

Let’s be clear: analytics is a natural progression of BI from reporting on historical activity to applying statistical techniques and algorithms to identify hidden patterns in large volumes of data. We can use these patterns and relationships to create models that help us understand customer and market dynamics and predict future events and behavior. Thus, analytics is a marked departure from traditional reporting in terms of scope, depth, and impact. It is easy to say but hard to do. This is why so many vendors have jumped on the analytics bandwagon without sporting any real credentials.

One of the few vendors that has marked a steady course in the usage of analytics is SAS. I talked with Keith Collins, SAS’ chief technology officer, last week about this issue, and all he could do is shake his head and commiserate about the bastardization of a term that is near and dear to his company’s heart. On the bright side, there could be worse things than for neophyte vendors to co-opt your semantics. Even if their current products don’t support honest-to-goodness analytics, their product roadmaps do.

In the big scheme of things, the bastardization of analytics is actually a bridge to the future. We use words to propel our thinking ahead as we prepare our bodies to follow. So, give it five or ten years and analytics will have a distinct meaning from reporting. By that time, I’m sure we’ll be bastardizing some new BI term.

Posted by Wayne Eckerson0 comments

Netezza Turns a Corner

Once compared to David facing Goliath, database vendor Netezza yesterday traded in its wooden slingshot for steel blades and armor as it both celebrated its victories to date and geared up to fight bigger adversaries.

With nearly 300 customers and a new “commodity-based” architecture, Netezza is the clear leader in the emerging analytics database market that it evangelized beginning in 2002. It celebrated is stature and market position with a high-energy, one-day event in Boston that kicks off a worldwide roadshow.

Almost 400 attendees (a good percentage being Netezza employees to be fair) heard CEO Jim Baum introduce the new Netezza TwinFin data warehouse and analytic appliance that boasts higher performance, greater modularity, and more openness and flexibility to handle various workloads than its predecessor boxes. Netezza primarily designed TwinFin to address new customer requirements but secondarily to strike a blow to “new” arch competitors Teradata and Oracle.

“We’re 50 times faster than Oracle and ten times faster than Teradata without tuning,” said Baum. Other Netezza executives added that the company’s platform is no longer a one-trick pony. “We’re moving from supporting fast scans to multiple workloads that address broader market demand,” said Phil Francisco, VP of product management and product marketing, in a pre-briefing this summer.

While none of the customers it highlighted during the day-long event on Boston’s waterfront were using TwinFin in production, most touted dramatic performance gains compared to competitor products in head-to-head tests. Some were testing TwinFin and most seemed eager to take the plunge, especially high-end customers who could benefit from the open architecture that makes it easier to embed advanced analytics programs.

Migration and Markets

What wasn’t discussed, however, was the time and cost to migrate from Netezza’s existing boxes to TwinFin. I raised this question with Netezza marketing cohorts Phil Francisco and Tim Young this summer. They said that most existing Netezza customers will eventually swap out old boxes with TwinFin as they exceed the capacity or performance requirements of existing machines. One of the limitations of the old architecture was relatively fixed capacity boxes with limited extensibility for growth. They implied that this type of forklift migration would be no surprise to existing customers, which is one of the downsides of selling “appliances” where everything is pre-built inside a box.

However, that doesn’t mean that Netezza will stop selling appliances—by definition, a complete hardware/software solution that requires no assembly and minimal configuration and tuning. You won’t be seeing a software-only version of Netezza TwinFin for sale. Netezza will still ship out appliance machines, except now the new open architecture dramatically improves the system’s overall flexibility and extensibility.

Perhaps more importantly, TwinFin is designed to address a new market. “We’ve moved past the early adopter, risk-taker market and are addressing a more mainstream customer,” said Young. While existing customers will be the first to adopt TwinFin, ultimately the new product is designed to position Netezza as an attractive alternative to leading database vendors.

Open Architecture

Still, it takes some courage to revamp your architecture when you have a large installed base. But many young, ambitious companies have had to undergo this painful baptism—sometimes multiple times—to position themselves for future growth. Business Objects, MicroStrategy, and Cognos all come to mind.

But Netezza may get off easy. No one seemed up in arms about the architectural shift, probably because most people—customers, analysts, press—seem to understand the benefits of building database systems on commodity hardware. (We should thank Greenplum, Aster, and other analytic database upstarts for that.) The shift means that it will spend less time engineering hardware and more time delivering value-add capabilities through software.

Netezza already is attracting many partners who find it easier to build applications and plug-ins within the open architecture running on blade-based servers, commodity disk storage, Intel chips, Linux operating system, and SQL, ODBC, JDBC, and OLE DB data access interfaces. Also, Netezza will enable customers to deploy TwinFin appliances in hub-spoke configurations with up to 10 racks to increase extensibility. It will also soon introduce specialized TwinFin appliances tuned for different workloads, including high capacity storage (up to 5 PB) and extreme performance (1 TB RAM per rack), a move eerily similar to Teradata’s lineup of new appliance machines.

Despite the hype, the future of TwinFin rests in the hands of customers who will vote with their pocketbooks. Netezza has claimed some impressive victories to date, but it has many battles to fight before it can claim the crown.

Posted by Wayne Eckerson0 comments