By using tdwi.org website you agree to our use of cookies as described in our cookie policy. Learn More

TDWI Blog

Data Analysis and Design Blog Posts

See the most recent Data Analysis and Design related items below.


Do We Really Need Semantic Layers?

It used to be that a semantic layer was the sine qua non of a sophisticated BI deployment and program. Today, I’m not so sure.

A semantic layer is a set of predefined business objects that represent corporate data in a form that is accessible to business users. These business objects, such as metrics, dimensions, and attributes, shield users from the data complexity of schema, tables, and columns in one or more back-end databases. But a semantic layer takes time to build and slows down deployment of an initial BI solution. Business Objects (now part of SAP) took its name from this notion of a semantic layer, which was the company’s chief differentiator at its inception in the early 1990s.

A semantic layer is critical for supporting ad hoc queries by non-IT professionals. As such, it’s a vital part of supporting self-service BI, which is all the rage today. So what’s my beef? Well, 80% of most BI users don’t need to create ad hoc queries. The self-service requirements of “casual users” are easily fulfilled using parameterized reports or interactive dashboards which do not require semantic layers to build or deploy.

Accordingly, most pureplay dashboard vendors don’t incorporate a semantic layer. Corda, iDashboards, Dundas, and others are fairly quick to install and deploy precisely because they have a lightweight architecture (i.e., no semantic layer). Granted, most are best used for departmental rather than enterprise deployments, but nonetheless, these low-cost, agile solutions often support sophisticated BI solutions.

Besides casual users, there are “power users” who constitute about 20% of total users. Most power users are business analysts who typically query a range of databases, including external sources. From my experience, most bonafide analysts feel constrained by a semantic layer, preferring to use SQL to examine and extract source data directly.

So is there a role for a semantic layer today? Yes, but not in the traditional sense of providing “BI to the masses” via ad hoc query and reporting tools. Since the “masses” don’t need such tools, the question becomes who does?

Super Users. The most important reason to build a semantic layer is to support a network of “super users.” Super users are technically savvy business people in each department who gravitate to BI tools and wind up building ad hoc reports on behalf of colleagues. Since super users aren’t IT professionals with formal SQL training, they need more assistance and guiderails than a typical application developer. A semantic layer ensures super users conform to standard data definitions and create accurate reports that align with enterprise standards.

Federation. Another reason a semantic layer might be warranted is when you have a federated BI architecture where power users regularly query the same sets of data from multiple sources to support a specific application. For example, a product analyst may query historical data from a warehouse, current data from sales and inventory applications, and market data from a syndicated data feed. If this usage is consistent, then the value of building a semantic layer outweighs its costs.

Distributed Development. Mature BI teams often get to the point where they become the bottleneck for development. To alleviate the backlog of projects, they distribute development tasks back out to IT professionals in each department who are capable of building data marts and complex reports and dashboards. To make distributed development work, the corporate BI team needs to establish standards for data and metric definitions, operational procedures, software development, project management, and technology. A semantic layer ensures that all developers use the same definitions for enterprise metrics, dimensions, and other business objects.

Semi-legitimate Power Users. You have inexperienced power users who don’t know how to form proper SQL and aren’t very familiar with the source systems they want to access. This type of power user is probably more akin to a super user than a business analyst and would be a good candidate for a semantic layer. However, before outfitting these users with ad hoc query tools, first determine whether a parameterized report, an interactive dashboard, or a visual analysis tool (e.g., Tableau) can meet their needs.

So there you have it. Semantic layers facilitate ad hoc query and reporting. But the only people who need ad hoc query and reporting tools these days are super users and distributed IT developers. However, if you are trying to deliver BI to the masses of casual users, then a semantic layer might not be worth the effort. Do you agree?

Posted by Wayne Eckerson on July 28, 20100 comments


Do Your Team a Favor: Stop Acting Like IT

(Caution: This blog may contain ideas that are hazardous to your career.)

I’ve argued in previous blogs that business intelligence (BI) professionals must think more like business people and less like IT managers if they are to succeed. However, while many BI professionals have their hearts in the right place, their actions speak differently. They know what they need to do but can’t seem to extricate themselves from an IT mindset. That takes revolutionary thinking and a little bit of luck.

Radical Thinking

So, here’s a radical idea that will help you escape the cultural bonds of IT: don’t upgrade your BI software.

Now, if you gasped after reading that statement, you’re still an IT person at heart. An IT person always believes in the value of new software features and fears losing vendor support and leverage by not staying fairly current with software licenses and versions.

Conversely, the average business person sees upgrades as a waste of time and money. Most don’t care about the new functionality or appreciate the financial rationale or architectural implications. To them, the upgrade is just more “IT busywork.”

Here’s another radical idea: stick with the BI tools you have. Why spend a lot of money and time migrating to new platform when the one you have works? So what if the tools are substandard and missing features? Is it really a problem if the tools force your team to work overtime to make ends meet? Who are the tools really designed to support: you or the users?

In the end, it’s not the tools that matter, it’s how you apply them. Case in point: in high school, I played clarinet in the band. One day, I complained vociferously to the first chair, a geeky guy named Igor Kavinsky who had an expensive, wooden clarinet (which I coveted), that my cheap, plasticized version wasn’t working very well and I needed a replacement. Before I could list my specific complaints, he grabbed my clarinet, replaced the mouthpiece, and began playing.

Lo and behold, the sound that came from my clarinet was beautiful, like nothing I had ever produced! I was both flabbergasted and humiliated. It was then I realized that the problem with my clarinet not the instrument but me! Igor showed me that it’s the skill of the practitioner, not the technology, that makes all the difference.

Reality Creeps In

Igor not withstanding, if you’re a good, well-trained IT person, you probably think my prior suggestions are unrealistic, if not ludicrous. In the “real world,” you say, there is no alternative to upgrading and migrating software from time to time. These changes—although painful—improve your team’s ability to respond quickly to new business needs and avoid a maintenance nightmare. And besides, many users want the new features and tools, you insist.

And of course, you are right. You have no choice.

Yet, given the rate of technology obsolescence and vendor consolidation, your team probably spends 50% of its time upgrading and migrating software. And it spends its remaining time maintaining both new and old versions (because everyone knows that old applications never die.) All this busywork leaves your team with precious little time and resources to devise new ways to add real value to the business.

Am I wrong? Is this a good use of your organization’s precious capital? What would a business person think about the ratio of maintenance to development dollars in your budget?

Blame the Vendors. It’s easy to blame software vendors for this predicament. In their quest for perpetual growth and profits, vendors continually sunset existing products, forcing you (the hapless customer) with no choice but to upgrade or lose support and critical features. And just when you’ve fully installed their products, they merge with another company and reinvent their product line, forcing another painful migration. It’s tempting to think that these mergers and acquisitions are simply diabolical schemes by vendors to sell customers expensive, replacement products. Just ask any SAP BI customer!

Breaking the Cycle

If this describes your situation, what do you do about it? How do you stop thinking like an IT person and being an IT cuckold to software vendors?

Most BI professionals are burrowed more deeply in an IT culture than they know. Breaking free often requires a cataclysmic event that rattles their cages and creates an opening to escape. This might be a change in leadership, deregulation, a new competitor, or a new computing platform. Savvy BI managers seize such opportunities to reinvent themselves and change the rules of the game.

Clouds Coming. Lucky for you, the Cloud—or more specifically, Software as a Service (SaaS)—is one of those cataclysmic events. The Cloud has the potential to liberate you and your team from an overwrought IT culture that is mired in endless, expensive upgrades and painful product migrations, among other things.

The beauty of a multi-tenant, cloud-based solution is that you never have to upgrade software again. In a SaaS environment, the upgrades happen automatically. To business and IT people, this is magical: cool new features appear and no one had to do any work or suffer any inconvenience. SaaS also eliminates vendor lock in since you can easily change cloud vendors (as long as you maintain your data) by just pointing users to a new URL. The Cloud is a radical invention that promises to alter IT culture forever.

Getting Started. To break the cycle, start experimenting with cloud-based BI solutions. Learn how these tools work and who offers them. Use the cloud for prototypes or small, new projects. Some cloud BI vendors offer a 30-day free trial while more scalable solutions promise to get you up and running quickly. If you have a sizable data warehouse, leave your data on premise and simply point the cloud BI tools to it. Performance won’t suffer.

Unless you experiment with ways to break free from an IT culture, you never will. Seize the opportunity that the Cloud affords and others that are sure to follow. Carpe diem!

Posted by Wayne Eckerson on June 25, 20100 comments


How to Organize Business Analysts

Business analysts are a key resource for creating an agile organization. These MBA- or PhD-accredited, number-crunchers can quickly unearth insights and correlations so executives can make critical decisions. Yet, one decision that executives haven’t analyzed thoroughly is the best way to organize business analysts to enhance their productivity and value.

Distributed Versus Centralized

Traditionally, executives either manage business analysts as a centralized, shared service or allow each business unit or department to hire and manage their own business analysts. Ultimately, neither a centralized or distributed approach is optimal.

Distributed Approach. In a distributed approach, a department or business unit head hires the analyst to address local needs and issues. This is ideal for the business head and departmental managers who get immediate and direct access to an analyst. And the presence of one or more analysts helps foster a culture of fact-based decision making. For example, analysts will often suggest analytical methods for testing various ideas, helping managers become accustomed to basing decisions on fact rather than gut feel alone.

However, in the distributed approach, business analysts often become a surrogate data mart for the department. They get bogged down creating low-value, ad hoc reports instead of conducting more strategic analyses. If the business analyst is highly efficient, the department head often doesn’t see the need to invest in a legitimate, enterprise decision-making infrastructure. From an analyst perspective, they often feel pigeonholed in a distributed approach. They see little room for career advancement and few opportunities to expand their knowledge in new areas. They often feel isolated and have few opportunities to exchange ideas and collaborate on projects with fellow analysts. In effect, they are “buried” in departmental silos.

Centralized Approach. In a centralized approach, business analysts are housed centrally and managed as a shared service under the control of a director of analytics, or more likely, a chief financial officer or director of information management. One benefit of this approach is that organizations can assign analysts to strategic, high priority projects rather than tactical, departmental ones, and the director can establish a strong partnership with the data warehousing and IT teams which control access to data, the fuel for business analysts. Also, by being co-located, business analysts can more easily collaborate on projects, mentor new hires, and cross-train in new disciplines. This makes the environment a more rewarding place to work for business analysts and increases their retention rate.

The downside of the centralized approach is that business analysts are a step removed from the people, processes, and data that drive the business. Without firsthand knowledge of these things, business analysts are less effective. It takes them longer to get up to speed on key issues and deliver useful insights, and they may miss various nuances that are critical for delivering a proper assessment. In short, without a close working relationship with the people they support and intimate knowledge of local processes and systems, they are running blind.

Hybrid Approach

A more optimal approach combines elements of both distributed and centralized methods. In a hybrid environment, business analysts are embedded in departments or business units but report directly to a director of analytics. This sounds easy enough, but it’s hard to do. It’s ideal when the company is geographically consolidated in one place so members of the analytics team can easily physically reconvene as a group to share ideas and discuss issues.

Zynga. For example, Zynga, an internet gaming company, uses a hybrid approach for managing its analysts. All of Zynga’s business analysts report to Ken Rudin, director of analytics for the company. However, about 75% of the analysts are embedded in business units, working side by side with product managers to enhance games and retain customers. The remainder sit with Rudin and work on strategic, cross-functional projects. (See “Proactive Analytics That Drive the Business” in my blog for more information on Zynga’s analytics initiative.) This setup helps deliver the best of both centralized and distributed approaches.

Every day, both distributed and centralized analysts come together for a quick “stand up” meeting where they share ideas and discuss issues. This helps preserve the sense of team and fosters a healthy exchange of knowledge among all the analysts, both embedded and centralized. Although Zynga’s analysts all reside on the same physical campus, a geographically distributed team could simulate “stand up” meetings with virtual Web meetings or conference calls.

Center of Excellence. The book “Analytics at Work” by Tom Davenport, Jeanne Harris, and Robert Morison describes five approaches for organizing analysts, most of which are variations on the themes described above. One approach, “Center of Excellence” is similar to the Hybrid approach above. The differences are that all (not just some) business analysts are embedded in business units, and all are members of (and perhaps report dotted line to) a corporate center of excellence for analytics. Here, the Center of Excellence functions more like a program office that coordinates activities of dispersed analysts rather than a singular, cohesive team, as in the case of Zynga.

Either approach works, although Zynga’s makes it easier for an inspired director of analytics to shape and grow the analytics department quickly and foster a culture of analytics throughout the organization.

Summary. As an organization recognizes the value of analytics, it will evolve the way it organizes its business analysts. Typically, companies will start off on one extreme—either centralized or distributed—and then migrate to a more nuanced hybrid approach in which analysts report directly to a director of analytics (i.e., Zynga) or are part of a corporate Center of Excellence.

Posted by Wayne Eckerson on May 23, 20100 comments


Evolving Your BI Team from a Data Provider to a Solutions Provider

In her presentation on “BI Roadmaps” at TDWI’s BI Executive Summit last month, Jill Dyche explained that BI teams can either serve as “data providers” or “solutions providers.” Data providers focus on delivering data in the form of data warehouses, data marts, cubes, and semantic layers that can be used by BI developers in the business units to create reports and analytic applications. Solutions providers, on the other hand, go one step further, by working hand-in-hand with the divisions to develop BI solutions.

I firmly believe that BI teams must evolve into the role of solutions provider if they want to succeed long term. They must interface directly with the business, serving as a strategic partner that advises the business on how to leverage data and BI capabilities to solve business problems and capitalize on business opportunities. Otherwise, they will become isolated and viewed as an IT cost-center whose mission will always be questioned and whose budget will always be on the chopping block.

Data Provisioning by Default. Historically, many BI teams become data providers by default because business units already have reporting and analysis capabilities, which they’ve developed over the years in the absence of corporate support. These business units are loathe to turn over responsibility for BI development to a nascent corporate BI group that doesn’t know its business and wants to impose corporate standards for architecture, semantics, and data processing. Given this environment, most corporate BI teams take what they can get and focus on data provisioning, leaving the business units to weave gold out of the data hay they deliver.

Mired Down by Specialization

However, over time, this separation of powers fails to deliver value. The business units lose skilled report developers, and they don’t follow systematic procedures for gathering requirements, managing projects, and developing software solutions. They end up deploying multiple tools, embedding logic into reports, and spawning multiple, inconsistent views of information. Most of all, they don’t recognize the data resources available to them, and they lack the knowledge and skills to translate data into robust solutions using new and emerging BI technologies and techniques, such as OLAP cubes, in-memory visualization, agile methods, dashboard, scorecards, and predictive analytics.

On the flip side, the corporate BI team gets mired down with a project backlog that it can’t seem to shake. Adopting an industrialized assembly line mindset, it hires specialists to handle every phase of the information factory to improve efficiency (e.g. requirements, ETL, cube building, semantic modeling, etc.) yet it can’t accelerate development easily. Its processes have become too rigid and sequential. When divisions get restless waiting for the BI team to deliver, CFOs and CIOs begin to question their investments and put its budget on the chopping block.

Evolving into Solutions Providers

Rethink Everything. To overcome these obstacles, a corporate BI team needs to rethink its mission and the way it’s organized. It needs to actively engage with the business and take some direct responsibility for delivering business solutions. In some cases, it may serve as an advisor to a business unit which has some BI expertise while in others it may build the entire solution from scratch where no BI expertise exists. By transforming itself from a back-office data provider to a front-office solutions developer, a corporate BI team will add value to the organization and have more fun in the process.

It will also figure out new ways to organize itself to serve the business efficiently. To provide solutions assistance without adding budget, it will break down intra-organizational walls and cross-train specialists to serve on cross-functional project teams that deliver an entire solution from A to Z. Such cross-fertilization will invigorate many developers who will seize the chance to expand their skill sets (although some will quit when forced out of their comfort zones). Most importantly, they will become more productive and before long eliminate the project backlog.

A High Performance BI Team

For example, Blue Cross/Blue Shield of Tennessee has evolved into a BI solutions provider over the course of many years. BI is now housed in an Information Management (IM) organization that reports to the CIO and is separate from the IT organization. The IM group consists of three subgroups: 1) the Data Management group 2) the Information Delivery group and 3) the IM Architecture group.

  • The Data Management group is comprised of 1) a data integration team that handles ETL work and data warehouse administration and 2) a database administration team that designs, tunes, and manages IM databases.
  • The Information Delivery group consists of 1) a BI and Performance Management team which purchases, installs, and manages BI and PM tools and solutions and provides training and two customer-facing solutions delivery teams that work with business units to build applications. The first is the IM Health Informatics team that builds clinical analytic applications using reporting, OLAP, and predictive analytics capabilities, and the second is the IM Business Informatics team which builds analytic applications for other internal departments (i.e. finance, sales, marketing).
  • The IM Architecture group builds and maintains the IM architecture, which consists of the enterprise data warehouse, data marts, and data governance programs, as well as closed loop processing and the integration of structured and unstructured data.

Collaborative Project Teams. Frank Brooks, director of data management and information delivery at BCBS of Tennessee, says that the IM group dynamically allocates resources from each IM team to support business-driven projects. Individuals from the Informatics teams serve as project managers, interfacing directly with the customers. (While Informatics members report to the IM group, many spend most of their in the departments they serve.) One or more members from each of the other IM teams (data integration, database administration, and BI/PM) is assigned to the project team and they collaboratively work to build a comprehensive solution for the customer.

In short, the BI team of BCBS of Tennessee has organized itself as a BI solutions provider, consolidating all the functions needed to deliver comprehensive solutions in one group, reporting to one individual who can ensure the various teams collaborate efficiently and effectively to meet and exceed customer requirements. BCBS of Tennessee has won many awards for its BI solutions and will be speaking at this summer’s TDWI BI Executive Summit in San Diego (August 16-18.)

The message is clear: if you want to deliver value to your organization and assure yourself a long-term, fulfilling career at your company, then don’t be satisfied with being just a data provider. Make sure you evolve into a solutions provider that is viewed as a strategic partner to the business.

Posted by Wayne Eckerson on March 16, 20100 comments


Zen BI: The Wisdom of Letting Go

One of my takeaways from last week’s BI Executive Summit in Las Vegas is that veteran BI directors are worried about the pace of change at the departmental level. More specifically, they are worried about how to support the business’ desire for new tools and data stores without undermining the data warehousing architecture and single version of truth they have worked so hard to deliver.

At the same time, many have recognized that the corporate BI team they manage has become a bottleneck. They know that if they don’t deliver solutions faster and reduce their project backlog, departments will circumvent them and develop renegade BI solutions that undermine the architectural integrity of the data warehousing  environment.

The Wisdom of Letting Go

In terms of TDWI’s BI Maturity Model, these DW veterans have achieved adulthood (i.e. centralized development and EDW) and are on the cusp of landing in the Sage stage. However, to achieve true BI wisdom (i.e. Sage stage), they must do something that is both counterintuitive and terrifying: they must let go. They must empower departments and business units to build their own DW and BI solutions.

Entrusting departments to do the right thing is a terrifying prospect for most BI veterans. They fear that the departments will create islands of analytical information and undermine data consistency with which they have worked so hard to achieve. The thought of empowering departments makes them grip the proverbial BI steering wheel tighter. But asserting control at this stage usually backfires. The only option is to adopt a Zenlike attitude and let go.

Trust in Standards

I'm reminded of the advice that Yoda in the movie “Star Wars” provides his Jedi warriors-in-training: “Let go and trust the force.” But, in this case, DW veterans need to trust their standards. That is, the BI standards that they’ve developed in the BI Competency Center, including definitions for business objects (i.e. business entities and metrics), processes for managing BI projects, techniques for developing BI software, and processes and procedures for managing ETL jobs and handling errors, among other things.

Some DW veterans who have gone down this path add the caveat: “trust but verify.” Although educating and training departmental IT personnel about proper BI development is critical, it’s also important to create validation routines where possible to ensure business units conform to standards.

Engage Departmental Analysts

The cagiest veterans also recognize that the key to making distributed BI development work is to recruit key analysts in each department to serve on a BI Working Committee. The Working Committee defines DW and BI standards, technologies, and architectures and essentially drives the BI effort, reporting their recommendations to the BI Steering Committee comprised of business sponsors for approval. Engaging analysts who are most apt to create renegade BI systems ensures the DW serves their needs and helps ensure buy-in and support.

By adopting a Zen like approach to BI, veteran DW managers can eliminate project backlogs, ensure a high level of customer satisfaction, and achieve BI nirvana.

Posted by Wayne Eckerson on February 28, 20100 comments


Reflections from TDWI's BI Executive Summit

More than 150 BI Directors and BI Sponsors from small, medium, and large companies plus a dozen or so sponsors attended TDWI’s BI Executive Summit last week, a record turnout.

Here are a few of the things I learned:

- Veteran BI directors are worried about the pace of change at the departmental level. More specifically, they are worried about how to support the business’ desire for new tools and data stores without undermining the data warehousing architecture and single version of truth they have worked so hard to deliver.(See "Zen BI: The  Wisdom of Letting  Go.")

- Jill Dyche explained that corporate BI teams can either be Data Providers or Solutions Providers. That this is an option is a new concept for me. However, after some thought, I believe that unless BI teams help deliver solutions, the data they provision will be underutilized. Unless the BI team helps solve business problems by delivering business solutions, it can never be viewed as a strategic partner.

- Most BI teams have project backlogs and don’t have a great way to get in front of them. Self-service BI can help eliminate a lot of the onesy and twosey requests for custom reports. BI Porfolios and roadmaps can help prioritize deliverables but executives always override their own priorities. Many veteran BI managers are looking to push more development back into the departments as a way to accelerate projects.

- There is a lot of interest in predictive analytics, dashboards, and the cloud. Those were the top three vote getters to the question, “Which technologies will have the most impact on your BI program in three years?”

- Most of the case studies at the Summit described real-time data delivery environments, often coupled with analytics. GE Rails applied statistical models to real-time data to help customer service agents figure out the optimal repair facility to send railroad cars to get fixed; Linkshare captures and displays Web activity and commissions to external customers (publishers and advertisers), and Seattle Teachers’ Credit Union delivers real-time recommendations to customer service agents.

- There was a lot of interest in how to launch an analytics practice and Aldo Mancini provided some great tips from his experiences at Discover Financial Services. To get SAS analysts to start using the data warehouse as a way to accelerate model development, he had them help design the subject areas and variables that should go into it. Then, he taught them how to use SQL so they could transform data into their desired format.

We’re already gearing up for our next Summit which will be held August 16-18 in San Diego. Hope to see you there!

Posted by Wayne Eckerson on February 28, 20100 comments


Launching an Analytics Practice: Ten Steps to Success

Everyone wants to move beyond reporting to deliver value-added insights through analytics. The problem is that few organizations know where to begin. Here is a ten-step guide for launching a vibrant analytics practice.

Launching the Practice

Step 1: Find an Analyst. You can’t do analytics without an analyst! Most companies have one or more analysts burrowed inside a department. Look for someone who is bright, curious, and understands key business processes inside and out. The analyst should like to work with numbers, have strong Excel, SQL, OLAP, and database skills, and ideally understand some statistics and data mining tools.

Step 2: Find a Business Person. The quickest way to kill an analytics practice is to talk about predictive models, optimization, or statistics with a business person. Instead, find one or more executives who are receptive to testing key assumptions about how the business works. For example, a retail executive might want to know, “Why do some customers stop buying our product?” A social service agency might want to know, “Which spouses are most likely not to pay alimony?” Ask them to dream up as many hypotheses to their questions as possible and then use those as inputs for your analysis.

Step 3: Gain Sponsorship. If step two piqued an executive’s interest, then you have a sponsor. Tell the sponsor what resources you need, if any, to conduct the test. Perhaps you need permission to free up an analyst for a week or two or hire a consultant to conduct the analysis. Ideally, you should be able to make do with people and tools you have inhouse. A good analyst can work miracles with Excel and SQL and there are many open source data mining packages on the market today as well as low cost statistical add-ins for Excel and BI tools.

Step 4: Don’t Get Uppity. “You never want to come across smarter than the executive you are supporting,” says Matthew Schwartz, a former director of business analytics at Corporate Express. Don’t ever portray the model results as “the truth”; executives don’t trust models unless they make intuitive sense or prove their value in dollars and cents. For example, Schwartz was able to get his director of marketing to buy in to the results of a market basket analysis for Web site recommendations because the director recognized the model’s cross-selling logic: “Ah! It knows that people are buying office kits for new employees.”

Step 5: Make It Actionable. A model is worthless if people can’t act on it. This often means embedding the model in an operational application, such as a Web site or customer-facing application, or distributing the results in reports to salespeople or customer service representatives. In either case, you need to strip out the mathematics and decompose the model so it’s understandable and usable by people in the field. For example, a sales report might say, “These five customers are likely to stop purchasing office products from us because they haven’t bought toner in four weeks.”

Step 6: Make It Proactive. The kiss of death for an analytical model is to tell people something they already know. Rather than tell salespeople about customers who are purchasing fewer products than the prior period are likely to churn (like the example I gave in step five above), tell them about customers who will buy fewer products in the future because they have fallen below a critical statistical threshold and are vulnerable to competitive offers. Or, rather than forecast the number loans that will go into default, identify the characteristics of good loans and bake that criteria into the loan origination process. If you deliver results that enable people to work proactively, you’ll become an overnight hero.

Sustaining the Analytics Practice

Let’s assume your initial modeling efforts worked their magic and garnered you strong executive sponsorship. How do you build and sustain an analytics practice? What organizational and technical strategies do you employ to ensure that your analysts are as productive as possible? The following four steps will solidify your analytics practice.

Step 7: Centralize and Standardize the Data. The thing that slows down analysts the most is having to collect data spread across multiple systems and then clean, harmonize, and integrate it. Only then can they start to analyze the data. Obviously, this is what a data warehouse is designed to do, not an analyst. But a data warehouse only helps if it contains all or most of the data analysts need in a format they can readily use so they don’t have to hunt and reconcile data on their own. Typically, analytical modelers need wide, flat tables with hundreds of attributes to create models.

Step 8: Provide Open Access to Data. Data warehouse administrators need to give analysts access to the data warehouse without having to file a request and wait weeks for an answer. Rather than broker access to the data warehouse, administrators should create analytical sandboxes using partitions and workload management that let analysts upload their own data and comingle it with data in the warehouse. This creates an analytical playground for analysts and keeps them from creating renegade data marts under their desks.

Step 9: Centralize Analysts. Contrary to current practice, it’s best to centralize

analysts in an Analytical Center of Excellence under the supervision of a Director of Analytics. This creates a greater sense of community and camaraderie among analysts and gives them more opportunities for advancement within the organization. It also minimizes the chance that they’ll be lured away by recruiters. Although they may be part of a shared services group, analysts should be physically embedded within the departments they support and have dotted line responsibility to those department heads.

Step 10: Offload Reporting. The quickest to undermine the productivity of your top analysts is force them to field requests for ad hoc reports from business users. To eliminate the reporting backlog, the BI team and analysts need to work together to create a self-service BI architecture that empowers business users to generate their own reports and views. When designed properly, these interactive reports and dashboards will meet 60% to 80% of users’ information needs, freeing up business analysts and BI report developers to focus on more value-added activities.

So there you have it, ten steps to analytical nirvana. Easy to write, hard to do! Keep me informed about your analytics journey and the lessons you learn along the way! I’d love to hear your stories. You can reach me at [email protected].

Posted by Wayne Eckerson on February 26, 20100 comments


Sleep Well at Night: Abstract Your Source Systems

It’s odd that our industry has established a best practice for creating a layer of abstraction between business users and the data warehouse (i.e., a semantic layer or business objects), but we have not done the same thing on the back end.

Today, when a database administrator adds, changes, or deletes fields in a source system, it breaks the feeds to the data warehouse. Usually, source systems owners don’t notify the data warehousing team of the changes, forcing us to scramble to track down the source of the errors, rerun ETL routines, and patch any residual problems before business users awake in the  morning and demand to see their error-free reports.

It’s time we get some sleep at night and create a layer of abstraction that insulates our ETL routines from the vicissitudes of source systems changes. This sounds great, but how?

Insulating Netflix

Eric Colson, whose novel approaches to BI appeared two weeks ago in my blog “Revolutionary BI: When Agile is Not Fast Enough,” has found a simple way to abstract source systems at Netflix. Rather than pulling data directly from source systems, Colson’s BI team pulls data from a file that source systems teams publish to. It’s kind of an old-fashioned publish-and-subscribe messaging system that insulates both sides from changes in the other.

“This has worked wonderfully with the [source systems] teams that are using it so far,” says Colson, who believes this layer of abstraction is critical when source systems change at breakneck speed, like they do at Neflix. “The benefit for the source systems team is that they get to go as fast as they want and don’t have to communicate changes to us. One team migrated a system to the cloud and never even told us! The move was totally transparent.”

On the flip side, the publish-and-subscribe system alleviates Colson’s team from having to 1) access to source systems 2) run queries on those systems 3) know the names and logic governing tables and columns in those systems and 4) keep up with changes in the systems. They also get much better quality data from source systems in this way.

Changing Mindsets

However, Colson admits that he might get push back from some source systems teams. “We are asking them to do more work and take responsibility for the quality of data they publish into the file,” says Colson. “But this gives them a lot more flexibility to make changes without having to coordinate with us.” If the source team wants to add a column, it simply appends it to the end of the file. 

This approach is a big mindset change from the way most data warehousing teams interface with source systems teams. The mentality is: “We will fix whatever you give us.” Colson’s technique, on the other hand, forces the source systems teams to design their databases and implement changes with downstream analysis in mind. For example, says Colson, “they will inevitably avoid adding proprietary logic and other weird stuff that would be hard to encapsulate in the file.”

Time to Deploy

Call me a BI rube, but I’ve always assumed that BI teams by default create such an insulating layer between their ETL tools and source systems. Perhaps for companies that don’t operate at the speed of Netflix, ETL tools offer enough abstraction. But, it seems to me that Colson’s solution is a simple, low-cost way to improve the adaptability and quality of data warehousing environments that everyone can and should implement.

Let me know what you think!

Posted by Wayne Eckerson on February 8, 20100 comments


Proactive Analytics That Drive the Business

“I love the chart, but what am I supposed to do about it?” With that simple question, Ken Rudin is schooling analysts at Zynga how to deliver information that makes a difference in the way the wildly successful gaming company creates and enhances games for customers.

“My mantra these days is ‘It’s gotta be actionable,’” says Rudin, former CEO of the early BI SaaS vendor LucidEra who now runs analytics at Zynga, creators of Farmville, Mafia Wars, and other popular applications for Facebook, iPhone, and other networks. “Just showing that revenue is down doesn’t help our product managers improve the games. But if we can show the lifecycle with which a subgroup uses the game, we can open their eyes to things they never realized before.”

It’s surprising that Rudin has to do any analytics tutoring at Zynga. Its data warehouse is a critical piece of its gaming infrastructure, providing recommendations to players based on profiles compiled daily in the data warehouse and cached to memory. With over 40 million players and 3TB of new data a day, Zynga’s 200-node, columnar data warehouse from Vertica is no analytical windup toy. If it goes down for a minute, all hell breaks out because product managers have no visibility into game traffic and trends.

Moreover, the company applies A/B testing to every new feature before deploying and has a bevy of statisticians who continually dream up ways that product managers can enhance games to improve retention and collaboration among gaming users. “I’ve never seen a company that is so analytically driven. Sometimes I think we are an analytics company masquerading as a gaming company. Everything is run by the numbers,” says Rudin.

Anticipating Questions

Yet, when Rudin came to Zynga in early 2009, he discovered the analytics team was mostly in reaction mode, taking orders from product managers for custom reports. So, he split the team into two groups: 1) a reporting team that creates reports for product managers and 2) an analytics team that tests hypotheses and creates models using statistical and analytical methods. A third part of his team runs the real-time, streaming data warehousing environment.

The reporting team currently uses a home grown SQL-based tool for creating parameterized reports. Rudin hopes to migrate them to a richer, self-service dashboard environment that delivers most of the routine information that product managers need and the ability to generate ad hoc views without the help of a SQL professional.

Rudin is encouraging the analytics team to be more proactive. Instead of waiting for product managers to submit requests for hypotheses to test, analysts should suggest gaming enhancements that increase a game's "stickiness" and customer satisfaction.  “It’s one thing to get answers to questions and it’s another to know what questions to ask in the first place. We need to show them novels ways that they can enhance the games to increase customer retention.”

Zynga is already an analytics powerhouse, but it sees an infinite opportunity to leverage the terabytes of data it collects daily to enhance the gaming experience of its customers. “My goal for the year is to use analytics to come up with new product innovations,” says Rudin. By proactively working with the business to improve core products, the analytics team is fast becoming an ideas factory to improve Zynga’s profitability.

Editor's Note: By the way, the Zynga Analytics team is growing as fast as the company, so if you’re interested in talking to them, please contact Ken at [email protected].

Posted by Wayne Eckerson on February 2, 20100 comments


Revolutionary BI: When Agile is Not Fast Enough

Developers of BI unite! It is time that we liberate the means of BI production from our industrial past.

Too many BI teams are shackled by outdated modes of industrial organization. In our quest for efficiency, we’ve created rigid, fiefdoms of specialization that have hijacked the development process (and frankly, sucked all the enjoyment out of it as well.)

We’ve created an insidious assembly line in which business specialists document user requirements that they throw over the wall to data management specialists who create data models that they throw over the wall to data acquisition specialists who capture and transform data that they throw over the wall to reporting specialists who create reports for end users that they throw over the wall to a support team who helps users understand and troubleshoot reports.The distance from user need to fulfillment is longer than Odysseus' journey home from Troy and just as fraught with peril.


Flattened BI Teams

Contrary to standard beliefs, linear development based on specialization is highly inefficient. “Coordination [between BI groups] was killing us,” says Eric Colson, director of BI at Netflix. Colson inherited an industrialized BI team set up by managers who came from a banking environment. The first thing Colson did when he inherited the job was tear down the walls and cross-train everyone on the BI staff. “ Everyone now can handle the entire stack--from requirements to database to ETL to BI tools.”

Likewise, the data warehousing team at the University of Illinois found its project backlog growing bigger each year until it reorganized itself into nine small, self-governing interdisciplinary groups. By cross-training its staff and giving members the ability to switch groups every year, the data warehousing team doubled the number of projects it handles with the same staff.

The Power of One

Going one step further, Colson believes that even small teams are too slow. “What some people call agile is actually quite slow.” Colson believes that one developer trained in all facets of a BI stack can work faster and more effectively than a team. For example, it’s easier and quicker for one person to decide whether to apply a calculation in the ETL or BI layer than a small team, he says.

Furthermore, Colson doesn’t believe in requirements documents or quality assurance (QA) testing. He disbanded those groups when he took charge. He believes developers should work directly with users, which is something I posited in a recent blog titled the Principle of Proximity. And he thinks QA testing actually lowers quality because it relieves developers from having to understand the context of the data with which they are working.

It’s safe to say that Colson is not afraid to shake up the establishment. He admits, however, that his approach may not work everywhere: Netflix is a dynamic environment where source systems change daily so flexibility and fluidity are keys to BI success. He also reports directly to the CEO and has strong support as long as he delivers results.

Both the University of Illinois and Netflix have discovered that agility comes from a flexible organizational model and versatile individuals who have the skills and inclination to deliver complete solutions. They are BI revolutionaries who have successfully unshackled their BI organizations from the bondage of industrial era organizational models and assembly line development processes.

Posted by Wayne Eckerson on January 27, 20100 comments