Managing System Change
    
		BI environments are like personal computers: after a  year or two, performance starts to degrade and you are never quite sure  why. The best explanation is that these systems start accumulating a  lot of “gunk” that is hard to identify and difficult to eliminate.  
Personal  computers, for example, become infected with viruses, spyware, and  other malware that wreak havoc on performance. But we cause many  problems ourselves by installing lots of poorly designed software,  adding too many memory-resident programs, accidentally deleting key  systems files, changing configuration settings, and failing to perform  routine maintenance. And when the system finally freezes up, we execute  unscheduled (i.e. three-finger) shutdowns, which usually compound  performance issues. Many of us quickly get to the point where it’s  easier and cheaper to replace our personal computers rather than try to  fix them.  
Unfortunately, BI  environments are much harder and more expensive to return to a pristine  environment. Over time, many queries become suboptimized because of  changes we make to logical models, physical schema, or indexes or  because we create incompatibilities when we upgrade or replace drivers  and other software. Each time we touch any part of the BI environment,  we create a ripple effect of problems that makes IT adverse to making  any changes at all, even to fix known problems! One data architect  recently confessed to me, “I’ve been trying 10 years to get permission  to get rid of one table in our data warehousing schema that is  adversely affecting performance, but I haven’t succeeded.” 
But  when IT is slow to make changes and maintenance efforts begins to dwarf  development initiatives, then the business revolts and refuses to work  with IT and fund its projects.  
The  above architect said the solution is “better regression testing.” The  idea is that if we perform continuous regression testing, IT will be  less hesitant to change things because it will see quickly whether the  impact is deleterious or not. However, this is like using a hammer and  chisel to chop down a tree. It will work but it’s not very effective.  
The  better approach is to implement end-to-end metadata so you can see what  impact any change in one part of the BI environment will have on every  other part. Of course, a metadata management system has been an elusive  goal for many years. But we are starting to see new classes of tools  emerge that begin to support impact analysis and data lineage. ETL  vendors, such as Informatica and IBM, have long offered metadata  management tools for the parts of the BI environment they touch. And a  new class of tools that I call data warehouse automation tools, which  automatically generate star schema and semantic layers for reporting,  also provide a glimmer of hope for easier change management and  reporting. These tools include Kalido, BI Ready, Wherescape, and  Composite Software with its new BI Accelerator product. You’ll hear  more about these tools from me in future blogs. 
 
	Posted by Wayne Eckerson