DevOps and BI: Software Development in Transition
Software development is in the midst of a transitional period, thanks to the emergence of use-oriented models like DevOps, which are revolutionizing application development, delivery, maintenance, and enhancement.
- By Steve Swoyer
- February 24, 2016
Software development is in the midst of a transitional period thanks to the emergence of new use-oriented models such as DevOps -- an agile fusion of development and operations (hence the portmanteau) -- that are revolutionizing application development, delivery, maintenance, and enhancement. This is most true of the conventional app-dev life cycle -- i.e., procedural coding (see Note 1 at end of article).
DevOps programming and automation tools target conventional software development and are not designed to address the unique requirements of business intelligence (BI) and analytic development. Put simply, the DevOps paradigm can't easily be translated to BI development.
BI isn't completely disenfranchised from this movement, however. Some vendors, such as Teradata Corp., have started using DevOps-like concepts in their own product development efforts.
Elsewhere, programs such as data warehouse automation (DWA) place a DevOps-like emphasis on continuous development, rapid deployment and operationalization, and ongoing maintenance.
Last September, Teradata announced a new Python module it says permits programmers to more easily build apps that access data in the Teradata data warehouse. Teradata claims this makes it easier for programmers to invoke Python libraries and to run them against Teradata, too. The idea is that coders can exploit Python's vast trove of data manipulation, machine learning, and advanced analytical libraries, among others. At the same time, Teradata's Python module falls far short of a full-on embrace of DevOps. This isn't the result of some kind of implementation failure however; DevOps just isn't an easy -- i.e., automatic -- fit for BI. Any effort to bring the two worlds together (or, more positively, to develop a DevOps-like program for BI and analytics) is going to take some time.
Teradata's Alan Greenspan, product marketing manager for its Teradata Database, concedes that Teradata's inaugural DevOps play has an "aspirational" aspect to it, but stresses that his company is dead serious about following through. "I don't have a formal, clear roadmap for where that module is going. I know there's a couple of things being discussed. There's going to be some get-some-feedback, see how people are using it, see what should be added, see what people ask for in comments and such, and do it," said Greenspan at Teradata's October Partners conference.
The material point, he says, is that doing DevOps in the context of data warehousing is an extraordinarily complex proposition. "Continuous deployment of applications in a data-driven world means continuous change and deployment of the workload in the data warehouse," he indicates.
"You can't just do a backup of your data and then go back to your data. That works for an app that's monolithic. Fine, roll your data back to a backup. In the data warehouse, if you take the data warehouse back to a backup that you made a week before or more, then what happens to all of the thousands of other apps that have made legitimate changes to that [in the interim]? In a world where data is a shared resource, there are many more interactions. It's much more complicated."
Data warehouse development has two different aspects, each with its own data life cycle. The first of these comprises the proximal (and ongoing) development project that commences with initial construction. This construction "project" actually consists of a slew of ongoing sub-projects, each one of which adds or integrates one or more subject areas to the warehouse. The second aspect is that of the production data environment, with its ongoing operations and its tally of change requests, large and small. Conventional data warehouse development gives priority to the first aspect at the expense of the second. A DevOps-like take on the data warehouse SLDC would change this.
Data warehouse automation gets at something like DevOps. It isn't in any sense the same thing as DevOps: we are, after all, talking about two very different domains, one of which (app dev) gives priority to code, the other (BI and analytic dev) gives priority to data. DWA, like DevOps, proposes to radically transform the application life cycle by accelerating application delivery, simplifying (by automating) operations and maintenance, and promoting the use of agile, resilient development practices that are better able to accommodate change and disruption.
What's more, DWA, like DevOps, gives similar priority to use: i.e., to developing reports, dashboards, analytics, or other artifacts for production; for rapidly deploying them in production; for refining, improving, and optimizing them once they're in production; and for automating, to the greatest extent possible, their operation and maintenance in production environments.
The DevOps paradigm understands that ongoing development -- maintenance, optimization, refactoring, and, yes, retirement and migration -- is inescapable. In the same way, DWA proponents like to say that systems must be flexible enough in their design to accommodate radically changing conditions. Increasingly, the data warehousing industry seems to be coming around to this idea, too. Look no further than Teradata's Data Lab analytic sandbox environment: Teradata doesn't just market it as a sandbox, but as a proving ground for developing prototypes and expediting them into production. Here and elsewhere, the idea is that ongoing analytic development is a core component of day-to-day data warehouse operations. Ongoing development is core because change itself is core. Change never ceases, so analytic development never ceases. The upshot is that you have to erect the equivalent of permanent scaffolding around your warehouse or big data system.
In the DevOps paradigm, there isn't anything wrong with this. The same kind of thing -- i.e., a similar, more or less permanent app-dev scaffolding -- has been a staple of data warehouse operations since the first data warehouse systems were built. We've tended to treat it as an ephemerality: as something that must and should go away.
It's time we stopped being in denial.
Note 1: The traditional software development process strictly isolates development or coding from operations, which is responsible for deployment and maintenance. The effect of this is to promote domain-specific segregation -- the equivalent of a kind of Apartheid arrangement -- between development and operations. This can lead to a scenario in which a team of developers effectively "throws" its work over the proverbial wall to operations: app dev does its thing without giving much thought to how the apps or services it's building will be deployed and maintained.
Among other problems, this gap divides responsibilities between teams and introduces fragility in processes that span environments: a change on one side, implemented without reference or recourse to the other, results in process breakdown in the whole. Because developers are insulated from the whole, they don't grasp the effects of their actions nor do they have any incentive to. Consequently, they build hard-to-maintain processes and systems that are unreliable and difficult to support. "Deployment? Support? Maintenance? What, us worry? They're operational problems!"
DevOps changes this by combining development and operations in continuous delivery model.
Stephen Swoyer is a technology writer with 20 years of experience. His writing has focused on business intelligence, data warehousing, and analytics for almost 15 years. Swoyer has an abiding interest in tech, but he’s particularly intrigued by the thorny people and process problems technology vendors never, ever want to talk about. You can contact him at firstname.lastname@example.org.