Can Automation Accelerate Machine Learning Programs?
Auto ML is a powerful concept for the next generation of AI tools. It's part of a general movement to extend AI-based automation to data science.
- By Brian J. Dooley
- February 9, 2018
Just within the past several years, the possibilities created by machine learning and deep learning have exploded across many industries. Unfortunately, machine learning is difficult and tedious, and there aren't enough qualified practitioners. Although many companies are envisioning a future of ubiquitous AI, a lack of data scientists experienced with machine learning will prevent them from making that vision a reality.
However, as with many aspects of programming in the past, new levels of abstraction and automation are now available to make the experienced more efficient and also permit a wider range of non-AI specialists to produce machine learning solutions.
The emerging technology that enables this is Auto ML. It has been developing for several years, but offerings are now becoming sophisticated enough to be applied to practical problems. Auto ML is designed to automate machine learning processes and open the gates for wider use.
How Much to Automate?
It seems reasonable that machine learning's multiple repetitive tasks and iterations should be automated, but the degree to which this is possible is not yet clear. Some machine learning tasks are suited to an automated approach, but complex features or algorithms and large, complicated data sets can make automation difficult.
Current Auto ML programs mainly handle the highly repetitive tasks required to create machine learning models, chiefly selecting appropriate algorithms, tuning hyper parameters, feature extraction, and iterative modeling. Hyper parameter tuning is particularly significant for deep neural networks.
Numerous programs available today provide assistance in these areas. Auto ML routines have already been validated in small studies, offering demonstrable benefits over manual human performance in some cases. However, these programs cannot yet be relied upon to achieve a desired result entirely without human intervention. At the very least, data scientists must set objectives and verify results.
Transparency is also of the utmost importance, particularly in fields such as healthcare and finance, which have strong regulatory requirements. Regulation abhors a black box; a company using a machine learning process must be able to clearly demonstrate that it does not violate regulations and cannot invisibly violate regulations when decisions are made.
Today's Auto ML Market
Current Auto ML programs include a range of proprietary and open source initiatives. Due to the skills shortage, this area is now viewed with increasing urgency. Major analytics firms, including Google, Microsoft, Salesforce, and Facebook, are rapidly developing Auto ML capabilities. Current programs include Google AutoML, DataRobot, Auto-WEKA, TPOT, and Auto-sklearn. The strong market for a proficient Auto ML program is demonstrated by the success of the start-up DataRobot, which has recently raised $54 million from venture capitalists, bringing its total purse to $111 million.
Many threads are being followed within the academic community in researching Auto ML capabilities and extending these capabilities across all machine learning. Already, Auto ML has been demonstrated for both supervised and unsupervised variants of machine learning.
Interest in Auto ML extends across the analytics community, vendors involved in AI, researchers, and governments. DARPA established a program in June, 2016, called Data-Driven Discovery of Models (D3M). It's aimed at developing techniques to automate machine learning model building from data ingestion up to model evaluation.
Competition is likely to increase in this space, and the rewards for solving some of the problems will be high.
The Next Generation of AI Tools
Auto ML is a powerful concept for the next generation of AI tools, and it is part of a general movement to extend AI-based automation to programming and data science. The trend will continue toward inclusion of some degree of Auto ML in vendor analytics toolsets along with other automated data science tools. Mirroring AI usage in other sectors, some of these tools will be invisible; automation will simply be taken for granted.
In some ways, the creation of an automated layer of processing is like the development of 4GL programming languages in the 1980s -- these introduced an additional level of abstraction, making it easier to develop programs to perform routine tasks.
As with other abstractions, use of Auto ML will broaden the types of users able to employ the technology as well as ensure that specialists (data scientists, in this case) can function more efficiently. With the complexities hidden and the repetition eased, it will be possible to move to another layer of complexity in the use of machine learning and produce more powerful results.
Brian J. Dooley is an author, analyst, and journalist with more than 30 years' experience in analyzing and writing about trends in IT. He has written six books, numerous user manuals, hundreds of reports, and more than 1,000 magazine features. You can contact the author at [email protected].