TDWI Best Practices Awards: Recognition for Innovative Data and Analytics Implementations



TDWI Best Practices Awards 2018 Winner Summaries

Category: Advanced Analytics and Data Science

HDFC Bank Ltd.

More than 10 million banking customers are eligible for cash loans, providing a tremendous incentive for HDFC Bank to selectively identify its customers with the greatest chance of responding in an effort to cross-sell such loans. This targeting reduces costs, increases the productivity of sales managers, and reduces irrelevant emails to customers.

To accomplish this objective, HDFC Bank needed to develop a predictive trigger to identify customers likely to apply for cash loans.

Traditionally, rule-based triggers are developed at the customer level and do not consider nonlinear relationships and multidimensional data to identify hidden trends. HDFC Bank’s innovative approach moves the paradigm from predictive modeling at the customer level to predictive modeling at the transaction level, resulting in predictive rules that can be triggered by a particular transaction.

The development team built models based on python using advanced machine-learning techniques. The trigger was developed using an eight-step framework, beginning with focus group discussions through pattern detection and text mining. Advanced feature engineering at the transaction level detected hidden patterns and behaviors. For example, many customers borrowed money from friends or relatives, then applied for a bank loan to pay back these personal loans.

Text mining identified 11 possible reasons a customer might need a cash loan, which became part of a customized solicitation. As a result of sharper identification of customers in need of cash loan and contextual communication, relevance of recommendations improved significantly leading to almost double conversion rates as compared to traditional methods.

Additional benefits include a reduction of customer contacts by 60 percent with no loss of opportunities and increasing sales by two and a half times.

Category: BI and Analytics on a Limited Budget

Polaris

Solution Sponsor: Paxata

Polaris, whose mission includes using data-driven strategies to combat modern-day slavery, needed a new approach to deal with the large and diverse data sets researchers used to help end human trafficking, and to protect those who are pushed into this crime against their will.

Despite access to rich, aggregated data sets collected from Polaris-operated hotlines and various global partner services, the ability to maintain confidentiality, find patterns, normalize fields across data sets, and address data quality was an extensive and primarily manual effort handled in Excel. This, combined with limited data engineering resources to code data science models, made it extremely difficult and time-consuming for Polaris to prepare the data for analysis.

Leveraging a data preparation solution from Paxata, Polaris is now able to transform raw data into ready information instantly and across a number of data sources. Once unified, the aggregated and non-personally identifiable data is made available to the counter-trafficking personnel in the field for analysis so they can create the most effective and targeted strategies possible for systemically disrupting the human-trafficking networks that rob human beings of their lives and freedom. Using Paxata, Polaris gets data ready for analytics by bringing together multistructured data from diverse sources, preparing it, and combining it with additional data to put the data into context.

Polaris is quickly and easily transforming and reformatting the structured and unstructured data obtained from the National Human Trafficking Hotline and using data preparation to exchange and combine data from Polaris’s worldwide counterparts to identify cross-border trends and apply it for analytics purposes. Using the self-service data prep solution, Polaris is removing its reliance on labor-intensive and error-prone Excel spreadsheets and is significantly reducing the number of requests made to their one staff developer, which allows them to respond to threats faster.

Category: BI, Visual Analytics, and Data Discovery

Medical and Health Sciences Foundation, University of Pittsburgh and UPMC

Solution Sponsor: ADVIZOR Solutions, Inc.

Founded in 2003, Medical and Health Sciences Foundation (MHSF) raises funds for the University of Pittsburgh's six health science schools and UPMC hospitals. The MHSF is the central source for patients, alumni, and friends to contribute to any clinical or research endeavor at UPMC and Pitt.

The Moves Management Dashboard (MMD)equips development officers to prioritize the identification, assignment, qualification, cultivation, solicitation, and stewardship of prospects. Content comes from a set of integrated algorithms; the Moves Management Algorithm (MMA) classifies prospect quality and prospect management engagement and is based on prospect point scoring, prospect manager progress scoring, proposal details, and other factors.

Two interwoven algorithms also provide information: the PRM Feeder-Leads Algorithm (PFLA)and the Prospect Leads Algorithm (PLA). PFLA finds unassigned, unrated, recent donors who meet additional criteria. Researchers conduct ratings analyses on this cohort—these values feed into the second algorithm, PLA, which yields a dynamic prospect leads score.

The Dashboard’s interactive visualizations are simple, providing gift officers straightforward ways to strategically segment their prospects. The graphs are arranged as nesting dolls of logic, ordered to show how each chart concept fits inside another. This is business intelligence, visual analytics, and data discovery for nontechnical users under pressure to raise significant sums of money.

The vision has been fully implemented. However, MHSF is integrating with the University’s Institutional Advancement department, which poses long-term questions for the project. The hope is to make the Moves Management Dashboard the gold standard for equipping gift officers in the larger, integrated enterprise. By design, the heavy lifting is done in the data warehouse, which allows the freedom to easily port this project to any number of applications.

Category: Big Data

Freddie Mac

In 2015, Freddie Mac launched the Loan Advisor Suite initiative to deliver flexible, customer-centric, componentized, and value-added technology solutions to offer customers Greater Purchase Certainty ("GPC") by driving quality of loan manufacturing. Credit Analytics and Reporting participates in this process by performing research on third-party data sets, using this data to codify credit policies and rules and to proactively monitor the loans for quality and performance.

Time to decision on the vendor data and to launch the product is extremely short. New product offerings are launched initially as pilots and then planned for broader roll out. This approach drives the need to have access to the data at launch for customer management and enhancements. However, data warehouses and data marts are traditionally updated on a quarterly or semiannual basis, based on business priority and funding. Typically, it takes from six months to over a year for downstream consumers to gain access to new data sets.

Data scientists and business users can spend as much as 80 percent of their time in “data janitor” work, which presents a key hurdle to insights. Our objective was to lift the “data munging” burden from the user, reduce the time spent on data preparation, and at the same time give the user flexibility to enter their requirements or overrides prior to the final presentation of the data.

To solve the mundane issues related to data management, we developed and deployed a hybrid, robust, high-reuse, and highly flexible data framework utilizing the Hortonworks BigData, and Actian Matrix DB platforms to enable business users to generate reports for risk monitoring. Our innovative framework can be implemented in any organization looking to leverage a big data platform for large volume data sets. Our homegrown approach gives us the flexibility to leverage new, emerging technologies and continually innovate.

Category: Data Management Strategies

Department of the Navy – Manpower, Personnel, Training, and Education

U.S. Department of the Navy, Bureau of Naval Personnel is the human resources division for all Navy enlisted and officer personnel serving throughout the world. This division manages and maintains personnel and pay records for each active duty, reserve, or retired sailor for up to 62 years. This task requires an extensive data management program that emphasizes and increases the value of data to the organization by enforcing standardized processes and controls, culminating in a single, secure data source to effectively support the needs of the Navy.

The Bureau of Naval Personnel Enterprise Information Management (EIM) team successfully matured its data management program during FY17, playing a pivotal role in the Navy Information Technology Transformation. Through data management and governance, the team and EIM Board have provided the necessary standardized business terms, data interfaces, data architecture artifacts and guidance for the Navy Personnel & Pay modernization, the MyNavy Portal, and the Authoritative Data Environment, transformation projects.

The EIM team standardized 1,859 business terms and created relationships to 8,182 approved data element dictionary terms, assisting with data needed for all the MPT&E modernization projects. The team processed 190 compliance reviews for data transfers, successfully ensuring the availability, integrity, security, and confidentiality of naval personnel data—a 44 percent increase in received packages over FY16. Responding to customer requests for the obfuscation and declassification of real data, the team developed and instituted standards for test data management governance across MPT&E.

EIM developed and maintains a database that allows the user to browse data exchanges, conduct searches, map data elements to the MPT&E Enterprise Logical Data Model, import and export DoD Architecture Framework (DoDAF) artifacts, trace the flow of a single data element as it moves between multiple systems, and run reports.

Category: Emerging Technologies and Methods

Verizon

Solution Sponsor: Kyvos Insights

Verizon’s marketing team wanted to analyze data from their 6 million customers, across media, viewers, geographies, and diagnostics and use the insights to power their marketing decisions and improve customer experience.

Because the complexity and size of data was extremely large, the IT team faced problems in analyzing data real-time due to poor query performance, which resulted in delayed analysis. An ad hoc query would often take hours or days to complete, which made it difficult for media team to respond to network, content, customer, and company related inquiries and insights. Analysis could only be run periodically and it became almost impossible to conduct trend analysis or machine learning because queries involving month-over-month or year-over-year comparisons would take several days to execute.

To solve these problems, Verizon built a big data streaming analytics platform that could provide quick insights on their viewer data. Verizon adopted a hybrid architecture using technologies from several partners. A huge legacy data warehouse handles structured data or data at rest; a Hadoop implementation works with both structured and unstructured data in motion. Both these systems co-exist and complement each other. Verizon also built a BI consumption layer using “OLAP on Hadoop” technology that allows pre-aggregation of data into cubes across multiple dimensions providing instant response to big data queries. Finally, their topmost layer consists of analytical self-serving BI tools that allow users to connect with big data interactively and visually analyze it independent of the changes to the underlying data.

This innovative self-serving data analytics architecture built has helped Verizon get deeper customer behavior insights, allowing them to create better programming for their customers. Verizon’s media team now has interactive access to 14 months of granular data from over 400 million fact table rows per day, amounting to 168 billion rows.

Category: Enterprise Data Warehouses

Cerner Corporation

Solution Sponsor: Vertica

Summary detail coming soon.