TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Think
- Research & Resources
  - TDWI Playbook | Next Generation Data Science: The AI-Driven Data Science Life Cycle
  - TDWI Data Points | The Data Foundation for AI
  - TDWI Best Practices Report | Data Strategies and Foundations for Modern Data Management
  - TDWI Insight Accelerator | Adopting a Platform Approach for Gaining Insights from Unstructured Data
- Webinars
  - Expert Panel: What's Next in Data Integration: Powering the AI-Driven Enterprise August 25, 2025
  - Expert Panel: Improving Data Quality, Accuracy, and Consistency August 27, 2025
  - Expert Panel: Building an AI-Driven Data Strategy September 15, 2025
  - Why Enterprises Aren’t Ready for AI—And How to Fix It September 18, 2025
- Virtual Summits
  - Virtual Events Keys to Making Your Data AI Ready September 10, 2025
  - Virtual Events Data Quality for BI, Analytics and AI October 22, 2025
  - Virtual Events Modern Data Strategy November 12, 2025
  - Virtual Events What’s Ahead in 2026 for Data & Analytics December 10, 2025
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Speaking of Data Podcast
  
  Current Research Surveys
Train
- In-Person Events
  - Conference TDWI Transform 2025 San Diego August 18, 2025
  - Executive Summit TDWI Modern Data Leader's Summit San Diego: AI in the Enterprise August 18, 2025
  - Conference TDWI Transform 2025 Orlando November 16, 2025
  - Executive Summit TDWI Data & AI Leaders Summit Orlando: Governing Data, Analytics, and AI November 17, 2025
- Virtual Live Seminars
  - Data Governance Week July 30, 2025
  - Platforms & Architecture Week July 30, 2025
  - AI Bootcamp Week July 30, 2025
- Online Learning
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Train Your TeamCustom solutions for training your team
  
  Get CertifiedEarn a professional credential in BI and Analytics, Data Governance, or AI
  
  TDWI MembershipExclusive access to the research, tools, training, and connections
Engage
- Connect
  - Connect and Contribute to Our Vibrant Community of Data Leaders
    
    Subscribe to TDWI Stay up to date on the latest news and events. Sign Up
    
    Become a TDWI Member Gain exclusive access to the research, tools, training, and connections to move your careers, teams, and projects forward. Learn More
    
    Become a Part of the TDWI Research Panel Make a difference in the data and analytics industry and earn incentives by sharing your insights with TDWI. Explore Now
    
    Speak at TDWI Events Share your expertise and build your personal brand as a speaker at a TDWI In-Person or Virtual Event. Submit a Proposal
    
    Become a TDWI Research Fellow Apply to be a member of TDWI’s industry leading research team. Apply Today
    
    Become a Member of the Data & AI Leaders Forum Engage in collaborative discussions, stay ahead of the curve, and stay in the know. Apply Now
    
    Showcase Your Data & AI Solutions Reach and engage with TDWI community through multi-channel marketing programs. Learn More

TDWI Articles

Three Models Leading the Neural Network Revolution

In recent years, we have seen great advances in machine learning and artificial intelligence that could usher in a new era of progress. In the area of natural language processing, three algorithms have been the cornerstone of this innovation: GPT, BERT, and T5.

By Troy Hiltbrand
February 13, 2023

In the past couple of years, there have been some revolutionary advances in machine learning (ML) and artificial intelligence (AI). These advances are demonstrating that ML and AI are moving from science fiction to science fact and that they have the capacity for transformational change across many industries. From DALL-E and Lensa demonstrating how machines can create art to ChatGPT demonstrating that machines can write articles, poetry, song lyrics, and even programming code, this domain is on the precipice of huge advances.

For Further Reading:

Next Year in Data Analytics: Data Quality, AI Advances, Improved Self-Service

Deep Trouble for Deep Learning: Hidden Technical Debt

2021: A Tale of Three Networks

Underlying these amazing demonstrations of the business value of ML and AI is a set of technologies that fall into the family of neural networks called transformers. As an analytics leader, you don’t necessarily have to understand all the technical details associated with how these are programmed and the inner workings of their code, but it is important to understand what they are and what makes them unique.

In 2017, a group of researchers at Google and the University of Toronto developed a new type of neural network architecture: the transformer. Originally, the goal of this team was to enable machine translation, but their findings have gone beyond just translation and have revolutionized multiple arenas in the ML world. Unlike the recurrent neural nets (RNNs) of the past, which were feed-forward in nature and expected data to arrive in a sequential manner, these transformers allowed the data to be distributed and parallelized. This means they can process huge amounts of data and can train very large models.

What Makes the Transformer Special

There are three concepts that enable these transformers to succeed where the RNN didn’t: positional encoding, attention, and self-attention.

Positional encoding removes the need to process one word of a sentence at a time. Each word in the corpus is encoded to have both the text of the word and the position in the sentence. This allows the model to be built in a distributed fashion across multiple processors and leverage mass parallelization.

Attention is a very important concept for machine translation. When translating language, it is not enough to just translate the words. The process needs to see patterns of word placement in the input and output sentences on the training content and mirror those patterns when performing machine translation on new phrases. This ability to leverage these patterns is at the core of attention. In addition to sentence word position matching, this pattern-matching concept applies to word gender determination, plurality, and other rules of grammar associated with translation.

Self-attention is the mechanism in a neural network where features are identified from within the data itself. In computer vision problems and convolutional neural nets (CNN), the neural network can identify features such as object edges and shapes from within unlabeled data and use these in the model. In natural language processing (NLP), self-attention finds similar patterns in the unlabeled data that represents parts of speech, grammar rules, homonyms, synonyms, and antonyms. These features extracted from within the data are then used to better train the neural network for future processing.

With these concepts, multiple groups have built large language models that leverage these transformers to do some incredible machine learning tasks related to NLP.

The Top Three Transformer Models

GPT stands for Generative Pre-trained Transformer. GPT-3 is the third generation of this transformer model and is the one gaining momentum today with an anticipated GPT-4 on the near-term horizon. GPT-3 was developed by OpenAI using 45TB of text data, or the equivalent of almost all the content on the public web.

GPT-3 is a neural network that has over 175 billion machine learning parameters that allow it to effectively perform natural language processing and natural language generation (NLG). The results of the GPT model are very human-like in word usage, sentence structure, and grammar. This model is the cornerstone of the GPTChat released by OpenAI to demonstrate how this model can solve real-world problems.

BERT stands for Bidirectional Encoder Representations from Transformers. In this neural net, every output element is connected to every input element. This enables the bidirectional nature of the language model. In past language models, the text was processed sequentially either left-to-right or right-to-left, but only in a single direction. The BERT framework was pre-trained by Google using all the unlabeled text from Wikipedia but can be further refined with other question-and-answer data sets.

The BERT model aims to understand the context and meaning of words within a sentence. BERT can be leveraged for tasks such as semantic role labeling of words, sentence classification, or word disambiguation based on the sentence context. BERT can support interaction in over 70 languages. Google leverages BERT as a core component of many of its products, including developer-facing services in the Google Cloud Platform.

T5 stands for Text-to-Text Transfer Transformer. T5 was developed by Google in 2019. Researchers were looking for an NLP model that would leverage transfer learning and have the features of a transformer, therefore this is called a transfer transformer. This model is different from the BERT model in that it uses both an encoder and decoder so its inputs and outputs are both text strings. This is where the text-to-text portion of the model is derived.

The model was trained, leveraging both unsupervised and supervised methods, on a large portion of the Common Crawl data set. T5 was designed to be transferable to other use cases by using its model as a base and then transferring it and fine-tuning it to solve domain-specific tasks.

Common Use Cases

Because of the transformer revolution we are experiencing, many NLP problems and use cases are being solved using these new and improved methods. This makes it possible for businesses to more effectively perform tasks that require text summarization, question answering, automatic text classification, text comparison, text and sentence prediction, natural language querying (including voice search), and message blocking based on policy violations (e.g., offensive or vulgar material, profanity).

As companies experience the power of these new models, many additional use cases will be identified, and businesses will find ways to derive value from integrating them into their existing and new products. We will see more products arrive on the market with intelligent features leveraging these three models.

Looking Forward

At this stage, many of these algorithms are still in the demonstration and experimentation phase, but companies such as Microsoft and Google are actively looking at ways to incorporate them into other products to make them better, smarter, and more capable of interacting in an intelligent manner with users. The AI revolution that is upon us will possibly define the coming decade much in the same way that the introduction of the internet defined the 1990s and 2000s, so it is important to understand what these algorithms are and start to identify where on your strategic road map they should be planned.

About the Author

Troy Hiltbrand is the senior vice president of digital product management and analytics at Partner.co where he is responsible for its enterprise analytics and digital product strategy. You can reach the author via email.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.

↑

TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Three Models Leading the Neural Network Revolution

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI

Engage

Research

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Three Models Leading the Neural Network Revolution

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects, and Your Career

TDWI

Engage

Research

Accelerate Your Projects,
and Your Career