Distinguishing Generative AI from Traditional AI Machine Learning Models

Artificial Intelligence & MLOps

Learn from Caylent's GenAI experts to understand the difference between Generative AI and Traditional (Analytical) AI to make more informed decisions and leverage these technologies effectively for your business.

This topic has come up often in my various customer conversations. It is tempting to answer it with: Generative AI is ChatGPT, but this wouldn't be correct.

In popular culture, Generative AI (GenAI) refers to AI models we interact with using natural language for various purposes, such as generating content (e.g., text, images, audio, video, etc), answering questions, and summarizing text. 

GenAI became popular due to advancements in AI’s conversational capabilities, including its ability to retrain knowledge and conversation history and answer questions from this knowledge, history, and any other relevant context the user introduces in mind. Consequently, GenAI acquired somewhat of a reputation as being exclusively an intelligent conversational solution that can do it all. Furthering this reputation is GenAI’s increasingly multimodal and multi-task advancements, where a single model can handle various data types and perform a wide range of tasks along with its conversational abilities, such as translations, generating content (e.g., text, images, audio, video, etc.) from text instructions, transcribing audio, describing content, recreating content based on stylistic characteristics of other content, and completing missing parts of the content. 

It would seem that GenAI is the epitome of AI, nullifying the need for all other types of AI. Though we may eventually get there, GenAI, as it is today, is not the golden hammer for all your AI problems. A clearer understanding of these models and distinguishing their capabilities from those of traditional AI models is necessary to understand when to apply one over the other.

In a way, addressing this topic feels like addressing my 9-year-old son's question about whether tomatoes are vegetables or fruits or whether a Platypus is a mammal, a bird, or a fish. Like my son's questions, distinguishing between GenAI and traditional AI is similarly challenging without delving into excruciating, long, technical details.

The distinction from a technical perspective is somewhat simple for AI practitioners: 

  • GenAI refers to models built using specialized architectures, such as Generative Adversarial Networks, Recurrent Neural Networks, Transformers, and Autoencoders to generate numbers (yes, numbers), text, images, videos, audio, etc. 
  • Those that don't are not Generative models. 

But then, some of these architectures can be used in tasks primarily associated with traditional AI, such as forecasting and recommender engines. Would this make them non-generative? 

The distinguishing characteristics of these two are not entirely black and white, especially when they do not involve technical details. 

I will refer to traditional AI algorithms and models as Analytical AI. I will also use the term algorithm to refer to the code that executes a learning process against some training data to discover and encode patterns into a data structure referred to as a model. So, I will use the term model to refer to the product of the training process where the “lessons learned” are stored and used for inferencing (i.e., predicting, forecasting, etc.). A good analogy for the relationship between algorithms and models is the relationship between recipes and dishes.

I will also distinguish between two categories of data: structured and unstructured data. 

  • Structured data refers to data organized and formatted in a specific, pre-defined schema containing fields, records, and tables. Each data element has a clearly defined data type, making it easily searchable, analyzable, and retrievable. 
  • Unstructured data refers to data that does not conform to a predefined schema or organization, making it more challenging to analyze and process using traditional methods. It can include different data types, such as text, images, audio, videos, social media posts, emails, 3D models, etc.

What is GenAI, and how does it differ from Analytical (Traditional) AI?

I will answer this question by examining GenAI and Analytical AI from several perspectives: 

  1. What tasks can they perform, i.e., their primary function?
  2. How are they trained, i.e., their training process?
  3. What data types can they ingest and produce, i.e., their input and output data? and
  4. What use cases are they used for, i.e., their application areas?

Primary Function

The primary function of Analytical AI and GenAI is the closest characteristic to a black-and-white distinction. 

Analytical AI refers to a collection of machine learning algorithms that serve the primary function of automatically identifying patterns in structured and unstructured data to solve analytical tasks faster and more efficiently than humans, such as being able to 

  • classify whether an image is that of an apple, orange, or banana, or if a credit card transaction is fraudulent or not, 
  • predict the likelihood of events occurring, such as a car breaking down, or forecast demand for our products in the upcoming months,
  • cluster or segment products into groups based on purchasing patterns or the frequency they are purchased together, or our customers into various segments, such as high-paying frequent visitors and bargain hunters,
  • recommend to users items to buy or movies to watch 

These algorithms result in highly specialized models for the specific analytical task. They can handle a single data type (e.g., unimodal) and perform a single task using that data type.

GenAI, on the other hand, refers to algorithms that understand and generate structured and unstructured data on par with or even better than humans, e.g., understanding and generating code, images, audio, videos, and 3D models. Unlike Analytical AI, these models are becoming increasingly multimodal, with a single model capable of simultaneously handling different data types, such as Anthropic Claude Sonnet’s and Amazon’s Titan Multimodal Embeddings’ text and image capabilities. Moreover, these models possess and are becoming increasingly capable of a wide range of tasks, such as conversing/chatting, writing code, generating images, generating structured and unstructured data, translating, transcribing audio, describing images and videos, recreating text, images, and videos based on stylistic characteristics of other content, understanding acting upon causality and reasoning, and even tackling brain-computer interactions.

Training Process

Analytical AI and GenAI fall within the machine learning (ML) domain. Both aim to learn from data to perform some tasks. Similar to humans, machines learn in various ways, the most popular of which are:

  • Supervised Learning: machines are provided with labeled data (input-output pairs representing decision points/factors and the decision) and are trained to learn the mapping between the inputs and outputs. For example, they learn to classify emails into spam or non-spam classes from examples of spam and non-spam emails.
  • Unsupervised Learning: machines are provided with unlabeled data to identify patterns, structures, or representations from this data without explicit supervision or labeled output. For example, they can segment customers based on their purchasing habits.
  • Semi-supervised: a combination of both supervised and unsupervised learning techniques.
  • Self-supervised Learning: models learn to predict some part of the input data from other parts of the same data without requiring explicit supervision or labeled output. For example, learning to predict the missing word in a sentence or predicting the following sentence given previous ones. 
  • Reinforcement Learning: Machines learn to make decisions through trial and error and a reward mechanism without needing labeled data. The reward mechanism is designed to reinforce positive behavior and penalize unwanted ones, resulting in models capable of, for example, driving a vehicle or chatting appropriately with users.
  • Online Learning/Continous Pre-training: incrementally training a model based on new input data acquired over time instead of starting from scratch. This approach significantly reduces the time required to train a model.
  • Transfer Learning: leveraging knowledge gained while training to perform a task and applying it to a different but related task, significantly reducing the labeled data required to train a model. For example, assume we have a small dataset of flower images, which we would like to use to build a model capable of classifying them. This dataset is too small to train a capable flower classification model. So, we transfer the “learning” ImageNet model, which has been trained on a much larger dataset of images, to another model and further train this new model on our classification task instead of starting training from scratch. Since ImageNet is a powerful image model, our model will inherit its power without incurring the training cost. 

Analytical AI models are trained using supervised, unsupervised, semi-supervised, and reinforcement learning techniques. They can be trained from scratch or using transfer and online learning techniques. However, most are trained from scratch due to their relatively small size and data requirements and the proprietary nature of the data used to train them. So, the utility of transfer and online learning techniques is limited to large Analytical AI models that are costly to train, like recommender systems or computer vision models, which constitute a small segment of Analytical AI models.

GenAI models are trained using techniques different from those of Analytical AI. For example, large Language Models (LLMs) are trained using the self-supervised learning technique, attempting to predict the missing word in a given sentence and the following sentence in a given paragraph. By doing so, these models learn the semantics and syntax of the target language, giving them the ability to understand it. This language understanding can then be used for downstream tasks, such as summarization, translation, or question-answering.

The deep learning nature of GenAI demands a larger dataset than is required for most Analytical AI models. So, they are typically trained on public sources of information, such as Wikipedia pages or large image repositories, resulting in larger models than most Analytical AI models - hence the popularity of transfer learning and online learning techniques with this type of AI. 

This is why GenAI is typically pre-trained and useful out of the box but can also be fine-tuned to understand nuanced domain-specific terminology or to perform specific tasks, such as identifying named entities (e.g., people, organizations, locations, dates, etc.) within a body of text.

Many pre-trained Analytical AI models, especially large ones, are useful off-the-shelf. However, this usefulness is more prominent with GenAI models due to the broad applicability of their tasks (e.g., understanding language), multiple task capabilities, and increasing multimodality.

Input & Output Data

Analytical AI and GenAI can be trained on structured and unstructured data to perform their respective functions but with some nuance.

Analytical AI algorithms learn from structured numerical data to perform a task. For example, given a table containing historical credit card transactions, such as customer ID, purchase date and time, transaction location code, amount spent, currency code, etc., and a label for each transaction indicating whether it is fraudulent or not (obtained from our esteemed fraud analysts), an Analytical AI algorithm can learn the necessary patterns within this data to replicate the decisions our fraud experts have made, effectively automating this task.

Analytical AI can also learn from unstructured data. However, this data must be transformed into structured numerical data before training commences. For example, to train an Analytical AI model to distinguish between spam emails and those that are not, the email text must first be transformed into numerical structured data using approaches such as bag-of-words and n-grams.

Once training is complete, these models can be queried by passing a new instance of data in the same structure to what they have been trained on to receive numeric predictions, forecasts, recommendations (e.g., product IDs), or patterns in the data, such as clusters/segment IDs and their members.

GenAI algorithms can also learn from structured and unstructured data, although they are more popularly associated with unstructured data. As mentioned above, LLMs, for example, are trained on raw text using the self-supervised learning technique. 

These algorithms are also capable of learning from structured numerical data as well. One of the most popular examples of this learning is the fraud detection use case using Generative Adversarial Networks (GANs), where two models are pitted against each other, where one (the discriminator) learns to differentiate between fake and real transactions and the other (the generator) learns to generate synthetic transactions to fool the discriminator. By doing so, these two models learn what typical transactions look like and become capable of identifying unusual transactions (i.e., fraudulent transactions). These two models use the same structured numerical data described above. Another example comes from an advancement made by Amazon’s research team, where they introduced Chronos, an approach for using LLM architecture to perform forecasting, a task closely associated with Analytical AI.

Similar to Analytical AI models, these models can be queried by passing them a new instance of the data (a question + context, a text excerpt to be translated, a credit card transaction, etc.) in the same structure as what they have been trained. However, in most cases, these models are not restricted to outputting numerical data, as with analytical AI. They can also generate unstructured data, such as textual responses, translations, or audio transcripts.

Model Size

Though both these types of AI can have models of varying sizes, GenAI has become increasingly massive. Analytical AI models are typically small compared to GenAI, with some being large, such as recommender systems or computer vision models (e.g., think Amazon/Netflix recommender systems or computer vision models for autonomous vehicles). GenAI models, on the other hand, are typically large, with some being small, especially unimodal, single-task GenAI models.

Applications

GenAI can understand and generate content, which is the only capability exclusive to GenAI. This capability includes chatting, document generation, document understanding, summarization and translation, named entity recognition, music creation, data generation, protein folding, semantic search (i.e., searching through Knowledge Bases for relevant information), code writing, and content personalization.

However, as mentioned above, GenAI can be leveraged in tasks traditionally associated with Analytical AI, such as classifying transactions into fraud/non-fraud buckets or forecasting. The same can be said of Analytical AI, such as email understanding for classification into spam/non-spam buckets, sentiment analysis, or image understanding for text extraction. 

Another example demonstrating this overlap is support email classification. This task aims to automate determining the support category a customer requests in their email, which brings substantial cost savings for organizations receiving many such emails. 

An Analytical AI approach to automating this requires curating a dataset of emails and their previously determined support category. Emails must be transformed into numerical values and the categories into category IDs, resulting in a structured numerical dataset on which an Analytical AI model gets trained to learn the relationship between email text and categories. This model can then be queried whenever a new email arrives to determine its support category automatically. 

GenAI can also be applied in this case in many ways. One such way would be to provide an LLM with the support categories’ descriptions along each new email, prompting it to infer the best category to which the email belongs. This approach requires no labeled data.

In this particular example, both approaches accomplish a classification task, an Analytical AI task, emphasizing that Analytical AI and GenAI solve different problems but can also solve the same problem differently. 

The figure below shows some common use cases and their association with Analytical and GenAI. 

Lessons Learned

This overlap between Analytical AI’s and GenAI’s capabilities feeds the misconception that GenAI has nullified the need for any other type of AI. Despite how it seems, GenAI is not (yet) a General Intelligence (GI) solution or a golden hammer to solve all AI-related problems. They each have their places and times to be used. 

In many cases, both these types of AI can be used together to improve the outcomes of a use case. However, it would still help to identify which to start with. In situations where you are looking to classify, predict, forecast, cluster, or recommend and have adequate data, use Analytical AI. If you lack data, look into using pre-trained Analytical AI models or GenAI, as described in the support email classification example. In situations where you are looking to produce content, use GenAI. 

From predictive and analytical use cases to content development and customer experience, we have tactful expertise in deploying the right flavor of AI to achieve successful outcomes. Embrace the future of business with AI, and let us help you achieve the technological transformation that will fuel tomorrow’s innovation.

Accelerate your GenAI initiatives

Leveraging our accelerators and technical experience

Browse GenAI Offerings
Artificial Intelligence & MLOps
Khobaib Zaamout

Khobaib Zaamout

Dr. Khobaib Zaamout is the Principal Architect for AI Strategy at Caylent, where his main focus lies in AIML and Generative AI. He brings a solid background with over ten years of experience in software, Data, and AIML. Khobaib has earned a master's in Machine Learning and holds a doctorate in Data Science. His professional journey also involves extensive consulting, solutioning, and leadership roles. Based in Chestermere, Alberta, Canada, Khobaib enjoys a laid-back life. Outside of work, he likes cooking for his family and friends and finds relaxation in camping trips to the Rocky Mountains.

View Khobaib's articles

Related Blog Posts

Experiences as a Tech Intern at Caylent

Read about the experiences our summer technology fellow had at Caylent, where she explored cloud computing, generative AI, web development, and more.

Culture
Artificial Intelligence & MLOps

OpenAI vs Bedrock: Optimizing Generative AI on AWS

The AI industry is growing rapidly and a variety of models now exist to tackle different use cases. Amazon Bedrock provides access to diverse AI models, seamless AWS integration, and robust security, making it a top choice for businesses who want to pursue innovation without vendor lock-in.

Artificial Intelligence & MLOps

AI-Augmented OCR with Amazon Textract

Learn how organizations can eliminate manual data extraction with Amazon Textract, a cutting-edge tool that uses machine learning to extract and organize text and data from scanned documents.

Artificial Intelligence & MLOps