Both Machine Learning and Generative AI aim to learn from data to perform some tasks. Similar to humans, machines learn in various ways, the most popular of which are:
- Supervised Learning: we provide machines with labeled data (input-output pairs representing decision points/factors and the decision) and train them to learn the mapping between the inputs and outputs. For example, they learn to classify emails into spam or non-spam classes from examples of spam and non-spam emails.
- Unsupervised Learning: we provide machines with unlabeled data to identify patterns, structures, or representations from this data without explicit supervision or labeled output. For example, they can segment customers based on their purchasing habits.
- Semi-supervised: a combination of both supervised and unsupervised learning techniques.
- Self-supervised Learning: models learn to predict some part of the input data from other parts of the same data without requiring explicit supervision or labeled output. For example, learning to predict the missing word in a sentence or predicting the following sentence given previous ones.
- Reinforcement Learning: Machines learn to make decisions through trial and error and a reward mechanism without needing labeled data. The reward mechanism reinforces positive behavior and penalizes unwanted ones, resulting in models capable of, for example, driving a vehicle or chatting appropriately with users.
- Online Learning/Continous Pre-training: incrementally training a model based on new input data acquired over time instead of starting from scratch. This approach significantly reduces the time required to train a model.
- Transfer Learning: leveraging knowledge gained while training to perform a task and applying it to a different but related task, significantly reducing the labeled data required to train a model. For example, assume we have a small dataset of flower images, which we would like to use to build a model capable of classifying them. This dataset is too small to train a capable flower classification model. So, we transfer the “learning” ImageNet model, which has been trained on a much larger dataset of images, to another model and further train this new model on our classification task instead of starting training from scratch. Since ImageNet is a powerful image model, our model will inherit its power without incurring the training cost.
Machine Learning
We can use supervised, unsupervised, semi-supervised, and reinforcement learning techniques on Machine Learning models for training. They can be trained from scratch or using transfer and online learning techniques. However, most are trained from scratch due to their relatively small size and data requirements and the proprietary nature of the data used to train them. So, the utility of transfer and online learning techniques is limited to large models that are costly to train, like recommender systems or computer vision models.
Generative AI
GenAI models use different training techniques from those of ML. For example, with Large Language Models (LLMs), we would use a self-supervised learning technique, attempting to predict the missing word in a given sentence and the following sentence in a given paragraph. By doing so, these models learn the semantics and syntax of the target language, giving them the ability to understand it. This language understanding can then be used for downstream tasks, such as summarization, translation, or question-answering.
The deep learning nature of GenAI demands a larger dataset than is required for most Analytical AI models. So, they are typically trained on public sources of information, such as Wikipedia pages or large image repositories, resulting in larger models than most Analytical AI models - hence the popularity of transfer learning and online learning techniques with this type of AI.
This is why GenAI is typically pre-trained and useful out of the box but can also be fine-tuned to understand nuanced domain-specific terminology or to perform specific tasks, such as identifying named entities (e.g., people, organizations, locations, dates, etc.) within a body of text.
Many pre-trained Analytical AI models, especially large ones, are useful off-the-shelf. However, this usefulness is more prominent with GenAI models due to the broad applicability of their tasks (e.g., understanding language), multiple task capabilities, and increasing multimodality.
Input & Output Data
ML and GenAI can be trained on structured and unstructured data to perform their respective functions but with some nuance.
Machine Learning
ML algorithms learn from structured data to perform a task. For example, given a table containing historical credit card transactions, such as customer ID, purchase date and time, transaction location code, amount spent, currency code, etc., and a label for each transaction indicating whether it is fraudulent or not (obtained from our esteemed fraud analysts), an ML algorithm can learn the necessary patterns within this data to replicate the decisions our fraud experts have made, effectively automating this task.
ML can also learn from unstructured data. However, we need to transform this data into structured numerical data before training commences. For example, to train an ML model to distinguish between spam emails and those that are not, we have to convert the email text into numerical structured data using approaches such as bag-of-words and n-grams.
Once training is complete, we can query these models by passing a new instance of data in the same structure to what they have been trained on to receive numeric predictions, forecasts, recommendations (e.g., product IDs), or patterns in the data, such as clusters/segment IDs and their members.
Generative AI
GenAI algorithms can also learn from structured and unstructured data, although they are more popularly associated with unstructured data. As mentioned above, LLMs, for example, are trained on raw text using the self-supervised learning technique.
These algorithms are also capable of learning from structured numerical data as well. One of the most popular examples of this learning is the fraud detection use case using Generative Adversarial Networks (GANs). Another example comes from an advancement made by Amazon’s research team, where they introduced Chronos, an approach for using LLM architecture to perform forecasting.
We can query these models by passing them a new instance of the data (a question + context, a text excerpt to be translated, a credit card transaction, etc.) in the same structure as what they have been trained. However, in most cases, these models are not restricted to outputting numerical data. They can also generate unstructured data, such as textual responses, translations, or audio transcripts.
Model Size
Though both these types of AI can have models of varying sizes, GenAI has become increasingly massive. ML models are typically small compared to GenAI, with some being large, such as recommender systems or computer vision models (e.g., think Amazon/Netflix recommender systems or computer vision models for autonomous vehicles). GenAI models, on the other hand, are typically large, with some being small, especially unimodal, single-task GenAI models.
Applications and Use Cases
GenAI can understand and generate content, which is the only capability exclusive to GenAI. This capability includes chatting, document generation, document understanding, summarization and translation, named entity recognition, music creation, data generation, protein folding, semantic search (i.e., searching through Knowledge Bases for relevant information), code writing, and content personalization.
However, as mentioned above, GenAI can be leveraged in tasks traditionally associated with ML, such as classifying transactions into fraud/non-fraud buckets or forecasting. The same can be said of ML, such as email understanding for classification into spam/non-spam buckets, sentiment analysis, or image understanding for text extraction.
Another example demonstrating this overlap is support email classification. This task aims to automate determining the support category a customer requests in their email, which brings substantial cost savings for organizations receiving many such emails.
When should you use Generative AI vs Machine Learning