Analytical AI and GenAI fall within the machine learning (ML) domain. Both aim to learn from data to perform some tasks. However, they use different training processes.
Analytical AI relies on supervised training
Analytical AI models are trained using supervised, unsupervised, semi-supervised, and reinforcement learning techniques. They can be trained from scratch or using transfer and online learning techniques. However, most are trained from scratch due to their relatively small size and data requirements and the proprietary nature of the data used to train them.
As a result, the utility of transfer and online learning techniques is limited to large Analytical AI models that are costly to train, like recommender systems or computer vision models, which constitute a small segment of Analytical AI models.
GenAI uses self-supervised training
GenAI models are trained using techniques different from those of Analytical AI. For example, large Language Models (LLMs) are trained using the self-supervised learning technique, attempting to predict the missing word in a given sentence and the following sentence in a given paragraph. By doing so, these models learn the semantics and syntax of the target language, giving them the ability to understand it. This language understanding can then be used for downstream tasks, such as summarization, translation, or question-answering.
The deep learning nature of GenAI demands a larger dataset than is required for most Analytical AI models. So, they are typically trained on public sources of information, such as Wikipedia pages or large image repositories, resulting in larger models than most Analytical AI models - hence the popularity of transfer learning and online learning techniques with this type of AI.
Pre-trained models
This is why GenAI is typically pre-trained and useful out of the box but can also be fine-tuned to understand nuanced domain-specific terminology or to perform specific tasks, such as identifying named entities (e.g., people, organizations, locations, dates, etc.) within a body of text.
Many pre-trained Analytical AI models, especially large ones, are useful off-the-shelf. However, this usefulness is more prominent with GenAI models due to the broad applicability of their tasks (e.g., understanding language), multiple task capabilities, and increasing multimodality.
Data
Analytical AI and GenAI can be trained on structured and unstructured data to perform their respective functions but with some nuance.
Analytical AI relies more on structured data
Analytical AI algorithms learn from primarily structured numerical data to perform a task. For example, given a table containing historical credit card transactions, such as customer ID, purchase date and time, transaction location code, amount spent, currency code, etc., and a label for each transaction indicating whether it is fraudulent or not (obtained from our esteemed fraud analysts), an Analytical AI algorithm can learn the necessary patterns within this data to replicate the decisions our fraud experts have made, effectively automating this task.
Analytical AI can also learn from unstructured data. However, this data must be transformed into structured numerical data before training commences. For example, to train an Analytical AI model to distinguish between spam emails and those that are not, the email text must first be transformed into numerical structured data using approaches such as bag-of-words and n-grams.
Once training is complete, these models can be queried by passing a new instance of data in the same structure to what they have been trained on to receive numeric predictions, forecasts, recommendations (e.g., product IDs), or patterns in the data, such as clusters/segment IDs and their members.
GenAI relies more on unstructured data
GenAI algorithms can also learn from structured and unstructured data, although they are more popularly associated with unstructured data. As mentioned above, LLMs, for example, are trained on raw text using the self-supervised learning technique.
These algorithms are also capable of learning from structured numerical data as well. One of the most popular examples of this learning is the fraud detection use case using Generative Adversarial Networks (GANs), where two models are pitted against each other, where one (the discriminator) learns to differentiate between fake and real transactions and the other (the generator) learns to generate synthetic transactions to fool the discriminator. By doing so, these two models learn what typical transactions look like and become capable of identifying unusual transactions (i.e., fraudulent transactions). These two models use the same structured numerical data described above.
Similar to Analytical AI models, these models can be queried by passing them a new instance of the data (a question + context, a text excerpt to be translated, a credit card transaction, etc.) in the same structure as what they have been trained. However, in most cases, these models are not restricted to outputting numerical data, as with analytical AI. They can also generate unstructured data, such as textual responses, translations, or audio transcripts.
Model size
Though both these types of AI can have models of varying sizes, GenAI has become increasingly massive. Analytical AI models are typically small compared to GenAI, with some being large, such as recommender systems or computer vision models (e.g., think Amazon/Netflix recommender systems or computer vision models for autonomous vehicles). GenAI models, on the other hand, are typically large, with some being small, especially unimodal, single-task GenAI models.
Use cases