re:Invent 2023 AI/ML Session Summaries

December 14, 2023

AWS Announcements

Analytical AI & MLOps

Generative AI & LLMOps

Get up to speed on all the GenAI, AI, and ML focused 300 and 400 level sessions from re:Invent 2023!

We know that watching all the re:Invent session videos can be a daunting task, but we don't want you to miss out on the gold that is often found in them! In this blog, you can find quick summaries of all the 300 and 400 level sessions, grouped by track. Enjoy!

SEG301 How Fetch build world-class ML models to power their business

At the AWS re:Invent 2023, Tony Cerqueira from AWS and Sam Corzine from Fetch discussed Fetch's use of machine learning (ML) to enhance their business operations. Fetch, a leading consumer rewards app, relies on cloud-native digital strategies and data-focused approaches. Their system processes millions of receipts weekly, requiring robust and efficient ML models to handle the massive data flow. Their approach involves using AWS services and innovative ML strategies to manage data, offering personalized customer experiences and ensuring smooth business operations.

Sam Corzine, a lead ML engineer at Fetch, shared insights into their ML journey, including the challenges and strategies for developing and deploying ML models. He emphasized the importance of starting with a comprehensive demo using tools like Streamlit to validate ML project feasibility. Corzine highlighted the critical role of data annotation and the need for careful planning and execution in this phase. He also discussed the challenges of integrating ML into existing systems, particularly in high-performance environments, and the necessity of continuous experimentation and adjustment to improve model accuracy and efficiency.

Looking ahead, Fetch plans to focus on personalization, discovery, and recommendation systems, leveraging the detailed purchase data from users. They aim to develop stateful ML services and explore the potential of generative AI models to augment their core ML pipelines. These efforts are directed towards making their app more user-centric and data-driven, thereby enhancing the overall customer experience. The presentation concluded with a call for feedback and appreciation for the audience's attendance.

AWS re:Invent 2023 - How Fetch built world-class ML models to power their business (SEG301)

AIM325 Scale complete ML development with Amazon SageMaker Studio

The AWS re:Invent 2023 session focused on scaling machine learning (ML) development using Amazon SageMaker Studio. Sumit Thakur, a senior manager at AWS, led the session with contributions from Giuseppe Porcelli, a principal ML solutions architect at AWS, and Marc Neumann, a product owner for the ML and AI platform at BMW Group. They discussed the advancements in SageMaker Studio, emphasizing its integrated development environment that combines various ML workflow tools into a unified interface. Key trends in AI development were highlighted, including the increasing pace of AI adoption, the proliferation of AI-specific job roles, and challenges in moving ML models from prototyping to production.

Giuseppe Porcelli presented a live demonstration, showcasing the streamlined startup experience of SageMaker Studio, its support for multiple code editors, and its new AI-based assistance for ML tasks. The demo illustrated the process of detecting machinery failures using the AI4I Predictive Maintenance Dataset, highlighting the ease of building, training, and deploying ML models with SageMaker Studio. Porcelli demonstrated the capabilities of SageMaker Studio in creating and managing machine learning workflows, including data preprocessing, model training, evaluation, and deployment.

Marc Neumann from BMW Group shared insights into how BMW is leveraging SageMaker Studio for scaling ML development across various domains, including quality inspections in vehicle production and demand prediction in supply chain management. He outlined BMW's journey from fragmented ML environments to adopting SageMaker Studio, which met their diverse requirements for tooling, data access, security, and collaboration. Neumann also discussed future plans to integrate MLOps solutions, leverage features for improved productivity, and provide access to large language models within their development environment. This session underscored the significant advancements in SageMaker Studio and its impact on simplifying and accelerating ML development in large organizations.

AWS re:Invent 2023 - Scale complete ML development with Amazon SageMaker Studio (AIM325)

AIM326 Large model training on AWS Deep Learning AMIs & PyTorch

The AWS re:Invent 2023 session titled "Large Model Training on AWS with DLAMIs and PyTorch, featuring Pinterest" was a detailed exploration of the challenges and solutions in training large machine learning models. Led by Arindam Paul, a senior product manager at AWS, the session also featured insights from Karthik, the engineering manager of ML Platforms at Pinterest, and Zlatan, a principal solutions architect at AWS. The presentation focused on the difficulties in training large models, such as the need for substantial compute and storage resources, the unavailability of GPU resources, and the challenges in orchestrating large-scale infrastructure. The team showcased how they leverage AWS Deep Learning AMIs (DLAMIs) and Elastic Fabric Adapter (EFA) to optimize the training process.

Karthik's section of the presentation delved into Pinterest's specific strategies and infrastructure for training large-scale models. He emphasized the use of PyTorch and AWS services like Batch and UltraCluster to handle the complexity of large models. He highlighted the importance of distributed training techniques, such as Distributed Data Parallel and Fully Sharded Data Parallel, and the need for efficient data preprocessing and model sharding. Karthik also stressed the benefits of standardized ML stacks for faster upgrades, efficient use of distributed data parallel for recommendation models, and the significant performance improvements gained from using AWS UltraCluster for their billion-plus parameter models.

The final part of the session, presented by Zlatan, focused on the solution architecture and the collaborative effort between AWS and Pinterest. He discussed the role of AWS Batch in simplifying job orchestration and the use of UltraCluster for high-density compute environments with low-latency inter-node networking. Zlatan also highlighted the strategic use of object storage (S3) by Pinterest for training data and the various types of large-scale models used by Pinterest, including compute-intensive models and graph neural networks. He concluded by emphasizing the importance of the features developed in collaboration with AWS, such as fair-share scheduling and dynamic compute environment updates, which significantly improved the efficiency and reliability of their ML training processes.

AWS re:Invent 2023 - Large model training on AWS Deep Learning AMIs & PyTorch, ft. Pinterest -AIM326

AIM327 Scaling FM inference to hundreds of models with Amazon SageMaker AI

The AWS re:Invent 2023 session on "Scaling Foundation Model Inference with SageMaker AI" showcased the advancements in deploying and managing large-scale generative AI applications using Amazon SageMaker AI. Dhawal Patel, leading the Solution Architecture team at AWS, introduced the session, emphasizing the integration of Generative AI into daily operations, enhancing productivity through Foundation models. Foundation models are pre-trained, transformer-based models, often requiring significant computational resources. The session demonstrated the use of SageMaker AI for hosting and scaling these models efficiently, touching on the challenges of managing large models with hundreds of billions of parameters and the need for effective resource utilization.

Alan Tan from the SageMaker AI Product team presented new SageMaker AI Inference features, focusing on the efficient deployment of multiple models. These features include smart routing for lower latency, streaming responses for real-time applications, and granular CloudWatch metrics for better monitoring. This approach allows for the hosting of multiple models on a single endpoint, scaling each model independently, and optimizing hardware utilization. The session also highlighted a new smart routing algorithm, which improves request handling by directing traffic to the most available instances, significantly reducing latency.

Bhavesh Doshi from Salesforce shared insights on how Salesforce Einstein 1 Platform utilizes SageMaker AI for hosting and scaling generative AI applications. He emphasized Salesforce's commitment to an integrated, intelligent, and automated platform, underlining the importance of secure, scalable, and efficient AI deployments. Salesforce's experience with SageMaker AI Inference Components showcased significant improvements in resource utilization and cost efficiency. The platform's ability to independently scale models and leverage auto-scaling features has streamlined operational workflows and reduced the computational footprint, all while maintaining high performance.

AWS re:Invent 2023 - Scaling FM inference to hundreds of models with Amazon SageMaker (AIM327)

AIM328 Accelerate FM development with Amazon SageMaker JumpStart

The AWS re:Invent 2023 session, "Accelerate FM development with Amazon SageMaker JumpStart" (AIM328), led by Karl Albertsen, Jeff Boudier, and Marc Karp, focused on the advancements and challenges in adopting generative AI, particularly in SageMaker JumpStart. Albertsen discussed the rapid evolution in AI, emphasizing the array of large language models and the importance of security, compliance, and cost-effective scalability. He highlighted AWS's efforts to simplify these processes, particularly through services like Amazon Bedrock and SageMaker AI.

Jeff Boudier from Hugging Face elaborated on their mission to democratize good machine learning. He showcased Hugging Face's vast AI model repository, including models for various tasks and languages. Boudier discussed the significance of transfer learning, which allows the use of pre-trained models to be adapted efficiently. He highlighted models such as StarCoder, Idefics, and Zephr, developed on SageMaker AI. The collaboration with AWS aims to make AI accessible, easy to use, secure, and cost-effective.

Marc Karp's segment concentrated on the practical application of AI models using SageMaker AI. He guided through the process of selecting, implementing, and evaluating AI models, covering prompt engineering, retrieval-augmented generation (RAG), and model fine-tuning. Karp also explained deploying models in SageMaker AI, ensuring secure, scalable, and integrated solutions with applications. The session underscored the collaborative efforts of AWS and Hugging Face in enhancing the accessibility and efficiency of AI technologies in various applications.

AWS re:Invent 2023 - Accelerate FM development with Amazon SageMaker JumpStart (AIM328)

AIM330 Deploy FMs on Amazon SageMaker AI for price performance

The AWS re:Invent 2023 session focused on deploying foundational models on Amazon SageMaker AI for optimizing price and performance in generative AI applications. Venkatesh Krishnan, the lead for Amazon SageMaker AI, alongside Rama Thamman, who heads a team of specialist solutions architects, discussed strategies for deploying and tuning foundational models to achieve high performance at a reduced cost. They emphasized the critical balance between complexity, performance, and cost, highlighting the challenges in deploying large models like language and image processing models. The session also introduced Travis Mehlinger from Cisco Systems, who shared insights on leveraging SageMaker AI for Cisco's AI applications.

The speakers delved into the specifics of deploying models on SageMaker AI, emphasizing features like multi-model deployment, cost-effective instance types like Inferentia and Trainium, and dynamic auto-scaling to manage costs efficiently. They introduced the Large Model Inference (LMI) container, packed with tools for optimizing large language models, and discussed the benefits of deploying models on Inferentia-based instances for better price performance. The session also highlighted the significance of building a robust platform for hosting large language models and the advantages of using a managed service like Amazon SageMaker AI for this purpose.

The presentation concluded with a demonstration of deploying a large language model on SageMaker AI using a Notebook and UI experience. Rama showcased the ease of deployment using the ModelBuilder class and schema builder, simplifying the process significantly compared to traditional methods. Travis from Cisco shared their experience in utilizing SageMaker AI for call summarization, highlighting the ease of integration, model evaluation, and scaling globally. He emphasized the future improvements possible with SageMaker AI, such as deploying multiple models on the same instances, scaling models to zero to optimize costs, and adopting new developer experience improvements for more efficient model deployment and management.

AWS re:Invent 2023 - Deploy FMs on Amazon SageMaker AI for price performance (AIM330)

AIM331 Explore Amazon Titan for language tasks

In this presentation at AWS re:Invent 2023, Brent Swidler, a principal product manager at Amazon, introduced Amazon Titan, a new family of models for text and multimodal tasks. He was joined by Ben Snively, a senior principal solution architect, and Satin, an application architect from Electronic Arts, to discuss Amazon Bedrock, a platform allowing users to select from various foundational models for text and image generation tasks. They highlighted Amazon Titan's capabilities, including Text Lite and Text Express models, which support rapid iterations and more robust applications in multiple languages. Titan's integration into systems and best practices for its use were also showcased through architectures and demos.

Electronic Arts shared their real-world applications of Amazon Titan, demonstrating how the models enhanced their productivity and player experience. They showcased tools for developers to generate automated unit tests, for quality engineers to write complex test scenarios, and for players to have an in-game help chat. Additionally, they highlighted a tool for business teams to extract social media insights from marketing campaigns. These applications of Titan models showed the potential for innovation and improved efficiency in various areas of their operations.

The presentation concluded with a Q&A session, where the team addressed common inquiries about Amazon Titan and Bedrock. They emphasized the security and privacy of user data, clarifying that data used within Bedrock is not stored or used for training models. The session also touched upon the challenges of model evaluation, explaining the balance between automated and human evaluations for determining model effectiveness. This comprehensive overview showcased Amazon Titan's impact on different industries, highlighting its potential for future development and integration.

AWS re:Invent 2023 - Explore Amazon Titan for language tasks (AIM331)

AIM332 Explore image generation and search with FMs on Amazon Bedrock

At AWS re:Invent 2023, Rohit Mittal introduced the Amazon Bedrock team's latest advancements in image generation and search using Foundational Models. The session highlighted two new models: Titan Multimodal Embeddings and Titan Image Generator. Titan Multimodal Embeddings enhances image search by capturing semantic meaning from images and text, significantly improving accuracy and efficiency over traditional keyword-based search methods. The Titan Image Generator, on the other hand, focuses on generating high-quality, diverse images from text prompts, including complex scenes and text incorporation within images. Both models emphasize ease of use, high accuracy, customization, and responsible AI, ensuring diversity and preventing harmful content generation.

Andres Velez from OfferUp shared their experience integrating Titan Multimodal Embeddings into their platform. OfferUp, a leading mobile marketplace in the U.S., initially relied on keyword search but evolved to use neural and semantic search, resulting in significant improvements in search relevance and stability. The integration of multimodal embeddings further enhanced their system, especially in low-density locations, by incorporating image data into the search process. This approach improved the discoverability of listings with limited textual information, demonstrating the practical benefits of these advanced AI models in a real-world application.

The session concluded with a summary of the key takeaways and an invitation for attendees to explore the potential of these models for their own applications. The Titan models' capabilities in image generation and search, combined with their focus on responsible AI and ease of use, present exciting opportunities for businesses to innovate and enhance their customer experiences. The success story of OfferUp illustrates the tangible impact these technologies can have on improving search functionality and overall user engagement in digital platforms.

AWS re:Invent 2023 - Explore image generation and search with FMs on Amazon Bedrock (AIM332)

AIM333 Explore text-generation FMs for top use cases with Amazon Bedrock

The AWS re:Invent 2023 session, "Explore text-generation Foundation Models for top use cases with Amazon Bedrock (AIM333)", featured speakers Usman Anwer, Denis Batalov, and Daniel Charles discussing the potentials and applications of text-generation models in Bedrock. They highlighted how these models can process diverse data types, including long-form documents and messages, unlocking significant organizational value. The session emphasized the ease of integrating these models into applications via the Bedrock API, providing best practices for maximizing their utility. Dan from Alida shared their journey with generative AI, demonstrating its transformative impact on text analytics, especially in understanding customer feedback and improving user experience.

The session also delved into various use cases demonstrating the flexibility and efficiency of text-generation models. Examples included customer support automation, where AI can answer queries and gather necessary information without human intervention, and content creation, such as blog post generation and summarization of complex documents like legal contracts. The models' ability to process and analyze large datasets was showcased, highlighting their application in generating insights for customer-service improvement and product catalog refinement.

Lastly, the session touched upon the responsible use of AI, with Denis discussing the importance of building AI systems responsibly. Amazon Bedrock's control measures, like data encryption, access management, and content filters, were discussed to ensure ethical AI usage. Bedrock Guardrails, a feature ensuring models stay within specified topics and avoid generating offensive content, was introduced. The session concluded with resources for getting started with Amazon Bedrock, emphasizing continuous learning and adaptation in the rapidly evolving field of AI and machine learning.

AWS re:Invent 2023 - Explore text-generation FMs for top use cases with Amazon Bedrock (AIM333)

AIM334 Improve FMs with Amazon SageMaker AI human-in-the-loop capabilities

In this presentation at AWS re:Invent 2023, Romi Datta, a product manager at SageMaker AI, and her colleagues Amanda Lester and Ketaki Shriram (CTO of Krikey AI) discuss improving foundational models with Amazon SageMaker AI's human-in-the-loop capabilities. Romi outlines the challenges in building and operationalizing foundational models, emphasizing the need for human-in-the-loop (HIL) AI/ML services to enhance these models. She introduces Amazon SageMaker Ground Truth as a solution, highlighting its capabilities in model evaluation, data collection, and customization through supervised fine-tuning or reinforcement learning. The platform provides workflows and expert workforces to annotate data at scale, addressing issues like inaccuracies, toxicity, stereotypes, biases, and hallucinations in models.

Amanda Lester, senior business development lead at AWS, demonstrates the practical application of SageMaker Ground Truth. She showcases how to create demonstration data, preference ranking data, and caption images and videos for various use cases, including documents and sports. This tool allows users to fine-tune AI models according to specific business needs, reinforcing learning with human feedback, and generate question-answer pairs for documents. Amanda's demo underlines the platform's efficiency in aligning model outputs with human preferences and business objectives.

Finally, Ketaki Shriram presents how Krikey AI used Amazon SageMaker Ground Truth to train their AI animation foundational model. She explains the challenges of manually labeling animation data and how partnering with the Ground Truth team enabled them to efficiently label a vast data set, saving significant time and resources. Krikey AI's use case demonstrates the transformative impact of SageMaker Ground Truth in enabling rapid development and deployment of AI models across various sectors, including entertainment, healthcare, and education, by democratizing the creation of high-quality 3D content.

AWS re:Invent 2023 - Improve FMs with Amazon SageMaker human-in-the-loop capabilities (AIM334)

AIM335 Train and tune state-of-the-art ML models on Amazon SageMaker AI

In the AWS re:Invent 2023 session on "Train and tune state-of-the-art ML models on Amazon SageMaker AI (AIM335)," Gal Oshri from AWS, alongside Emily Webber and Thomas Kollar, discussed the challenges and solutions for training large-scale machine learning models using Amazon SageMaker AI. The session began with Gal emphasizing the rapid advancements in machine learning, particularly in deep learning models for computer vision and natural language processing. He showcased the transformation in image generation quality over the years, highlighting the roles of algorithmic improvements and increased data and model sizes. Gal introduced SageMaker AI's capabilities in managing large-scale model training, including efficient hardware utilization, fault resistance, and cost-effective orchestration.

Emily Webber, leading the generative AI foundation's technical field community at AWS, further explored the customization of large language models (LLMs) on SageMaker AI. She delineated various customization techniques ranging from prompt engineering to pre-training new foundation models. Emily elaborated on the ease and efficiency of pre-training large models on SageMaker AI, detailing the steps from data gathering and preprocessing to model evaluation. She highlighted SageMaker AI's distributed training libraries, which enhance training efficiency and scalability. Emily concluded her segment with a demonstration of pre-training a 7 billion parameter Llama model on SageMaker AI, showcasing the platform's seamless integration of various functionalities.

Thomas Kollar from the Toyota Research Institute (TRI) shared insights on how TRI leverages SageMaker AI for machine learning acceleration. He presented TRI's projects, including autonomous drift driving and robotic systems for grocery shelf stocking. Thomas explained the use of SageMaker AI at TRI for experimentation, large-scale training, and model serving. He emphasized the platform's scalability, performance, and the ability to handle large datasets efficiently. Thomas presented TRI's achievements in training state-of-the-art models and their aspirations to develop foundation models capable of performing various robotic tasks in response to language and other inputs. The session concluded with Gal providing additional resources for learning more about SageMaker AI.

AWS re:Invent 2023 - Train and tune state-of-the-art ML models on Amazon SageMaker AI (AIM335)

AIM336 Use RAG to improve responses in generative AI applications

The presentation at AWS re:Invent 2023 focused on improving responses in generative AI applications using Retrieval-Augmented Generation (RAG). The speakers, Ruhaab Markas and Mani Khanuja, emphasized customizing foundation models to enhance generative AI's performance. They discussed the need for customization to adapt to domain-specific language, improve performance on unique tasks, and enhance context awareness with external company data. They presented common approaches to customization, including prompt engineering, RAG, model fine-tuning, and training models from scratch. The talk detailed the benefits and complexities of each approach, particularly highlighting the effectiveness of RAG in leveraging external knowledge sources for accurate responses.

RAG works by retrieving relevant text from a corpus and using it as context in a foundational model to generate responses grounded in a company's specific data. This method enhances content quality by reducing hallucinations and biases in AI responses. The presenters introduced Knowledge Bases for Amazon Bedrock, a tool that simplifies building RAG applications by managing data ingestion, retrieval, and response generation processes. They showcased how Knowledge Bases could be used for various applications, including context-based chatbots, personalized searches, and text summarization.

Finally, the presentation covered the integration of Knowledge Bases with other tools in the Bedrock ecosystem, such as Agents for Bedrock, for building more dynamic and responsive AI applications. This integration allows for real-time data fetching and interaction with databases and APIs, offering a more seamless experience. The speakers demonstrated using the Bedrock console and open-source frameworks like LangChain to build knowledge bases and retrieve information, showcasing the ease and flexibility of these tools. They concluded by inviting feedback and interaction on their LinkedIn profiles and encouraged attendees to take a survey for future improvements.

AWS re:Invent 2023 - Use RAG to improve responses in generative AI applications (AIM336)

AIM337 Accelerate generative AI application development with Amazon Bedrock

The AWS re:Invent 2023 session, led by Gal Oshri, a product manager at AWS, Emily Webber, and Thomas Kollar, focused on the advancements in machine learning model training and tuning using Amazon SageMaker AI. They discussed the challenges and solutions in training large-scale ML models, emphasizing the importance of hardware advancements, efficient data handling, and cost management. Notably, they demonstrated the ease of starting with SageMaker AI, highlighting features like the estimator API, remote python decorator, and the ability to easily manage and track training experiments.

Emily Webber presented on customizing large language models (LLMs), outlining various techniques from prompt engineering to pre-training. She explained the benefits of fine-tuning smaller models for specific domains, offering a balance between accuracy and resource efficiency. Emily showcased a demonstration of pre-training a 7 billion parameter Llama model on SageMaker AI, detailing the steps from data gathering and processing to model evaluation. The demo highlighted SageMaker AI's support for various instance types, distributed training libraries, and seamless integration with AWS services like FSX for Luster.

Thomas Kollar from Toyota Research Institute shared how they leverage SageMaker AI for a range of applications, from small-scale experiments to large-scale training of state-of-the-art models. He discussed the significance of SageMaker AI's scalability, performance, and features like Warm Pools and cluster repair for efficient ML operations. Kollar provided insights into TRI's use of SageMaker AI for tasks like autonomous drift driving, robotics, and language model training, including their efforts towards developing a foundation model capable of performing diverse robotics tasks in response to language inputs.

AWS re:Invent 2023 - Accelerate generative AI application development with Amazon Bedrock (AIM337)

AIM353 Simplify generative AI app development with Agents for Amazon Bedrock

At AWS re:Invent 2023, Harshal Pimpalkhute, a Senior ManagerProduct at Amazon Bedrock, along with Mark Roy, a principal machine learning architect at AWS, and Shawn, CTO of Athene Holdings, presented on the capabilities of Amazon Bedrock and its Agents. They discussed how Amazon Bedrock is designed to simplify the building and scaling of generative AI applications, with a focus on model choices, customization, and integration. They highlighted Bedrock’s recent updates, including new models, evaluation capabilities, customization options like fine-tuning and retrieval-augmented generation, and an API to simplify rank capabilities. The session mainly focused on Agents for Amazon Bedrock, which extends foundation models to perform tasks and automate processes. These Agents can interact with various APIs and data sources to complete user-requested tasks, providing solutions for automation challenges in generative AI application development.

Shawn from Athene Holdings shared a case study illustrating how Athene used Amazon Bedrock Agents to streamline and improve their data processing tasks, particularly in handling large data files and transforming data for their systems. The Agents automated the generation of data mapping documents, saving significant time and allowing analysts to focus on more complex tasks. This application of Agents significantly reduced the time required for data analysis and documentation creation, demonstrating the practical benefits of using Agents in a business environment. Athene's future plans include expanding the use of Agents to more data blocks and fine-tuning their prompts, with the goal of having Agents assist in code writing for data transformation.

Mark demonstrated several practical applications of Agents, showing how they can be deployed in real-world scenarios like customer relationship management, perception surveys, and tax policy inquiries. He showcased how Agents can intelligently orchestrate actions, use knowledge bases, and generate meaningful responses based on user queries. This included a demonstration of Agents creating other Agents, highlighting the advanced capabilities and potential for automation within the Amazon Bedrock framework. The presentation concluded with a Q&A session, where they further discussed the implications and future developments of Agents in various industries.

AWS re:Invent 2023 - Simplify generative AI app development with Agents for Amazon Bedrock (AIM353)

AIM361 Build responsible AI applications with Guardrails for Amazon Bedrock

The AWS re:Invent 2023 session focused on building responsible AI applications using Guardrails for Amazon Bedrock. Harshal Pimpalkhute, a senior manager at Amazon Bedrock, provided an overview of the recent advancements in generative AI and their integration with Amazon Bedrock, including the launch of Guardrails. Guardrails is designed to ensure AI applications adhere to organizational policies and ethical standards, addressing challenges such as staying on topic, avoiding toxicity, and ensuring privacy and fairness. The session demonstrated how Guardrails can be configured with policies for denied topics, content filtering, and privacy protection, offering flexibility to tailor AI responses according to specific use cases and compliance requirements.

The session also detailed the process of creating and testing Guardrails. This involves configuring various policies like denied topics and content filters, and testing them with different scenarios to ensure they work as intended. The integration of Guardrails with CloudWatch logs enables monitoring and analyzing interactions, helping developers identify and respond to violations of usage policies. This feature is crucial for maintaining the integrity and compliance of AI applications in real-world deployments.

Lastly, the session introduced the concept of 'Agent' in Amazon Bedrock, which automates multi-step tasks. Agents are created with specific instructions and can be integrated with action groups and knowledge bases for enhanced functionality. Importantly, Guardrails can be applied to Agents, ensuring that the automated tasks adhere to the defined ethical and compliance standards. This integration of Guardrails with Agents represents a significant step in developing responsible and reliable AI applications, capable of handling complex tasks while adhering to organizational and ethical guidelines.

AWS re:Invent 2023 - Build responsible AI applications with Guardrails for Amazon Bedrock (AIM361)

AIM362 Introducing Amazon SageMaker HyperPod

At AWS re:Invent 2023, Ian Gibbs, a product manager with Amazon SageMaker AI, and his colleagues introduced Amazon SageMaker HyperPod, a new product designed to address the increasing demands for computational power in large-scale foundation model training. They highlighted the challenges in starting foundation model development, which includes data collection, cluster creation, code development for distributed workloads, and the necessity of scale for efficient training. HyperPod is designed to simplify these processes and handle issues like hardware failures and model iteration.

The team demonstrated how SageMaker HyperPod provides a resilient training environment, supporting self-healing clusters that automatically recover from hardware failures, reloading from the last checkpoint, and resuming training without customer intervention. This feature can reduce training time by up to 20%. Additionally, access to SageMaker AI’s distributed training libraries optimized for AWS networking infrastructure can further reduce training time. HyperPod also offers a user-friendly experience, enabling customers to customize various layers of the tech stack during training, which is crucial for rapid iterations on model design.

Ian Gibbs and his team shared customer stories to illustrate HyperPod's impact. Stability AI benefited from HyperPod by saving 50% in training time and costs due to automatic instance replacement during hardware failures. Perplexity AI doubled their experiment throughput with HyperPod's distributed training libraries. Hugging Face, a partner in using HyperPod, could deeply customize their training environment, enhancing their ability to innovate quickly. The session concluded with a live demonstration of HyperPod’s auto-healing feature, showcasing its efficiency in dealing with hardware failures and maintaining the training process with minimal interruption.

AWS re:Invent 2023 - [LAUNCH] Introducing Amazon SageMaker HyperPod (AIM362)

AIM363 New LLM capabilities in Amazon SageMaker Canvas

The AWS re:Invent 2023 session, hosted by Shyam Srinivasan of AWS, introduced new capabilities in Amazon SageMaker Canvas, focusing on enhancements in low-code/no-code machine learning (ML) solutions and large language model (LLM) integration. Shyam demonstrated the new 'Chat for Data Prep' feature, enabling users to interact with and prepare data using natural language, simplifying data transformation and visualization without the need for extensive coding. He also highlighted the ability to fine-tune LLMs within SageMaker Canvas, allowing users to customize model outputs for specific industry needs while ensuring data security.

Purna Doddapaneni from Bain & Company shared practical applications of these innovations, illustrating how different ventures, including Chiefsight, Inside-Out, and Aura, leverage SageMaker Canvas for efficient data processing and ML model building. These examples demonstrated the application of low-code/no-code environments in various business scenarios, including sales prediction, cybersecurity threat analysis, and workforce analytics, showcasing the platform's versatility and user-friendliness.

The session emphasized the growing trend of combining low-code/no-code solutions with generative AI, highlighting its impact on business efficiency and the democratization of ML technology. It stressed the importance of an effective operating model, the need for awareness programs, and the value of collaboration between business and technical teams. The session concluded by encouraging attendees to explore and provide feedback on SageMaker Canvas, emphasizing its ease of use and broad applicability in diverse business contexts.

AWS re:Invent 2023 - New LLM capabilities in Amazon SageMaker Canvas, with Bain & Company (AIM363)

AIM367 Accelerate foundation model evaluation with Amazon SageMaker Clarify

The AWS re:Invent 2023 session on Amazon SageMaker Clarify focused on evaluating foundation models, particularly large language models (LLMs), for accuracy and responsibility. The presentation highlighted the challenges in ensuring the reliability of LLMs, discussing infamous errors made by them, such as factual inaccuracies and stereotyping. Mike Diamond, a Principal Product Manager with SageMaker Clarify, introduced the new tool for foundation model evaluations, emphasizing its ability to evaluate LLMs for both quality and responsible AI aspects. The tool aims to facilitate the assessment of models for businesses, helping them make informed decisions while aligning with regulatory standards and managing risks like hallucinations and biases in AI responses.

Emily Webber, leading the Generative AI Foundation's Technical Field community, delved into the practical aspects of LLM evaluation, outlining the processes and user interfaces available in SageMaker Studio. She demonstrated how the tool can be utilized to select and customize the right LLMs for specific use cases, highlighting its ease of use and integration with various models and tasks. Webber showcased the tool's flexibility, offering both human-driven and automatic evaluation options, and stressed the importance of incorporating evaluation into the AI development and deployment lifecycle for effective governance and cost management.

Taryn Heilman from Indeed's Responsible AI Team shared insights into how their organization approaches AI, emphasizing fairness and responsibility. She explained how Indeed uses LLMs for tasks like matching job seekers with jobs and generating content, underscoring the importance of ensuring factual accuracy, reducing biases, and avoiding toxic outputs. Heilman also discussed the use of SageMaker Clarify in evaluating these models, focusing on its capability to detect discrimination, toxicity, and performance disparities across different demographic groups, which is crucial for maintaining fairness and compliance in AI-driven processes.

AWS re:Invent 2023 - Accelerate foundation model evaluation with Amazon SageMaker Clarify (AIM367)

AIM373 Evaluate and select the best FM for your use case in Amazon Bedrock

The presentation at AWS re:Invent 2023 focused on evaluating and selecting the best foundation model for specific use cases using Amazon Bedrock. Jessie, a senior product manager at AWS, introduced the topic, highlighting the recent addition of a model evaluation feature in Amazon Bedrock. The session outlined the importance of foundation model evaluation, emphasizing the need for businesses to align models with their brand voice, ensure applications are data-appropriate, and maintain safety and trustworthiness. The challenges faced by customers in this process, such as finding a suitable model home, deciding on evaluation metrics, and managing data sets, were also discussed.

The presentation detailed two types of evaluations: automatic and human. The automatic evaluation allows users to select from predefined metrics like accuracy, robustness, and toxicity, and offers built-in algorithms for evaluation. Users can either use AWS-provided data sets or import their own. The human evaluation component enables a more nuanced assessment, with options for users to bring their own evaluation team or use an AWS-managed team. The process includes defining custom metrics and setting up an evaluation portal for the team. This segment was demonstrated by Kishore, who walked through the process of setting up both automatic and human evaluations, showcasing the platform's capabilities.

In conclusion, the session wrapped up with a summary of the features and benefits of model evaluation on Amazon Bedrock. It emphasized the ease of setup, the variety of evaluation methods, and the integration of this feature within the Bedrock environment. Attendees were encouraged to start using these features for their projects and were directed to the Amazon Bedrock webpage for further information. The presentation ended with an invitation to fill out a survey and an offer to answer questions in a more informal setting.

AWS re:Invent 2023 - Evaluate and select the best FM for your use case in Amazon Bedrock (AIM373)

AIM377 Prompt Engineering Best Practices for LLMs on Amazon Bedrock

The AWS re:Invent 2023 session titled "Prompt engineering best practices for LLMs on Amazon Bedrock (AIM377)" featured discussions on prompt engineering for large language models (LLMs), emphasizing the significance of clear and detailed instruction in eliciting accurate responses from LLMs. The session, led by AWS principal engineer John Baker and Nicholas Marwell from Anthropic, highlighted various techniques and best practices in prompt engineering. These techniques include the use of examples (few-shot learning), chain-of-thought prompting for complex problems, and role prompting to elicit specific tones or levels of complexity in responses. The presentation underscored the importance of detailed, human-understandable instructions to guide LLMs effectively.

The session also delved into advanced prompting techniques like function calling or tool use, enabling LLMs to perform a range of functions by combining user inputs with descriptions of available tools. This approach allows for more dynamic and responsive interaction with LLMs, where the model can decide the relevance and application of various tools to answer a query. This technique was demonstrated as particularly effective in scenarios like Retrieval Augmented Generation (RAG), where the LLM can selectively retrieve and use external data to answer queries more accurately.

The speakers emphasized the empirical nature of prompt engineering, advocating for a methodical approach involving writing test cases, creating detailed and comprehensive prompts, and iteratively refining these prompts based on test results. They stressed the importance of detailed tool descriptions (akin to developer documentation) for effective function calling, and the significant role of clear and structured prompts in reducing hallucinations and prompt injections, thereby enhancing the reliability and safety of LLM responses. The session concluded with a reminder about the wealth of resources and detailed documentation available for developers interested in prompt engineering for LLMs.

AWS re:Invent 2023 - Prompt engineering best practices for LLMs on Amazon Bedrock (AIM377)

Conclusion

These are summaries of all the 300 and 400 level AI/ML sessions. We hope you found these helpful in both getting an overview of the new AI/ML content as well as deciding which sessions to go watch.

AWS Announcements

Analytical AI & MLOps

Generative AI & LLMOps

Brian Tarbox

Brian is an AWS Community Hero, Alexa Champion, runs the Boston AWS User Group, has ten US patents and a bunch of certifications. He's also part of the New Voices mentorship program where Heros teach traditionally underrepresented engineers how to give presentations. He is a private pilot, a rescue scuba diver and got his Masters in Cognitive Psychology working with bottlenosed dolphins.

View Brian's articles

Learn more about the services mentioned

Caylent Services

Artificial Intelligence & MLOps

Apply artificial intelligence (AI) to your data to automate business processes and predict outcomes. Gain a competitive edge in your industry and make more informed decisions.

Accelerate your cloud native journey

Leveraging our deep experience and patterns

Get in touch

Infographic: The Life Sciences AI Architecture

Analytical AI & MLOps

August 5, 2025

How AI is Revolutionizing Database Migration: From Year-long Projects to Quarterly Wins

AI-powered automation is transforming database migrations. Read expert insights on faster, safer, and more cost-effective modernization for enterprises.

Generative AI & LLMOps

Databases

August 1, 2025

Architecting GenAI at Scale: Lessons from Amazon S3 Vector Store and the Nuances of Hybrid Vector Storage

Explore how AWS S3 Vector Store is a major turning point in large-scale AI infrastructure and why a hybrid approach is essential for building scalable, cost-effective GenAI applications.

Generative AI & LLMOps

View all blog posts

SEG301 How Fetch build world-class ML models to power their business

AIM325 Scale complete ML development with Amazon SageMaker Studio

AIM326 Large model training on AWS Deep Learning AMIs & PyTorch

AIM327 Scaling FM inference to hundreds of models with Amazon SageMaker AI

AIM328 Accelerate FM development with Amazon SageMaker JumpStart

AIM330 Deploy FMs on Amazon SageMaker AI for price performance

AIM331 Explore Amazon Titan for language tasks

AIM332 Explore image generation and search with FMs on Amazon Bedrock

AIM333 Explore text-generation FMs for top use cases with Amazon Bedrock

AIM334 Improve FMs with Amazon SageMaker AI human-in-the-loop capabilities

AIM335 Train and tune state-of-the-art ML models on Amazon SageMaker AI

AIM336 Use RAG to improve responses in generative AI applications

AIM337 Accelerate generative AI application development with Amazon Bedrock

AIM353 Simplify generative AI app development with Agents for Amazon Bedrock

AIM361 Build responsible AI applications with Guardrails for Amazon Bedrock

AIM362 Introducing Amazon SageMaker HyperPod

AIM363 New LLM capabilities in Amazon SageMaker Canvas

AIM367 Accelerate foundation model evaluation with Amazon SageMaker Clarify

AIM373 Evaluate and select the best FM for your use case in Amazon Bedrock

AIM377 Prompt Engineering Best Practices for LLMs on Amazon Bedrock

Conclusion

Brian Tarbox

Learn more about the services mentioned

Artificial Intelligence & MLOps

Accelerate your cloud native journey

Related Blog Posts

Infographic: The Life Sciences AI Architecture

How AI is Revolutionizing Database Migration: From Year-long Projects to Quarterly Wins

Architecting GenAI at Scale: Lessons from Amazon S3 Vector Store and the Nuances of Hybrid Vector Storage