Building a RAG AI with OpenSearch Serverless and LangChain
Learn how to build a RAG-based GenAI bot on AWS using OpenSearch Serverless, through our step-by-step example.
Join Caylent’s Connie Chen and Mark Olson as they discuss the possibilities of developing a conversational chatbot that leverages Generative AI. Learn the steps needed to develop it, how organizations can benefit from it, and how this innovative approach is set to transform customer interactions.
Mark Olson: So Connie, in addition to what we’re doing for customers, I know as leader of the marketing department, you’ve got an interesting case of AI, generative AI specifically within Caylent. I'd love to hear a little bit more about what you’re trying to do with generative AI.
Connie Chen: Yeah, so recently, we just launched our new website, and we’ve always been talking about including a chatbot, but there are so many different types of chatbots out there where you go to it and say “Hi, I’d like to speak to someone” and it would just generate something that is completely canned. I want something that actually talks to you like a human, and I feel with a chatbot leveraging generative AI, we can pull in data sources not only from just our website, but from our social networks and looking through all of the videos that we’ve created to actually be able to answer the question that the user has inputted into the chat space.
MO: That’s really interesting, because I feel like when I see a chatbot on a website, I’m actively avoiding them because typically their keywords are hard to follow, it’s like dealing with phone trees. So you're thinking about a bot that's a little bit more conversational, that's a little bit more like dealing with one of your team members on the marketing team or branding and really understands Caylent and can talk to a customer.
CC: Exactly, we’ve all dealt with chatbots before, half the time when we get about 2 questions in, they’re like “Let me connect you to a representative” and then you’re sitting there waiting for however long. I just imagine with the inventions that are going on with generative AI, I believe that it's almost going to be like speaking to a human and they will be just as knowledgeable as an actual human behind the keyboard that's answering this chat.
MO: Right, so I have good news for you. It turns out we actually have an application pattern that can solve that. Before we jump into that though, just to set a base expectation, I know people are working with Chat GPT and are having fun with it, making poems, and doing all kinds of things like making it talk like a pirate and all these other use cases. But that model and others like it aren't necessarily trained on Caylent. There was no reason for it to be very specific to Caylent. However, one of the approaches that we're taking to customizing those large language models without retraining them, an efficient approach to that is to use a technique called Retrieval Augmented Generation. Now that’s a fancy term for, essentially, prompting the chatbot with more information that is very specific to Caylent. So if I wanted to ask about Caylent’s offerings, we could prompt that chatbot with the idea of our Catalysts that we have that help accelerate customer experience. And so, with that, we take the conversational abilities of a large language model, and we add the knowledge of Caylent and help summarize it and have conversations with clients, and hopefully give them much more useful information that also tie back to primary sources. So if we're going to talk about Caylent’s Enhanced Control Tower Catalyst, we can link back to the page and do interesting things like that. I think the timing of the chatbot that you're dreaming of is coming along, it's just a matter of getting a little time from our developers to put something together for you.
CC: Now if we're actually thinking about architecting this, what kind of work needs to go into it? What type of talent do we need to create this? Generally, how long would it take? Is it something we can buy out of the box to do?
MO: Today, there are a few frameworks that are closer to out of the box, but there's a little bit of work from technical staff in terms of making sure that we collect all that data and we're able to index it. What you would talk about in terms of generative AI is creating embeddings, which is essentially a numerical representation of the data in these documents and then storing those into a vector database, using that to search. So we're getting into some technical nitty gritty, but that's the kind of thing our Cloud Native Applications and Data Engineering team would do, it's a blend of their skill sets to prepare that. There's also a prompt engineering aspect to tune the chatbot to the type of brand voice that you would like it to have and the kinds of interactions that you would like it to have with customers. So there's a couple of different roles that play a part in that.
CC: Got it. And I do remember in traditional ML when you have to introduce a new data source, you might need to retrain the model, is that an issue?
MO: That's actually a really insightful question. The nice thing about this approach is that as new data comes in, we are essentially updating not the model itself but the contextual information that we're feeding to the model to summarize for our users' conversations. So it's nice to not have to retrain, especially in the case of a large language model that could mean hundreds of thousands to millions of dollars to train, we just don't have the budget for that, but indexing one new document as its added to the website is extremely cost efficient, so it's a really nice way to do that.
CC: And outside of documents, because I want to be able to input different kinds of data sources, are there any limitations to the type of data sources that we use for this model?
MO: For this use case, it would be textual sources, so we would want some structure around it so that there's ability for the model to summarize and understand it in terms of the English language, but that's basically what you're talking about when you talk about things like from Slack or from other use cases that might show up in marketing. Those textual use cases are really easily handled by this kind of model.
CC: Now say I wanted to add in video and audio text because we have a bunch of podcasts as well as, and for instance, the video we're filming now and some other videos that we've done, even webinars that we've done and recorded speaking engagements. There's a lot of good information there and I feel like that could definitely add to the knowledge base of the chatbot.
MO: Yeah, and in the short term I think what you'll see is that transcripts of those kinds of things, metadata about the video rather than having a native understanding of the video itself, is going to be your fastest mode to production. But what we do see with the generative AI space, is a convergence on multimodal models which can understand text and video and images as well. So I don't think we're there today, for what you want to do, but if we use the tools that we have to generate a transcript of your video, we can determine that this could be relevant to the user's query and point them in the right direction. So I think we can make it look good from the user experience perspective, I think the back end right now isn't just one answer, it's going to be a combination of pieces that our development and data teams can put together to accomplish the solutions that you're looking for.
CC: Now say that the answer that the chatbot spit out was completely not useful to the user, are we able to see that, or react to that or change it? How are we able to see that it’s something wrong?
MO: Yeah, that's a good question. So there's a couple of ways to do that, one of the obvious ways that you may have seen in the experience that you’ve had, is the idea of “Was this answer useful?” Thumbs up or thumbs down. So that basically gives you human feedback to understand whether the chatbots are on the right track or not. If it's not useful, we can redirect that user either to a live person or we can restart the conversation and try again with the chatbot. But also, the key would be behind the scenes, logging that information and understanding that Connie had a bad experience with a chat bot and she gave it a thumbs down, what was the history of that interaction and can we tune the prompts that the chat bot got to lead it to a better answer in the future if they get those same kinds of questions?
CC: Okay, I mean that's probably one of my bigger fears is if the chatbot is spitting out something that's completely not useful, and the person would get very frustrated. But if we're able to have some guardrails in place or something that lets us know that a customer had a bad experience on the chat bot and us being able to adjust the outputs from the prompts, then that makes it a lot easier.
MO: Yeah, for sure. Your whole point is that you're trying to improve the user experience using our website, you want to come away with happy users. If we introduce automation or a chatbot that purports to make them happier, we want to measure that it does and make sure that we're tuning with that in mind. So absolutely, it’s best practice to consider that
CC: Is there anything that I should be concerned about with generative AI, speaking back to our customers?
MO: Well, I think the biggest thing is that the one that you brought up, making sure that the information is useful and that we're getting truthful information back. So I think a little bit of that becomes the development and testing cycle, making sure that our users are throwing curveballs at the bot and understanding how it reacts in the situation. So again, the best practice in the way that we would roll out generative AI for customers, we would do the same thing internally for Caylent use cases.
CC: Well, that sounds like a lot of fun, having the entire company test out our chat bot!
MO: Yeah, let’s get started!
Are you exploring ways to take advantage of Analytical or Generative AI in your organization? Partnered with AWS, Caylent's data engineers have been implementing AI solutions extensively and are also helping businesses develop AI strategies that will generate real ROI. For some examples, take a look at our Generative AI offerings.
As Caylent's VP of Customer Solutions, Mark leads a team that's entrusted with envisioning and proposing solutions to an infinite variety of client needs. He's passionate about helping clients transform and leverage AWS services to accelerate their objectives. He applies curiosity and a systems thinking mindset to find the optimal balance among technical and business requirements and constraints. His 20+ years of experience spans team leadership, technical sales, consulting, product development, cloud adoption, cloud native development, and enterprise-wide as well as line of business solution architecture and software development from Fortune 500s to startups. He recharges outdoors - you might find him and his wife climbing a rock, backpacking, hiking, or riding a bike up a road or down a mountain.View Mark's articles
Learn how to build a RAG-based GenAI bot on AWS using OpenSearch Serverless, through our step-by-step example.
Understand key concepts like Large Language Models (LLMs), Retrieval Augmented Generation (RAG) & Prompt Engineering to arm you with the knowledge needed to leverage the remarkable capabilities of GenAI.
Explore the basics of GenAI, the necessary skills needed to utilize it, resources you need to build your own AI apps, and how to use Amazon Bedrock to reduce the initial investments towards getting started.