Evolving MultiAgentic Systems
Explore how organizations can evolve their agentic AI architectures from complex multi-agent systems to streamlined, production-ready designs that deliver greater performance, reliability, and efficiency at scale.
Learn from Caylent’s Randall Hunt and Mark Olson as they discuss prompt engineering and how to use it effectively to improve response relevance and accuracy.
With the rise of generative AI models, comes a need to structure text that can be interpreted and understood by the model. That’s where prompt engineering comes in. Prompt engineering entails the act of optimizing the input given to large language models (LLMs) to produce the most effective responses.
Here is a list of tips we recommend you use when inputting a prompt:
These models are attempting to predict the next token, so it is important to give them context in the form of tokens to work with. If they are given a short prompt, that gives the models too few tokens upon which it is able to create new information from the context. Longer prompts are ideal, within the model's context window, as you can usually generate much better results. It's a balancing act between prompt length and prompt performance (maximizing per-token utility).
If you’re using a third party model, you can also implement some guardrails to restrict the paths that the model will take and the suggestions that it will provide. With some APIs and models, you can differentiate between user prompt vs. system prompt, and you can have system prompts that influence model behavior outside of the normal Human/Bot loop.
You can run few shot or zero shot learning scenarios. This involves giving the model a series of example inputs and outputs that you want the model to emulate. From those tokens, it will learn how it’s supposed to predict the outcome of subsequent features. For instance, you could give the model a list of items and instruct it to return the list back in the specific format you are looking for.
When prompting the models, ensure that you are using common formatting tools such as colons, quotes, and even markdown. A common scenario that we have encountered involves translating natural language to SQL or other query languages like, an AWS OpenSearch query. We give it an example and put the query with the templated names in with three backticks (```) to help the LLM identify it as code and not text. It is important to ensure that things are formatted appropriately.
Another common technique is to ask the models to think step-by-step. LLMs need tokens about which to reason with, so it’s important to give it more tokens to generate, within a context window. You give the model specific instructions to think step-by-step about the prompts you are providing and ask it to seek approval before pursuing a task, and if it is not absolutely confident, it should prompt you for more information.
Occasionally, conversations will also get stuck down the wrong path, so how do you correct it? Different models will behave differently. If a model has gone down a wrong path, you can ask it occasionally to correct itself and it will go and treat the full context of the situation of the prompt as if it has made a mistake along the way and it will attempt to predict what the next version is going to be. However, a lot of models won’t be able to do that as once they've gone down the wrong path, they're stuck in a local minima, so it's sometimes easier to start over with a new prompt.
Another key recommendation around prompt engineering involves asking the LLM to adopt a persona when given a prompt, however you have to be careful asking for persona’s that have limited or no context. When you tell the model to adopt a persona around which it has very little data in the model, it will give an answer worse than if you were to ask it to reply in a manner similar to an average human. This is because the model relies on the token generation within its existing model space and it has a data set that is extremely small for something like “world’s foremost expert on beanie babies”, and there have likely been reward modeling or other techniques utilized to deselect those paths as they often give incorrect results for the more common array of questions. To get better results, it is recommended to ask the models to behave as an expert or as an rather than as the best expert in the world.
Prompt engineering & Generative AI are still relatively new and there is still plenty of room to dig in and learn.
Are you exploring ways to take advantage of Analytical or Generative AI in your organization? Partnered with AWS, Caylent's data engineers have been implementing AI solutions extensively and are also helping businesses develop AI strategies that will generate real ROI. For some examples, take a look at our Generative AI offerings.
Leveraging our accelerators and technical experience
Browse GenAI OfferingsRandall Hunt, Chief Technology Officer at Caylent, is a technology leader, investor, and hands-on-keyboard coder based in Los Angeles, CA. Previously, Randall led software and developer relations teams at Facebook, SpaceX, AWS, MongoDB, and NASA. Randall spends most of his time listening to customers, building demos, writing blog posts, and mentoring junior engineers. Python and C++ are his favorite programming languages, but he begrudgingly admits that Javascript rules the world. Outside of work, Randall loves to read science fiction, advise startups, travel, and ski.
View Randall's articlesMark Olson, Caylent's Portfolio CTO, is passionate about helping clients transform and leverage AWS services to accelerate their objectives. He applies curiosity and a systems thinking mindset to find the optimal balance among technical and business requirements and constraints. His 20+ years of experience spans team leadership, technical sales, consulting, product development, cloud adoption, cloud native development, and enterprise-wide as well as line of business solution architecture and software development from Fortune 500s to startups. He recharges outdoors - you might find him and his wife climbing a rock, backpacking, hiking, or riding a bike up a road or down a mountain.
View Mark's articlesExplore how organizations can evolve their agentic AI architectures from complex multi-agent systems to streamlined, production-ready designs that deliver greater performance, reliability, and efficiency at scale.
Explore the newly launched Claude Haiku 4.5, Anthropic's first Haiku model to include extended thinking, computer use, and context awareness capabilities.
Explore Anthropic’s newly released Claude Sonnet 4.5, including its record-breaking benchmark performance, enhanced safety and alignment features, and significantly improved cost-efficiency.