Prompt Engineering

August 24, 2023

Generative AI & LLMOps

Learn from Caylent’s Randall Hunt and Mark Olson as they discuss prompt engineering and how to use it effectively to improve response relevance and accuracy.

With the rise of generative AI models, comes a need to structure text that can be interpreted and understood by the model. That’s where prompt engineering comes in. Prompt engineering entails the act of optimizing the input given to large language models (LLMs) to produce the most effective responses.

Tips for paths to success

Here is a list of tips we recommend you use when inputting a prompt:

Use formatting
Adopt a persona
Ask it to think step-by-step
Ask it to do one thing at a time
Don't ask it to combine disparate responses into one common response
Instruct the model to ask for clarification

These models are attempting to predict the next token, so it is important to give them context in the form of tokens to work with. If they are given a short prompt, that gives the models too few tokens upon which it is able to create new information from the context. Longer prompts are ideal, within the model's context window, as you can usually generate much better results. It's a balancing act between prompt length and prompt performance (maximizing per-token utility).

If you’re using a third party model, you can also implement some guardrails to restrict the paths that the model will take and the suggestions that it will provide. With some APIs and models, you can differentiate between user prompt vs. system prompt, and you can have system prompts that influence model behavior outside of the normal Human/Bot loop.

What if I get back something I didn’t want?

You can run few shot or zero shot learning scenarios. This involves giving the model a series of example inputs and outputs that you want the model to emulate. From those tokens, it will learn how it’s supposed to predict the outcome of subsequent features. For instance, you could give the model a list of items and instruct it to return the list back in the specific format you are looking for.

When prompting the models, ensure that you are using common formatting tools such as colons, quotes, and even markdown. A common scenario that we have encountered involves translating natural language to SQL or other query languages like, an AWS OpenSearch query. We give it an example and put the query with the templated names in with three backticks (```) to help the LLM identify it as code and not text. It is important to ensure that things are formatted appropriately.

Another common technique is to ask the models to think step-by-step. LLMs need tokens about which to reason with, so it’s important to give it more tokens to generate, within a context window. You give the model specific instructions to think step-by-step about the prompts you are providing and ask it to seek approval before pursuing a task, and if it is not absolutely confident, it should prompt you for more information.

Occasionally, conversations will also get stuck down the wrong path, so how do you correct it? Different models will behave differently. If a model has gone down a wrong path, you can ask it occasionally to correct itself and it will go and treat the full context of the situation of the prompt as if it has made a mistake along the way and it will attempt to predict what the next version is going to be. However, a lot of models won’t be able to do that as once they've gone down the wrong path, they're stuck in a local minima, so it's sometimes easier to start over with a new prompt.

Adopting a persona

Another key recommendation around prompt engineering involves asking the LLM to adopt a persona when given a prompt, however you have to be careful asking for persona’s that have limited or no context. When you tell the model to adopt a persona around which it has very little data in the model, it will give an answer worse than if you were to ask it to reply in a manner similar to an average human. This is because the model relies on the token generation within its existing model space and it has a data set that is extremely small for something like “world’s foremost expert on beanie babies”, and there have likely been reward modeling or other techniques utilized to deselect those paths as they often give incorrect results for the more common array of questions. To get better results, it is recommended to ask the models to behave as an expert or as an rather than as the best expert in the world.

Next Steps

Prompt engineering & Generative AI are still relatively new and there is still plenty of room to dig in and learn.

Are you exploring ways to take advantage of Analytical or Generative AI in your organization? Partnered with AWS, Caylent's data engineers have been implementing AI solutions extensively and are also helping businesses develop AI strategies that will generate real ROI. For some examples, take a look at our Generative AI offerings.

Accelerate your GenAI initiatives

Leveraging our accelerators and technical experience

Browse GenAI Offerings

Generative AI & LLMOps

Randall Hunt

Randall Hunt, Chief Technology Officer at Caylent, is a technology leader, investor, and hands-on-keyboard coder based in Los Angeles, CA. Previously, Randall led software and developer relations teams at Facebook, SpaceX, AWS, MongoDB, and NASA. Randall spends most of his time listening to customers, building demos, writing blog posts, and mentoring junior engineers. Python and C++ are his favorite programming languages, but he begrudgingly admits that Javascript rules the world. Outside of work, Randall loves to read science fiction, advise startups, travel, and ski.

View Randall's articles

Mark Olson

Mark Olson, Caylent's Portfolio CTO, is passionate about helping clients transform and leverage AWS services to accelerate their objectives. He applies curiosity and a systems thinking mindset to find the optimal balance among technical and business requirements and constraints. His 20+ years of experience spans team leadership, technical sales, consulting, product development, cloud adoption, cloud native development, and enterprise-wide as well as line of business solution architecture and software development from Fortune 500s to startups. He recharges outdoors - you might find him and his wife climbing a rock, backpacking, hiking, or riding a bike up a road or down a mountain.

View Mark's articles

Integrating MLOps and DevOps on AWS

From notebooks to frictionless production: learn how to make your ML models update themselves every week (or earlier). Complete an MLOps + DevOps integration on AWS with practical architecture, detailed steps, and a real case in which a Startup transformed its entire process.

Analytical AI & MLOps

Infrastructure & DevOps Modernization

Generative AI & LLMOps

October 30, 2025

Jumpstart Your AWS Cloud Migration

Learn how small and medium businesses seeking faster, more predictable paths to AWS adoption can leverage Caylent's SMB Migration Quick Start to overcome resource constraints, reduce risk, and achieve cloud readiness in as little as seven weeks.

Migrations

Generative AI & LLMOps

October 17, 2025

Evolving MultiAgentic Systems

Explore how organizations can evolve their agentic AI architectures from complex multi-agent systems to streamlined, production-ready designs that deliver greater performance, reliability, and efficiency at scale.

Generative AI & LLMOps

View all blog posts

Tips for paths to success

What if I get back something I didn’t want?

Adopting a persona

Next Steps

Accelerate your GenAI initiatives

Randall Hunt

Mark Olson

Related Blog Posts

Integrating MLOps and DevOps on AWS

Jumpstart Your AWS Cloud Migration

Evolving MultiAgentic Systems