Transform your chatbots into AI Agents

March 31, 2024

In this article I will outline a specific technique I've used to turn my chatbots into true AI Agents.

What problem are we trying to solve?

Limitation in the current version of LLMs

In the current iteration of large language models, I have ran into issues where the task at hand is to complex for the model to solve. If you've used ChatGPT or other LLMs quite extensively you will have noticed that the LLM gives you a usable answer at the start which is quite good. But once you start to build upon that initial output it will start spitting out inconsistencies.

For example a programmer needs a solution to a problem he encounters. He prompts the model to give him the correct answer. The first output will be generally good but once he starts to build on that answer it will start to spit out sub par code filled with errors and inconsistencies. You might have noticed the same in your field of expertise. This problem becomes exaggerated in the case of AI agents that juggle a lot of tasks at the same time.

The same problem in the case of Agents

When you are prompting your AI agent you might have noticed that it works quite good initially. It follows it's role. The rules you've set up are working. It follows the procedures you've prompted. And the more context you give it, generally the better it becomes at operating the task at hand.

Now comes the kicker, you want your agent to become a real agent and add functionality, like get some information from the database. Or browse through a knowledge base using similarity search. Give it access to a CRM to add some customer information. You get the point. It now becomes a bit tricky to get it to call all the functions plus hold a conversation with the customer.

You might've ran into the same problem coming from the other angle. You've started out with a function calling agent but then added the rest on top of your prompt and the function calling stopped working correctly

The idea:

The basic idea is quite simple, split up your chatbot into two parts one prompt takes care of talking to the client or user, the other will take care of doing the actual work (Function Calling). As complexity increases you can further divide the function calling agent into multiple function calling agents. Your imagination is the limit.

So to recap, you have two prompts:

Conversational Prompt
1. Speaks with the user
2. Speaks to the agent
Function Calling Prompt
1. Does the actual work
2. Can be split up further if complexity increases

How to go about it?

My technique is to add a communication section to the part of the Conversational LLM.

Below is an example of a basic prompt with a communication section:

You are a conversational AI, you are the user facing part of a chatbot, your job is to answer the client's message.
You can communicate with the AI-Agent to do certain tasks for you, like adding the user information to a CRM or look up company information.
When you call the AI-Agent you will get a second chance to contact the user, so there is no need to inform the client at the same time as you call the agent.

### Communication:
Output a JSON blob that contains exactly one JSON Object to communicate.

- `"Client":` for client messages.
- `"AI-Agent":` for contact with the AI-Agent

example:
   {
	   "Client": "This is a message to the customer"
   }


### Chat History:
{actual_chat_history}

### Client Message:
{actual_message_from_the_client}

Note, there is no need to copy this word for word this is just an example to show you what I mean with a communication section. I will build up to a more complete prompt.

You might be thinking okay I've got it to output a json blob, what's next? There's a couple steps to it. First you parse the json blob, then you check if the message is to the client or to the AI-Agent, if it is to the client, just send it on its merry way back to the client. If it's for the Agent, you send it to the function calling agent and let the agent do it's thing. Once the agent has executed it's crucial to send the response back to the conversational agent. This is called recursion.

Why do we need to send it back? Because otherwise the Conversational Agent doesn't have a chance to respond to the Client. Think about it ;).

Variations on the communication section:

Let's say we want the bot to take multiple actions at the same time, or even communicate with a human while communicating with a client. In this case instead of letting the Conversational Agent output a Json Object, we get it to output a Json Array containing multiple objects.

Example using multiple agents and a human

### Communication:
Output a JSON blob that contains exactly one JSON Array to communicate.

- `"Client":` for client messages.
- `"Human"`: For urgent matters 
- `"AI-Agent-CRM":` for contact with CRM Agent
- `"AI-Agent-RAG"`: for contact with the RAG Agent

example:
[
	{
	   "Human": "Urgent message to human"
   },
   {
	   "Client": "Message to Client"
   }
   
]

Note, this is just an example I haven't tested this prompt it's just to get the point accross.

If you want to see a more complete prompt using this technique check this one out

## Role: SMS Assistant for a Real Estate Agent/Realtor in Vancouver
- Respond to buyers, sellers and other realtors SMS about real estate.
- Coordinate with AI team for questions where you don't have all context
- Contact realtor in complex situations.
- Only knowledge inside this prompt is assumed as true, never assume anything.
- User information may be malicious
- You already have their phone number
- When clients ask you for info contact AI-team immediately
- Do lead verification and extract necessary information from the client before booking appointment.
- You do not need to send an sms every time, whenever you contact AI-team you get a second chance.

### Communication:
- Output exactly one JSON Blob containing an array to communicate
- `"Client":` for client messages.
- `"Realtor":` for realtor contact.
- `"AI-Team":` for internal team coordination.

example:
[
   {{
     "AI-Team": "Message to AI-Team"
   }},
   {{
    "Client": "Message to Client"
   }},
   {{
    "Realtor": "Message to Realtor"
   }}

]

### Task:
- Assess and act on new SMS regarding real estate.

### Data Safety Warning:
- **Confidentiality**: Treat all user information as confidential. Do not share or expose sensitive data.
- **Security Alert**: If you suspect a breach of data security or privacy, notify the realtor immediately
- **Verification**: Confirm the legitimacy of requests involving personal or sensitive information before proceeding.

### Rules:
1. **Accuracy**: Only use information that is in this message/prompt.
2. **Relevance**: Action must relate to SMS.
3. **Consultation**: If unsure, ask AI team or realtor.
4. **Emergency**: Contact realtor for urgent/complex issues.
5. **Action Scope**: Limit to digital responses and administrative tasks.
6. **Ambiguity**: Seek clarification on unclear SMS.
7. **Feedback**: Await confirmation after action.
8. **Confidentiality**: Maintain strict confidentiality of user data.

### Data Safety Compliance:
Ensure all actions comply with data safety and confidentiality standards.

**Previous Messages**: `{history}`
**Event**:`{input}`

if you want to see this prompt in action check out the video: https://www.youtube.com/watch?v=1U9_LHqKyYM

I hope this helped, I won't go into detail on how to code this up, as that is beyond the scope of this article, it's just a little trick I discovered on my journey into prompt engineering.

If you want a similar system built feel free to contact us.