
Youtube video for this section is still under creation. Please be patient ^^
As shown in the previous examples, each agent maintains its own message history, which forms its memory. When a new request is sent to the LLM (e.g., via Ollama), the entire history is forwarded to the inference server. The LLM responds to the latest prompt, using the context provided by previous messages and, if available, the initial system prompt.
This is what an history looks like:
There are 4 types of messages:
First, there is the "System" prompt, if present, always goes first:To counteract those negative effects it is recommended you clean the history when possible. For instance, you
could use the forget=True
parameter in the Task()
class so the prompt and the LLM
response do not get saved to the history. You'll see there are many ways to preserve the history from useless noise.
The Agent class comes with a .history
property of type History.
It exposes methods to manipulate and view the history.
For instance, you can use the .pretty_print()
method to print the history on the standard output
using the classic color scheme:
agent1.history.pretty_print()
.get_messages_as_dict()
which is great for parsing!
messages = agent1.history.get_messages_as_dict()
messages[0].content # First message
messages[1].content # Second message
Let's discover all the different ways to interact with an agent's history!
The history class exposes a .pretty_print()
method to print the history on the standard output.
It uses the classic color scheme to make it easier to read.
Or you can use .get_messages_as_dict()
which is great for parsing!
Look at this simple example:
from yacana import LoggerManager, OllamaAgent, Task
# Let's deactivate automatic logging so that only OUR prints are shown
LoggerManager.set_log_level(None)
agent1 = OllamaAgent("Cook", "llama3.1:8b", system_prompt="You are a pastry chef")
Task("Generate 5 pastry names followed by the associated estimated calorie.", agent1).solve()
Task("Rank the pastries from the lowest calorie count to the largest.", agent1).solve()
print("############## Agent 1 history pretty print ##################")
agent1.history.pretty_print()
print("############## END ##################")
print("")
print("-------------- Agent 1 history dictionnary --------------")
print(str(agent1.history.get_messages_as_dict()))
print("-------------- END --------------")
############## Agent 1 history pretty print ##################
[user]:
Generate 5 pastry names followed by the associated estimated calorie.
[assistant]:
Here are 5 pastry names with their associated estimated calorie counts:
1. **Cinnamon Swirl Brioche** (250-300 calories) - A sweet, buttery brioche filled with a gooey cinnamon swirl.
2. **Lemon Lavender Mille-Feuille** (400-450 calories) - Layers of flaky pastry, lemon curd, and lavender cream create a bright and airy dessert.
3. **Chocolate Soufflé Cake** (500-550 calories) - A rich, decadent chocolate cake that rises like a cloud from the oven, served with a scoop of vanilla ice cream.
4. **Raspberry Almond Croissant** (200-250 calories) - Flaky, buttery croissants filled with sweet and tart raspberry jam and topped with sliced almonds.
5. **Pistachio Rosewater Macarons** (150-200 calories) - Delicate, chewy macarons flavored with pistachio and rosewater, sandwiched together with a light and creamy filling.
Note: The estimated calorie counts are approximate and may vary based on specific ingredients and portion sizes used.
[user]:
Rank the pastries from the lowest calorie count to the largest.
[assistant]:
Based on the estimated calorie counts I provided earlier, here are the pastries ranked from lowest to highest:
1. **Pistachio Rosewater Macarons** (150-200 calories)
2. **Raspberry Almond Croissant** (200-250 calories)
3. **Cinnamon Swirl Brioche** (250-300 calories)
4. **Lemon Lavender Mille-Feuille** (400-450 calories)
5. **Chocolate Soufflé Cake** (500-550 calories)
Let me know if you have any other questions!
############## END ##################
-------------- Agent 1 history dictionnary --------------
[{'role': 'system', 'content': 'You are a pastry chef'}, {'role': 'user', 'content': 'Generate 5 pastry names followed by the associated estimated calorie.'}, {'role': 'assistant', 'content': 'Here are 5 pastry names with their associated estimated calorie counts:\n\n1. **Cinnamon Swirl Brioche** (250-300 calories) - A sweet, buttery brioche filled with a gooey cinnamon swirl.\n2. **Lemon Lavender Mille-Feuille** (400-450 calories) - Layers of flaky pastry, lemon curd, and lavender cream create a bright and airy dessert.\n3. **Chocolate Soufflé Cake** (500-550 calories) - A rich, decadent chocolate cake that rises like a cloud from the oven, served with a scoop of vanilla ice cream.\n4. **Raspberry Almond Croissant** (200-250 calories) - Flaky, buttery croissants filled with sweet and tart raspberry jam and topped with sliced almonds.\n5. **Pistachio Rosewater Macarons** (150-200 calories) - Delicate, chewy macarons flavored with pistachio and rosewater, sandwiched together with a light and creamy filling.\n\nNote: The estimated calorie counts are approximate and may vary based on specific ingredients and portion sizes used.'}, {'role': 'user', 'content': 'Rank the pastries from the lowest calorie count to the largest.'}, {'role': 'assistant', 'content': 'Based on the estimated calorie counts I provided earlier, here are the pastries ranked from lowest to highest:\n\n1. **Pistachio Rosewater Macarons** (150-200 calories)\n2. **Raspberry Almond Croissant** (200-250 calories)\n3. **Cinnamon Swirl Brioche** (250-300 calories)\n4. **Lemon Lavender Mille-Feuille** (400-450 calories)\n5. **Chocolate Soufflé Cake** (500-550 calories)\n\nLet me know if you have any other questions!'}]
-------------- END --------------
Output speaks for itself.
As mentioned earlier it's better to keep the History clean. Too many prompts and unrelated
questions will lead to poorer results so if you have the opportunity to scratch some portion
then you should.
Yacana allows you to make history snapshots and rollback to any of them.
This is particularly useful when reaching the end of a flow branch and needing to roll back to start a new one.
It is as simple as this:
# Creating a checkpoint
checkpoint_id: str = agent1.history.create_check_point()
The checkpoint_id is merely a unique string identifier that you can use to load back a save. Like this:
# Go back in time to when the checkpoint was created
agent1.history.load_check_point(checkpoint_id)
Note that you can make a snapshot before
rolling back to a previous save.
This way you could go back… to the future. ^^
Are you okay Marty?
Let's take a concrete example. You have a pastry website that generates pastry recipes.
The flow will look like this:
With a bit of color, it would look like this:
from yacana import LoggerManager, OllamaAgent, Task
# Let's deactivate automatic logging so that only OUR prints are shown; Maybe reactivate (to "info") if you want to see what's happening behind the scenes.
LoggerManager.set_log_level(None)
agent1 = OllamaAgent("Cook", "llama3.1:8b", system_prompt="You are a pastry chef")
# Getting a list of pastries
pastries: str = Task("Generate 5 pastry names displayed as a list. ONLY output the names and nothing else.", agent1).solve().content
print(f"Welcome, you may order one of the following pastries\n{pastries}")
#Looping till the user is satisfied
while True:
print("")
# Creating our checkpoint to go back in time
checkpoint_id: str = agent1.history.create_check_point()
# Asking for one of the pastries from the list
user_choice: str = input("Please choose one of the above pastries: ")
# Printing associated calories for the selected pastry
pastry_calorie_question: str = Task(f"The user said '{user_choice}'. Your task is to output a specific sentence and replace the <replace> tags with the correct values: 'You selected the <replace>selected pastry</replace>. The average calorie intake for this pastry is <replace>average associated calories for the selected pastry</replace>. Do you wish to continue ?", agent1).solve().content
print(pastry_calorie_question)
# Asking if the user wants to continue
is_satisfied: str = input("Continue ? ")
# Basic yes / no router
router_answer: str = Task(f"The user said '{is_satisfied}'. Evaluate if the user was okay with its order. If he was, ONLY output 'yes', if not only output 'no'.", agent1).solve().content
if "yes" in router_answer.lower():
print("Thank you for your order.")
# The user was satisfied with his choice. Exiting the loop...
break
else:
# The user wants to choose another pastry. Let's go back in time by loading are previous checkpoint!
agent1.history.load_check_point(checkpoint_id)
# Let's go back to the top of the loop
continue
print("############## Agent 1 history pretty print ##################")
agent1.history.pretty_print()
print("############## END ##################")
▶️ Output:
Welcome, you may order one of the following pastries
1. Whipped Wonders
2. Creamy Confections
3. Flaky Fancies
4. Golden Galettes
5. Sugar Serenades
Please choose one of the above pastries: The Creamy one looks good
You selected the Creamy Confections. The average calorie intake for this pastry is 350-400 calories per serving. Do you wish to continue?
Continue ? no
Please choose one of the above pastries: Hummm. The golden one?
You selected the Golden Galettes. The average calorie intake for this pastry is approximately 250-300 calories per serving. Do you wish to continue?
Continue ? yes
Thank you for your order.
############## Agent 1 history pretty print ##################
[user]:
Generate 5 pastry names displayed as a list. ONLY output the names and nothing else.
[assistant]:
1. Whipped Wonders
2. Creamy Confections
3. Flaky Fancies
4. Golden Galettes
5. Sugar Serenades
[user]:
The user said 'Hummm. The golden one ?'. Your task is to output a specific sentence and replace the tags with the correct values: 'You selected the selected pastry . The average calorie intake for this pastry is average associated calories for the selected pastry . Do you wish to continue ?
[assistant]:
You selected the Golden Galettes. The average calorie intake for this pastry is approximately 250-300 calories per serving. Do you wish to continue?
[user]:
The user said 'yes'. Evaluate if the user was okay with the order. If he was, ONLY output 'yes', if not only output 'no'.
[assistant]:
yes
############## END ##################
As you can see in the above output, we went for "the creamy one" but when shown the
calories,
refused to continue… After that, we chose the "Golden Galettes" which was satisfying.
Then the program ended with an output of the agent's history.
We can see in the agent's output that it only remembered us choosing the "Golden
Galettes" but not
the "Creamy Confections". This is because we loaded the last checkpoint which rolled us
back to
making our choice again.
When an LLM struggles to solve a complex task it may be time to give it a little help.
In large language models, the approach to prompting can significantly influence the model's performance.
Yacana provides you with a way to add new Messages to the History manually. The History class
exposes an .add_message(...)
method.
It takes an argument of type Message()
with two parameters: a
[MessageRole]() enum and the string message itself.
For example:
from yacana import OllamaAgent, Message, MessageRole
# Creating a basic agent with an empty history
agent1 = OllamaAgent("AI assistant", "llama3.1:8b")
# We create a fake prompt identified as coming from the user (Thx to `MessageRole.USER`)
user_message = Message(MessageRole.USER, "What's 2+2 ?")
# We create a fake answer identified as coming from the LLM (Thx to `MessageRole.ASSISTANT`)
fake_ai_response = Message(MessageRole.ASSISTANT, "The answer is 4")
# Let's add these two Messages to the Agent's History
agent1.history.add_message(user_message)
agent1.history.add_message(fake_ai_response)
# Print the content of the history
agent1.history.pretty_print()
Outputs:
[user]:
What's 2+2 ?
[assistant]:
The answer is 4
The Agent's History successfully contains the two messages we manually added.
The .add_message()
always adds new messages
at the end of the stack, however you can add messages wherever you want using other History methods.
⚠️ Try to keep the alternation of USER and ASSISTANT as this is how "instruct" LLMs have been trained.
Let's see a 0-shot example asking for a JSON output extracted from a given sentence:
from yacana import OllamaAgent, Task
agent1 = OllamaAgent("Ai assistant", "llama3.1:8b")
Task(f"Print the following sentence as JSON, extracting the names and rephrasing the actions: 'Marie is walking her dog. Ryan is watching them through the window. The dark sky is pouring down heavy raindrops.'", agent1).solve()
Outputs:
INFO: [PROMPT][To: Ai assistant]: Print the following sentence as JSON extracting the names and rephrasing the actions: 'Marie is walking her dog. Ryan is watching them through the window. The dark sky is pouring down heavy raindrops.'
INFO: [AI_RESPONSE][From: Ai assistant]: Here is the sentence rewritten in JSON format:
{
"people": [
{
"name": "Marie",
"action": "walking"
},
{
"name": "Ryan",
"action": "watching through the window"
}
],
"weather": {
"condition": "heavy raindrops",
"sky": "dark sky"
}
}
Let me know if you'd like me to help with anything else!
Not bad but there's noise. We would like to output the JSON and nothing else. No bedside manners.
The Let me know if you'd like me to help with anything else!
must go.
Let's introduce another optional Task() parameter: json_output=True
. This relies on
the inference server (Ollama, OpenAI, etc.) to output as JSON.
It is preferable to prompt the LLM to "output as JSON" in addition to this option.
⚠️ This is only best effort ! Meaning that it may fail at ouputing correct JSON. If you need reliable JSON then go try structured output !
Replace the Task with this one:
Task(f"Print the following sentence as JSON extracting the names and rephrasing the actions: 'Marie is walking her dog. Ryan is watching them through the window. The dark sky is pouring down heavy raindrops.'", agent1, json_output=True).solve()
Outputs:
INFO: [PROMPT][To: Ai assistant]: Print the following sentence as JSON extracting the names and rephrasing the actions: 'Marie is walking her dog. Ryan is watching them through the window. The dark sky is pouring down heavy raindrops.'
INFO: [AI_RESPONSE][From: Ai assistant]: {"names": ["Marie", "Ryan"], "actions": {"Marie": "is walking", "Ryan": "is watching"}, "description": [{"location": "window", "activity": "watching"}, {"location": "outdoors", "activity": "pouring raindrops"}]}
Way better. No more noise.
However, we would prefer having an array of name
and action
even for
the weather (the name would be sky and the action raining).
To achieve this let's give the LLM an example of what we expect by making it believe it already outputted it correctly once:
from yacana import OllamaAgent, Task, MessageRole, Message
agent1 = OllamaAgent("Ai assistant", "llama3.1:8b")
# Making a fake valid interaction
agent1.history.add_message(Message(MessageRole.USER, "Print the following sentence as json extracting the names and rephrasing the actions: 'John is reading a book on the porch while the cold wind blows through the trees.'"))
agent1.history.add_message(Message(MessageRole.ASSISTANT, '[{"name": "John", "action": "Reading a book.", "Cold wind": "Blowing through the trees."]'))
Task(f"Print the following sentence as json extracting the names and rephrasing the actions: 'Marie is walking her dog. Ryan is watching them through the window. The dark sky is pouring down heavy raindrops.'", agent1).solve()
Outputs:
INFO: [PROMPT][To: Ai assistant]: Print the following sentence as JSON extracting the names and rephrasing the actions: 'Marie is walking her dog. Ryan is watching them through the window. The dark sky is pouring down heavy raindrops.'
INFO: [AI_RESPONSE][From: Ai assistant]: [{"name": "Marie", "action": "Walking her dog."}, {"name": "Ryan", "action": "Watching Marie and her dog through the window."}, {"name": "The dark sky", "action": "Pouring down heavy raindrops."}]
This is perfect!
(❕ Model temperature may impact performance here. Consider using a lower value.)
You can add multiple fake interactions like this one to cover more advanced use cases and train
the LLM on how to react when they happen. It would become multi-shot prompting.
You can also do multi-shot prompting with self-reflection. This takes more CPU time because you decompose the task into multiple subtasks but can be beneficial in some scenarios.
For example:
from yacana import OllamaAgent, Task
agent1 = OllamaAgent("Ai assistant", "llama3.1:8b")
Task('I will give you a sentence where you must extract as JSON all the names and rephrase all the actions. For example in the following sentence: "John is reading a book on the porch while the cold wind blows through the trees." would result in this JSON output: [{"name": "John", "action": "Reading a book."}, {"name": "Cold wind", "action": "Blowing through the trees."}] ', agent1).solve()
Task(f"Marie is walking her dog. Ryan is watching them through the window. The dark sky is pouring down heavy raindrops.", agent1, json_output=True).solve()
Outputs
INFO: [PROMPT]: I will give you a sentence where you must extract as JSON all the names and rephrase all the actions. For example in the following sentence: "John is reading a book on the porch while the cold wind blows through the trees." would result in this JSON output: [{"name": "John", "action": "Reading a book."}, {"name": "Cold wind", "action": "Blowing through the trees."}]
INFO: [AI_RESPONSE]: I'm ready to extract the names and rephrase the actions. What's the sentence?
INFO: [PROMPT]: Marie is walking her dog. Ryan is watching them through the window. The dark sky is pouring down heavy raindrops.
INFO: [AI_RESPONSE]: {"name": "Marie", "action": "Walking with her dog."}
:-(
In this case, it didn't work very well as only one name was extracted as JSON. But in more
complex scenarios we can assure you that letting the LLM reflect on the guideline beforehand,
can be very beneficial to solving the task.
Yacana provides a way to store an Agent state into a file and load it later. All of the Agent's
properties, model settings, etc are saved including the History.
Examples of use cases:
To save an Agent do the following:
from yacana import OllamaAgent, Task
agent1 = OllamaAgent("Ai assistant", "llama3.1:8b")
Task("What's 2+2 ?", agent1).solve()
# Exporting the agent1 current state to a file called agent1_save.json
agent1.export_to_file("./agent1_save.json")
If you look at the file agent1_save.json
you'll see something like this:
{
"name": "Ai assistant",
"model_name": "llama3.1:8b",
"system_prompt": null,
"model_settings": {},
"endpoint": "http://127.0.0.1:11434",
"history": [
{
"role": "user",
"content": "What's 2+2 ?"
},
{
"role": "assistant",
"content": "The answer to 2+2 is... (drumroll please)... 4!"
}
]
}
Now let's load back this agent from the dead using .import_from_file()
!
In another Python file add this code snippet:
from yacana import GenericAgent, Task
# You can use any of the Agent classes? The GenericAgent is merely the abstract parent class.
agent2: GenericAgent = GenericAgent.import_from_file("./agent1_save.json")
Task("Multiply by 2 the previous result", agent2).solve()
The .import_from_file(...)
works like a
factory pattern, returning a new Agent instance.
▶️ Output:
INFO: [PROMPT]: Multiply by 2 the previous result
INFO: [AI_RESPONSE]: If we multiply 4 by 2, we get...
8!
As you can see when asked to multiply by 2 the previous result, it remembered agent1's result which was 4. Then it performed the multiplication and got us 8. 🎉
The Task()
class exposes a parameter to attach tags to messages. You can then use the tags to find specific messages inside the history to
read or alter them.
Tags are a convinient way to keep track of things and change history state during runtime.
Adding a tag to a Task will automatically add it to all the messages generated by this Task.
To add tags to your prompt messages use the tags=[...]
parameter like this:
from yacana import OllamaAgent, Task
agent1 = OllamaAgent("Ai assistant", "llama3.1:8b")
first_message: Message = Task("What's 2+2 ?", agent1, tags=["first_task"]).solve()
This told Yacana to add the tag "first_task"
to both the prompt and LLM output message generated by the task.
Note that you can also tag individual Message manualy:
...
first_message.add_tags(["first_task_response"])
This will add the tag "first_task_response"
to the LLM's response.
You can also do this all on one line because Task.solve()
returns the message instance:
from yacana import OllamaAgent, Task
agent1 = OllamaAgent("Ai assistant", "llama3.1:8b")
# If you don't need the message instance you can do this:
Task("What's 2+2 ?", agent1, tags=["first_task"]).solve().add_tags(["first_task_response"])
Let's make a longer example to show how tagging works. Bellow is a script that will ask 3 numeric related questions to the LLM.
The script will then change the second question + answer to something completly different.
Finally, the script will ask the LLM what question was not about numbers. Because we updated the second task, it should tell us that it's the second one.
from yacana import OllamaAgent, Task, GenericMessage
agent1 = OllamaAgent("Ai assistant", "llama3.1:latest")
# Creating 3 numeric related questions with one tag each
Task("What's 2+2 ?", agent1, tags=["first_task"]).solve()
Task("What's 20+20 ?", agent1, tags=["second_task"]).solve()
# For the last one, let's also tag the response
Task("What's 200+200 ?", agent1, tags=["third_task"]).solve().add_tags(["third_task_answer"])
# Let's print the first task's prompt and response
first_task_prompt: GenericMessage = agent1.history.get_messages_by_tags(["first_task"])[0]
first_task_response: GenericMessage = agent1.history.get_messages_by_tags(["first_task"])[1]
print("First task prompt: ", first_task_prompt.content)
print("First task response: ", first_task_response.content)
# Let's print the second task's prompt and response
second_task_prompt: GenericMessage = agent1.history.get_messages_by_tags(["second_task"])[0]
second_task_response: GenericMessage = agent1.history.get_messages_by_tags(["second_task"])[1]
print("Second task prompt: ", second_task_prompt.content)
print("Second task response: ", second_task_response.content)
# Let's print the third task's prompt and response using the tags and not only the index
third_task_prompt: GenericMessage = agent1.history.get_messages_by_tags(["third_task"])[0]
third_task_response: GenericMessage = agent1.history.get_messages_by_tags(["third_task_answer"])[0] # Here we use the tag added manually on line 9
print("Third task prompt: ", third_task_prompt.content)
print("Third task response: ", third_task_response.content)
# Let's change the second task's prompt
second_task_prompt.content = "Why is the sky blue ?"
# Let's change the second task's response
second_task_response.content = "Because of Rayleigh scattering."
# Now let's ask the LLM what question is not about numbers
Task("What question was not about numbers ?", agent1).solve()
# You'll see the answer during runtime with the logging system but let's print the whole history anyway
print("\n--Showing whole history--\n")
agent1.history.pretty_print()
First task prompt: What's 2+2 ?
First task response: The answer is: 4!
Second task prompt: What's 20+20 ?
Second task response: That's a bigger one! The answer is: 40!
Third task prompt: What's 200+200 ?
Third task response: The answer is: 400!
--Showing whole history--
[user]:
What's 2+2 ?
[assistant]:
The answer is: 4!
[user]:
Why is the sky blue ?
[assistant]:
Because of Rayleigh scattering.
[user]:
What's 200+200 ?
[assistant]:
The answer is: 400!
[user]:
What question was not about numbers ?
[assistant]:
That would be "Why is the sky blue?" It's a classic example of an explanation of a phenomenon in science.
There are a few builtin tags that are automatically added to some messages:
Tag name | Tag description |
---|---|
yacana_response | All responses from the LLM are tagged with this. When using enhanced tool calling there will be multiple messages tagged with this. |
yacana_prompt | All prompts created with a Task are tagged with this. |
Let's make another example. This time we'll count how many messages are present between the first LLM's response and the third task prompt.
It's quite specific so be sure to understand the range we're selecting.
We'll just be counting but you can think of anything, even deleting the messages!
from typing import List
from yacana import OllamaAgent, Task, GenericMessage
agent1 = OllamaAgent("Ai assistant", "llama3.1:latest")
start_tag = "start"
end_tag = "end"
Task("What's 2+2 ?", agent1).solve().add_tags([start_tag])
Task("What's 20+20 ?", agent1).solve()
Task("What's 200+200 ?", agent1, tags=[end_tag]).solve()
# Getting all messages from the history
messages: List[GenericMessage] = agent1.history.get_all_messages()
nb_messages = 0
start_counting = False
for message in messages:
print("\n" + message.content + " ==> " + str(message.tags))
if start_tag in message.tags:
start_counting = True
elif end_tag in message.tags:
break
elif start_counting is True:
nb_messages += 1
print("Number of messages between the two selected messages: ", nb_messages)
Output (without logging):
What's 2+2 ? ==> ['yacana_prompt']
The answer is: 4! ==> ['yacana_response', 'start']
What's 20+20 ? ==> ['yacana_prompt']
Easy one!
The answer is: 40! ==> ['yacana_response']
What's 200+200 ? ==> ['end', 'yacana_prompt']
Number of messages between the two selected messages: 2