IV. Managing Agents history

Managing agent history

As you saw in the previous examples, each agent has its own history of messages that compose its memory. When a new request is made to the LLM the whole history is sent to the inference server (ie: Ollama) and the LLM responds to the last prompt in the chain but bases its answer on the context it gets from the previous messages (and the initial system prompt if present).

This is what an history looks like:

history1A

There are 3 types of prompts:

  • 1: The optional "System" prompt that, if present, always goes first.
Then it's only an alternation between these two:
  • 2: The "User" prompts coming from the Task you set.
  • 3: The "Assistant" message which is the answer from the LLM.

However, sending the whole history to the LLM for each Task to solve has some disadvantages that can not be overturned:

  • The longer the history the longer the LLM takes to analyze it and return an answer.
  • Each LLM comes with a maximum token window size. This is the maximum number of words an LLM can analyze in one run, therefore it's maximum memory.
  • One token roughly represents one word or 3/4 of a word. More information on token count per word here or here.

To counteract those negative effects it is recommended you clean the history when possible. You can use the forget=True parameter in the Task() class so the prompt and the LLM response do not get saved to the history. You'll see there are other ways to preserve the history from useless noise.
But first, we'll look at one Agent's history. Fortunately, Yacana got you covered.

Printing History

The Agent class comes with a .history property of type History. It exposes methods so you can manipulate and view it (for debugging, mostly) during runtime. For instance, it has a .pretty_print() method that prints on the standard output the content of the history using the classic color scheme. It's great for debugging but not for parsing. If you need to parse the history there is a .get_as_dict() method which as the name implies returns the History as a Python dictionary.

Let's see a simple example:


from yacana import LoggerManager, Agent, Task

# Let's deactivate automatic logging so that only OUR prints are shown
LoggerManager.set_log_level(None)

agent1 = Agent("Cook", "llama3.1:8b", system_prompt="You are a pastry chef")

Task("Generate 5 pastry names followed by the associated estimated calorie.", agent1).solve()
Task("Rank the pastries from the lowest calorie count to the largest.", agent1).solve()

print("############## Agent 1 history pretty print ##################")
agent1.history.pretty_print()
print("############## END ##################")

print("")

print("############## Agent 1 history dictionnary ##################")
print(str(agent1.history.get_as_dict()))
print("############## END ##################")
					

############## Agent 1 history pretty print ##################

[user]:
Generate 5 pastry names followed by the associated estimated calorie.

[assistant]:
Here are 5 pastry names with their associated estimated calorie counts:

1. **Cinnamon Swirl Brioche** (250-300 calories) - A sweet, buttery brioche filled with a gooey cinnamon swirl.
2. **Lemon Lavender Mille-Feuille** (400-450 calories) - Layers of flaky pastry, lemon curd, and lavender cream create a bright and airy dessert.
3. **Chocolate Soufflé Cake** (500-550 calories) - A rich, decadent chocolate cake that rises like a cloud from the oven, served with a scoop of vanilla ice cream.
4. **Raspberry Almond Croissant** (200-250 calories) - Flaky, buttery croissants filled with sweet and tart raspberry jam and topped with sliced almonds.
5. **Pistachio Rosewater Macarons** (150-200 calories) - Delicate, chewy macarons flavored with pistachio and rosewater, sandwiched together with a light and creamy filling.

Note: The estimated calorie counts are approximate and may vary based on specific ingredients and portion sizes used.

[user]:
Rank the pastries from the lowest calorie count to the largest.

[assistant]:
Based on the estimated calorie counts I provided earlier, here are the pastries ranked from lowest to highest:

1. **Pistachio Rosewater Macarons** (150-200 calories)
2. **Raspberry Almond Croissant** (200-250 calories)
3. **Cinnamon Swirl Brioche** (250-300 calories)
4. **Lemon Lavender Mille-Feuille** (400-450 calories)
5. **Chocolate Soufflé Cake** (500-550 calories)

Let me know if you have any other questions!

############## END ##################

############## Agent 1 history pretty print ##################
[{'role': 'system', 'content': 'You are a pastry chef'}, {'role': 'user', 'content': 'Generate 5 pastry names followed by the associated estimated calorie.'}, {'role': 'assistant', 'content': 'Here are 5 pastry names with their associated estimated calorie counts:\n\n1. **Cinnamon Swirl Brioche** (250-300 calories) - A sweet, buttery brioche filled with a gooey cinnamon swirl.\n2. **Lemon Lavender Mille-Feuille** (400-450 calories) - Layers of flaky pastry, lemon curd, and lavender cream create a bright and airy dessert.\n3. **Chocolate Soufflé Cake** (500-550 calories) - A rich, decadent chocolate cake that rises like a cloud from the oven, served with a scoop of vanilla ice cream.\n4. **Raspberry Almond Croissant** (200-250 calories) - Flaky, buttery croissants filled with sweet and tart raspberry jam and topped with sliced almonds.\n5. **Pistachio Rosewater Macarons** (150-200 calories) - Delicate, chewy macarons flavored with pistachio and rosewater, sandwiched together with a light and creamy filling.\n\nNote: The estimated calorie counts are approximate and may vary based on specific ingredients and portion sizes used.'}, {'role': 'user', 'content': 'Rank the pastries from the lowest calorie count to the largest.'}, {'role': 'assistant', 'content': 'Based on the estimated calorie counts I provided earlier, here are the pastries ranked from lowest to highest:\n\n1. **Pistachio Rosewater Macarons** (150-200 calories)\n2. **Raspberry Almond Croissant** (200-250 calories)\n3. **Cinnamon Swirl Brioche** (250-300 calories)\n4. **Lemon Lavender Mille-Feuille** (400-450 calories)\n5. **Chocolate Soufflé Cake** (500-550 calories)\n\nLet me know if you have any other questions!'}]
############## END ##################
					

Output speaks for itself.

Creating and loading checkpoints

As mentioned earlier it's better to keep the History clean. Too many prompts and unrelated questions will lead to poorer results so if you have the opportunity to scratch some portion then you should.
Yacana allows you to make history snapshots and rollback to any of them. This is particularly useful when reaching the end of a flow branch and wanting to go back onto another.

Checkpoint1A

It is as simple as this:


# Creating a checkpoint
checkpoint_id: str = agent1.history.create_check_point()
					

The checkpoint_id is only a unique identifier that you can use to load back a save. Like this:


# Go back in time to when the checkpoint was created
agent1.history.load_check_point(checkpoint_id)
					

Note that nothing prevents you from making a snapshot before rolling back to a previous save. This way you could go back… to the future. ^^
Are you okay Marty?

Let's take a concrete example. You have a pastry website that generates pastry recipes.
The flow will look like this:

  1. Propose 5 pastry names ;
  2. Create a checkpoint ;
  3. The user chooses one of the pastries ;
  4. We show the associated calories of the selected pastry ;
  5. If the user is okay with it we end the program ;
  6. If the user is not okay with the calorie count we go back to the checkpoint and propose to choose from the the list again ;
  7. Repeat until satisfied ;
  8. We'll show the final agent's History and make sure that it ONLY stored the selected pastry ;

With a bit of color, it would look like this:

pastry1B


from yacana import LoggerManager, Agent, Task

# Let's deactivate automatic logging so that only OUR prints are shown; Maybe reactivate (to "info") if you want to see what's happening behind the scenes.
LoggerManager.set_log_level(None)

agent1 = Agent("Cook", "llama3.1:8b", system_prompt="You are a pastry chef")

# Getting a list of pastries
pastries: str = Task("Generate 5 pastry names displayed as a list. ONLY output the names and nothing else.", agent1).solve().content
print(f"Welcome, you may order one of the following pastries\n{pastries}")

#Looping till the user is satisfied
while True:
    print("")

    # Creating our checkpoint to go back in time
    checkpoint_id: str = agent1.history.create_check_point()

    # Asking for one of the pastries from the list
    user_choice: str = input("Please choose one of the above pastries: ")

    # Printing associated calories for the selected pastry
    pastry_calorie_question: str = Task(f"The user said '{user_choice}'. Your task is to output a specific sentence and replace the <replace> tags with the correct values: 'You selected the <replace>selected pastry</replace>. The average calorie intake for this pastry is <replace>average associated calories for the selected pastry</replace>. Do you wish to continue ?", agent1).solve().content
    print(pastry_calorie_question)

    # Asking if the user wants to continue
    is_satisfied: str = input("Continue ? ")

    # Basic yes / no router
    router_answer: str = Task(f"The user said '{is_satisfied}'. Evaluate if the user was okay with its order. If he was, ONLY output 'yes', if not only output 'no'.", agent1).solve().content

    if "yes" in router_answer.lower():
        print("Thank you for your order.")
        # The user was satisfied with his choice. Exiting the loop...
        break
    else:
        # The user wants to choose another pastry. Let's go back in time by loading are previous checkpoint!
        agent1.history.load_check_point(checkpoint_id)
        #  Let's go back to the top of the loop
        continue

print("############## Agent 1 history pretty print ##################")
agent1.history.pretty_print()
print("############## END ##################")
					

▶️ Output:


Welcome, you may order one of the following pastries
1. Whipped Wonders
2. Creamy Confections
3. Flaky Fancies
4. Golden Galettes
5. Sugar Serenades

Please choose one of the above pastries: The Creamy one looks good
You selected the Creamy Confections. The average calorie intake for this pastry is 350-400 calories per serving. Do you wish to continue?
Continue ? no

Please choose one of the above pastries: Hummm. The golden one?
You selected the Golden Galettes. The average calorie intake for this pastry is approximately 250-300 calories per serving. Do you wish to continue?
Continue ? yes
Thank you for your order.

############## Agent 1 history pretty print ##################

[user]:
Generate 5 pastry names displayed as a list. ONLY output the names and nothing else.

[assistant]:
1. Whipped Wonders
2. Creamy Confections
3. Flaky Fancies
4. Golden Galettes
5. Sugar Serenades

[user]:
The user said 'Hummm. The golden one ?'. Your task is to output a specific sentence and replace the  tags with the correct values: 'You selected the selected pastry. The average calorie intake for this pastry is average associated calories for the selected pastry. Do you wish to continue ?

[assistant]:
You selected the Golden Galettes. The average calorie intake for this pastry is approximately 250-300 calories per serving. Do you wish to continue?

[user]:
The user said 'yes'. Evaluate if the user was okay with the order. If he was, ONLY output 'yes', if not only output 'no'.

[assistant]:
yes

############## END ##################
					

As you can see in the above output, we went for "the creamy one" but when shown the calories, refused to continue… After that, we chose the "Golden Galettes" which was satisfying. Then the program ended with an output of the agent's history.
We can see in the agent's output that it only remembered us choosing the "Golden Galettes" but not the "Creamy Confections". This is because we loaded the last checkpoint which rolled us back to making our choice again.

Note that the Task replacing the variables might not work very well with dumb LLMs. It could be reworked by splitting it into two. One that would extract the name of the chosen pastry from the user's input and a second one that would generate the associated calories. Finally, print the sentence with the variables pre-generated. Using local models is all about knowing the maximum performance of your LLM and adapting the prompts to match that performance. The dumber, the more guidance it needs!


Zero-prompt shot vs multi-prompt shot

When an LLM struggles to solve a complex task and achieve a good success rate it may be time to give it a little help.

In large language models, the approach to prompting can significantly influence the model's performance.

  • Zero-shot prompting asks the model to complete a task without any prior examples, relying solely on its pre-existing knowledge. This can lead to varied results, especially in more complex tasks.
  • One-shot prompting improves accuracy by providing the model with a single example, offering some guidance on how to approach the task.
  • Few-shot prompting further enhances performance by supplying multiple examples, allowing the model to have a better understanding of the task's nuances and producing more reliable and accurate results.

Yacana provides you with a way to add new Messages to the History manually. The History class exposes an .add(...) method.
It takes an argument of type Message() with two parameters: a [MessageRole]() enum and the string message itself.

For example:


from yacana import Agent, Message, MessageRole

# Creating a basic agent with an empty history
agent1 = Agent("AI assistant", "llama3.1:8b")

# We create a fake prompt identified as coming from the user (Thx to `MessageRole.USER`)
user_message = Message(MessageRole.USER, "What's 2+2 ?")

# We create a fake answer identified as coming from the LLM (Thx to `MessageRole.ASSISTANT`)
fake_ai_response = Message(MessageRole.ASSISTANT, "The answer is 4")

# Let's add these two Messages to the Agent's History
agent1.history.add(user_message)
agent1.history.add(fake_ai_response)

# Print the content of the history
agent1.history.pretty_print()
					

Outputs:


[user]:
What's 2+2 ?

[assistant]:
The answer is 4
					

The Agent's History successfully contains the two messages we manually added.

The .add() method can only append messages to the end of the History.

⚠️ Try to keep the alternation of USER and ASSISTANT as this is how "instruct" LLMs have been trained.


Let's see a 0-shot example asking for a JSON output extracted from a given sentence:


from yacana import Agent, Task

agent1 = Agent("Ai assistant", "llama3.1:8b")

Task(f"Print the following sentence as JSON, extracting the names and rephrasing the actions: 'Marie is walking her dog. Ryan is watching them through the window. The dark sky is pouring down heavy raindrops.'", agent1).solve()
					

Outputs:


INFO: [PROMPT]: Print the following sentence as JSON extracting the names and rephrasing the actions: 'Marie is walking her dog. Ryan is watching them through the window. The dark sky is pouring down heavy raindrops.'

INFO: [AI_RESPONSE]: Here is the sentence rewritten in JSON format:
{
	"people": [
		{
			"name": "Marie",
			"action": "walking"
		},
		{
			"name": "Ryan",
			"action": "watching through the window"
		}
	],
	"weather": {
		"condition": "heavy raindrops",
		"sky": "dark sky"
	}
}
Let me know if you'd like me to help with anything else!
					

Not bad but there's noise. We would like to output the JSON and nothing else. No bedside manners. The Let me know if you'd like me to help with anything else! must go.
Let's introduce another optional Task() parameter: json_output=True. This relies on Ollama to force the output as JSON.
⚠️ It is preferable to prompt the LLM to output as JSON in addition to this option.

Replace the Task with this one:


Task(f"Print the following sentence as JSON extracting the names and rephrasing the actions: 'Marie is walking her dog. Ryan is watching them through the window. The dark sky is pouring down heavy raindrops.'", agent1, json_output=True).solve()
					

Outputs:


INFO: [PROMPT]: Print the following sentence as JSON extracting the names and rephrasing the actions: 'Marie is walking her dog. Ryan is watching them through the window. The dark sky is pouring down heavy raindrops.'

INFO: [AI_RESPONSE]: {"names": ["Marie", "Ryan"], "actions": {"Marie": "is walking", "Ryan": "is watching"}, "description": [{"location": "window", "activity": "watching"}, {"location": "outdoors", "activity": "pouring raindrops"}]}
					

Way better. No more noise.
However, we would prefer having an array of name and action even for the weather (the name would be sky and the action raining).

To achieve this let's give the LLM an example of what we expect by making it believe it already outputted it correctly once:


from yacana import Agent, Task, MessageRole, Message

agent1 = Agent("Ai assistant", "llama3.1:8b")

# Making a fake valid interaction
agent1.history.add(Message(MessageRole.USER, "Print the following sentence as json extracting the names and rephrasing the actions: 'John is reading a book on the porch while the cold wind blows through the trees.'"))
agent1.history.add(Message(MessageRole.ASSISTANT, '[{"name": "John", "action": "Reading a book.", "Cold wind": "Blowing through the trees."]'))

Task(f"Print the following sentence as json extracting the names and rephrasing the actions: 'Marie is walking her dog. Ryan is watching them through the window. The dark sky is pouring down heavy raindrops.'", agent1).solve()
					

Outputs:


INFO: [PROMPT]: Print the following sentence as JSON extracting the names and rephrasing the actions: 'Marie is walking her dog. Ryan is watching them through the window. The dark sky is pouring down heavy raindrops.'

INFO: [AI_RESPONSE]: [{"name": "Marie", "action": "Walking her dog."}, {"name": "Ryan", "action": "Watching Marie and her dog through the window."}, {"name": "The dark sky", "action": "Pouring down heavy raindrops."}]
					

This is perfect!
(❕ Model temperature may impact performance here. Consider using a low value.)
You can add multiple fake interactions like this one to cover more advanced use cases and train the LLM on how to react when they happen. It would become multi-shot prompting.


You can also do multi-shot prompting with self-reflection. This takes more CPU time because you decompose the task into multiple subtasks but can be beneficial in some scenarios.

For example:


from yacana import Agent, Task

agent1 = Agent("Ai assistant", "llama3.1:8b")

Task('I will give you a sentence where you must extract as JSON all the names and rephrase all the actions. For example in the following sentence: "John is reading a book on the porch while the cold wind blows through the trees." would result in this JSON output: [{"name": "John", "action": "Reading a book."}, {"name": "Cold wind", "action": "Blowing through the trees."}] ', agent1).solve()

Task(f"Marie is walking her dog. Ryan is watching them through the window. The dark sky is pouring down heavy raindrops.", agent1, json_output=True).solve()
					

Outputs


INFO: [PROMPT]: I will give you a sentence where you must extract as JSON all the names and rephrase all the actions. For example in the following sentence: "John is reading a book on the porch while the cold wind blows through the trees." would result in this JSON output: [{"name": "John", "action": "Reading a book."}, {"name": "Cold wind", "action": "Blowing through the trees."}]

INFO: [AI_RESPONSE]: I'm ready to extract the names and rephrase the actions. What's the sentence?

INFO: [PROMPT]: Marie is walking her dog. Ryan is watching them through the window. The dark sky is pouring down heavy raindrops.

INFO: [AI_RESPONSE]: {"name": "Marie", "action": "Walking with her dog."}
					

:-(
In this case, it didn't work very well as only one name was extracted as JSON. But in more complex scenarios we can assure you that letting the LLM reflect on the guideline beforehand, can be very beneficial to solving the task.


Saving Agent state

Maybe your program needs to start, stop, and resume where it stopped. For this use case, Yacana provides a way to store an Agent state into a file and load it later. All of the Agent's properties are saved including the History. Only checkpoints are lost as they are more of a runtime thing. We might include them in the save file one day if the need arises.

To save an Agent do the following:


from yacana import Agent, Task

agent1 = Agent("Ai assistant", "llama3.1:8b")

Task("What's 2+2 ?", agent1).solve()

# Exporting the agent1 current state to a file called agent1_save.json
agent1.export_state("./agent1_save.json")
					

If you look at the file agent1_save.json you'll see something like this:


{
    "name": "Ai assistant",
    "model_name": "llama3.1:8b",
    "system_prompt": null,
    "model_settings": {},
    "endpoint": "http://127.0.0.1:11434",
    "history": [
        {
            "role": "user",
            "content": "What's 2+2 ?"
        },
        {
            "role": "assistant",
            "content": "The answer to 2+2 is... (drumroll please)... 4!"
        }
    ]
}
					

Now let's load back this agent from the dead using .get_agent_from_state()!
In another Python file add this code snippet:


from yacana import Agent, Task

agent2: Agent = Agent.get_agent_from_state("./agent1_save.json")

Task("Multiply by 2 the previous result", agent2).solve()
					

The .get_agent_from_state(...) works like a factory pattern, returning a new instance of an Agent.

▶️ Output:


INFO: [PROMPT]: Multiply by 2 the previous result

INFO: [AI_RESPONSE]: If we multiply 4 by 2, we get...

8!
					

As you can see when asked to multiply by 2 the previous result, it remembered agent1's result which was 4. It then did 4 x 2 and got us 8.

Pagination