I. Installation

Yacana and Ollama installation

Installing Ollama

image

Yacana was initially designed to work with Ollama, but now supports any OpenAI-compatible endpoints like ChatGPT or VLLM.
This documentation serves as a comprehensive tutorial, guiding you through Yacana's features step by step. Since the programming API is identical across all inference servers, we'll primarily demonstrate examples using the Ollama agent. You can easily adapt any code snippets to work with your preferred inference server by simply swapping the agent type. Let's start by installing Ollama on your computer, it's one of the simplest inference servers to set up.

Click here to get the latest release.


Ollama is:
  • Compatible with all operating systems Windows/Mac/Linux ;
  • Installed in seconds using one command ;
  • Has a great CLi that even a 4-year-old can use to download models ;
  • Has tons of tutorials out there if you run into any trouble ;

You can connect Yacana to a remote Ollama instance. Read forward.


Choosing an LLM model

After Ollama is installed you can browse the list of available LLMs on the Ollama website and download any model you want (or your computer can deal with).
For reference, if you don't know what LLM model to choose (we've all been there) here is a list of models you can try out on consumer hardware:

Computer power LLM models name to try LLM quality
Out of this world (RTX 4090 / 64 GB RAM) llama3.3:70b, gemma3:27b, deepseek-r1:32b, mixtral:8x22b Excellent reasoning and instruction following.
Epic (RTX 4090 / 32 GB RAM) llama3.1:8b, gemma3:27b, dolphin-mixtral:8x7b, dolphin-mixtral:8x7b-v2.5-q6_K Good reasoning and instruction following. (q6_K model should be less consuming than the default Mixtral if you have any issues)
Gamer (GTX 1080TI / 16 GB RAM) llama3.1:8b, mistral:7b Llama still works but is slower. Expect limited reasoning and no more than 2 complex instructions at a time
Potato phi:2.7b, phi3:3.8b, tinyllama:1.1b Almost no reasoning, incapable of following more than 1 instruction at a time, English bound only ; Dumb as a stone

If you have access to a GPU VM with A LOT of RAM then you should try the state-of-the-art 'llama3.1:405b' model. HF


Running the model

When you have chosen your model it's time to use the Ollama CLI to pull it on your computer.

  • To download the model do ollama pull <model_name> ;
  • Then list installed models using ollama list ;
  • When ready, test the model locally by doing ollama run <model_name> which will start a conversation with the LLM ;

Installing Yacana


pip install yacana
					

Imports

When using other frameworks 'import hell' quickly appears. To prevent this bothersome problem we propose that you always import all of Yacana's modules and when finished developing let the IDE remove the unused imports. Unused imports generally appear grayed. Thus we recommend that you prepend these imports in all your files and clean them later. This way the IDE will have auto-completion available and will help you develop 10 times faster.


                    from yacana import OllamaAgent, OpenAiAgent, GenericAgent, Task, Tool, Message, GenericMessage, MessageRole, GroupSolve, EndChat, EndChatMode, OllamaModelSettings, OpenAiModelSettings, LoggerManager, ToolError, MaxToolErrorIter
					

Pagination