Ollama is used to use locally run AI/LLM models. Ollama can be installed or run as a Docker image in a local machine. Ollama supports a library of AI/LLM models for different use cases. Models like LLava for image description, falcon for RAG based question answering, sqlcoder for Sql generation and mixtral for function calling can be used. Ollama can use GPUs if they are available and works on CPU without them. Spring AI has Ollama support that make the use similar to using an AI service. On current CPUs there are often performance issues. The CPU providers want to add AI engines to their CPUs in the future to solve these issues.
0 commit comments