Skip to content

Ollama

The Ollama plugin enables local LLM inference without requiring API keys.

<dependency>
<groupId>com.google.genkit</groupId>
<artifactId>genkit-plugin-ollama</artifactId>
<version>1.0.0-SNAPSHOT</version>
</dependency>

Install and run Ollama:

Terminal window
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull a model
ollama pull gemma3n:e4b
Terminal window
# Optional: configure Ollama host (default: http://localhost:11434)
export OLLAMA_HOST=http://localhost:11434
import com.google.genkit.plugins.ollama.OllamaPlugin;
Genkit genkit = Genkit.builder()
.plugin(OllamaPlugin.create())
.build();
ModelResponse response = genkit.generate(
GenerateOptions.builder()
.model("ollama/gemma3n")
.prompt("Tell me about AI")
.build());

Use any model available in Ollama. Popular choices:

  • ollama/gemma3n — Google Gemma 3n
  • ollama/llama3.1 — Meta Llama 3.1
  • ollama/mistral — Mistral 7B
  • ollama/codellama — Code-focused model
  • Text generation, streaming, local-first (no API key required)

See the ollama sample.