Skip to content

Google GenAI (Gemini)

The Google GenAI plugin provides access to Google’s Gemini models, text embeddings, Imagen image generation, text-to-speech, and Veo video generation.

<dependency>
<groupId>com.google.genkit</groupId>
<artifactId>genkit-plugin-google-genai</artifactId>
<version>1.0.0-SNAPSHOT</version>
</dependency>
Terminal window
export GOOGLE_GENAI_API_KEY=your-api-key

Get an API key from Google AI Studio.

To use Vertex AI instead of the Google AI Developer API:

Terminal window
export GOOGLE_GENAI_USE_VERTEXAI=true
export GOOGLE_CLOUD_PROJECT=my-project
export GOOGLE_CLOUD_LOCATION=us-central1 # optional, defaults to us-central1
import com.google.genkit.plugins.googlegenai.GoogleGenAIPlugin;
Genkit genkit = Genkit.builder()
.plugin(GoogleGenAIPlugin.create(System.getenv("GOOGLE_GENAI_API_KEY")))
.build();
ModelResponse response = genkit.generate(
GenerateOptions.builder()
.model("googleai/gemini-2.0-flash")
.prompt("Tell me about AI")
.build());

Generate text embeddings for RAG, semantic search, and similarity tasks:

import com.google.genkit.ai.EmbedResponse;
import com.google.genkit.ai.Document;
List<Document> documents = List.of(
Document.fromText("Genkit is a framework for building AI apps"),
Document.fromText("Firebase provides cloud services")
);
EmbedResponse response = genkit.embed("googleai/text-embedding-004", documents);
// Access embedding vectors
float[] vector = response.getEmbeddings().get(0).getValues();
// vector.length == 768

Embeddings are used automatically by vector store plugins (Firebase, Pinecone, pgvector, etc.) when you configure an embedder name. You can also use them directly for custom similarity search.

You can pass task-specific options to optimize embedding quality:

Map<String, Object> embedOptions = Map.of(
"taskType", "RETRIEVAL_DOCUMENT", // or "RETRIEVAL_QUERY", "SEMANTIC_SIMILARITY"
"title", "Document title",
"outputDimensionality", 256 // reduce dimensions if needed
);

Generate natural-sounding speech from text using Gemini TTS models:

Map<String, Object> ttsOptions = Map.of("voiceName", "Zephyr");
GenerationConfig config = GenerationConfig.builder()
.custom(ttsOptions)
.build();
ModelResponse response = genkit.generate(
GenerateOptions.builder()
.model("googleai/gemini-2.5-flash-preview-tts")
.prompt("Hello! Welcome to Genkit Java.")
.config(config)
.build());
// The response contains audio as a media part (WAV format, base64-encoded)
String audioDataUrl = response.getMessage().getParts().get(0).getMedia().getUrl();
// "data:audio/wav;base64,..."
String dataUrl = response.getMessage().getParts().get(0).getMedia().getUrl();
String base64Data = dataUrl.substring(dataUrl.indexOf(",") + 1);
byte[] audioBytes = Base64.getDecoder().decode(base64Data);
Files.write(Path.of("output.wav"), audioBytes);

Generate videos from text prompts or images using Google’s Veo models:

Map<String, Object> veoOptions = Map.of(
"numberOfVideos", 1,
"durationSeconds", 8,
"aspectRatio", "16:9",
"timeoutMs", 600000 // 10 min — video generation can take a while
);
GenerationConfig config = GenerationConfig.builder()
.custom(veoOptions)
.build();
ModelResponse response = genkit.generate(
GenerateOptions.builder()
.model("googleai/veo-3.0-generate-001")
.prompt("A serene Japanese garden with cherry blossoms falling")
.config(config)
.build());
// The response contains video as a media part (base64-encoded)
String videoDataUrl = response.getMessage().getParts().get(0).getMedia().getUrl();

Generate images with Imagen:

Map<String, Object> imagenOptions = Map.of(
"numberOfImages", 1,
"aspectRatio", "1:1"
);
GenerationConfig config = GenerationConfig.builder()
.custom(imagenOptions)
.build();
ModelResponse response = genkit.generate(
GenerateOptions.builder()
.model("googleai/imagen-4.0-fast-generate-001")
.prompt("A cat wearing a space suit")
.config(config)
.build());

See the google-genai sample for complete examples of text generation, tool calling, embeddings, image generation, TTS, and video generation.