diff --git a/README.md b/README.md index 3e707b6c..e0c035f2 100644 --- a/README.md +++ b/README.md @@ -30,31 +30,31 @@ available samples. ## Samples Here is the list of samples you can find in the `/samples` folder: -| Samples | | -|:----------------------------------------------------------------------------------------------------------------------------------|-----------| -| Gemini Image Chat sample | ✨🖼️🍌 **Gemini Image Chat**:
A chatbot app using the new [Gemini 2.5 Flash Image model](https://developers.googleblog.com/en/introducing-gemini-2-5-flash-image/) (a.k.a. "NanoBanana") enabling image generation and iterations via conversation with the Gemini model. Ask the model to generate an image and ask for tweaks in the chat.



**[> Browse code](samples/gemini-image-chat)**

| -| | | -| Gemini Chatbot sample | ✨🗣️ **Gemini Chatbot**:
A chatbot app using the Gemini Flash model. You can tweak the [system instructions](https://firebase.google.com/docs/ai-logic/system-instructions) in the model configuration to change the tone or the persona of the model.



**[> Browse code](samples/gemini-chatbot)**

| -| | | -| Gemini Multimodal sample | ✨📸 **Gemini Multimodal**:
A sample leveraging the [multimodal capabilities](https://developer.android.com/ai/gemini/developer-api#generate-text-from-media) of the Gemini Flash model (in this case text and image-to-text) to let you prompt the model with an image.



**[> Browse code](samples/gemini-multimodal)**

| -| | | -| Gemini Nano summarization sample | ✨📱📰 **On-device Summarization**:
A sample letting you summarize text on-device using Gemini Nano via the [GenAI Summarization API](https://developers.google.com/ml-kit/genai/summarization/android).



**[> Browse code](samples/genai-summarization)**

| -| | | -| Gemini Nano Image description | ✨📱🔍 **On-device Image Description**:
A sample letting you generate image descriptions using Gemini Nano via the [GenAI Image Description API](https://developers.google.com/ml-kit/genai/image-description/android).



**[> Browse code](samples/genai-image-description)**

| -| | | -| Gemini Nano Rewrite | ✨📱🖋️ **On-device Writing Assistance**:
A sample letting you proofread and rewrite text using Gemini Nano via the [GenAI Rewriting API](https://developers.google.com/ml-kit/genai/rewriting/android).



**[> Browse code](samples/genai-writing-assistance)**

| -| | | -| Imagen sample | 🖼️ **Image Generation with Imagen**:
A sample using [Imagen to generate images](https://developer.android.com/ai/imagen#generate-image) of landscapes, objects and people in various artistic style.



**[> Browse code](samples/imagen)**

| -| | | -| Magic Selfie sample | 🖼️📸 **Magic Selfie**:
A sample using [ML Kit subject Segmentation SDK](https://developers.google.com/ml-kit/vision/subject-segmentation/android) to remove the background behind a person, and [Imagen](https://developer.android.com/ai/imagen#generate-image) to generate new background.



**[> Browse code](samples/magic-selfie)**

| -| | | -| Gemini Video Summarization sample | ✨🎥 **Gemini Video Summarization**:
A sample using Gemini Flash to [summarize videos](https://firebase.google.com/docs/ai-logic/analyze-video?api=dev) leveraging the [large file support](https://firebase.google.com/docs/ai-logic/solutions/cloud-storage).



**[> Browse code](samples/gemini-video-summarization)**

| -| | | +| Samples | | +|:----------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Gemini Hybrid sample | ✨📱☁️ **Hybrid Inference**:
A sample demonstrating a hybrid approach to generative AI, utilizing both on-device (Gemini Nano via ML Kit) and cloud-based (Gemini via Firebase AI SDK) models. It showcases how to fallback to the cloud when on-device capabilities are unavailable.



**[> Browse code](samples/gemini-hybrid)**

| +| | | +| Gemini Image Chat sample | ✨🖼️🍌 **Gemini Image Chat**:
A chatbot app using the new [Gemini 3 Pro Image model](https://deepmind.google/models/gemini-image/pro/) (a.k.a. "Nano Banana Pro") enabling image generation and iterations via conversation with the Gemini model. Ask the model to generate an image and ask for tweaks in the chat.



**[> Browse code](samples/gemini-image-chat)**

| +| | | +| Gemini Chatbot sample | ✨🗣️ **Gemini Chatbot**:
A chatbot app using the Gemini Flash model. You can tweak the [system instructions](https://firebase.google.com/docs/ai-logic/system-instructions) in the model configuration to change the tone or the persona of the model.



**[> Browse code](samples/gemini-chatbot)**

| +| | | +| Gemini Multimodal sample | ✨📸 **Gemini Multimodal**:
A sample leveraging the [multimodal capabilities](https://developer.android.com/ai/gemini/developer-api#generate-text-from-media) of the Gemini Flash model (in this case text and image-to-text) to let you prompt the model with an image.



**[> Browse code](samples/gemini-multimodal)**

| +| | | +| Gemini Nano summarization sample | ✨📱📰 **On-device Summarization**:
A sample letting you summarize text on-device using Gemini Nano via the [GenAI Summarization API](https://developers.google.com/ml-kit/genai/summarization/android).



**[> Browse code](samples/genai-summarization)**

| +| | | +| Gemini Nano Image description | ✨📱🔍 **On-device Image Description**:
A sample letting you generate image descriptions using Gemini Nano via the [GenAI Image Description API](https://developers.google.com/ml-kit/genai/image-description/android).



**[> Browse code](samples/genai-image-description)**

| +| | | +| Gemini Nano Rewrite | ✨📱🖋️ **On-device Writing Assistance**:
A sample letting you proofread and rewrite text using Gemini Nano via the [GenAI Rewriting API](https://developers.google.com/ml-kit/genai/rewriting/android).



**[> Browse code](samples/genai-writing-assistance)**

| +| | | +| Nanobanana sample | 🖼️🍌 **Nanobanana**:
A sample using [Gemini 3.1 Flash Image model](https://developer.android.com/ai/gemini) (a.k.a. \"Nano Banana\") to generate images of landscapes, objects and people in various artistic style.



**[> Browse code](samples/nanobanana)**

| +| | | +| Magic Selfie sample | 🖼️📸 **Magic Selfie**:
A sample using [ML Kit subject Segmentation SDK](https://developers.google.com/ml-kit/vision/subject-segmentation/android) to remove the background behind a person, and Nano Banana to generate new background.



**[> Browse code](samples/magic-selfie)**

| +| | | +| Gemini Video Summarization sample | ✨🎥 **Gemini Video Summarization**:
A sample using Gemini Flash to [summarize videos](https://firebase.google.com/docs/ai-logic/analyze-video?api=dev) leveraging the [large file support](https://firebase.google.com/docs/ai-logic/solutions/cloud-storage).



**[> Browse code](samples/gemini-video-summarization)**

| +| | | | Gemini Video Metadata sample | ✨🎥 **Gemini Video Metadata Creation**:
A sample using Gemini Flash to generate thumbnails, descriptions, hashtags, account tags, chapters and links from a video. This sample leverages the ability to provide a [Youtube video link](https://firebase.google.com/docs/ai-logic/input-file-requirements?api=dev#provide-file-using-url) to the model context for inference.



**[> Browse code](samples/gemini-video-metadata-creation)**

| -| | | -| Gemini Live Todo sample | ✨🗣️ **Gemini Live Todo App**:
A Todo List app using the [Gemini Live API](https://developer.android.com/ai/gemini/live) to let the user interact with Gemini Live via voice to update the todo list.



**[> Browse code](samples/gemini-live-todo)**

| -| | | -| Imagen Editing sample | 🖼️🖌️ **Imagen Editing**:
A sample using Imagen to [generate images](https://developer.android.com/ai/imagen#generate-image) and [editing images](https://developer.android.com/ai/imagen#editing) using the mask based editing capabilities of the model.



**[> Browse code](samples/imagen-editing)**

| +| | | +| Gemini Live API to-do sample | ✨🗣️ **Gemini Live API To-do App**:
A to-do list app using the [Gemini Live API](https://developer.android.com/ai/gemini/live) to let the user interact with Gemini via voice to update the todo list.



**[> Browse code](samples/gemini-live-todo)**

| ## Reporting issues diff --git a/app/build.gradle.kts b/app/build.gradle.kts index cb0cef6c..12329c99 100644 --- a/app/build.gradle.kts +++ b/app/build.gradle.kts @@ -86,13 +86,13 @@ dependencies { implementation(project(":samples:genai-summarization")) implementation(project(":samples:genai-image-description")) implementation(project(":samples:genai-writing-assistance")) - implementation(project(":samples:imagen")) - implementation(project(":samples:imagen-editing")) + implementation(project(":samples:nanobanana")) implementation(project(":samples:magic-selfie")) implementation(project(":samples:gemini-video-summarization")) implementation(project(":samples:gemini-live-todo")) implementation(project(":samples:gemini-video-metadata-creation")) implementation(project(":samples:gemini-image-chat")) + implementation(project(":samples:gemini-hybrid")) testImplementation(libs.junit) androidTestImplementation(libs.androidx.junit) diff --git a/app/src/main/java/com/android/ai/catalog/domain/SampleCatalog.kt b/app/src/main/java/com/android/ai/catalog/domain/SampleCatalog.kt index c4b10287..51b15282 100644 --- a/app/src/main/java/com/android/ai/catalog/domain/SampleCatalog.kt +++ b/app/src/main/java/com/android/ai/catalog/domain/SampleCatalog.kt @@ -31,13 +31,25 @@ import com.android.ai.samples.geminivideosummary.ui.VideoSummarizationScreen import com.android.ai.samples.genai_image_description.GenAIImageDescriptionScreen import com.android.ai.samples.genai_summarization.GenAISummarizationScreen import com.android.ai.samples.genai_writing_assistance.GenAIWritingAssistanceScreen -import com.android.ai.samples.imagen.ui.ImagenScreen -import com.android.ai.samples.imagenediting.ui.ImagenEditingScreen +import com.android.ai.samples.geminihybrid.GeminiHybridScreen +import com.android.ai.samples.nanobanana.ui.NanobananaScreen import com.android.ai.samples.magicselfie.ui.MagicSelfieScreen import com.android.ai.theme.extendedColorScheme +import com.google.firebase.ai.type.PublicPreviewAPI +@OptIn(PublicPreviewAPI::class) @RequiresPermission(Manifest.permission.RECORD_AUDIO) val sampleCatalog = listOf( + SampleCatalogItem( + title = R.string.gemini_hybrid_sample_list_title, + description = R.string.gemini_hybrid_sample_list_description, + route = "GeminiHybridScreen", + sampleEntryScreen = { GeminiHybridScreen() }, + tags = listOf(SampleTags.GEMINI_NANO, SampleTags.GEMINI_FLASH, SampleTags.ML_KIT, SampleTags.FIREBASE), + needsFirebase = true, + keyArt = R.drawable.img_keyart_text, + isFeatured = true, + ), SampleCatalogItem( title = R.string.gemini_image_chat_list_title, description = R.string.gemini_image_chat_list_description, @@ -48,16 +60,6 @@ val sampleCatalog = listOf( needsFirebase = true, isFeatured = true, ), - SampleCatalogItem( - title = R.string.imagen_editing_sample_list_title, - description = R.string.imagen_editing_sample_list_description, - route = "ImagenMaskEditing", - sampleEntryScreen = { ImagenEditingScreen() }, - tags = listOf(SampleTags.IMAGEN, SampleTags.FIREBASE), - needsFirebase = true, - keyArt = R.drawable.img_keyart_imagen, - isFeatured = true, - ), SampleCatalogItem( title = R.string.gemini_multimodal_sample_list_title, description = R.string.gemini_multimodal_sample_list_description, @@ -102,11 +104,11 @@ val sampleCatalog = listOf( keyArt = R.drawable.img_keyart_text, ), SampleCatalogItem( - title = R.string.imagen_sample_list_title, - description = R.string.imagen_sample_list_description, - route = "ImagenImageGenerationScreen", - sampleEntryScreen = { ImagenScreen() }, - tags = listOf(SampleTags.IMAGEN, SampleTags.FIREBASE), + title = R.string.nanobanana_sample_list_title, + description = R.string.nanobanana_sample_list_description, + route = "NanobananaImageGenerationScreen", + sampleEntryScreen = { NanobananaScreen() }, + tags = listOf(SampleTags.GEMINI_FLASH, SampleTags.FIREBASE), needsFirebase = true, keyArt = R.drawable.img_keyart_imagen, ), @@ -115,7 +117,7 @@ val sampleCatalog = listOf( description = R.string.magic_selfie_sample_list_description, route = "MagicSelfieScreen", sampleEntryScreen = { MagicSelfieScreen() }, - tags = listOf(SampleTags.IMAGEN, SampleTags.FIREBASE, SampleTags.ML_KIT), + tags = listOf(SampleTags.GEMINI_FLASH, SampleTags.FIREBASE), needsFirebase = true, keyArt = R.drawable.img_keyart_magic_selfie, ), diff --git a/app/src/main/res/values/strings.xml b/app/src/main/res/values/strings.xml index 299ac9da..f8876099 100644 --- a/app/src/main/res/values/strings.xml +++ b/app/src/main/res/values/strings.xml @@ -13,20 +13,20 @@ Android AI Samples Android\nAI Samples Open sample - Image generation with Imagen - Generate images with Imagen, Google image generation model - Image Editing with Imagen - Generate images and edit only specific areas of a generated image with inpainting - Magic Selfie with Imagen and ML Kit - Change the background of your selfies with Imagen and the ML Kit Segmentation API + Image generation with Nanobanana + Generate images with Nanobanana, Google image generation model + Magic Selfie with Gemini + Change the background of your selfies with the Gemini Flash model Video Summarization with Gemini and Firebase "Generate a summary of a video (from a cloud URL or Youtube) with Gemini API powered by Firebase" Video Metadata Creation with Gemini and Firebase "Generate metadata of a video (from a cloud URL or Youtube) with Gemini API powered by Firebase" - Gemini Live Todo - "Simple Todo app using the Gemini Live API to interact with the items in the list" - Chat with Nano Banana - Conversational Image generation with Gemini 2.5 Flash Image + Gemini Live API to-do + "Simple to-do app using the Gemini Live API to interact with the items in the list" + Chat with Nano Banana Pro + Conversational Image generation with Gemini 3 Pro Image + Gemini Hybrid + Inference with Firebase Hybrid SDK using either Gemini Nano on-device or Gemini Flash in the Cloud. Firebase Required This feature requires Firebase to be initialized. Close diff --git a/gradle/libs.versions.toml b/gradle/libs.versions.toml index 19f57495..adee43f7 100644 --- a/gradle/libs.versions.toml +++ b/gradle/libs.versions.toml @@ -1,7 +1,8 @@ [versions] agp = "8.8.2" coilCompose = "3.1.0" -firebaseBom = "34.5.0" +firebaseAiOndevice = "16.0.0-beta01" +firebaseBom = "34.11.0" lifecycleRuntimeCompose = "2.9.1" mlkitGenAi = "1.0.0-beta1" kotlin = "2.1.0" @@ -40,8 +41,9 @@ richtext = "1.0.0-alpha02" androidx-core-ktx = { group = "androidx.core", name = "core-ktx", version.ref = "coreKtx" } androidx-lifecycle-runtime-compose = { module = "androidx.lifecycle:lifecycle-runtime-compose", version.ref = "lifecycleRuntimeCompose" } coil-compose = { module = "io.coil-kt.coil3:coil-compose", version.ref = "coilCompose" } +firebase-ai-ondevice = { module = "com.google.firebase:firebase-ai-ondevice", version.ref = "firebaseAiOndevice" } firebase-bom = { module = "com.google.firebase:firebase-bom", version.ref = "firebaseBom" } -firebase-ai = { group = "com.google.firebase", name = "firebase-ai" } +firebase-ai = { module = "com.google.firebase:firebase-ai"} firebase-common-ktx = { group = "com.google.firebase", name = "firebase-common-ktx", version.ref = "firebaseCommonKtx" } genai-image-description = { module = "com.google.mlkit:genai-image-description", version.ref = "mlkitGenAi" } genai-proofreading = { module = "com.google.mlkit:genai-proofreading", version.ref = "mlkitGenAi" } @@ -79,7 +81,6 @@ androidx-media3-ui = { module = "androidx.media3:media3-ui", version.ref = "medi androidx-media3-ui-compose = { module = "androidx.media3:media3-ui-compose", version.ref = "media3"} androidx-media3-transformer = { module = "androidx.media3:media3-transformer", version.ref = "media3" } androidx-ui-tooling-preview-android = { group = "androidx.compose.ui", name = "ui-tooling-preview-android", version.ref = "uiToolingPreviewAndroid" } -mlkit-segmentation = { module = "com.google.android.gms:play-services-mlkit-subject-segmentation", version.ref = "mlkitSegmentation" } ui-tooling-preview = { group = "androidx.compose.ui", name = "ui-tooling-preview", version.ref = "uiToolingPreview" } ui-tooling = { group = "androidx.compose.ui", name = "ui-tooling", version.ref = "uiTooling" } androidx-lifecycle-viewmodel-android = { group = "androidx.lifecycle", name = "lifecycle-viewmodel-android", version.ref = "lifecycleViewmodelAndroid" } @@ -94,4 +95,4 @@ google-gms-google-services = { id = "com.google.gms.google-services", version.re hilt-plugin = { id = "com.google.dagger.hilt.android", version.ref = "hilt"} ksp = { id = "com.google.devtools.ksp", version.ref = "ksp" } compose-compiler = { id = "org.jetbrains.kotlin.plugin.compose", version.ref = "kotlin" } -spotless = { id = "com.diffplug.spotless", version.ref = "spotless" } \ No newline at end of file +spotless = { id = "com.diffplug.spotless", version.ref = "spotless" } diff --git a/samples/gemini-hybrid/.gitignore b/samples/gemini-hybrid/.gitignore new file mode 100644 index 00000000..796b96d1 --- /dev/null +++ b/samples/gemini-hybrid/.gitignore @@ -0,0 +1 @@ +/build diff --git a/samples/gemini-hybrid/README.md b/samples/gemini-hybrid/README.md new file mode 100644 index 00000000..650cc640 --- /dev/null +++ b/samples/gemini-hybrid/README.md @@ -0,0 +1,28 @@ +# Gemini Hybrid Sample + +This sample is part of the [AI Sample Catalog](../../). To build and run this sample, you should clone the entire repository. + +## Description + +This sample demonstrates how to use the Firebase Hybrid SDK, utilizing both on-device (Gemini Nano via [ML Kit Prompt API](https://developers.google.com/ml-kit/genai/prompt/android)) and cloud-based models via the [Firebase AI Logic SDK](https://firebase.google.com/docs/ai-logic). + +The sample lets users generate generic user reviews for a hotel based on a few selected topics. + +
+Gemini Hybrid SDK in action +
+ +## How it works + +Here is how the model is instantiated to leverage hybrid inference: +```kotlin +val model = Firebase.ai(backend = GenerativeBackend.googleAI()) + .generativeModel( + "gemini-2.5-flash-lite", + onDeviceConfig = OnDeviceConfig(mode = InferenceMode.PREFER_ON_DEVICE) + ) + +val response = model.generateContent(prompt) +``` + +Read more about the [Firebase Hybrid SDK](https://firebase.google.com/docs/ai-logic/hybrid/android/get-started?api=dev) in the Firebase documentation. diff --git a/samples/imagen-editing/build.gradle.kts b/samples/gemini-hybrid/build.gradle.kts similarity index 86% rename from samples/imagen-editing/build.gradle.kts rename to samples/gemini-hybrid/build.gradle.kts index bf26d7b5..fa698ae1 100644 --- a/samples/imagen-editing/build.gradle.kts +++ b/samples/gemini-hybrid/build.gradle.kts @@ -13,6 +13,7 @@ * See the License for the specific language governing permissions and * limitations under the License. */ + plugins { alias(libs.plugins.android.library) alias(libs.plugins.jetbrains.kotlin.android) @@ -21,7 +22,7 @@ plugins { } android { - namespace = "com.android.ai.samples.imagenediting" + namespace = "com.android.ai.samples.geminihybrid" compileSdk = 36 buildFeatures { @@ -29,7 +30,8 @@ android { } defaultConfig { - minSdk = 24 + minSdk = 26 + testInstrumentationRunner = "androidx.test.runner.AndroidJUnitRunner" consumerProguardFiles("consumer-rules.pro") } @@ -43,38 +45,31 @@ android { ) } } - compileOptions { sourceCompatibility = JavaVersion.VERSION_17 targetCompatibility = JavaVersion.VERSION_17 } - kotlinOptions { jvmTarget = "17" } - - lint { - warningsAsErrors = true - } } dependencies { implementation(libs.androidx.core.ktx) implementation(libs.androidx.appcompat) implementation(libs.androidx.material3) + implementation(libs.androidx.activity.compose) implementation(platform(libs.androidx.compose.bom)) implementation(libs.androidx.material.icons.extended) - implementation(platform(libs.firebase.bom)) - implementation(libs.firebase.ai) implementation(libs.hilt.android) implementation(libs.hilt.navigation.compose) implementation(libs.androidx.runtime.livedata) - implementation(libs.ui.tooling.preview) + implementation(libs.androidx.lifecycle.runtime.compose) + implementation(platform(libs.firebase.bom)) + implementation(libs.firebase.ai) + implementation(libs.firebase.ai.ondevice) + implementation(project(":ui-component")) debugImplementation(libs.ui.tooling) ksp(libs.hilt.compiler) - - testImplementation(libs.junit) - androidTestImplementation(libs.androidx.junit) - androidTestImplementation(libs.androidx.espresso.core) } diff --git a/samples/gemini-hybrid/consumer-rules.pro b/samples/gemini-hybrid/consumer-rules.pro new file mode 100644 index 00000000..8b137891 --- /dev/null +++ b/samples/gemini-hybrid/consumer-rules.pro @@ -0,0 +1 @@ + diff --git a/samples/gemini-hybrid/gemini_hybrid.png b/samples/gemini-hybrid/gemini_hybrid.png new file mode 100644 index 00000000..1324a668 Binary files /dev/null and b/samples/gemini-hybrid/gemini_hybrid.png differ diff --git a/samples/imagen-editing/proguard-rules.pro b/samples/gemini-hybrid/proguard-rules.pro similarity index 94% rename from samples/imagen-editing/proguard-rules.pro rename to samples/gemini-hybrid/proguard-rules.pro index 481bb434..f1b42451 100644 --- a/samples/imagen-editing/proguard-rules.pro +++ b/samples/gemini-hybrid/proguard-rules.pro @@ -18,4 +18,4 @@ # If you keep the line number information, uncomment this to # hide the original source file name. -#-renamesourcefileattribute SourceFile \ No newline at end of file +#-renamesourcefileattribute SourceFile diff --git a/samples/gemini-hybrid/src/main/java/com/android/ai/samples/geminihybrid/GeminiHybridScreen.kt b/samples/gemini-hybrid/src/main/java/com/android/ai/samples/geminihybrid/GeminiHybridScreen.kt new file mode 100644 index 00000000..a15fb129 --- /dev/null +++ b/samples/gemini-hybrid/src/main/java/com/android/ai/samples/geminihybrid/GeminiHybridScreen.kt @@ -0,0 +1,486 @@ +/* + * Copyright 2025 The Android Open Source Project + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +@file:OptIn(PublicPreviewAPI::class, ExperimentalMaterial3ExpressiveApi::class) + +package com.android.ai.samples.geminihybrid + +import androidx.activity.compose.LocalOnBackPressedDispatcherOwner +import androidx.compose.foundation.background +import androidx.compose.foundation.layout.Box +import androidx.compose.foundation.layout.Column +import androidx.compose.foundation.layout.ExperimentalLayoutApi +import androidx.compose.foundation.layout.FlowRow +import androidx.compose.foundation.layout.Spacer +import androidx.compose.foundation.layout.fillMaxHeight +import androidx.compose.foundation.layout.fillMaxSize +import androidx.compose.foundation.layout.fillMaxWidth +import androidx.compose.foundation.layout.height +import androidx.compose.foundation.layout.heightIn +import androidx.compose.foundation.layout.imePadding +import androidx.compose.foundation.layout.padding +import androidx.compose.foundation.layout.widthIn +import androidx.compose.foundation.rememberScrollState +import androidx.compose.foundation.shape.RoundedCornerShape +import androidx.compose.foundation.verticalScroll +import androidx.compose.material.icons.Icons +import androidx.compose.material.icons.filled.ArrowDropDown +import androidx.compose.material3.ButtonDefaults +import androidx.compose.material3.DropdownMenu +import androidx.compose.material3.DropdownMenuItem +import androidx.compose.material3.ExperimentalMaterial3Api +import androidx.compose.material3.ExperimentalMaterial3ExpressiveApi +import androidx.compose.material3.Icon +import androidx.compose.material3.MaterialTheme +import androidx.compose.material3.OutlinedTextField +import androidx.compose.material3.OutlinedToggleButton +import androidx.compose.material3.Scaffold +import androidx.compose.material3.SplitButtonDefaults +import androidx.compose.material3.SplitButtonLayout +import androidx.compose.material3.Text +import androidx.compose.material3.TextField +import androidx.compose.material3.TextFieldDefaults +import androidx.compose.material3.ToggleButtonDefaults +import androidx.compose.runtime.Composable +import androidx.compose.runtime.getValue +import androidx.compose.runtime.mutableStateOf +import androidx.compose.runtime.remember +import androidx.compose.runtime.setValue +import androidx.compose.ui.Alignment +import androidx.compose.ui.Modifier +import androidx.compose.ui.draw.clip +import androidx.compose.ui.graphics.Color +import androidx.compose.ui.platform.LocalContext +import androidx.compose.ui.res.painterResource +import androidx.compose.ui.res.stringResource +import androidx.compose.ui.text.font.FontWeight +import androidx.compose.ui.unit.dp +import androidx.compose.ui.unit.sp +import androidx.core.content.ContextCompat +import androidx.hilt.navigation.compose.hiltViewModel +import androidx.lifecycle.compose.collectAsStateWithLifecycle +import com.android.ai.theme.AISampleCatalogTheme +import com.android.ai.theme.surfaceContainerHighestLight +import com.android.ai.uicomponent.GenerateButton +import com.android.ai.uicomponent.SampleDetailTopAppBar +import com.android.ai.uicomponent.UndoButton +import com.google.firebase.ai.InferenceMode +import com.google.firebase.ai.type.PublicPreviewAPI + + +@OptIn(ExperimentalMaterial3Api::class, ExperimentalLayoutApi::class) +@Composable +fun GeminiHybridScreen(viewModel: GeminiHybridViewModel = hiltViewModel()) { + + val uiState by viewModel.uiState.collectAsStateWithLifecycle() + + val context = LocalContext.current + val backDispatcher = LocalOnBackPressedDispatcherOwner.current?.onBackPressedDispatcher + + AISampleCatalogTheme { + Scaffold( + modifier = Modifier.fillMaxSize(), + topBar = { + SampleDetailTopAppBar( + sampleName = stringResource(R.string.gemini_hybrid_title), + sampleDescription = stringResource(R.string.gemini_hybrid_description), + sourceCodeUrl = "https://github.com/android/ai-samples/tree/main/samples/gemini-hybrid", + onBackClick = { backDispatcher?.onBackPressed() }, + ) + }, + ) { innerPadding -> + Box( + Modifier + .padding(innerPadding) + .fillMaxSize() + .clip(RoundedCornerShape(40.dp)) + .background(color = surfaceContainerHighestLight) + .padding(top = 16.dp, start = 16.dp, end = 16.dp, bottom = 32.dp), + contentAlignment = Alignment.Center, + ) { + val scrollState = rememberScrollState() + + Column( + Modifier + .padding(top = 16.dp) + .imePadding() + .widthIn(max = 646.dp) + .fillMaxHeight() + .verticalScroll(scrollState), + ) { + Text( + text = stringResource(R.string.gemini_hotel_review), + style = MaterialTheme.typography.titleLarge, + modifier = Modifier.padding(8.dp), + ) + + val status = uiState.status + when { + status is GeminiStatus.Initial -> { + InitialReviewUi( + tags = viewModel.tags, + selectedTags = uiState.selectedTags, + onTagToggle = viewModel::toggleTag, + selectedMode = uiState.selectedMode, + onModeSelected = viewModel::setInferenceMode, + onGenerate = { + val tagStrings = + uiState.selectedTags.map { ContextCompat.getString(context, it) } + viewModel.generateReview(tagStrings) + }, + ) + } + + status is GeminiStatus.Generating && !status.isTranslation -> { + GeneratingUi(status) + } + + status is GeminiStatus.Error -> { + ErrorUi(status.message, onReset = viewModel::reset) + } + + else -> { + SuccessReviewUi( + reviewText = uiState.reviewText, + reviewInferenceStatus = uiState.reviewInferenceStatus, + onReviewTextChanged = viewModel::updateReviewText, + languageKeys = viewModel.languageMap.keys.toList(), + languageMap = viewModel.languageMap, + selectedLanguage = uiState.selectedLanguage, + onLanguageSelected = viewModel::setSelectedLanguage, + onTranslate = { + viewModel.translate( + uiState.reviewText, + uiState.selectedLanguage + ) + }, + onReset = viewModel::reset, + generationStatus = status, + ) + } + } + } + } + } + } +} + +@OptIn(ExperimentalLayoutApi::class) +@Composable +fun InitialReviewUi( + tags: List, + selectedTags: List, + onTagToggle: (Int) -> Unit, + selectedMode: InferenceMode, + onModeSelected: (InferenceMode) -> Unit, + onGenerate: () -> Unit, +) { + Text( + stringResource(R.string.select_topics_for_your_review), + modifier = Modifier.padding(horizontal = 8.dp, vertical = 4.dp), + style = MaterialTheme.typography.titleMedium, + ) + FlowRow( + modifier = Modifier + .padding(8.dp), + ) { + tags.forEach { tagResId -> + val isSelected = selectedTags.contains(tagResId) + OutlinedToggleButton( + checked = isSelected, + onCheckedChange = { onTagToggle(tagResId) }, + colors = ToggleButtonDefaults.outlinedToggleButtonColors( + contentColor = MaterialTheme.colorScheme.tertiary, + ), + modifier = Modifier.padding(horizontal = 6.dp), + ) { + Text( + stringResource(tagResId), + style = MaterialTheme.typography.labelLarge, + fontWeight = FontWeight.Bold, + ) + } + } + } + Spacer(Modifier.height(50.dp)) + InferenceModeDropdown( + selectedMode = selectedMode, + onModeSelected = onModeSelected, + ) + + GenerateButton( + text = stringResource(R.string.gemini_hybrid_generate_btn), + icon = painterResource(id = com.android.ai.uicomponent.R.drawable.ic_ai_text), + modifier = Modifier + .fillMaxWidth() + .padding(top = 12.dp, start = 8.dp, end = 8.dp), + enabled = selectedTags.isNotEmpty(), + onClick = onGenerate, + ) +} + +@Composable +fun GeneratingUi(status: GeminiStatus.Generating) { + val statusText = if (status.isCloud) { + stringResource(R.string.gemini_hybrid_status_generating_cloud) + } else { + stringResource(R.string.gemini_hybrid_status_generating_on_device) + } + Column(modifier = Modifier.fillMaxWidth()) { + StatusText(statusText) + if (status.partialOutput.isNotEmpty()) { + OutputText(status.partialOutput) + } + } +} + +@Composable +fun SuccessReviewUi( + reviewText: String, + reviewInferenceStatus: Int?, + onReviewTextChanged: (String) -> Unit, + languageKeys: List, + languageMap: Map, + selectedLanguage: String, + onLanguageSelected: (String) -> Unit, + onTranslate: () -> Unit, + onReset: () -> Unit, + generationStatus: GeminiStatus, +) { + Column(modifier = Modifier.fillMaxWidth()) { + reviewInferenceStatus?.let { + StatusText(stringResource(it)) + } + OutlinedTextField( + value = reviewText, + onValueChange = onReviewTextChanged, + modifier = Modifier + .padding(4.dp) + .fillMaxWidth() + .heightIn(max = 200.dp), + ) + + Spacer(modifier = Modifier.height(20.dp)) + + Box(modifier = Modifier.padding(start = 8.dp, top = 12.dp)) { + LanguageDropdown( + languageKeys = languageKeys, + languageMap = languageMap, + selectedLanguage = selectedLanguage, + onLanguageSelected = onLanguageSelected, + ) + } + + GenerateButton( + text = stringResource(R.string.gemini_hybrid_translate_btn), + icon = painterResource(id = com.android.ai.uicomponent.R.drawable.ic_ai_text), + modifier = Modifier + .fillMaxWidth() + .padding(top = 12.dp, start = 8.dp, end = 8.dp), + enabled = reviewText.isNotBlank() && generationStatus !is GeminiStatus.Generating, + onClick = onTranslate, + ) + + Spacer(modifier = Modifier.height(20.dp)) + when (generationStatus) { + is GeminiStatus.Generating -> { + if (generationStatus.isTranslation) { + val statusText = if (generationStatus.isCloud) { + stringResource(R.string.gemini_hybrid_status_generating_cloud) + } else { + stringResource(R.string.gemini_hybrid_status_generating_on_device) + } + StatusText(statusText) + if (generationStatus.partialOutput.isNotEmpty()) { + OutputText(generationStatus.partialOutput) + } + } + } + + is GeminiStatus.Success -> { + if (generationStatus.isTranslation) { + val inferenceStatus = if (generationStatus.isCloud) { + R.string.gemini_hybrid_generated_cloud + } else { + R.string.gemini_hybrid_generated_on_device + } + + StatusText(stringResource(inferenceStatus)) + OutputText(generationStatus.output) + } + } + + else -> {} + } + + UndoButton( + modifier = Modifier.padding(start = 8.dp, top = 8.dp), + onClick = onReset, + ) + } +} + +@Composable +fun ErrorUi(message: String, onReset: () -> Unit) { + Column { + StatusText(message) + UndoButton( + modifier = Modifier.padding(start = 8.dp, top = 8.dp), + onClick = onReset, + ) + } +} + +@Composable +fun LanguageDropdown( + languageKeys: List, + languageMap: Map, + selectedLanguage: String, + onLanguageSelected: (String) -> Unit, +) { + var expanded by remember { mutableStateOf(false) } + + Box { + SplitButtonLayout( + leadingButton = { + SplitButtonDefaults.LeadingButton( + onClick = { expanded = true }, + colors = ButtonDefaults.buttonColors( + containerColor = MaterialTheme.colorScheme.tertiaryContainer, + contentColor = MaterialTheme.colorScheme.onTertiaryContainer, + ), + ) { + Text(stringResource(languageMap[selectedLanguage] ?: R.string.gemini_hybrid_lang_korean).uppercase()) + } + }, + trailingButton = { + SplitButtonDefaults.TrailingButton( + onClick = { expanded = true }, + colors = ButtonDefaults.buttonColors( + containerColor = MaterialTheme.colorScheme.tertiaryContainer, + contentColor = MaterialTheme.colorScheme.onTertiaryContainer, + ), + ) { + Icon( + imageVector = Icons.Default.ArrowDropDown, + contentDescription = null, + ) + } + }, + ) + DropdownMenu( + expanded = expanded, + onDismissRequest = { expanded = false }, + ) { + languageKeys.forEach { key -> + DropdownMenuItem( + text = { Text(stringResource(languageMap[key]!!)) }, + onClick = { + onLanguageSelected(key) + expanded = false + }, + ) + } + } + } +} + +@PublicPreviewAPI +@Composable +fun InferenceModeDropdown( + selectedMode: InferenceMode, + onModeSelected: (InferenceMode) -> Unit, +) { + var expanded by remember { mutableStateOf(false) } + val modes = listOf( + InferenceMode.ONLY_ON_DEVICE to stringResource(R.string.gemini_hybrid_mode_only_on_device), + InferenceMode.ONLY_IN_CLOUD to stringResource(R.string.gemini_hybrid_mode_only_cloud), + InferenceMode.PREFER_ON_DEVICE to stringResource(R.string.gemini_hybrid_mode_prefer_on_device), + InferenceMode.PREFER_IN_CLOUD to stringResource(R.string.gemini_hybrid_mode_prefer_cloud), + ) + val selectedText = modes.find { it.first == selectedMode }?.second ?: "" + + Box(modifier = Modifier.padding(start = 8.dp, top = 12.dp)) { + SplitButtonLayout( + leadingButton = { + SplitButtonDefaults.LeadingButton( + onClick = { expanded = true }, + colors = ButtonDefaults.buttonColors( + containerColor = MaterialTheme.colorScheme.tertiaryContainer, + contentColor = MaterialTheme.colorScheme.onTertiaryContainer, + ), + ) { + Text(selectedText) + } + }, + trailingButton = { + SplitButtonDefaults.TrailingButton( + onClick = { expanded = true }, + colors = ButtonDefaults.buttonColors( + containerColor = MaterialTheme.colorScheme.tertiaryContainer, + contentColor = MaterialTheme.colorScheme.onTertiaryContainer, + ), + ) { + Icon( + imageVector = Icons.Default.ArrowDropDown, + contentDescription = null, + ) + } + }, + ) + DropdownMenu( + expanded = expanded, + onDismissRequest = { expanded = false }, + ) { + modes.forEach { (mode, label) -> + DropdownMenuItem( + text = { Text(label) }, + onClick = { + onModeSelected(mode) + expanded = false + }, + ) + } + } + } +} + +@Composable +fun StatusText(text: String) { + Text( + text = text, + style = MaterialTheme.typography.bodySmall, + modifier = Modifier.padding(8.dp), + ) +} + +@Composable +fun OutputText(text: String, modifier: Modifier = Modifier) { + TextField( + value = text, + onValueChange = {}, + readOnly = true, + colors = TextFieldDefaults.colors( + focusedContainerColor = Color.Transparent, + unfocusedContainerColor = Color.Transparent, + focusedIndicatorColor = Color.Transparent, + unfocusedIndicatorColor = Color.Transparent, + disabledIndicatorColor = Color.Transparent, + disabledTextColor = MaterialTheme.colorScheme.onSurface, + ), + modifier = modifier.fillMaxWidth(), + textStyle = MaterialTheme.typography.bodyLarge.copy(fontSize = 16.sp), + ) +} diff --git a/samples/gemini-hybrid/src/main/java/com/android/ai/samples/geminihybrid/GeminiHybridViewModel.kt b/samples/gemini-hybrid/src/main/java/com/android/ai/samples/geminihybrid/GeminiHybridViewModel.kt new file mode 100644 index 00000000..684b76a1 --- /dev/null +++ b/samples/gemini-hybrid/src/main/java/com/android/ai/samples/geminihybrid/GeminiHybridViewModel.kt @@ -0,0 +1,251 @@ +/* + * Copyright 2025 The Android Open Source Project + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package com.android.ai.samples.geminihybrid + +import android.util.Log +import androidx.lifecycle.ViewModel +import androidx.lifecycle.viewModelScope +import com.google.firebase.Firebase +import com.google.firebase.ai.InferenceMode +import com.google.firebase.ai.InferenceSource +import com.google.firebase.ai.OnDeviceConfig +import com.google.firebase.ai.ai +import com.google.firebase.ai.type.GenerativeBackend +import com.google.firebase.ai.type.PublicPreviewAPI +import dagger.hilt.android.lifecycle.HiltViewModel +import javax.inject.Inject +import kotlinx.coroutines.flow.MutableStateFlow +import kotlinx.coroutines.flow.StateFlow +import kotlinx.coroutines.flow.asStateFlow +import kotlinx.coroutines.flow.update +import kotlinx.coroutines.launch + +sealed interface GeminiStatus { + data object Initial : GeminiStatus + data class Generating( + val isCloud: Boolean, + val partialOutput: String = "", + val isTranslation: Boolean = false + ) : GeminiStatus + + data class Success( + val output: String, + val isCloud: Boolean, + val isTranslation: Boolean = false + ) : GeminiStatus + + data class Error(val message: String) : GeminiStatus +} + +@OptIn(PublicPreviewAPI::class) +data class GeminiHybridUiState( + val selectedMode: InferenceMode = InferenceMode.ONLY_ON_DEVICE, + val selectedTags: List = emptyList(), + val reviewText: String = "", + val reviewInferenceStatus: Int? = null, + val selectedLanguage: String = "Korean", + val status: GeminiStatus = GeminiStatus.Initial +) + +@PublicPreviewAPI +@HiltViewModel +class GeminiHybridViewModel @Inject constructor() : ViewModel() { + private val _uiState = MutableStateFlow(GeminiHybridUiState()) + val uiState: StateFlow = _uiState.asStateFlow() + + val tags = listOf( + R.string.location, + R.string.view, + R.string.service, + R.string.comfort, + R.string.food, + R.string.spacious, + R.string.natural_light, + ) + + val languageMap = mapOf( + "Korean" to R.string.gemini_hybrid_lang_korean, + "Spanish" to R.string.gemini_hybrid_lang_spanish, + "French" to R.string.gemini_hybrid_lang_french, + "German" to R.string.gemini_hybrid_lang_german + ) + + fun setInferenceMode(mode: InferenceMode) { + _uiState.update { it.copy(selectedMode = mode) } + } + + fun toggleTag(tagResId: Int) { + _uiState.update { state -> + val newTags = if (state.selectedTags.contains(tagResId)) { + state.selectedTags - tagResId + } else { + state.selectedTags + tagResId + } + state.copy(selectedTags = newTags) + } + } + + fun updateReviewText(text: String) { + _uiState.update { it.copy(reviewText = text) } + } + + fun setSelectedLanguage(language: String) { + _uiState.update { it.copy(selectedLanguage = language) } + } + + fun generateReview(tagStrings: List) { + if (tagStrings.isEmpty()) { + _uiState.update { it.copy(status = GeminiStatus.Error("Please select at least one tag")) } + return + } + + viewModelScope.launch { + _uiState.update { + it.copy( + status = GeminiStatus.Generating( + isCloud = it.selectedMode == InferenceMode.ONLY_IN_CLOUD, + isTranslation = false + ) + ) + } + try { + val prompt = + "Write a simple, short and generic hotel review positively covering the following themes: ${ + tagStrings.joinToString(", ") + }. Generate a generic review strictly from themes, don't hallucinate a hotel name or a location. Return only the review text." + + val model = Firebase.ai(backend = GenerativeBackend.googleAI()) + .generativeModel( + "gemini-2.5-flash-lite", + onDeviceConfig = OnDeviceConfig(mode = _uiState.value.selectedMode) + ) + model.generateContentStream(prompt).collect { chunk -> + val isCloud = chunk.inferenceSource == InferenceSource.IN_CLOUD + _uiState.update { state -> + val currentStatus = state.status + val newStatus = if (currentStatus is GeminiStatus.Generating) { + currentStatus.copy( + isCloud = isCloud, + partialOutput = currentStatus.partialOutput + (chunk.text ?: "") + ) + } else { + GeminiStatus.Generating( + isCloud = isCloud, + partialOutput = chunk.text ?: "", + isTranslation = false + ) + } + state.copy(status = newStatus) + } + } + + val finalState = _uiState.value + val finalStatus = finalState.status + if (finalStatus is GeminiStatus.Generating) { + val output = finalStatus.partialOutput.trimEnd() + val inferenceStatusResId = if (finalStatus.isCloud) { + R.string.gemini_hybrid_generated_cloud + } else { + R.string.gemini_hybrid_generated_on_device + } + _uiState.update { + it.copy( + reviewText = output, + reviewInferenceStatus = inferenceStatusResId, + status = GeminiStatus.Success(output, finalStatus.isCloud, isTranslation = false) + ) + } + } + } catch (e: Exception) { + Log.e("GeminiHybrid", "Inference failed", e) + _uiState.update { + it.copy(status = GeminiStatus.Error(e.localizedMessage ?: "Unknown error occurred")) + } + } + } + } + + fun translate(text: String, language: String) { + if (text.isBlank()) { + _uiState.update { it.copy(status = GeminiStatus.Error("Text to translate cannot be empty")) } + return + } + + viewModelScope.launch { + _uiState.update { + it.copy( + status = GeminiStatus.Generating( + isCloud = it.selectedMode == InferenceMode.ONLY_IN_CLOUD, + isTranslation = true + ) + ) + } + try { + val prompt = + "Translate the following text to $language. Return ONLY the translated text, no explanations:\n\n$text" + + val model = Firebase.ai(backend = GenerativeBackend.googleAI()) + .generativeModel( + "gemini-2.5-flash-lite", + onDeviceConfig = OnDeviceConfig(mode = _uiState.value.selectedMode) + ) + + model.generateContentStream(prompt).collect { chunk -> + val isCloud = chunk.inferenceSource == InferenceSource.IN_CLOUD + _uiState.update { state -> + val currentStatus = state.status + val newStatus = if (currentStatus is GeminiStatus.Generating) { + currentStatus.copy( + isCloud = isCloud, + partialOutput = currentStatus.partialOutput + (chunk.text ?: "") + ) + } else { + GeminiStatus.Generating( + isCloud = isCloud, + partialOutput = chunk.text ?: "", + isTranslation = true + ) + } + state.copy(status = newStatus) + } + } + + val finalState = _uiState.value + val finalStatus = finalState.status + if (finalStatus is GeminiStatus.Generating) { + _uiState.update { + it.copy( + status = GeminiStatus.Success( + finalStatus.partialOutput, + finalStatus.isCloud, + isTranslation = true + ) + ) + } + } + } catch (e: Exception) { + Log.e("GeminiHybrid", "Inference failed", e) + _uiState.update { + it.copy(status = GeminiStatus.Error(e.localizedMessage ?: "Unknown error occurred")) + } + } + } + } + + fun reset() { + _uiState.value = GeminiHybridUiState() + } +} diff --git a/samples/gemini-hybrid/src/main/res/values/strings.xml b/samples/gemini-hybrid/src/main/res/values/strings.xml new file mode 100644 index 00000000..82a3fc9e --- /dev/null +++ b/samples/gemini-hybrid/src/main/res/values/strings.xml @@ -0,0 +1,28 @@ + + + Hybrid Inference + Inference with Firebase Hybrid SDK using either Gemini Nano on-device or Gemini Flash in the Cloud. + Hotel review generation + Generate Review + Translate + Korean + Spanish + French + German + Generating on-device… + Generating in cloud… + Generated in the cloud + Generated on device + PREFER ON-DEVICE + PREFER CLOUD + ONLY ON-DEVICE + ONLY CLOUD + LOCATION + VIEW + SERVICE + COMFORT + FOOD + SPACIOUS + NATURAL LIGHT + Select topics for your review: + diff --git a/samples/gemini-image-chat/src/main/res/values/strings.xml b/samples/gemini-image-chat/src/main/res/values/strings.xml index 54d6dfbf..afdeb3b8 100644 --- a/samples/gemini-image-chat/src/main/res/values/strings.xml +++ b/samples/gemini-image-chat/src/main/res/values/strings.xml @@ -1,9 +1,8 @@ Gemini Image Chat - A simple implementation of chatbot using Gemini Flash model that can understand both text and images. + A simple implementation of chatbot using Gemini 3 Pro Image that can understand both text and images and generate images. Type your message - Start by describing an initial image. Then iterate on it by suggesting gemini to apply changes (background, colors, view angle, etc…). See code Something went wrong, try again Dismiss diff --git a/samples/gemini-live-todo/README.md b/samples/gemini-live-todo/README.md index 3adc8196..477c0fe0 100644 --- a/samples/gemini-live-todo/README.md +++ b/samples/gemini-live-todo/README.md @@ -1,13 +1,15 @@ -# Gemini Live Todo Sample +# Gemini Live Todo AI Glasses Sample This sample is part of the [AI Sample Catalog](../../). To build and run this sample, you should clone the entire repository. +This sample was built with Android Studio Quail 1 Canary 3. Get the latest [Android Studio Canary](https://developer.android.com/studio/preview) to access the AI glasses emulator and its latest features. + ## Description This sample demonstrates how to use the Gemini Live API for real-time, voice-based interactions in a simple ToDo application. Users can add, remove, and update tasks by speaking to the app, showcasing a hands-free, conversational user experience powered by the Gemini API.
-Gemini Live Todo in action +Gemini Live Todo in action
## How it works @@ -35,5 +37,27 @@ try { liveSessionState.value = LiveSessionState.Error } ``` +# Google AI Glasses Support +This prototype sample demonstrates how to extend the Gemini Live experience to Google AI Glasses. + +The prototype illustrates how to leverage the glasses' form factor for a heads-up display (HUD) experience while maintaining the core application logic on the host device. + +
+AI Glasses List in action +
+ +
+AI Glasses List in action scrolled +
+ +## Tech Stack +The glasses integration is built using the following libraries: + +### Jetpack Projected: +Used to manage the connection and service lifecycle between the host application (phone) and the client display (glasses). This allows the application to "project" its content onto the glasses. + +### Jetpack Compose Glimmer: +The UI for the glasses is constructed using Glimmer, an Android UI toolkit optimized for transparent, wearable displays. -Read more about the [Gemini Live API](https://developer.android.com/ai/gemini/live) in the Android Documentation. +Read more about the Gemini Live API in the Android Documentation. +Read more about the [Gemini Live API](https://developer.android.com/ai/gemini/live) in the Android Documentation. \ No newline at end of file diff --git a/samples/gemini-live-todo/ai_glasses_list1.png b/samples/gemini-live-todo/ai_glasses_list1.png new file mode 100644 index 00000000..4d3bb0cb Binary files /dev/null and b/samples/gemini-live-todo/ai_glasses_list1.png differ diff --git a/samples/gemini-live-todo/ai_glasses_list2.png b/samples/gemini-live-todo/ai_glasses_list2.png new file mode 100644 index 00000000..91f9f6cd Binary files /dev/null and b/samples/gemini-live-todo/ai_glasses_list2.png differ diff --git a/samples/gemini-live-todo/ai_glasses_todo.png b/samples/gemini-live-todo/ai_glasses_todo.png new file mode 100644 index 00000000..b7df1b5d Binary files /dev/null and b/samples/gemini-live-todo/ai_glasses_todo.png differ diff --git a/samples/gemini-live-todo/build.gradle.kts b/samples/gemini-live-todo/build.gradle.kts index 2eef9bf4..05ccb02d 100644 --- a/samples/gemini-live-todo/build.gradle.kts +++ b/samples/gemini-live-todo/build.gradle.kts @@ -18,11 +18,12 @@ plugins { alias(libs.plugins.jetbrains.kotlin.android) alias(libs.plugins.ksp) alias(libs.plugins.compose.compiler) + alias(libs.plugins.hilt.plugin) } android { namespace = "com.android.ai.samples.geminilivetodo" - compileSdk = 35 + compileSdk = 36 defaultConfig { minSdk = 24 @@ -50,7 +51,9 @@ android { } dependencies { - + implementation(libs.androidx.xr.glimmer) + implementation(libs.androidx.xr.projected) + implementation("com.google.firebase:firebase-ai:17.5.0") implementation(libs.androidx.core.ktx) implementation(libs.androidx.appcompat) implementation(platform(libs.androidx.compose.bom)) diff --git a/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/GlassesActivity.kt b/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/GlassesActivity.kt new file mode 100644 index 00000000..534e1c4c --- /dev/null +++ b/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/GlassesActivity.kt @@ -0,0 +1,107 @@ +package com.android.ai.samples.geminilivetodo + +import android.Manifest +import android.content.pm.PackageManager +import android.os.Bundle +import androidx.activity.ComponentActivity +import androidx.activity.compose.setContent +import androidx.activity.result.ActivityResultLauncher +import androidx.compose.material3.Text +import androidx.compose.runtime.Composable +import androidx.compose.runtime.getValue +import androidx.compose.runtime.mutableStateOf +import androidx.compose.runtime.setValue +import androidx.compose.ui.Modifier +import androidx.compose.ui.res.stringResource +import androidx.compose.ui.tooling.preview.Preview +import androidx.core.content.ContextCompat +import androidx.xr.glimmer.GlimmerTheme +import com.android.ai.samples.geminilivetodo.ui.GlimmerTodoScreen +import androidx.xr.projected.experimental.ExperimentalProjectedApi +import androidx.xr.projected.permissions.ProjectedPermissionsRequestParams +import androidx.xr.projected.permissions.ProjectedPermissionsResultContract +import dagger.hilt.android.AndroidEntryPoint + +private const val TAG = "GlassesActivity" + +@AndroidEntryPoint +class GlassesActivity : ComponentActivity() { + + private var isPermissionsGranted by mutableStateOf(false) + + private val requiredPermissions = listOf( + Manifest.permission.RECORD_AUDIO + ) + + @OptIn(ExperimentalProjectedApi::class) + private val requestPermissionLauncher: ActivityResultLauncher> = + registerForActivityResult(ProjectedPermissionsResultContract()) { results -> + val granted = requiredPermissions.all { permission -> + results[permission] == true + } + isPermissionsGranted = granted + setupContent() + } + + override fun onCreate(savedInstanceState: Bundle?) { + super.onCreate(savedInstanceState) + val allGranted = checkAllPermissionsGranted() + isPermissionsGranted = allGranted + + setupContent() + + + if (!allGranted) { + requestPermissions() + } + } + + + private fun checkAllPermissionsGranted(): Boolean { + return requiredPermissions.all { permission -> + ContextCompat.checkSelfPermission(this, permission) == PackageManager.PERMISSION_GRANTED + } + } + + private fun setupContent() { + setContent { + GlimmerTheme { + RootScreen(isGranted = isPermissionsGranted) + } + } + } + + @OptIn(ExperimentalProjectedApi::class) + private fun requestPermissions() { + requestPermissionLauncher.launch( + listOf( + ProjectedPermissionsRequestParams( + permissions = requiredPermissions, + rationale = getString(R.string.permission_rationale_mic_access) + ) + ) + ) + } +} + + +@Composable +fun RootScreen(isGranted: Boolean, modifier: Modifier = Modifier) { + if (isGranted) { + GlimmerTodoScreen(modifier = modifier) + } else { + Text( + text = stringResource(R.string.permissions_denied_mic_access), + modifier = modifier + ) + } +} + + +@Preview(showBackground = true) +@Composable +fun PreviewRootScreen() { + GlimmerTheme { + RootScreen(isGranted = false) + } +} \ No newline at end of file diff --git a/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/data/Todo.kt b/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/data/Todo.kt index 953ec7fc..4805b658 100644 --- a/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/data/Todo.kt +++ b/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/data/Todo.kt @@ -5,7 +5,7 @@ * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * - * https://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, @@ -15,11 +15,22 @@ */ package com.android.ai.samples.geminilivetodo.data -import kotlin.random.Random +import java.util.UUID.randomUUID +const val MIC_TODO_ID = 111 + +sealed interface GlassesListItem { + val id: Int +} data class Todo( - val id: Int = Random.nextInt(), + override val id: Int = randomUUID().hashCode(), val task: String, val isCompleted: Boolean = false, -) +) : GlassesListItem + +data class MicControl( + override val id: Int = MIC_TODO_ID, + val statusText: String, + val isMicOn: Boolean, +) : GlassesListItem \ No newline at end of file diff --git a/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/data/TodoRepository.kt b/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/data/TodoRepository.kt index 8729e1b6..a6617903 100644 --- a/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/data/TodoRepository.kt +++ b/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/data/TodoRepository.kt @@ -5,7 +5,7 @@ * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * - * https://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, @@ -26,47 +26,86 @@ import kotlinx.coroutines.flow.MutableStateFlow import kotlinx.coroutines.flow.asStateFlow import kotlinx.coroutines.flow.update +private const val MIC_STATUS_TODO_ID = -999 + @Singleton class TodoRepository @Inject constructor() { - private val _todos = MutableStateFlow>( + private val _todos = MutableStateFlow>( listOf( Todo(1234, "buy bread", false), Todo(1235, "do the dishes", false), Todo(1236, "buy eggs", false), Todo(1237, "read a book", false), - ), + MicControl( + id = MIC_TODO_ID, + statusText = "Mic Status", + isMicOn = false + ), + ).filterNot { it.id == MIC_STATUS_TODO_ID }, ) - val todos: Flow> = _todos.asStateFlow() + val todos: Flow> = _todos.asStateFlow() - fun getTodoList(): List = _todos.value + fun getTodoList(): List = _todos.value - fun addTodo(taskDescription: String) : Int? { + fun addTodo(taskDescription: String) { if (taskDescription.isNotBlank()) { - val newTodo = Todo(task = taskDescription) + val newTodo = Todo(task = taskDescription, isCompleted = false) _todos.update { currentList -> - currentList + newTodo + listOf(newTodo) + currentList + } + } + } + + + fun updateMicStatus(micIsOn: Boolean) { + val newText = if (micIsOn) "Microphone Status: On" else "Microphone Status: Off" + _todos.update { currentList -> + currentList.map { item -> + if (item.id == MIC_TODO_ID && item is MicControl) { + item.copy( + statusText = newText, + isMicOn = micIsOn + ) + } else { + item + } } - return newTodo.id } - return null } fun removeTodo(todoId: Int) { _todos.update { currentList -> - currentList.filterNot { it.id == todoId } + + if (todoId == MIC_TODO_ID) { + currentList + } else { + + currentList.filterNot { it.id == todoId } + } } } fun toggleTodoStatus(todoId: Int) { _todos.update { currentList -> - currentList.map { todo -> - if (todo.id == todoId) { - todo.copy(isCompleted = !todo.isCompleted) - } else { - todo + currentList.map { item -> + when (item) { + is MicControl -> { + if (item.id == todoId) { + item.copy(isMicOn = !item.isMicOn) + } else { + item + } + } + is Todo -> { + if (item.id == todoId) { + item.copy(isCompleted = !item.isCompleted) + } else { + item + } + } } } } } -} +} \ No newline at end of file diff --git a/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/ui/GlimmerToDoScreen.kt b/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/ui/GlimmerToDoScreen.kt new file mode 100644 index 00000000..f8509916 --- /dev/null +++ b/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/ui/GlimmerToDoScreen.kt @@ -0,0 +1,255 @@ +/* + * Copyright 2025 The Android Open Source Project + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.android.ai.samples.geminilivetodo.ui + +import android.app.Activity +import android.util.Log +import androidx.compose.foundation.Image +import androidx.compose.foundation.background +import androidx.compose.foundation.layout.Arrangement +import androidx.compose.foundation.layout.Box +import androidx.compose.foundation.layout.PaddingValues +import androidx.compose.foundation.layout.fillMaxSize +import androidx.compose.foundation.layout.height +import androidx.compose.foundation.layout.size +import androidx.compose.runtime.Composable +import androidx.compose.runtime.LaunchedEffect +import androidx.compose.runtime.getValue +import androidx.compose.ui.Alignment +import androidx.compose.ui.Modifier +import androidx.compose.ui.graphics.Color +import androidx.compose.ui.platform.LocalContext +import androidx.compose.ui.res.painterResource +import androidx.compose.ui.res.stringResource +import androidx.compose.ui.text.style.TextDecoration +import androidx.compose.ui.tooling.preview.Preview +import androidx.compose.ui.unit.dp +import androidx.hilt.navigation.compose.hiltViewModel +import androidx.lifecycle.compose.collectAsStateWithLifecycle +import androidx.xr.glimmer.GlimmerTheme +import androidx.xr.glimmer.ListItem +import androidx.xr.glimmer.Text +import androidx.xr.glimmer.list.VerticalList +import com.android.ai.samples.geminilivetodo.R +import com.android.ai.samples.geminilivetodo.data.Todo +import com.android.ai.uicomponent.R as UiComponentR +import kotlin.math.min + +private val DefaultListItemHeight = 64.dp +private const val MaxItemsInList = 4 +private val IconSize = 30.dp +private const val TAG = "GlimmerTodoScreen" +private const val MIC_CONTROL_ID = 111 + +@Composable +fun GlimmerTodoScreen( + modifier: Modifier = Modifier, + viewModel: TodoScreenViewModel = hiltViewModel() +) { + val uiState by viewModel.uiState.collectAsStateWithLifecycle() + + val context = LocalContext.current + val activity = context as? Activity + val onExit = { activity?.finish() } + + LaunchedEffect(uiState) { + if (uiState is TodoScreenUiState.Success) { + val isMicOn = (uiState as TodoScreenUiState.Success).isMicOn + Log.i(TAG, "Glimmer UI MIC STATUS: ${if (isMicOn) "Running" else "Ready"}") + } + } + + GlimmerTheme { + Box( + contentAlignment = Alignment.BottomCenter, + modifier = modifier + .fillMaxSize() + .background(GlimmerTheme.colors.background) + ) { + + GlimmerScreenContent( + uiState = uiState, + onToggleItem = viewModel::toggleTodoStatus, + onExit = { onExit() } + ) + } + } +} + +@Composable +private fun GlimmerScreenContent( + uiState: TodoScreenUiState, + onToggleItem: (Int) -> Unit, + onExit: () -> Unit +) { + when (uiState) { + is TodoScreenUiState.Initial -> { + Text(text = stringResource(R.string.loading_todo_list)) + } + is TodoScreenUiState.Success -> { + TodoListView( + todoItems = uiState.todoItems, + isMicOn = uiState.isMicOn, + onToggleItem = onToggleItem, + onExit = onExit + ) + } + is TodoScreenUiState.Error -> { + Text(text = stringResource(R.string.error_loading_list)) + } + } +} + +@Composable +private fun TodoListView( + todoItems: List, + isMicOn: Boolean, + onToggleItem: (Int) -> Unit, + onExit: () -> Unit +) { + + val totalItems = todoItems.size + 2 + + val listHeight = (min(totalItems, MaxItemsInList) * DefaultListItemHeight.value + + min(totalItems - 1, MaxItemsInList) * 12f) + + VerticalList( + modifier = Modifier.height(listHeight.dp), + contentPadding = PaddingValues(horizontal = 16.dp, vertical = 24.dp), + verticalArrangement = Arrangement.spacedBy(12.dp), + ) { + + item { + GlimmerMicControlItem( + isMicOn = isMicOn, + onToggle = { onToggleItem(MIC_CONTROL_ID) } + ) + } + + items(todoItems.size, key = { index -> todoItems[index].id }) { index -> + GlimmerTodoItem( + task = todoItems[index], + onToggle = onToggleItem + ) + } + + item { + ListItem( + onClick = onExit, + leadingIcon = { + Image( + painter = painterResource(id = UiComponentR.drawable.ic_close), + contentDescription = stringResource(R.string.exit_app), + modifier = Modifier.size(IconSize) + ) + } + ) { + Text(text = stringResource(R.string.exit_app)) + } + } + } +} + +@Composable +private fun GlimmerMicControlItem( + isMicOn: Boolean, + onToggle: () -> Unit +) { + val icon = if (isMicOn) UiComponentR.drawable.ic_mic_off else UiComponentR.drawable.ic_ai_mic + + + val displayTask = if (isMicOn) { + stringResource(R.string.mic_on_label) + } else { + stringResource(R.string.mic_off_label) + } + + val contentDesc = if (isMicOn) { + stringResource(R.string.mic_status_on) + } else { + stringResource(R.string.mic_status_off) + } + + ListItem( + onClick = onToggle, + leadingIcon = { + Image( + painter = painterResource(id = icon), + contentDescription = contentDesc, + modifier = Modifier.size(IconSize) + ) + } + ) { + Text(text = displayTask) + } +} + +@Composable +private fun GlimmerTodoItem( + task: Todo, + onToggle: (Int) -> Unit +) { + val icon = if (task.isCompleted) UiComponentR.drawable.ic_check else UiComponentR.drawable.ic_circle + + ListItem( + onClick = { onToggle(task.id) }, + leadingIcon = { + Image( + painter = painterResource(id = icon), + contentDescription = if (task.isCompleted) + stringResource(R.string.status_completed) + else + stringResource(R.string.status_pending), + modifier = Modifier.size(IconSize) + ) + } + ) { + Text( + text = task.task, + textDecoration = if (task.isCompleted) TextDecoration.LineThrough else null + ) + } +} + +@Preview(device = "id:ai_glasses_device") +@Composable +private fun GlimmerTodoScreenPreview() { + val mockTodoItems = listOf( + Todo(id = 1, task = "Buy groceries", isCompleted = false), + Todo(id = 2, task = "Finish the report", isCompleted = true), + Todo(id = 3, task = "Call mom", isCompleted = false) + ) + + GlimmerTheme { + Box( + contentAlignment = Alignment.BottomCenter, + modifier = Modifier + .fillMaxSize() + .background(GlimmerTheme.colors.background) + ) { + GlimmerScreenContent( + uiState = TodoScreenUiState.Success( + todoItems = mockTodoItems, + isMicOn = true, + liveSessionState = LiveSessionState.Running + ), + onToggleItem = { _ -> }, + onExit = {} + ) + } + } +} \ No newline at end of file diff --git a/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/ui/TodoScreen.kt b/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/ui/TodoScreen.kt index 17abc4c1..37dbb8dc 100644 --- a/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/ui/TodoScreen.kt +++ b/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/ui/TodoScreen.kt @@ -5,7 +5,7 @@ * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * - * https://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, @@ -13,11 +13,16 @@ * See the License for the specific language governing permissions and * limitations under the License. */ + package com.android.ai.samples.geminilivetodo.ui import android.app.Activity +import android.content.Intent +import android.os.Build +import android.util.Log import androidx.activity.compose.LocalActivity import androidx.activity.compose.LocalOnBackPressedDispatcherOwner +import androidx.annotation.RequiresApi import androidx.compose.foundation.background import androidx.compose.foundation.layout.Box import androidx.compose.foundation.layout.Column @@ -39,6 +44,7 @@ import androidx.compose.foundation.text.input.setTextAndPlaceCursorAtEnd import androidx.compose.material3.Checkbox import androidx.compose.material3.CircularProgressIndicator import androidx.compose.material3.ExperimentalMaterial3Api +import androidx.compose.material3.ExtendedFloatingActionButton import androidx.compose.material3.FabPosition import androidx.compose.material3.HorizontalDivider import androidx.compose.material3.Icon @@ -54,14 +60,21 @@ import androidx.compose.runtime.LaunchedEffect import androidx.compose.runtime.getValue import androidx.compose.runtime.mutableStateOf import androidx.compose.runtime.remember +import androidx.compose.runtime.rememberCoroutineScope import androidx.compose.ui.Alignment import androidx.compose.ui.Modifier +import androidx.compose.ui.graphics.Color +import androidx.compose.ui.platform.LocalContext import androidx.compose.ui.res.painterResource import androidx.compose.ui.res.stringResource +import androidx.compose.ui.text.font.FontWeight import androidx.compose.ui.text.style.TextDecoration import androidx.compose.ui.unit.dp import androidx.hilt.navigation.compose.hiltViewModel import androidx.lifecycle.compose.collectAsStateWithLifecycle +import androidx.xr.projected.ProjectedContext +import androidx.xr.projected.experimental.ExperimentalProjectedApi +import com.android.ai.samples.geminilivetodo.GlassesActivity import com.android.ai.samples.geminilivetodo.R import com.android.ai.samples.geminilivetodo.data.Todo import com.android.ai.uicomponent.GenerateButton @@ -69,12 +82,47 @@ import com.android.ai.uicomponent.SampleDetailTopAppBar import com.android.ai.uicomponent.SecondaryButton import com.android.ai.uicomponent.TextInput -@OptIn(ExperimentalMaterial3Api::class) +private const val TAG = "TodoScreenLaunch" + +private val GlassesConnectedGreen = Color(0xFF34A853) + +@OptIn(ExperimentalProjectedApi::class) +@RequiresApi(Build.VERSION_CODES.VANILLA_ICE_CREAM) +private fun launchGlassesExperience(activity: Activity) { + Log.d(TAG, "Attempting to launch GlassesActivity on connected device...") + + try { + val projectedContext = ProjectedContext.createProjectedDeviceContext(activity) + val options = ProjectedContext.createProjectedActivityOptions(projectedContext) + val intent = Intent(activity, GlassesActivity::class.java).apply { + addFlags(Intent.FLAG_ACTIVITY_NEW_TASK) + } + + activity.startActivity(intent, options.toBundle()) + Log.i(TAG, "Successfully sent launch intent to the projected device.") + + } catch (e: IllegalStateException) { + Log.e(TAG, "Projected device not ready: ${e.message}") + } catch (e: Exception) { + Log.e(TAG, "Error during launch: ${e.message}") + } +} + +@OptIn(ExperimentalMaterial3Api::class, ExperimentalProjectedApi::class) @Composable fun TodoScreen(viewModel: TodoScreenViewModel = hiltViewModel()) { val uiState by viewModel.uiState.collectAsStateWithLifecycle() val activity = LocalActivity.current as Activity + val context = LocalContext.current + val scope = rememberCoroutineScope() + + val isGlassesConnected = if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.VANILLA_ICE_CREAM) { + ProjectedContext.isProjectedDeviceConnected(context, scope.coroutineContext) + .collectAsStateWithLifecycle(initialValue = false).value + } else { + false + } LaunchedEffect(Unit) { viewModel.initializeGeminiLive(activity) @@ -90,7 +138,7 @@ fun TodoScreen(viewModel: TodoScreenViewModel = hiltViewModel()) { SampleDetailTopAppBar( sampleName = stringResource(R.string.gemini_live_title), sampleDescription = stringResource(R.string.gemini_live_subtitle), - sourceCodeUrl = "https://github.com/android/ai-samples/tree/main/samples/gemini-live-todo", + sourceCodeUrl = "https://github.com/android/ai-samples/tree/main/ai-catalog/samples/gemini-live-todo", topAppBarState = topAppBarState, scrollBehavior = scrollBehavior, onBackClick = { backDispatcher?.onBackPressed() }, @@ -106,55 +154,98 @@ fun TodoScreen(viewModel: TodoScreenViewModel = hiltViewModel()) { .imePadding() .fillMaxSize(), ) { - when (uiState) { + + when (val state = uiState) { is TodoScreenUiState.Initial -> { Box( modifier = Modifier - .fillMaxSize(), + .fillMaxSize() + .weight(1f), contentAlignment = Alignment.Center, ) { CircularProgressIndicator() } } is TodoScreenUiState.Success -> { - val todos = (uiState as TodoScreenUiState.Success).todos + val itemsToRender = state.todoItems LazyColumn( modifier = Modifier .widthIn(max = 646.dp) .align(Alignment.CenterHorizontally) .weight(1f), ) { - itemsIndexed(todos.reversed(), key = { index: Int, item: Todo -> item.id }) { index, todo -> + itemsIndexed(itemsToRender, key = { _, item -> item.id }) { index, item -> TodoItem( - task = todo, - onToggle = { viewModel.toggleTodoStatus(todo.id) }, - onDelete = { viewModel.removeTodo(todo.id) }, + task = item, + onToggle = { viewModel.toggleTodoStatus(item.id) }, + onDelete = { viewModel.removeTodo(item.id) }, ) + if (index != itemsToRender.size - 1) { + HorizontalDivider() + } } } } is TodoScreenUiState.Error -> { - val todos = (uiState as TodoScreenUiState.Error).todos - LazyColumn(modifier = Modifier.weight(1f)) { - itemsIndexed(todos.reversed(), key = { index: Int, item: Todo -> item.id }) { index, todo -> - TodoItem( - task = todo, - onToggle = { viewModel.toggleTodoStatus(todo.id) }, - onDelete = { viewModel.removeTodo(todo.id) }, + Box( + modifier = Modifier + .fillMaxSize() + .weight(1f), + contentAlignment = Alignment.Center, + ) { + Text( + text = stringResource(R.string.error_message), + color = MaterialTheme.colorScheme.error + ) + } + } + } + + + if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.VANILLA_ICE_CREAM) { + Box( + modifier = Modifier + .fillMaxWidth() + .widthIn(max = 646.dp) + .align(Alignment.CenterHorizontally) + .padding(bottom = 16.dp), + contentAlignment = Alignment.CenterEnd + ) { + ExtendedFloatingActionButton( + onClick = { + launchGlassesExperience(activity) + }, + expanded = true, + containerColor = if (isGlassesConnected) { + GlassesConnectedGreen + } else { + MaterialTheme.colorScheme.surfaceContainerHighest + }, + contentColor = Color.Black, + icon = { + Icon( + painter = painterResource(id = com.android.ai.uicomponent.R.drawable.ic_glasses), + contentDescription = null, + modifier = Modifier.size(25.dp) + ) + }, + text = { + Text( + text = if (isGlassesConnected) "Open on Glasses" else "Connect Glasses", + fontWeight = FontWeight.Bold ) - if (index != todos.size - 1) { - HorizontalDivider() - } } - } + ) } } val textFieldState = rememberTextFieldState() val textInputEnabled = remember { mutableStateOf(true) } + if (uiState is TodoScreenUiState.Success) { + val state = uiState as TodoScreenUiState.Success when { - (uiState as TodoScreenUiState.Success).liveSessionState is LiveSessionState.Running -> { + state.liveSessionState is LiveSessionState.Running -> { val listeningMessage = stringResource(R.string.listening) LaunchedEffect(Unit) { textFieldState.setTextAndPlaceCursorAtEnd(listeningMessage) @@ -231,10 +322,12 @@ fun TodoItem(task: Todo, onToggle: () -> Unit, onDelete: () -> Unit) { ) IconButton( onClick = onDelete, - modifier = Modifier.background( - color = MaterialTheme.colorScheme.surfaceContainerHighest, - shape = RoundedCornerShape(10.dp), - ).size(32.dp), + modifier = Modifier + .background( + color = MaterialTheme.colorScheme.surfaceContainerHighest, + shape = RoundedCornerShape(10.dp), + ) + .size(32.dp), ) { Icon( painterResource(com.android.ai.uicomponent.R.drawable.ic_delete), @@ -243,4 +336,4 @@ fun TodoItem(task: Todo, onToggle: () -> Unit, onDelete: () -> Unit) { ) } } -} +} \ No newline at end of file diff --git a/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/ui/TodoScreenUiState.kt b/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/ui/TodoScreenUiState.kt index fc95618b..57898483 100644 --- a/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/ui/TodoScreenUiState.kt +++ b/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/ui/TodoScreenUiState.kt @@ -5,7 +5,7 @@ * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * - * https://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, @@ -15,20 +15,20 @@ */ package com.android.ai.samples.geminilivetodo.ui +import androidx.compose.runtime.Immutable import com.android.ai.samples.geminilivetodo.data.Todo +@Immutable sealed interface TodoScreenUiState { data object Initial : TodoScreenUiState data class Success( - val todos: List = emptyList(), + val todoItems: List = emptyList(), + val isMicOn: Boolean = false, val liveSessionState: LiveSessionState, ) : TodoScreenUiState - data class Error( - val todos: List = emptyList(), - val liveSessionState: LiveSessionState, - ) : TodoScreenUiState + data object Error : TodoScreenUiState } sealed interface LiveSessionState { @@ -36,4 +36,4 @@ sealed interface LiveSessionState { data object Ready : LiveSessionState data object Running : LiveSessionState data object Error : LiveSessionState -} +} \ No newline at end of file diff --git a/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/ui/TodoScreenViewModel.kt b/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/ui/TodoScreenViewModel.kt index 7106da2a..7a80ede1 100644 --- a/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/ui/TodoScreenViewModel.kt +++ b/samples/gemini-live-todo/src/main/java/com/android/ai/samples/geminilivetodo/ui/TodoScreenViewModel.kt @@ -5,7 +5,7 @@ * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * - * https://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, @@ -13,17 +13,20 @@ * See the License for the specific language governing permissions and * limitations under the License. */ + package com.android.ai.samples.geminilivetodo.ui import android.Manifest -import android.annotation.SuppressLint import android.app.Activity import android.content.pm.PackageManager import android.util.Log +import androidx.annotation.RequiresPermission import androidx.core.app.ActivityCompat import androidx.core.content.ContextCompat import androidx.lifecycle.ViewModel import androidx.lifecycle.viewModelScope +import com.android.ai.samples.geminilivetodo.data.MicControl +import com.android.ai.samples.geminilivetodo.data.Todo import com.android.ai.samples.geminilivetodo.data.TodoRepository import com.google.firebase.Firebase import com.google.firebase.ai.ai @@ -41,30 +44,50 @@ import com.google.firebase.ai.type.Voice import com.google.firebase.ai.type.content import com.google.firebase.ai.type.liveGenerationConfig import dagger.hilt.android.lifecycle.HiltViewModel +import java.lang.ref.WeakReference import javax.inject.Inject +import kotlinx.coroutines.CancellationException import kotlinx.coroutines.flow.MutableStateFlow import kotlinx.coroutines.flow.SharingStarted import kotlinx.coroutines.flow.StateFlow import kotlinx.coroutines.flow.combine import kotlinx.coroutines.flow.stateIn +import kotlinx.coroutines.flow.update import kotlinx.coroutines.launch import kotlinx.serialization.json.JsonObject import kotlinx.serialization.json.JsonPrimitive -import kotlinx.serialization.json.int import kotlinx.serialization.json.jsonPrimitive -import kotlinx.serialization.json.long +import kotlinx.serialization.json.int + +private const val MIC_TODO_ID = 111 +private const val MIC_STATUS_TODO_ID = -999 @OptIn(PublicPreviewAPI::class) @HiltViewModel class TodoScreenViewModel @Inject constructor(private val todoRepository: TodoRepository) : ViewModel() { private val TAG = "TodoScreenViewModel" private var session: LiveSession? = null + private var hostActivityRef: WeakReference? = null private val liveSessionState = MutableStateFlow(LiveSessionState.NotReady) private val todos = todoRepository.todos - val uiState: StateFlow = combine(liveSessionState, todos) { liveSessionState, todos -> - TodoScreenUiState.Success(todos, liveSessionState) + val uiState: StateFlow = combine(liveSessionState, todos) { liveSessionState, currentTodos -> + + + val micItem = currentTodos.filterIsInstance().firstOrNull() + val isMicOn = micItem?.isMicOn ?: false + + val todoItems = currentTodos + .filterIsInstance() + .filterNot { it.id == MIC_STATUS_TODO_ID } + .reversed() + + TodoScreenUiState.Success( + todoItems = todoItems, + isMicOn = isMicOn, + liveSessionState = liveSessionState + ) }.stateIn( scope = viewModelScope, started = SharingStarted.WhileSubscribed(5000L), @@ -76,40 +99,107 @@ class TodoScreenViewModel @Inject constructor(private val todoRepository: TodoRe } fun removeTodo(todoId: Int) { + if (todoId == MIC_TODO_ID || todoId == MIC_STATUS_TODO_ID) return todoRepository.removeTodo(todoId) } fun toggleTodoStatus(todoId: Int) { + if (todoId == MIC_TODO_ID) { + todoRepository.toggleTodoStatus(MIC_TODO_ID) + return + } + if (todoId == MIC_STATUS_TODO_ID) return todoRepository.toggleTodoStatus(todoId) } - @SuppressLint("MissingPermission") - fun toggleLiveSession(activity: Activity) { + @RequiresPermission(Manifest.permission.RECORD_AUDIO) + private fun startLiveSession() { + val activity = hostActivityRef?.get() ?: run { + Log.e(TAG, "Cannot start Live Session: Host Activity reference lost.") + todoRepository.updateMicStatus(micIsOn = false) + return + } + viewModelScope.launch { if (liveSessionState.value is LiveSessionState.NotReady) return@launch - session?.let { - if (liveSessionState.value is LiveSessionState.Ready) { - if (ContextCompat.checkSelfPermission( - activity, - Manifest.permission.RECORD_AUDIO, - ) == PackageManager.PERMISSION_GRANTED - ) { - it.startAudioConversation(::handleFunctionCall) - liveSessionState.value = LiveSessionState.Running + session?.let { currentSession -> + if (ContextCompat.checkSelfPermission( + activity, + Manifest.permission.RECORD_AUDIO, + ) == PackageManager.PERMISSION_GRANTED + ) { + try { + liveSessionState.update { LiveSessionState.Running } + Log.i(TAG, "API Sync: Live Session Started.") + currentSession.startAudioConversation(::handleFunctionCall) + } + + catch (e: CancellationException) { + throw e + } + catch (e: Exception) { + Log.e(TAG, "Error starting Live Session: ${e.message}", e) + todoRepository.updateMicStatus(micIsOn = false) + liveSessionState.update { LiveSessionState.Ready } } } else { - it.stopAudioConversation() - liveSessionState.value = LiveSessionState.Ready + requestAudioPermissionIfNeeded(activity) + todoRepository.updateMicStatus(micIsOn = false) } } } } + private fun stopLiveSession() { + viewModelScope.launch { + session?.let { currentSession -> + if (liveSessionState.value is LiveSessionState.Running) { + try { + currentSession.stopAudioConversation() + liveSessionState.update { LiveSessionState.Ready } + Log.i(TAG, "API Sync: Live Session Stopped.") + } + catch (e: CancellationException) { + throw e + } + catch (e: Exception) { + Log.e(TAG, "Error stopping Live Session: ${e.message}", e) + liveSessionState.update { LiveSessionState.Ready } + } + } + } + } + } + + fun toggleLiveSession(activity: Activity) { + todoRepository.toggleTodoStatus(MIC_TODO_ID) + } + fun initializeGeminiLive(activity: Activity) { + hostActivityRef = WeakReference(activity) requestAudioPermissionIfNeeded(activity) + + viewModelScope.launch { + todoRepository.todos.collect @androidx.annotation.RequiresPermission(android.Manifest.permission.RECORD_AUDIO) { todos -> + val isMicOnInUI = todos.find { it.id == MIC_TODO_ID } + ?.let { it as? MicControl }?.isMicOn ?: false + + val currentLiveStatus = liveSessionState.value is LiveSessionState.Running + + if (isMicOnInUI != currentLiveStatus) { + if (isMicOnInUI) { + startLiveSession() + } else { + stopLiveSession() + } + } + } + } + viewModelScope.launch { Log.d(TAG, "Start Gemini Live initialization") + val liveGenerationConfig = liveGenerationConfig { speechConfig = SpeechConfig(voice = Voice("FENRIR")) responseModality = ResponseModality.AUDIO @@ -159,9 +249,8 @@ class TodoScreenViewModel @Inject constructor(private val todoRepository: TodoRe emptyMap(), ) - // See https://firebase.google.com/docs/ai-logic/live-api for an overview of available models - val generativeModel = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel( - "gemini-2.5-flash-native-audio-preview-09-2025", + val generativeModel = Firebase.ai(backend = GenerativeBackend.vertexAI()).liveModel( + "gemini-live-2.5-flash-preview-native-audio-09-2025", generationConfig = liveGenerationConfig, systemInstruction = systemInstruction, tools = listOf( @@ -171,21 +260,32 @@ class TodoScreenViewModel @Inject constructor(private val todoRepository: TodoRe ), ) + todoRepository.updateMicStatus(micIsOn = false) + try { session = generativeModel.connect() - } catch (e: Exception) { + liveSessionState.update { LiveSessionState.Ready } + + todoRepository.updateMicStatus(micIsOn = false) + Log.i(TAG, "MIC STATE UPDATE: Session connected (LiveSessionState.Ready).") + } + // Change: Rethrow CancellationException so the coroutine cancels properly + catch (e: CancellationException) { + throw e + } + catch (e: Exception) { Log.e(TAG, "Error connecting to the model", e) - liveSessionState.value = LiveSessionState.Error + liveSessionState.update { LiveSessionState.Error } + todoRepository.updateMicStatus(micIsOn = false) + Log.i(TAG, "MIC STATE UPDATE: Connection Error (LiveSessionState.Error).") } - - liveSessionState.value = LiveSessionState.Ready } } private fun handleFunctionCall(functionCall: FunctionCallPart): FunctionResponsePart { return when (functionCall.name) { "getTodoList" -> { - val todoList = todoRepository.getTodoList().reversed() + val todoList = todoRepository.getTodoList().filterNot { it.id == MIC_STATUS_TODO_ID }.reversed() val response = JsonObject( mapOf( "success" to JsonPrimitive(true), @@ -196,48 +296,25 @@ class TodoScreenViewModel @Inject constructor(private val todoRepository: TodoRe } "addTodo" -> { val taskDescription = functionCall.args["taskDescription"]!!.jsonPrimitive.content - val id = todoRepository.addTodo(taskDescription) - - if (id!=null) { - val response = JsonObject( - mapOf( - "success" to JsonPrimitive(true), - "message" to JsonPrimitive("Task $taskDescription added to the todo list (id: $id)"), - ), - ) - FunctionResponsePart(functionCall.name, response, functionCall.id) - } else { - val response = JsonObject( - mapOf( - "success" to JsonPrimitive(false), - "message" to JsonPrimitive("Task $taskDescription wasn't properly added to the list"), - ), - ) - FunctionResponsePart(functionCall.name, response, functionCall.id) - } - + todoRepository.addTodo(taskDescription) + val response = JsonObject( + mapOf( + "success" to JsonPrimitive(true), + "message" to JsonPrimitive("Task $taskDescription added to the todo list"), + ), + ) + FunctionResponsePart(functionCall.name, response, functionCall.id) } "removeTodo" -> { - try { - val taskId = functionCall.args["todoId"]!!.jsonPrimitive.int - todoRepository.removeTodo(taskId) - val response = JsonObject( - mapOf( - "success" to JsonPrimitive(true), - "message" to JsonPrimitive("Task was removed from the todo list"), - ), - ) - FunctionResponsePart(functionCall.name, response, functionCall.id) - } catch (e: Exception) { - val response = JsonObject( - mapOf( - "success" to JsonPrimitive(false), - "message" to JsonPrimitive("Something went wrong: ${e.message}"), - ), - ) - FunctionResponsePart(functionCall.name, response, functionCall.id) - } - + val taskId = functionCall.args["todoId"]!!.jsonPrimitive.int + todoRepository.removeTodo(taskId) + val response = JsonObject( + mapOf( + "success" to JsonPrimitive(true), + "message" to JsonPrimitive("Task was removed from the todo list"), + ), + ) + FunctionResponsePart(functionCall.name, response, functionCall.id) } "toggleTodoStatus" -> { val taskId = functionCall.args["todoId"]!!.jsonPrimitive.int @@ -268,4 +345,4 @@ class TodoScreenViewModel @Inject constructor(private val todoRepository: TodoRe ActivityCompat.requestPermissions(activity, arrayOf(Manifest.permission.RECORD_AUDIO), 1) } } -} +} \ No newline at end of file diff --git a/samples/gemini-live-todo/src/main/res/values/strings.xml b/samples/gemini-live-todo/src/main/res/values/strings.xml index 44a131ff..edbd7dfd 100644 --- a/samples/gemini-live-todo/src/main/res/values/strings.xml +++ b/samples/gemini-live-todo/src/main/res/values/strings.xml @@ -14,9 +14,8 @@ ~ limitations under the License. ~ --> - - Gemini Live Todo + Gemini Live API Todo Simple ToDo app using the Gemini Live API to interact with the items in the list. New Task Add @@ -26,4 +25,17 @@ Button to start the live session and interact with the todo list by voice See code I am listening... + Mic Status + Microphone Status: On + Microphone Status: Off + Turn off microphone + Turn on microphone + Exit app + We need microphone access to continue to the main experience. + Permissions Denied. Please grant Audio access on the host phone to proceed. + + Loading Todo List... + Error loading list. Please try again. + Completed + Pending \ No newline at end of file diff --git a/samples/gemini-video-metadata-creation/src/main/java/com/android/ai/samples/geminivideometadatacreation/util/VideoList.kt b/samples/gemini-video-metadata-creation/src/main/java/com/android/ai/samples/geminivideometadatacreation/util/VideoList.kt index e4775935..dbba65be 100644 --- a/samples/gemini-video-metadata-creation/src/main/java/com/android/ai/samples/geminivideometadatacreation/util/VideoList.kt +++ b/samples/gemini-video-metadata-creation/src/main/java/com/android/ai/samples/geminivideometadatacreation/util/VideoList.kt @@ -27,25 +27,27 @@ data class VideoItem( val uri: Uri, ) +const val VIDEO_BASE_URL = "https://storage.googleapis.com/androiddevelopers/samples_assets" + val sampleVideoList = listOf( VideoItem( R.string.video_title_android_spotlight_shorts, - "https://storage.googleapis.com/exoplayer-test-media-0/shorts_android_developers/shorts_10.mp4".toUri(), + "$VIDEO_BASE_URL/exoplayer-test-media-0/shorts_android_developers/shorts_10.mp4".toUri(), ), VideoItem( R.string.video_title_big_buck_bunny, - "https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4".toUri(), + "$VIDEO_BASE_URL/gtv-videos-bucket/sample/BigBuckBunny.mp4".toUri(), ), VideoItem( R.string.video_title_tears_of_steel, - "https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/TearsOfSteel.mp4".toUri(), + "$VIDEO_BASE_URL/gtv-videos-bucket/sample/TearsOfSteel.mp4".toUri(), ), VideoItem( R.string.video_title_for_bigger_blazes, - "https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerBlazes.mp4".toUri(), + "$VIDEO_BASE_URL/gtv-videos-bucket/sample/ForBiggerBlazes.mp4".toUri(), ), VideoItem( R.string.video_title_for_bigger_escape, - "https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerEscapes.mp4".toUri(), + "$VIDEO_BASE_URL/gtv-videos-bucket/sample/ForBiggerEscapes.mp4".toUri(), ), ) diff --git a/samples/gemini-video-summarization/src/main/java/com/android/ai/samples/geminivideosummary/util/VideoList.kt b/samples/gemini-video-summarization/src/main/java/com/android/ai/samples/geminivideosummary/util/VideoList.kt index 6cba302b..87dee86b 100644 --- a/samples/gemini-video-summarization/src/main/java/com/android/ai/samples/geminivideosummary/util/VideoList.kt +++ b/samples/gemini-video-summarization/src/main/java/com/android/ai/samples/geminivideosummary/util/VideoList.kt @@ -27,14 +27,16 @@ data class VideoItem( val uri: Uri, ) +const val VIDEO_BASE_URL = "https://storage.googleapis.com/androiddevelopers/samples_assets" + val sampleVideoList = listOf( VideoItem( R.string.video_title_big_buck_bunny, - "https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4".toUri(), + "$VIDEO_BASE_URL/gtv-videos-bucket/sample/BigBuckBunny.mp4".toUri(), ), VideoItem( R.string.video_title_android_spotlight_shorts, - "https://storage.googleapis.com/exoplayer-test-media-0/shorts_android_developers/shorts_10.mp4".toUri(), + "$VIDEO_BASE_URL/exoplayer-test-media-0/shorts_android_developers/shorts_10.mp4".toUri(), ), VideoItem( R.string.video_title_youtube_google_tv, @@ -42,14 +44,14 @@ val sampleVideoList = listOf( ), VideoItem( R.string.video_title_tears_of_steel, - "https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/TearsOfSteel.mp4".toUri(), + "$VIDEO_BASE_URL/gtv-videos-bucket/sample/TearsOfSteel.mp4".toUri(), ), VideoItem( R.string.video_title_for_bigger_blazes, - "https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerBlazes.mp4".toUri(), + "$VIDEO_BASE_URL/gtv-videos-bucket/sample/ForBiggerBlazes.mp4".toUri(), ), VideoItem( R.string.video_title_for_bigger_escape, - "https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerEscapes.mp4".toUri(), + "$VIDEO_BASE_URL/gtv-videos-bucket/sample/ForBiggerEscapes.mp4".toUri(), ), ) diff --git a/samples/imagen-editing/.gitignore b/samples/imagen-editing/.gitignore deleted file mode 100644 index 42afabfd..00000000 --- a/samples/imagen-editing/.gitignore +++ /dev/null @@ -1 +0,0 @@ -/build \ No newline at end of file diff --git a/samples/imagen-editing/README.md b/samples/imagen-editing/README.md deleted file mode 100644 index 04cf41b2..00000000 --- a/samples/imagen-editing/README.md +++ /dev/null @@ -1,37 +0,0 @@ -# Imagen Image Editing Sample - -This sample is part of the [AI Sample Catalog](../../). To build and run this sample, you should clone the entire repository. - -## Description - -This sample demonstrates how to edit images using the Imagen editing model. Users can generate an image, then draw a mask on it and provide a text prompt to inpaint (fill in) the masked area, showcasing advanced image manipulation capabilities with Imagen. - -
-Imagen Image Editing in action -
- -## How it works - -The application uses the Firebase AI SDK (see [How to run](../../#how-to-run)) for Android to interact with Imagen. The core logic is in the [`ImagenEditingDataSource.kt`](https://github.com/android/ai-samples/blob/main/samples/imagen-editing/src/main/java/com/android/ai/samples/imagenediting/data/ImagenEditingDataSource.kt) file. It first generates a base image using the generation model. Then, for editing, it takes the source image, a user-drawn mask, and a text prompt, and sends them to the editing model's `editImage` method to perform inpainting. - -Here is the key snippet of code that performs inpainting from [`ImagenEditingDataSource.kt`](./src/main/java/com/android/ai/samples/imagenediting/data/ImagenEditingDataSource.kt): - -```kotlin -@OptIn(PublicPreviewAPI::class) -suspend fun inpaintImageWithMask(sourceImage: Bitmap, maskImage: Bitmap, prompt: String, editSteps: Int = DEFAULT_EDIT_STEPS): Bitmap { - val imageResponse = editingModel.editImage( - referenceImages = listOf( - ImagenRawImage(sourceImage.toImagenInlineImage()), - ImagenRawMask(maskImage.toImagenInlineImage()), - ), - prompt = prompt, - config = ImagenEditingConfig( - editMode = ImagenEditMode.INPAINT_INSERTION, - editSteps = editSteps, - ), - ) - return imageResponse.images.first().asBitmap() -} -``` - -Read more about [Imagen](https://developer.android.com/ai/imagen) in the Android Documentation. diff --git a/samples/imagen-editing/imagen_editing.png b/samples/imagen-editing/imagen_editing.png deleted file mode 100644 index d603efa2..00000000 Binary files a/samples/imagen-editing/imagen_editing.png and /dev/null differ diff --git a/samples/imagen-editing/src/main/java/com/android/ai/samples/imagenediting/data/ImagenEditingDataSource.kt b/samples/imagen-editing/src/main/java/com/android/ai/samples/imagenediting/data/ImagenEditingDataSource.kt deleted file mode 100644 index 90559012..00000000 --- a/samples/imagen-editing/src/main/java/com/android/ai/samples/imagenediting/data/ImagenEditingDataSource.kt +++ /dev/null @@ -1,123 +0,0 @@ -/* - * Copyright 2025 The Android Open Source Project - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * https://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package com.android.ai.samples.imagenediting.data - -import android.graphics.Bitmap -import com.google.firebase.Firebase -import com.google.firebase.ai.ai -import com.google.firebase.ai.type.GenerativeBackend -import com.google.firebase.ai.type.ImagenAspectRatio -import com.google.firebase.ai.type.ImagenEditMode -import com.google.firebase.ai.type.ImagenEditingConfig -import com.google.firebase.ai.type.ImagenGenerationConfig -import com.google.firebase.ai.type.ImagenImageFormat -import com.google.firebase.ai.type.ImagenRawImage -import com.google.firebase.ai.type.ImagenRawMask -import com.google.firebase.ai.type.PublicPreviewAPI -import com.google.firebase.ai.type.toImagenInlineImage -import javax.inject.Inject -import javax.inject.Singleton - -/** - * A data source that provides methods for interacting with the Firebase Imagen API - * for various image generation and editing tasks. - * - * This class encapsulates the logic for initializing Imagen models and calling - * their respective functions for image generation, inpainting, outpainting, and style transfer. - * It leverages the Firebase AI SDK for seamless integration with Vertex AI backends. - * - * Note: This class uses `@OptIn(PublicPreviewAPI::class)` as Imagen features - * are currently in public preview. - */ -@Singleton -class ImagenEditingDataSource @Inject constructor() { - private companion object { - const val IMAGEN_MODEL_NAME = "imagen-4.0-ultra-generate-001" - const val IMAGEN_EDITING_MODEL_NAME = "imagen-3.0-capability-001" - const val DEFAULT_EDIT_STEPS = 50 - const val DEFAULT_STYLE_STRENGTH = 1 - } - - @OptIn(PublicPreviewAPI::class) - private val imagenModel = - Firebase.ai(backend = GenerativeBackend.vertexAI()).imagenModel( - IMAGEN_MODEL_NAME, - generationConfig = ImagenGenerationConfig( - numberOfImages = 1, - aspectRatio = ImagenAspectRatio.SQUARE_1x1, - imageFormat = ImagenImageFormat.jpeg(compressionQuality = 75), - ), - ) - - @OptIn(PublicPreviewAPI::class) - private val editingModel = - Firebase.ai(backend = GenerativeBackend.vertexAI()).imagenModel( - IMAGEN_EDITING_MODEL_NAME, - generationConfig = ImagenGenerationConfig( - numberOfImages = 1, - aspectRatio = ImagenAspectRatio.SQUARE_1x1, - imageFormat = ImagenImageFormat.jpeg(compressionQuality = 75), - ), - ) - - /** - * Generates an image based on the provided prompt. - * - * This function uses the Imagen model to generate an image from a textual description. - * It returns the generated image as a Bitmap. - * - * @param prompt The textual description to generate the image from. - * @return The generated image as a [Bitmap]. - * @throws Exception if the image generation fails. - */ - @OptIn(PublicPreviewAPI::class) - suspend fun generateImage(prompt: String): Bitmap { - val imageResponse = imagenModel.generateImages( - prompt = prompt, - ) - val image = imageResponse.images.first() - return image.asBitmap() - } - - /** - * Performs inpainting on a source image using a provided mask and prompt. - * - * This function utilizes the Imagen editing model to fill in the masked areas - * of the source image based on the textual prompt. - * - * @param sourceImage The original image to be inpainted. - * @param maskImage A bitmap representing the mask, where white areas indicate - * regions to be inpainted and black areas indicate regions to be preserved. - * @param prompt A textual description of what should be generated in the masked areas. - * @param editSteps The number of editing steps to perform. Defaults to `DEFAULT_EDIT_STEPS`. - * @return A [Bitmap] representing the inpainted image. - */ - @OptIn(PublicPreviewAPI::class) - suspend fun inpaintImageWithMask(sourceImage: Bitmap, maskImage: Bitmap, prompt: String, editSteps: Int = DEFAULT_EDIT_STEPS): Bitmap { - val imageResponse = editingModel.editImage( - referenceImages = listOf( - ImagenRawImage(sourceImage.toImagenInlineImage()), - ImagenRawMask(maskImage.toImagenInlineImage()), - ), - prompt = prompt, - config = ImagenEditingConfig( - editMode = ImagenEditMode.INPAINT_INSERTION, - editSteps = editSteps, - ), - ) - return imageResponse.images.first().asBitmap() - } -} diff --git a/samples/imagen-editing/src/main/java/com/android/ai/samples/imagenediting/ui/ImagenEditingMaskEditor.kt b/samples/imagen-editing/src/main/java/com/android/ai/samples/imagenediting/ui/ImagenEditingMaskEditor.kt deleted file mode 100644 index 6d66f321..00000000 --- a/samples/imagen-editing/src/main/java/com/android/ai/samples/imagenediting/ui/ImagenEditingMaskEditor.kt +++ /dev/null @@ -1,188 +0,0 @@ -/* - * Copyright 2025 The Android Open Source Project - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * https://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package com.android.ai.samples.imagenediting.ui - -import android.graphics.Bitmap -import android.graphics.Paint -import androidx.compose.foundation.Canvas -import androidx.compose.foundation.Image -import androidx.compose.foundation.background -import androidx.compose.foundation.clickable -import androidx.compose.foundation.gestures.detectDragGestures -import androidx.compose.foundation.layout.Box -import androidx.compose.foundation.layout.Column -import androidx.compose.foundation.layout.Row -import androidx.compose.foundation.layout.fillMaxSize -import androidx.compose.foundation.layout.fillMaxWidth -import androidx.compose.foundation.layout.padding -import androidx.compose.foundation.shape.RoundedCornerShape -import androidx.compose.material.icons.Icons -import androidx.compose.material.icons.automirrored.filled.Undo -import androidx.compose.material.icons.filled.Check -import androidx.compose.material.icons.filled.Delete -import androidx.compose.material3.Icon -import androidx.compose.material3.MaterialTheme -import androidx.compose.runtime.Composable -import androidx.compose.runtime.getValue -import androidx.compose.runtime.mutableFloatStateOf -import androidx.compose.runtime.mutableStateListOf -import androidx.compose.runtime.mutableStateOf -import androidx.compose.runtime.remember -import androidx.compose.runtime.setValue -import androidx.compose.ui.Alignment -import androidx.compose.ui.Modifier -import androidx.compose.ui.geometry.Offset -import androidx.compose.ui.graphics.Color -import androidx.compose.ui.graphics.Path -import androidx.compose.ui.graphics.StrokeCap -import androidx.compose.ui.graphics.StrokeJoin -import androidx.compose.ui.graphics.asAndroidPath -import androidx.compose.ui.graphics.asImageBitmap -import androidx.compose.ui.graphics.drawscope.Stroke -import androidx.compose.ui.graphics.drawscope.withTransform -import androidx.compose.ui.input.pointer.pointerInput -import androidx.compose.ui.layout.ContentScale -import androidx.compose.ui.res.stringResource -import androidx.compose.ui.unit.dp -import androidx.core.graphics.createBitmap -import com.android.ai.samples.imagenediting.R -import kotlin.math.min - -@Composable -fun ImagenEditingMaskEditor(sourceBitmap: Bitmap, onMaskFinalized: (Bitmap) -> Unit, onCancel: () -> Unit, modifier: Modifier = Modifier) { - val paths = remember { mutableStateListOf() } - var currentPath by remember { mutableStateOf(null) } - var scale by remember { mutableFloatStateOf(1f) } - var offsetX by remember { mutableFloatStateOf(0f) } - var offsetY by remember { mutableFloatStateOf(0f) } - - Box(modifier = modifier) { - Column( - modifier = Modifier.fillMaxSize(), - horizontalAlignment = Alignment.CenterHorizontally, - ) { - Box( - modifier = Modifier - .weight(1f) - .fillMaxWidth() - .pointerInput(Unit) { - detectDragGestures( - onDragStart = { startOffset -> - val transformedStart = Offset( - (startOffset.x - offsetX) / scale, - (startOffset.y - offsetY) / scale, - ) - currentPath = Path().apply { moveTo(transformedStart.x, transformedStart.y) } - }, - onDrag = { change, _ -> - currentPath?.let { - val transformedChange = Offset( - (change.position.x - offsetX) / scale, - (change.position.y - offsetY) / scale, - ) - it.lineTo(transformedChange.x, transformedChange.y) - currentPath = Path().apply { addPath(it) } - } - change.consume() - }, - onDragEnd = { - currentPath?.let { paths.add(it) } - currentPath = null - }, - ) - }, - ) { - Image( - bitmap = sourceBitmap.asImageBitmap(), - contentDescription = stringResource(R.string.editing_image_to_mask), - modifier = Modifier.fillMaxSize(), - contentScale = ContentScale.Crop, - ) - Canvas(modifier = Modifier.fillMaxSize()) { - val canvasWidth = size.width - val canvasHeight = size.height - val bitmapWidth = sourceBitmap.width.toFloat() - val bitmapHeight = sourceBitmap.height.toFloat() - scale = min(canvasWidth / bitmapWidth, canvasHeight / bitmapHeight) - offsetX = (canvasWidth - bitmapWidth * scale) / 2 - offsetY = (canvasHeight - bitmapHeight * scale) / 2 - withTransform( - { - translate(left = offsetX, top = offsetY) - scale(scale, scale, pivot = Offset.Zero) - }, - ) { - val strokeWidth = 70f / scale - val stroke = Stroke(width = strokeWidth, cap = StrokeCap.Round, join = StrokeJoin.Round) - val pathColor = Color.White.copy(alpha = 0.5f) - paths.forEach { path -> - drawPath(path = path, color = pathColor, style = stroke) - } - currentPath?.let { path -> - drawPath(path = path, color = pathColor, style = stroke) - } - } - } - - Row( - modifier = Modifier - .padding(16.dp) - .align(Alignment.BottomEnd) - .background(color = MaterialTheme.colorScheme.surfaceContainer, shape = RoundedCornerShape(20.dp)), - ) { - Icon( - Icons.Default.Delete, - contentDescription = stringResource(R.string.cancel_masking), - modifier = Modifier - .padding(10.dp) - .clickable(true) { - onCancel() - }, - ) - Icon( - Icons.AutoMirrored.Filled.Undo, - contentDescription = stringResource(R.string.undo_the_mask), - modifier = Modifier - .padding(10.dp) - .clickable(true) { - if (paths.isNotEmpty()) paths.removeAt(paths.lastIndex) - }, - ) - Icon( - Icons.Default.Check, - contentDescription = stringResource(R.string.save_the_mask), - modifier = Modifier - .padding(10.dp) - .clickable(true) { - val maskBitmap = createBitmap(sourceBitmap.width, sourceBitmap.height) - val canvas = android.graphics.Canvas(maskBitmap) - val paint = Paint().apply { - color = android.graphics.Color.WHITE - strokeWidth = 70f - style = Paint.Style.STROKE - strokeCap = Paint.Cap.ROUND - strokeJoin = Paint.Join.ROUND - isAntiAlias = true - } - paths.forEach { path -> canvas.drawPath(path.asAndroidPath(), paint) } - onMaskFinalized(maskBitmap) - }, - ) - } - } - } - } -} diff --git a/samples/imagen-editing/src/main/java/com/android/ai/samples/imagenediting/ui/ImagenEditingScreen.kt b/samples/imagen-editing/src/main/java/com/android/ai/samples/imagenediting/ui/ImagenEditingScreen.kt deleted file mode 100644 index 9699e7aa..00000000 --- a/samples/imagen-editing/src/main/java/com/android/ai/samples/imagenediting/ui/ImagenEditingScreen.kt +++ /dev/null @@ -1,288 +0,0 @@ -/* - * Copyright 2025 The Android Open Source Project - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * https://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package com.android.ai.samples.imagenediting.ui - -import android.graphics.Bitmap -import android.graphics.BitmapFactory -import androidx.activity.compose.LocalOnBackPressedDispatcherOwner -import androidx.compose.foundation.Image -import androidx.compose.foundation.background -import androidx.compose.foundation.border -import androidx.compose.foundation.layout.Box -import androidx.compose.foundation.layout.BoxScope -import androidx.compose.foundation.layout.fillMaxHeight -import androidx.compose.foundation.layout.fillMaxSize -import androidx.compose.foundation.layout.fillMaxWidth -import androidx.compose.foundation.layout.height -import androidx.compose.foundation.layout.imePadding -import androidx.compose.foundation.layout.padding -import androidx.compose.foundation.layout.size -import androidx.compose.foundation.layout.width -import androidx.compose.foundation.layout.widthIn -import androidx.compose.foundation.shape.RoundedCornerShape -import androidx.compose.foundation.text.input.TextFieldState -import androidx.compose.foundation.text.input.rememberTextFieldState -import androidx.compose.material3.ContainedLoadingIndicator -import androidx.compose.material3.ExperimentalMaterial3Api -import androidx.compose.material3.ExperimentalMaterial3ExpressiveApi -import androidx.compose.material3.MaterialTheme -import androidx.compose.material3.Scaffold -import androidx.compose.material3.Text -import androidx.compose.runtime.Composable -import androidx.compose.runtime.getValue -import androidx.compose.runtime.remember -import androidx.compose.ui.Alignment -import androidx.compose.ui.Modifier -import androidx.compose.ui.draw.clip -import androidx.compose.ui.graphics.Color -import androidx.compose.ui.graphics.ColorFilter -import androidx.compose.ui.graphics.ImageShader -import androidx.compose.ui.graphics.ShaderBrush -import androidx.compose.ui.graphics.TileMode -import androidx.compose.ui.graphics.asImageBitmap -import androidx.compose.ui.layout.ContentScale -import androidx.compose.ui.platform.LocalContext -import androidx.compose.ui.platform.LocalSoftwareKeyboardController -import androidx.compose.ui.platform.SoftwareKeyboardController -import androidx.compose.ui.res.painterResource -import androidx.compose.ui.res.stringResource -import androidx.compose.ui.unit.dp -import androidx.hilt.navigation.compose.hiltViewModel -import androidx.lifecycle.compose.collectAsStateWithLifecycle -import com.android.ai.samples.imagenediting.R -import com.android.ai.uicomponent.GenerateButton -import com.android.ai.uicomponent.SampleDetailTopAppBar -import com.android.ai.uicomponent.TextInput - -@OptIn(ExperimentalMaterial3Api::class) -@Composable -fun ImagenEditingScreen(viewModel: ImagenEditingViewModel = hiltViewModel()) { - val uiState: ImagenEditingUIState by viewModel.uiState.collectAsStateWithLifecycle() - val showMaskEditor: Boolean by viewModel.showMaskEditor.collectAsStateWithLifecycle() - val bitmapForMasking: Bitmap? by viewModel.bitmapForMasking.collectAsStateWithLifecycle() - - ImagenEditingScreenContent( - uiState = uiState, - showMaskEditor = showMaskEditor, - bitmapForMasking = bitmapForMasking, - onGenerateClick = viewModel::generateImage, - onInpaintClick = { source, mask, prompt -> viewModel.inpaintImage(source, mask, prompt) }, - onImageMaskReady = { source, mask -> viewModel.onImageMaskReady(source, mask) }, - onCancelMasking = viewModel::onCancelMasking, - modifier = Modifier.fillMaxSize(), - ) -} - -@Composable -@OptIn(ExperimentalMaterial3Api::class, ExperimentalMaterial3ExpressiveApi::class) -private fun ImagenEditingScreenContent( - uiState: ImagenEditingUIState, - showMaskEditor: Boolean, - bitmapForMasking: Bitmap?, - onGenerateClick: (String) -> Unit, - onInpaintClick: (source: Bitmap, mask: Bitmap, prompt: String) -> Unit, - onImageMaskReady: (source: Bitmap, mask: Bitmap) -> Unit, - onCancelMasking: () -> Unit, - modifier: Modifier = Modifier, -) { - val isGenerating = uiState is ImagenEditingUIState.Loading - val backDispatcher = LocalOnBackPressedDispatcherOwner.current?.onBackPressedDispatcher - - Scaffold( - containerColor = MaterialTheme.colorScheme.surface, - topBar = { - SampleDetailTopAppBar( - sampleName = stringResource(R.string.editing_title_image_generation_title), - sampleDescription = stringResource(R.string.editing_title_image_generation_subtitle), - sourceCodeUrl = "https://github.com/android/ai-samples/tree/main/samples/imagen-editing", - onBackClick = { backDispatcher?.onBackPressed() }, - ) - }, - modifier = Modifier.fillMaxWidth(), - ) { innerPadding -> - val context = LocalContext.current - val imageBitmap = remember { - val bitmap = BitmapFactory.decodeResource(context.resources, com.android.ai.uicomponent.R.drawable.img_fill) - bitmap.asImageBitmap() - } - val imageShader = remember { - ImageShader( - image = imageBitmap, - tileModeX = TileMode.Repeated, - tileModeY = TileMode.Repeated, - ) - } - - Box( - modifier = Modifier - .padding(innerPadding) - .fillMaxSize(), - contentAlignment = Alignment.Center, - ) { - Box( - Modifier - .padding(16.dp) - .imePadding() - .widthIn(max = 440.dp) - .fillMaxHeight(0.85f) - .border( - 1.dp, - MaterialTheme.colorScheme.outline, - shape = RoundedCornerShape(40.dp), - ) - .clip(RoundedCornerShape(40.dp)) - .background(ShaderBrush(imageShader)), - contentAlignment = Alignment.Center, - ) { - val keyboardController = LocalSoftwareKeyboardController.current - - when (uiState) { - is ImagenEditingUIState.Initial -> { - Text( - text = stringResource(R.string.generate_an_image_to_edit), - style = MaterialTheme.typography.bodyMedium, - modifier = Modifier - .padding(24.dp) - .align(Alignment.Center), - ) - - val textFieldState = rememberTextFieldState() - - TextField( - textFieldState, - isGenerating, - onGenerateClick, - keyboardController, - placeholder = stringResource(R.string.describe_the_image_to_generate), - ) - } - - is ImagenEditingUIState.Loading -> { - Box(modifier.fillMaxSize()) { - ContainedLoadingIndicator( - modifier = Modifier - .size(60.dp) - .align(Alignment.Center), - ) - } - } - - is ImagenEditingUIState.ImageGenerated -> { - if (showMaskEditor && bitmapForMasking != null) { - val textFieldState = rememberTextFieldState() - - ImagenEditingMaskEditor( - sourceBitmap = bitmapForMasking, - onMaskFinalized = { maskBitmap -> - onImageMaskReady(bitmapForMasking, maskBitmap) - }, - onCancel = { onCancelMasking() }, - modifier = Modifier.fillMaxSize(), - ) - - Text( - text = "Draw a mask on the image", - style = MaterialTheme.typography.bodyMedium, - modifier = Modifier - .padding(24.dp) - .align(Alignment.TopCenter) - .background(color = MaterialTheme.colorScheme.surfaceContainer), - ) - } else { - val textFieldState = rememberTextFieldState() - - Image( - bitmap = uiState.bitmap.asImageBitmap(), - contentDescription = uiState.contentDescription, - contentScale = ContentScale.Crop, - modifier = Modifier.fillMaxSize(), - ) - TextField( - textFieldState, - isGenerating, - onGenerateClick, - keyboardController, - placeholder = stringResource(R.string.describe_the_image_to_generate), - ) - } - } - - is ImagenEditingUIState.ImageMasked -> { - Box(modifier = Modifier.fillMaxSize()) { - Image( - bitmap = uiState.originalBitmap.asImageBitmap(), - contentDescription = stringResource(R.string.editing_generated_image), - modifier = Modifier.fillMaxSize(), - contentScale = ContentScale.Crop, - ) - Image( - bitmap = uiState.maskBitmap.asImageBitmap(), - contentDescription = stringResource(R.string.editing_generated_mask), - modifier = Modifier.fillMaxSize(), - contentScale = ContentScale.Fit, - colorFilter = ColorFilter.tint(Color.Red.copy(alpha = 0.5f)), - ) - } - val textFieldState = rememberTextFieldState() - - TextField( - textFieldState = textFieldState, - isGenerating = isGenerating, - onGenerateClick = { prompt -> onInpaintClick(uiState.originalBitmap, uiState.maskBitmap, prompt) }, - keyboardController, - placeholder = stringResource(R.string.describe_the_image_to_in_paint), - ) - } - - else -> {} - } - } - } - } -} - -@Composable -private fun BoxScope.TextField( - textFieldState: TextFieldState, - isGenerating: Boolean, - onGenerateClick: (String) -> Unit, - keyboardController: SoftwareKeyboardController?, - placeholder: String = "", -) { - TextInput( - state = textFieldState, - placeholder = placeholder, - primaryButton = { - GenerateButton( - text = "", - icon = painterResource(id = com.android.ai.uicomponent.R.drawable.ic_ai_img), - modifier = Modifier - .width(72.dp) - .height(55.dp) - .padding(4.dp), - enabled = !isGenerating, - onClick = { - onGenerateClick(textFieldState.text.toString()) - keyboardController?.hide() - }, - ) - }, - modifier = Modifier - .widthIn(max = 646.dp) - .padding(start = 10.dp, end = 10.dp, bottom = 10.dp) - .align(Alignment.BottomCenter), - ) -} diff --git a/samples/imagen-editing/src/main/java/com/android/ai/samples/imagenediting/ui/ImagenEditingUIState.kt b/samples/imagen-editing/src/main/java/com/android/ai/samples/imagenediting/ui/ImagenEditingUIState.kt deleted file mode 100644 index 0a37a73b..00000000 --- a/samples/imagen-editing/src/main/java/com/android/ai/samples/imagenediting/ui/ImagenEditingUIState.kt +++ /dev/null @@ -1,35 +0,0 @@ -/* - * Copyright 2025 The Android Open Source Project - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * https://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package com.android.ai.samples.imagenediting.ui - -import android.graphics.Bitmap - -sealed interface ImagenEditingUIState { - data object Initial : ImagenEditingUIState - data object Loading : ImagenEditingUIState - data class ImageGenerated( - val bitmap: Bitmap, - val contentDescription: String, - ) : ImagenEditingUIState - - data class ImageMasked( - val originalBitmap: Bitmap, - val maskBitmap: Bitmap, - val contentDescription: String, - ) : ImagenEditingUIState - - data class Error(val message: String?) : ImagenEditingUIState -} diff --git a/samples/imagen-editing/src/main/java/com/android/ai/samples/imagenediting/ui/ImagenEditingViewModel.kt b/samples/imagen-editing/src/main/java/com/android/ai/samples/imagenediting/ui/ImagenEditingViewModel.kt deleted file mode 100644 index ce6f8620..00000000 --- a/samples/imagen-editing/src/main/java/com/android/ai/samples/imagenediting/ui/ImagenEditingViewModel.kt +++ /dev/null @@ -1,93 +0,0 @@ -/* - * Copyright 2025 The Android Open Source Project - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * https://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package com.android.ai.samples.imagenediting.ui - -import android.graphics.Bitmap -import android.util.Log -import androidx.lifecycle.ViewModel -import androidx.lifecycle.viewModelScope -import com.android.ai.samples.imagenediting.data.ImagenEditingDataSource -import dagger.hilt.android.lifecycle.HiltViewModel -import javax.inject.Inject -import kotlinx.coroutines.flow.MutableStateFlow -import kotlinx.coroutines.flow.StateFlow -import kotlinx.coroutines.launch - -@HiltViewModel -class ImagenEditingViewModel @Inject constructor(private val imagenDataSource: ImagenEditingDataSource) : ViewModel() { - - private val _uiState: MutableStateFlow = MutableStateFlow(ImagenEditingUIState.Initial) - val uiState: StateFlow = _uiState - - private val _bitmapForMasking = MutableStateFlow(null) - val bitmapForMasking: StateFlow = _bitmapForMasking - - private val _showMaskEditor = MutableStateFlow(false) - val showMaskEditor: StateFlow = _showMaskEditor - - fun generateImage(prompt: String) { - _uiState.value = ImagenEditingUIState.Loading - viewModelScope.launch { - try { - val bitmap = imagenDataSource.generateImage(prompt) - - _bitmapForMasking.value = bitmap - _showMaskEditor.value = true - _uiState.value = ImagenEditingUIState.ImageGenerated(bitmap, contentDescription = prompt) - } catch (e: Exception) { - _uiState.value = ImagenEditingUIState.Error(e.message) - } - } - } - - fun inpaintImage(sourceImage: Bitmap, maskImage: Bitmap, prompt: String, editSteps: Int = 50) { - _uiState.value = ImagenEditingUIState.Loading - viewModelScope.launch { - try { - val inpaintedBitmap = imagenDataSource.inpaintImageWithMask( - sourceImage = sourceImage, - maskImage = maskImage, - prompt = prompt, - editSteps = editSteps, - ) - _uiState.value = ImagenEditingUIState.ImageGenerated( - bitmap = inpaintedBitmap, - contentDescription = "Inpainted image based on prompt: $prompt", - ) - } catch (e: Exception) { - _uiState.value = ImagenEditingUIState.Error(e.localizedMessage ?: "An unknown error occurred during inpainting") - } - } - } - - fun onImageMaskReady(originalBitmap: Bitmap, maskBitmap: Bitmap) { - val originalContentDescription = (_uiState.value as? ImagenEditingUIState.ImageGenerated)?.contentDescription ?: "Edited image" - _uiState.value = ImagenEditingUIState.ImageMasked( - originalBitmap = originalBitmap, - maskBitmap = maskBitmap, - contentDescription = originalContentDescription, - ) - _showMaskEditor.value = false - _bitmapForMasking.value = null - } - - fun onCancelMasking() { - Log.d("ImagenEditingViewModel", "onCancelMasking") - _showMaskEditor.value = false - _bitmapForMasking.value = null - _uiState.value = ImagenEditingUIState.Initial - } -} diff --git a/samples/imagen-editing/src/main/res/values/strings.xml b/samples/imagen-editing/src/main/res/values/strings.xml deleted file mode 100644 index 4f7efff1..00000000 --- a/samples/imagen-editing/src/main/res/values/strings.xml +++ /dev/null @@ -1,43 +0,0 @@ - - - - Generate - Generating… - Prompt - Mask edit prompt - Mask Edit - Inpaint - Enter a prompt and tap \"Generate\" to generate an image - Edit Image - Finalize Mask - Generate an image, then tap to draw a mask. - An image of dog working as a chef - An unknown error occurred. - Imagen Editing - Generate images with Imagen, Google\'s image generation model. - Image to be masked - The generated image - The generated mask - Draw a mask - Cancel masking - Undo the mask - Save the mask - describe the image to generate - Generate an image to edit - describe the image to in-paint - \ No newline at end of file diff --git a/samples/imagen/.gitignore b/samples/imagen/.gitignore deleted file mode 100644 index 42afabfd..00000000 --- a/samples/imagen/.gitignore +++ /dev/null @@ -1 +0,0 @@ -/build \ No newline at end of file diff --git a/samples/imagen/README.md b/samples/imagen/README.md deleted file mode 100644 index 9adfa02e..00000000 --- a/samples/imagen/README.md +++ /dev/null @@ -1,30 +0,0 @@ -# Imagen Image Generation Sample - -This sample is part of the [AI Sample Catalog](../../). To build and run this sample, you should clone the entire repository. - -## Description - -This sample demonstrates how to generate images from text prompts using the Imagen model. Users can input a text description, and the generative model will create an image based on that prompt, showcasing the power of text-to-image generation with Imagen. - -
-Imagen Image Generation in action -
- -## How it works - -The application uses the Firebase AI SDK (see [How to run](../../#how-to-run)) for Android to interact with Imagen. The core logic is in the [`ImagenDataSource.kt`](./src/main/java/com/android/ai/samples/imagen/data/ImagenDataSource.kt) file. An `imagenModel` is initialized with specific generation configurations (e.g., number of images, aspect ratio, image format). When a user provides a text prompt, it's passed to the `generateImages` method, which returns the generated image as a bitmap. - -Here is the key snippet of code that calls the generative model from [`ImagenDataSource.kt`](./src/main/java/com/android/ai/samples/imagen/data/ImagenDataSource.kt): - -```kotlin -@OptIn(PublicPreviewAPI::class) -suspend fun generateImage(prompt: String): Bitmap { - val imageResponse = imagenModel.generateImages( - prompt = prompt, - ) - val image = imageResponse.images.first() - return image.asBitmap() -} -``` - -Read more about [Imagen](https://developer.android.com/ai/imagen) in the Android Documentation. diff --git a/samples/imagen/imagen_image_generation.png b/samples/imagen/imagen_image_generation.png deleted file mode 100644 index af4d3ad2..00000000 Binary files a/samples/imagen/imagen_image_generation.png and /dev/null differ diff --git a/samples/imagen/proguard-rules.pro b/samples/imagen/proguard-rules.pro deleted file mode 100644 index 481bb434..00000000 --- a/samples/imagen/proguard-rules.pro +++ /dev/null @@ -1,21 +0,0 @@ -# Add project specific ProGuard rules here. -# You can control the set of applied configuration files using the -# proguardFiles setting in build.gradle. -# -# For more details, see -# http://developer.android.com/guide/developing/tools/proguard.html - -# If your project uses WebView with JS, uncomment the following -# and specify the fully qualified class name to the JavaScript interface -# class: -#-keepclassmembers class fqcn.of.javascript.interface.for.webview { -# public *; -#} - -# Uncomment this to preserve the line number information for -# debugging stack traces. -#-keepattributes SourceFile,LineNumberTable - -# If you keep the line number information, uncomment this to -# hide the original source file name. -#-renamesourcefileattribute SourceFile \ No newline at end of file diff --git a/samples/imagen/src/main/java/com/android/ai/samples/imagen/data/ImagenDataSource.kt b/samples/imagen/src/main/java/com/android/ai/samples/imagen/data/ImagenDataSource.kt deleted file mode 100644 index f5c15694..00000000 --- a/samples/imagen/src/main/java/com/android/ai/samples/imagen/data/ImagenDataSource.kt +++ /dev/null @@ -1,49 +0,0 @@ -/* - * Copyright 2025 The Android Open Source Project - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * https://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package com.android.ai.samples.imagen.data - -import android.graphics.Bitmap -import com.google.firebase.Firebase -import com.google.firebase.ai.ai -import com.google.firebase.ai.type.GenerativeBackend -import com.google.firebase.ai.type.ImagenAspectRatio -import com.google.firebase.ai.type.ImagenGenerationConfig -import com.google.firebase.ai.type.ImagenImageFormat -import com.google.firebase.ai.type.PublicPreviewAPI -import javax.inject.Inject -import javax.inject.Singleton - -@Singleton -class ImagenDataSource @Inject constructor() { - @OptIn(PublicPreviewAPI::class) - private val imagenModel = Firebase.ai(backend = GenerativeBackend.vertexAI()).imagenModel( - modelName = "imagen-4.0-generate-preview-06-06", - generationConfig = ImagenGenerationConfig( - numberOfImages = 1, - aspectRatio = ImagenAspectRatio.SQUARE_1x1, - imageFormat = ImagenImageFormat.jpeg(compressionQuality = 75), - ), - ) - - @OptIn(PublicPreviewAPI::class) - suspend fun generateImage(prompt: String): Bitmap { - val imageResponse = imagenModel.generateImages( - prompt = prompt, - ) - val image = imageResponse.images.first() - return image.asBitmap() - } -} diff --git a/samples/imagen/src/main/java/com/android/ai/samples/imagen/ui/GeneratedContent.kt b/samples/imagen/src/main/java/com/android/ai/samples/imagen/ui/GeneratedContent.kt deleted file mode 100644 index 7248170d..00000000 --- a/samples/imagen/src/main/java/com/android/ai/samples/imagen/ui/GeneratedContent.kt +++ /dev/null @@ -1,66 +0,0 @@ -/* - * Copyright 2025 The Android Open Source Project - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * https://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package com.android.ai.samples.imagen.ui - -import androidx.compose.foundation.Image -import androidx.compose.foundation.layout.fillMaxSize -import androidx.compose.foundation.layout.wrapContentSize -import androidx.compose.material3.Card -import androidx.compose.material3.Text -import androidx.compose.runtime.Composable -import androidx.compose.ui.Alignment -import androidx.compose.ui.Modifier -import androidx.compose.ui.graphics.asImageBitmap -import androidx.compose.ui.layout.ContentScale -import androidx.compose.ui.res.stringResource -import androidx.compose.ui.text.style.TextAlign -import com.android.ai.samples.imagen.R - -@Composable -fun GeneratedContent(uiState: ImagenUIState, modifier: Modifier = Modifier) { - Card( - modifier = modifier, - ) { - when (uiState) { - ImagenUIState.Initial -> { - // - } - - ImagenUIState.Loading -> { - // - } - - is ImagenUIState.ImageGenerated -> { - Image( - bitmap = uiState.bitmap.asImageBitmap(), - contentDescription = uiState.contentDescription, - contentScale = ContentScale.Fit, - modifier = Modifier.fillMaxSize(), - ) - } - - is ImagenUIState.Error -> { - Text( - text = uiState.message ?: stringResource(R.string.error_message_unknown), - modifier = Modifier - .fillMaxSize() - .wrapContentSize(Alignment.Center), - textAlign = TextAlign.Center, - ) - } - } - } -} diff --git a/samples/imagen/src/main/java/com/android/ai/samples/imagen/ui/GenerationInput.kt b/samples/imagen/src/main/java/com/android/ai/samples/imagen/ui/GenerationInput.kt deleted file mode 100644 index f124ed66..00000000 --- a/samples/imagen/src/main/java/com/android/ai/samples/imagen/ui/GenerationInput.kt +++ /dev/null @@ -1,82 +0,0 @@ -/* - * Copyright 2025 The Android Open Source Project - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * https://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package com.android.ai.samples.imagen.ui - -import androidx.compose.foundation.layout.Arrangement -import androidx.compose.foundation.layout.Column -import androidx.compose.foundation.layout.Spacer -import androidx.compose.foundation.layout.fillMaxWidth -import androidx.compose.foundation.layout.size -import androidx.compose.foundation.text.KeyboardActions -import androidx.compose.foundation.text.KeyboardOptions -import androidx.compose.material.icons.Icons -import androidx.compose.material.icons.filled.SmartToy -import androidx.compose.material3.Button -import androidx.compose.material3.ButtonDefaults -import androidx.compose.material3.Icon -import androidx.compose.material3.Text -import androidx.compose.material3.TextField -import androidx.compose.runtime.Composable -import androidx.compose.runtime.getValue -import androidx.compose.runtime.mutableStateOf -import androidx.compose.runtime.saveable.rememberSaveable -import androidx.compose.runtime.setValue -import androidx.compose.ui.Modifier -import androidx.compose.ui.res.stringResource -import androidx.compose.ui.text.input.ImeAction -import androidx.compose.ui.unit.dp -import com.android.ai.samples.imagen.R - -@Composable -fun GenerationInput(onGenerateClick: (String) -> Unit, enabled: Boolean, modifier: Modifier = Modifier) { - val placeholder = stringResource(R.string.placeholder_prompt) - var textFieldValue by rememberSaveable { mutableStateOf(placeholder) } - - Column( - verticalArrangement = Arrangement.spacedBy(8.dp), - modifier = modifier, - ) { - TextField( - value = textFieldValue, - onValueChange = { textFieldValue = it }, - label = { Text(stringResource(R.string.prompt_label)) }, - modifier = Modifier.fillMaxWidth(), - enabled = enabled, - keyboardOptions = KeyboardOptions(imeAction = ImeAction.Send), - keyboardActions = KeyboardActions( - onSend = { - onGenerateClick(textFieldValue) - }, - ), - ) - Button( - onClick = { - onGenerateClick(textFieldValue) - }, - enabled = enabled, - contentPadding = ButtonDefaults.ButtonWithIconContentPadding, - modifier = Modifier.fillMaxWidth(), - ) { - Icon( - Icons.Default.SmartToy, - contentDescription = null, - modifier = Modifier.size(ButtonDefaults.IconSize), - ) - Spacer(Modifier.size(ButtonDefaults.IconSpacing)) - Text(text = stringResource(R.string.generate_button)) - } - } -} diff --git a/samples/magic-selfie/README.md b/samples/magic-selfie/README.md index ef6dcd9c..a25a5790 100644 --- a/samples/magic-selfie/README.md +++ b/samples/magic-selfie/README.md @@ -4,7 +4,7 @@ This sample is part of the [AI Sample Catalog](../../). To build and run this sa ## Description -This sample demonstrates how to create a "magic selfie" by replacing the background of a user's photo with a generated image. It uses the ML Kit Subject Segmentation API to isolate the user from their original background and the Imagen API to generate a new background from a text prompt. +This sample demonstrates how to create a "magic selfie" by replacing the background of a user's photo with a generated image. It uses the Nano Banana 2 (`gemini-3.1-flash-image-preview`) model to perform semantic image editing, transforming the background based on a text prompt while preserving the subject.
Magic Selfie in action @@ -12,24 +12,19 @@ This sample demonstrates how to create a "magic selfie" by replacing the backgro ## How it works -The application uses two main components. First, the ML Kit Subject Segmentation API processes the user's selfie to create a bitmap containing only the foreground (the person). Second, the Firebase AI SDK (see [How to run](../../#how-to-run)) for Android interacts with the Imagen model to generate a new background image from a user-provided text prompt. Finally, the application combines the foreground bitmap with the newly generated background to create the final magic selfie. The core logic for this process is in the [`MagicSelfieViewModel.kt`](./src/main/java/com/android/ai/samples/magicselfie/ui/MagicSelfieViewModel.kt) and [`MagicSelfieRepository.kt`](./src/main/java/com/android/ai/samples/magicselfie/data/MagicSelfieRepository.kt) files. +The application uses the Firebase AI SDK (see [How to run](../../#how-to-run)) for Android to interact with the Nano Banana 2 model. Unlike older approaches that require manual subject segmentation and image compositing, Nano Banana 2 can process a multimodal prompt (an image plus text) to modify the scene directly. The application sends the user's selfie and a prompt describing the desired background, and the model generates a new version of the image with the background replaced. The core logic for this process is in the [`MagicSelfieViewModel.kt`](./src/main/java/com/android/ai/samples/magicselfie/ui/MagicSelfieViewModel.kt) and [`MagicSelfieRepository.kt`](./src/main/java/com/android/ai/samples/magicselfie/data/MagicSelfieRepository.kt) files. -Here is the key snippet of code that orchestrates the magic selfie creation from [`MagicSelfieViewModel.kt`](./src/main/java/com/android/ai/samples/magicselfie/ui/MagicSelfieViewModel.kt): +Here is the key snippet of code that calls the generative model from [`MagicSelfieRepository.kt`](./src/main/java/com/android/ai/samples/magicselfie/data/MagicSelfieRepository.kt): ```kotlin -fun createMagicSelfie(bitmap: Bitmap, prompt: String) { - viewModelScope.launch { - try { - _uiState.value = MagicSelfieUiState.RemovingBackground - val foregroundBitmap = magicSelfieRepository.generateForegroundBitmap(bitmap) - _uiState.value = MagicSelfieUiState.GeneratingBackground - val backgroundBitmap = magicSelfieRepository.generateBackground(prompt) - val resultBitmap = magicSelfieRepository.combineBitmaps(foregroundBitmap, backgroundBitmap) - _uiState.value = MagicSelfieUiState.Success(resultBitmap) - } catch (e: Exception) { - _uiState.value = MagicSelfieUiState.Error(e.message) - } +suspend fun generateMagicSelfie(bitmap: Bitmap, prompt: String): Bitmap { + val multimodalPrompt = content { + image(bitmap) + text("Change the background of this image to $prompt") } + val response = generativeModel.generateContent(multimodalPrompt) + return response.candidates.firstOrNull()?.content?.parts?.firstNotNullOfOrNull { it.asImageOrNull() } + ?: throw Exception("No image generated") } ``` diff --git a/samples/magic-selfie/build.gradle.kts b/samples/magic-selfie/build.gradle.kts index 0b5cd2d6..9d59936c 100644 --- a/samples/magic-selfie/build.gradle.kts +++ b/samples/magic-selfie/build.gradle.kts @@ -69,7 +69,6 @@ dependencies { implementation(libs.hilt.android) implementation(libs.hilt.navigation.compose) implementation(libs.androidx.runtime.livedata) - implementation(libs.mlkit.segmentation) implementation(libs.ui.tooling.preview) debugImplementation(libs.ui.tooling) diff --git a/samples/magic-selfie/src/main/java/com/android/ai/samples/magicselfie/data/MagicSelfieRepository.kt b/samples/magic-selfie/src/main/java/com/android/ai/samples/magicselfie/data/MagicSelfieRepository.kt index 2e293777..19952fef 100644 --- a/samples/magic-selfie/src/main/java/com/android/ai/samples/magicselfie/data/MagicSelfieRepository.kt +++ b/samples/magic-selfie/src/main/java/com/android/ai/samples/magicselfie/data/MagicSelfieRepository.kt @@ -16,87 +16,34 @@ package com.android.ai.samples.magicselfie.data import android.graphics.Bitmap -import android.graphics.Canvas -import android.graphics.Paint import com.google.firebase.Firebase import com.google.firebase.ai.ai import com.google.firebase.ai.type.GenerativeBackend -import com.google.firebase.ai.type.ImagenAspectRatio -import com.google.firebase.ai.type.ImagenGenerationConfig -import com.google.firebase.ai.type.ImagenImageFormat -import com.google.firebase.ai.type.PublicPreviewAPI -import com.google.mlkit.vision.common.InputImage -import com.google.mlkit.vision.segmentation.subject.SubjectSegmentation -import com.google.mlkit.vision.segmentation.subject.SubjectSegmenterOptions +import com.google.firebase.ai.type.ResponseModality +import com.google.firebase.ai.type.asImageOrNull +import com.google.firebase.ai.type.content +import com.google.firebase.ai.type.generationConfig import javax.inject.Inject import javax.inject.Singleton -import kotlin.coroutines.suspendCoroutine -import kotlin.math.roundToInt @Singleton class MagicSelfieRepository @Inject constructor() { - @OptIn(PublicPreviewAPI::class) - private val imagenModel = Firebase.ai(backend = GenerativeBackend.vertexAI()).imagenModel( - modelName = "imagen-4.0-generate-preview-06-06", - generationConfig = ImagenGenerationConfig( - numberOfImages = 1, - aspectRatio = ImagenAspectRatio.PORTRAIT_3x4, - imageFormat = ImagenImageFormat.jpeg(compressionQuality = 75), - ), - ) - - private val subjectSegmenter = SubjectSegmentation.getClient( - SubjectSegmenterOptions.Builder() - .enableForegroundBitmap() - .build(), - ) - - suspend fun generateForegroundBitmap(bitmap: Bitmap): Bitmap { - val image = InputImage.fromBitmap(bitmap, 0) - return suspendCoroutine { continuation -> - subjectSegmenter.process(image) - .addOnSuccessListener { - it.foregroundBitmap?.let { foregroundBitmap -> - continuation.resumeWith(Result.success(foregroundBitmap)) - } - } - .addOnFailureListener { - continuation.resumeWith(Result.failure(it)) - } - } - } - - @OptIn(PublicPreviewAPI::class) - suspend fun generateBackground(prompt: String): Bitmap { - val imageResponse = imagenModel.generateImages( - prompt = prompt, + private val generativeModel by lazy { + Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel( + modelName = "gemini-3.1-flash-image-preview", + generationConfig = generationConfig { + responseModalities = listOf(ResponseModality.TEXT, ResponseModality.IMAGE) + } ) - val image = imageResponse.images.first() - return image.asBitmap() } - fun combineBitmaps(foreground: Bitmap, background: Bitmap): Bitmap { - val height = background.height - val width = background.width - - val resultBitmap = Bitmap.createBitmap(width, height, background.config!!) - val canvas = Canvas(resultBitmap) - val paint = Paint() - canvas.drawBitmap(background, 0f, 0f, paint) - - var foregroundHeight = foreground.height - var foregroundWidth = foreground.width - val ratio = foregroundWidth.toFloat() / foregroundHeight.toFloat() - - foregroundHeight = height - foregroundWidth = (foregroundHeight * ratio).roundToInt() - - val scaledForeground = Bitmap.createScaledBitmap(foreground, foregroundWidth, foregroundHeight, false) - - val left = (width - scaledForeground.width) / 2f - val top = (height - scaledForeground.height.toFloat()) - canvas.drawBitmap(scaledForeground, left, top, paint) - - return resultBitmap + suspend fun generateMagicSelfie(bitmap: Bitmap, prompt: String): Bitmap { + val multimodalPrompt = content { + image(bitmap) + text("Change the background of this image to $prompt") + } + val response = generativeModel.generateContent(multimodalPrompt) + return response.candidates.firstOrNull()?.content?.parts?.firstNotNullOfOrNull { it.asImageOrNull() } + ?: throw Exception("No image generated") } } diff --git a/samples/magic-selfie/src/main/java/com/android/ai/samples/magicselfie/ui/MagicSelfieScreen.kt b/samples/magic-selfie/src/main/java/com/android/ai/samples/magicselfie/ui/MagicSelfieScreen.kt index bd08381f..148247be 100644 --- a/samples/magic-selfie/src/main/java/com/android/ai/samples/magicselfie/ui/MagicSelfieScreen.kt +++ b/samples/magic-selfie/src/main/java/com/android/ai/samples/magicselfie/ui/MagicSelfieScreen.kt @@ -235,7 +235,6 @@ private fun MagicSelfieScreen( text = "", icon = painterResource(id = com.android.ai.uicomponent.R.drawable.ic_ai_bg), enabled = textFieldState.text.isNotEmpty() && - (uiState !is MagicSelfieUiState.RemovingBackground) && (uiState !is MagicSelfieUiState.GeneratingBackground), ) { onGenerateClick(selfieBitmap, textFieldState.text.toString()) @@ -246,8 +245,7 @@ private fun MagicSelfieScreen( SecondaryButton( text = "", icon = painterResource(id = com.android.ai.uicomponent.R.drawable.ic_ai_img), - enabled = (uiState !is MagicSelfieUiState.RemovingBackground) && - (uiState !is MagicSelfieUiState.GeneratingBackground), + enabled = (uiState !is MagicSelfieUiState.GeneratingBackground), onClick = onTakePictureClick, ) }, diff --git a/samples/magic-selfie/src/main/java/com/android/ai/samples/magicselfie/ui/MagicSelfieUiState.kt b/samples/magic-selfie/src/main/java/com/android/ai/samples/magicselfie/ui/MagicSelfieUiState.kt index 2ada49bf..ff30be89 100644 --- a/samples/magic-selfie/src/main/java/com/android/ai/samples/magicselfie/ui/MagicSelfieUiState.kt +++ b/samples/magic-selfie/src/main/java/com/android/ai/samples/magicselfie/ui/MagicSelfieUiState.kt @@ -19,7 +19,6 @@ import android.graphics.Bitmap sealed interface MagicSelfieUiState { data object Initial : MagicSelfieUiState - data object RemovingBackground : MagicSelfieUiState data object GeneratingBackground : MagicSelfieUiState data class Success(val bitmap: Bitmap) : MagicSelfieUiState data class Error(val message: String?) : MagicSelfieUiState diff --git a/samples/magic-selfie/src/main/java/com/android/ai/samples/magicselfie/ui/MagicSelfieViewModel.kt b/samples/magic-selfie/src/main/java/com/android/ai/samples/magicselfie/ui/MagicSelfieViewModel.kt index 30f68adf..64eac062 100644 --- a/samples/magic-selfie/src/main/java/com/android/ai/samples/magicselfie/ui/MagicSelfieViewModel.kt +++ b/samples/magic-selfie/src/main/java/com/android/ai/samples/magicselfie/ui/MagicSelfieViewModel.kt @@ -34,11 +34,8 @@ class MagicSelfieViewModel @Inject constructor(private val magicSelfieRepository fun createMagicSelfie(bitmap: Bitmap, prompt: String) { viewModelScope.launch { try { - _uiState.value = MagicSelfieUiState.RemovingBackground - val foregroundBitmap = magicSelfieRepository.generateForegroundBitmap(bitmap) _uiState.value = MagicSelfieUiState.GeneratingBackground - val backgroundBitmap = magicSelfieRepository.generateBackground(prompt) - val resultBitmap = magicSelfieRepository.combineBitmaps(foregroundBitmap, backgroundBitmap) + val resultBitmap = magicSelfieRepository.generateMagicSelfie(bitmap, prompt) _uiState.value = MagicSelfieUiState.Success(resultBitmap) } catch (e: Exception) { _uiState.value = MagicSelfieUiState.Error(e.message) diff --git a/samples/magic-selfie/src/main/res/values/strings.xml b/samples/magic-selfie/src/main/res/values/strings.xml index 9a802fd5..7524495a 100644 --- a/samples/magic-selfie/src/main/res/values/strings.xml +++ b/samples/magic-selfie/src/main/res/values/strings.xml @@ -1,7 +1,7 @@ Magic Selfie - Change the background of you selfies with Imagen and the ML Kit Segmentation API + Change the background of your selfies with the Gemini Flash model Add image Unknown error A very scenic view of the grand canyon diff --git a/samples/nanobanana/.gitignore b/samples/nanobanana/.gitignore new file mode 100644 index 00000000..796b96d1 --- /dev/null +++ b/samples/nanobanana/.gitignore @@ -0,0 +1 @@ +/build diff --git a/samples/nanobanana/README.md b/samples/nanobanana/README.md new file mode 100644 index 00000000..dd4a4f20 --- /dev/null +++ b/samples/nanobanana/README.md @@ -0,0 +1,27 @@ +# Nanobanana Image Generation Sample + +This sample is part of the [AI Sample Catalog](../../). To build and run this sample, you should clone the entire repository. + +## Description + +This sample demonstrates how to generate images from text prompts using the Gemini 3.1 Flash Image model (a.k.a. "Nano Banana"). Users can input a text description, and the generative model will create an image based on that prompt, showcasing the power of text-to-image generation with Gemini. + +
+Nanobanana Image Generation in action +
+ +## How it works + +The application uses the Firebase AI SDK (see [How to run](../../#how-to-run)) for Android to interact with Gemini. The core logic is in the [`NanobananaDataSource.kt`](./src/main/java/com/android/ai/samples/nanobanana/data/NanobananaDataSource.kt) file. A `generativeModel` is initialized with specific configurations. When a user provides a text prompt, it's passed to the `generateImage` method, which returns the generated image as a bitmap. + +Here is the key snippet of code that calls the generative model from [`NanobananaDataSource.kt`](./src/main/java/com/android/ai/samples/nanobanana/data/NanobananaDataSource.kt): + +```kotlin +suspend fun generateImage(prompt: String): Bitmap { + val response = generativeModel.generateContent(prompt) + return response.candidates.firstOrNull()?.content?.parts?.firstNotNullOfOrNull { it.asImageOrNull() } + ?: throw Exception("No image generated") +} +``` + +Read more about [Gemini](https://developer.android.com/ai/gemini) in the Android Documentation. diff --git a/samples/imagen/build.gradle.kts b/samples/nanobanana/build.gradle.kts similarity index 97% rename from samples/imagen/build.gradle.kts rename to samples/nanobanana/build.gradle.kts index b2ac122a..9a56b458 100644 --- a/samples/imagen/build.gradle.kts +++ b/samples/nanobanana/build.gradle.kts @@ -21,7 +21,7 @@ plugins { } android { - namespace = "com.android.ai.samples.imagen" + namespace = "com.android.ai.samples.nanobanana" compileSdk = 35 buildFeatures { diff --git a/samples/imagen-editing/consumer-rules.pro b/samples/nanobanana/consumer-rules.pro similarity index 100% rename from samples/imagen-editing/consumer-rules.pro rename to samples/nanobanana/consumer-rules.pro diff --git a/samples/imagen/consumer-rules.pro b/samples/nanobanana/proguard-rules.pro similarity index 100% rename from samples/imagen/consumer-rules.pro rename to samples/nanobanana/proguard-rules.pro diff --git a/samples/nanobanana/src/main/java/com/android/ai/samples/nanobanana/data/NanobananaDataSource.kt b/samples/nanobanana/src/main/java/com/android/ai/samples/nanobanana/data/NanobananaDataSource.kt new file mode 100644 index 00000000..1408c6d3 --- /dev/null +++ b/samples/nanobanana/src/main/java/com/android/ai/samples/nanobanana/data/NanobananaDataSource.kt @@ -0,0 +1,44 @@ +/* + * Copyright 2025 The Android Open Source Project + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package com.android.ai.samples.nanobanana.data + +import android.graphics.Bitmap +import com.google.firebase.Firebase +import com.google.firebase.ai.ai +import com.google.firebase.ai.type.GenerativeBackend +import com.google.firebase.ai.type.ResponseModality +import com.google.firebase.ai.type.asImageOrNull +import com.google.firebase.ai.type.generationConfig +import javax.inject.Inject +import javax.inject.Singleton + +@Singleton +class NanobananaDataSource @Inject constructor() { + private val generativeModel by lazy { + Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel( + modelName = "gemini-3.1-flash-image-preview", + generationConfig = generationConfig { + responseModalities = listOf(ResponseModality.IMAGE) + } + ) + } + + suspend fun generateImage(prompt: String): Bitmap { + val response = generativeModel.generateContent(prompt) + return response.candidates.firstOrNull()?.content?.parts?.firstNotNullOfOrNull { it.asImageOrNull() } + ?: throw Exception("No image generated") + } +} diff --git a/samples/imagen/src/main/java/com/android/ai/samples/imagen/ui/ImagenScreen.kt b/samples/nanobanana/src/main/java/com/android/ai/samples/nanobanana/ui/NanobananaScreen.kt similarity index 87% rename from samples/imagen/src/main/java/com/android/ai/samples/imagen/ui/ImagenScreen.kt rename to samples/nanobanana/src/main/java/com/android/ai/samples/nanobanana/ui/NanobananaScreen.kt index 10afed0e..89e3814b 100644 --- a/samples/imagen/src/main/java/com/android/ai/samples/imagen/ui/ImagenScreen.kt +++ b/samples/nanobanana/src/main/java/com/android/ai/samples/nanobanana/ui/NanobananaScreen.kt @@ -13,7 +13,7 @@ * See the License for the specific language governing permissions and * limitations under the License. */ -package com.android.ai.samples.imagen.ui +package com.android.ai.samples.nanobanana.ui import android.graphics.BitmapFactory import android.widget.Toast @@ -57,7 +57,7 @@ import androidx.compose.ui.tooling.preview.PreviewScreenSizes import androidx.compose.ui.unit.dp import androidx.hilt.navigation.compose.hiltViewModel import androidx.lifecycle.compose.collectAsStateWithLifecycle -import com.android.ai.samples.imagen.R +import com.android.ai.samples.nanobanana.R import com.android.ai.theme.AISampleCatalogTheme import com.android.ai.uicomponent.GenerateButton import com.android.ai.uicomponent.SampleDetailTopAppBar @@ -65,14 +65,14 @@ import com.android.ai.uicomponent.TextInput @OptIn(ExperimentalMaterial3Api::class, ExperimentalMaterial3ExpressiveApi::class) @Composable -fun ImagenScreen(viewModel: ImagenViewModel = hiltViewModel()) { - val uiState: ImagenUIState by viewModel.uiState.collectAsStateWithLifecycle() +fun NanobananaScreen(viewModel: NanobananaViewModel = hiltViewModel()) { + val uiState: NanobananaUIState by viewModel.uiState.collectAsStateWithLifecycle() - if (uiState is ImagenUIState.Error) { - Toast.makeText(LocalContext.current, (uiState as ImagenUIState.Error).message, Toast.LENGTH_SHORT).show() + if (uiState is NanobananaUIState.Error) { + Toast.makeText(LocalContext.current, (uiState as NanobananaUIState.Error).message, Toast.LENGTH_SHORT).show() } - ImagenScreen( + NanobananaScreen( uiState = uiState, onGenerateClick = viewModel::generateImage, ) @@ -80,16 +80,16 @@ fun ImagenScreen(viewModel: ImagenViewModel = hiltViewModel()) { @Composable @OptIn(ExperimentalMaterial3Api::class, ExperimentalMaterial3ExpressiveApi::class) -private fun ImagenScreen(uiState: ImagenUIState, onGenerateClick: (String) -> Unit) { - val isGenerating = uiState is ImagenUIState.Loading +private fun NanobananaScreen(uiState: NanobananaUIState, onGenerateClick: (String) -> Unit) { + val isGenerating = uiState is NanobananaUIState.Loading val backDispatcher = LocalOnBackPressedDispatcherOwner.current?.onBackPressedDispatcher Scaffold( containerColor = MaterialTheme.colorScheme.surface, topBar = { SampleDetailTopAppBar( - sampleName = stringResource(R.string.title_image_generation_screen), - sampleDescription = stringResource(R.string.subtitle_image_generation_screen), - sourceCodeUrl = "https://github.com/android/ai-samples/tree/main/samples/imagen", + sampleName = stringResource(R.string.title_nanobanana_screen), + sampleDescription = stringResource(R.string.subtitle_nanobanana_screen), + sourceCodeUrl = "https://github.com/android/ai-samples/tree/main/samples/nanobanana", onBackClick = { backDispatcher?.onBackPressed() }, ) }, @@ -133,13 +133,13 @@ private fun ImagenScreen(uiState: ImagenUIState, onGenerateClick: (String) -> Un ) { when (uiState) { - is ImagenUIState.ImageGenerated -> Image( + is NanobananaUIState.ImageGenerated -> Image( bitmap = uiState.bitmap.asImageBitmap(), contentDescription = uiState.contentDescription, contentScale = ContentScale.Crop, modifier = Modifier.fillMaxSize(), ) - ImagenUIState.Loading -> { + NanobananaUIState.Loading -> { ContainedLoadingIndicator( modifier = Modifier.size(60.dp) .align(Alignment.Center), @@ -182,10 +182,10 @@ private fun ImagenScreen(uiState: ImagenUIState, onGenerateClick: (String) -> Un @PreviewScreenSizes @Composable @OptIn(ExperimentalMaterial3Api::class) -private fun ImagenScreenPreview() { +private fun NanobananaScreenPreview() { AISampleCatalogTheme { - ImagenScreen( - uiState = ImagenUIState.Initial, + NanobananaScreen( + uiState = NanobananaUIState.Initial, onGenerateClick = {}, ) } diff --git a/samples/imagen/src/main/java/com/android/ai/samples/imagen/ui/ImagenUIState.kt b/samples/nanobanana/src/main/java/com/android/ai/samples/nanobanana/ui/NanobananaUIState.kt similarity index 74% rename from samples/imagen/src/main/java/com/android/ai/samples/imagen/ui/ImagenUIState.kt rename to samples/nanobanana/src/main/java/com/android/ai/samples/nanobanana/ui/NanobananaUIState.kt index 3bca1610..15be384a 100644 --- a/samples/imagen/src/main/java/com/android/ai/samples/imagen/ui/ImagenUIState.kt +++ b/samples/nanobanana/src/main/java/com/android/ai/samples/nanobanana/ui/NanobananaUIState.kt @@ -13,16 +13,16 @@ * See the License for the specific language governing permissions and * limitations under the License. */ -package com.android.ai.samples.imagen.ui +package com.android.ai.samples.nanobanana.ui import android.graphics.Bitmap -sealed interface ImagenUIState { - data object Initial : ImagenUIState - data object Loading : ImagenUIState +sealed interface NanobananaUIState { + data object Initial : NanobananaUIState + data object Loading : NanobananaUIState data class ImageGenerated( val bitmap: Bitmap, val contentDescription: String, - ) : ImagenUIState - data class Error(val message: String?) : ImagenUIState + ) : NanobananaUIState + data class Error(val message: String?) : NanobananaUIState } diff --git a/samples/imagen/src/main/java/com/android/ai/samples/imagen/ui/ImagenViewModel.kt b/samples/nanobanana/src/main/java/com/android/ai/samples/nanobanana/ui/NanobananaViewModel.kt similarity index 60% rename from samples/imagen/src/main/java/com/android/ai/samples/imagen/ui/ImagenViewModel.kt rename to samples/nanobanana/src/main/java/com/android/ai/samples/nanobanana/ui/NanobananaViewModel.kt index b0f77d08..5e65ff05 100644 --- a/samples/imagen/src/main/java/com/android/ai/samples/imagen/ui/ImagenViewModel.kt +++ b/samples/nanobanana/src/main/java/com/android/ai/samples/nanobanana/ui/NanobananaViewModel.kt @@ -13,11 +13,11 @@ * See the License for the specific language governing permissions and * limitations under the License. */ -package com.android.ai.samples.imagen.ui +package com.android.ai.samples.nanobanana.ui import androidx.lifecycle.ViewModel import androidx.lifecycle.viewModelScope -import com.android.ai.samples.imagen.data.ImagenDataSource +import com.android.ai.samples.nanobanana.data.NanobananaDataSource import dagger.hilt.android.lifecycle.HiltViewModel import javax.inject.Inject import kotlinx.coroutines.flow.MutableStateFlow @@ -25,20 +25,20 @@ import kotlinx.coroutines.flow.StateFlow import kotlinx.coroutines.launch @HiltViewModel -class ImagenViewModel @Inject constructor(private val imagenDataSource: ImagenDataSource) : ViewModel() { +class NanobananaViewModel @Inject constructor(private val nanobananaDataSource: NanobananaDataSource) : ViewModel() { - private val _uiState: MutableStateFlow = MutableStateFlow(ImagenUIState.Initial) - val uiState: StateFlow = _uiState + private val _uiState: MutableStateFlow = MutableStateFlow(NanobananaUIState.Initial) + val uiState: StateFlow = _uiState fun generateImage(prompt: String) { - _uiState.value = ImagenUIState.Loading + _uiState.value = NanobananaUIState.Loading viewModelScope.launch { try { - val bitmap = imagenDataSource.generateImage(prompt) - _uiState.value = ImagenUIState.ImageGenerated(bitmap, contentDescription = prompt) + val bitmap = nanobananaDataSource.generateImage(prompt) + _uiState.value = NanobananaUIState.ImageGenerated(bitmap, contentDescription = prompt) } catch (e: Exception) { - _uiState.value = ImagenUIState.Error(e.message) + _uiState.value = NanobananaUIState.Error(e.message) } } } diff --git a/samples/imagen/src/main/res/values/strings.xml b/samples/nanobanana/src/main/res/values/strings.xml similarity index 55% rename from samples/imagen/src/main/res/values/strings.xml rename to samples/nanobanana/src/main/res/values/strings.xml index 9edf57df..5f70ad9a 100644 --- a/samples/imagen/src/main/res/values/strings.xml +++ b/samples/nanobanana/src/main/res/values/strings.xml @@ -16,13 +16,8 @@ --> - See Code - An oil painting of Alcatraz - Imagen image generation - Generate images with Imagen, Google image generation model - Generate - Generating… - Prompt - Enter a prompt and tap \"Generate\" to generate an image + A yellow banana on a blue background + Nanobanana image generation + Generate images with Nanobanana, Google image generation model Unknown error - \ No newline at end of file +
diff --git a/settings.gradle.kts b/settings.gradle.kts index 977e9376..14bffd8c 100644 --- a/settings.gradle.kts +++ b/settings.gradle.kts @@ -43,11 +43,11 @@ include(":samples:gemini-chatbot") include(":samples:genai-summarization") include(":samples:genai-writing-assistance") include(":samples:genai-image-description") -include(":samples:imagen") -include(":samples:imagen-editing") include(":samples:magic-selfie") include(":samples:gemini-video-summarization") include(":samples:gemini-live-todo") include(":samples:gemini-video-metadata-creation") include(":samples:gemini-image-chat") +include(":samples:gemini-hybrid") +include(":samples:nanobanana") include(":ui-component") diff --git a/ui-component/src/main/java/com/android/ai/uicomponent/VideoPlayer.kt b/ui-component/src/main/java/com/android/ai/uicomponent/VideoPlayer.kt index 397da849..b41b002a 100644 --- a/ui-component/src/main/java/com/android/ai/uicomponent/VideoPlayer.kt +++ b/ui-component/src/main/java/com/android/ai/uicomponent/VideoPlayer.kt @@ -245,11 +245,13 @@ fun VideoPickerDropdown( } } +const val VIDEO_BASE_URL = "https://storage.googleapis.com/androiddevelopers/samples_assets/gtv-videos-bucket/sample" + // Sample data for the picker private val sampleVideosForPicker = listOf( - VideoPickerData("Big Buck Bunny", "https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4".toUri()), - VideoPickerData("Tears of Steel", "https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/TearsOfSteel.mp4".toUri()), - VideoPickerData("For Bigger Blazes", "https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerBlazes.mp4".toUri()), + VideoPickerData("Big Buck Bunny", "$VIDEO_BASE_URL/BigBuckBunny.mp4".toUri()), + VideoPickerData("Tears of Steel", "$VIDEO_BASE_URL/TearsOfSteel.mp4".toUri()), + VideoPickerData("For Bigger Blazes", "$VIDEO_BASE_URL/ForBiggerBlazes.mp4".toUri()), ) @Preview