A library for integrating real-time, conversational AI voice with Scala and the Google Gemini Live API.
The project's goal is to provide a Scala-friendly wrapper over google-genai SDK, being simple enough to get started while allowing to override any of the google-genai settings.
As of now, this exposes a fs2 stream where you can an audio stream to Gemini, producing Gemini's audio stream.
One of the key features is supporting Automatic Function Calling which allows Gemini to invoke Scala functions.
Pick the latest version from the releases page, then, add the dependency to your build.sbt:
libraryDependencies += "com.alexitc.geminilive4s" %% "audio" % "0.3.0"This is how a minimal application looks like, it listens to your microphone and plays Gemini audio over your speaker:
import cats.effect.{IO, IOApp}
import com.alexitc.geminilive4s.GeminiService
import com.alexitc.geminilive4s.demo.{MicSource, SpeakerSink}
import com.alexitc.geminilive4s.models.{
AudioStreamFormat,
GeminiConfig,
GeminiInputChunk
}
object MinimalDemo extends IOApp.Simple {
val apiKey = sys.env.getOrElse(
"GEMINI_API_KEY",
throw new RuntimeException("GEMINI_API_KEY is required")
)
val config = GeminiConfig(
prompt = "You are a comedian and your goal is making me laugh",
functions = List.empty
)
override def run: IO[Unit] = {
val audioFormat = AudioStreamFormat.GeminiOutput
val pipeline = for {
gemini <- GeminiService.make(apiKey, config)
// mic to gemini, gemini to speaker
_ <- MicSource
.stream(audioFormat)
.map(bytes => GeminiInputChunk(bytes))
.through(gemini.conversationPipe(geminiMustSpeakFirst = true))
.observe(in => in.map(_.chunk).through(SpeakerSink.pipe(audioFormat)))
} yield ()
pipeline.compile.drain
}
}The simplest way to try this is by picking one of the examples and run it with scala-cli, like:
scala-cli https://raw.githubusercontent.com/AlexITC/geminilive4s/refs/heads/main/examples/NoteTakerDemo.scala