Skip to content

ASR/TTS Private Deployment SDK Integration Guide

This document explains how third-party developers can integrate the online-speech SDK, initialize it, and call ASR/TTS features. The SDK can connect to Rokid public cloud speech services, and it can also connect to a private deployment by configuring domain, asrPath, and ttsPath.

1. Modules

  • sdk: Pure Kotlin/JVM ASR/TTS SDK.
  • sdk-android-open: Android adaptation layer that integrates with open-sdk recording and default streaming playback.
  • demo-android: Demo app for functional verification only.

2. Requirements

2.1 Build environment

  • JDK 17
  • Android Gradle Plugin 8.2.2 for Android modules
  • Kotlin 2.2.0

2.2 Runtime environment for Android

  • minSdk = 26
  • Required permissions:
    • android.permission.INTERNET
    • android.permission.RECORD_AUDIO for ASR microphone mode

3. Add dependencies

Add Rokid Maven to your project repository configuration:

  • https://maven.rokid.com/repository/maven-public/

Example settings.gradle(.kts):

kotlin
dependencyResolutionManagement {
    repositories {
        maven(url = "https://maven.rokid.com/repository/maven-public/")
        google()
        mavenCentral()
    }
}

For older Gradle versions, add the same repository to allprojects.repositories in the root build.gradle(.kts).

Required dependencies:

  • com.rokid.security.sdk:online-speech:0.1.0
  • com.rokid.security:glass3.open.sdk:2.1.6-E

App module example:

kotlin
dependencies {
    implementation("com.rokid.security.sdk:online-speech:0.1.1")
    implementation("com.rokid.security:glass3.open.sdk:2.1.6-E")
}

4. Initialize and call the SDK

4.1 Initialize

For private deployments, replace domain, asrPath, and ttsPath with the domain and paths provided by the private environment. AK/SK, UID, and device ID should also come from the deployed environment.

kotlin
val cfg = OnlineSpeechSdkConfig(
    domain = "api-test.rokid.com",
    ak = "<AK>",
    sk = "<SK>",
    uid = "<UID>",
    deviceId = "<DEVICE_ID>",
    asrPath = "/ar/audio/api/ws/asr/streaming",
    ttsPath = "/ar/audio/api/ws/tts",
    trustAllCerts = true, // Enable for debugging only. Disable in production.
    staticHttpHeaders = mapOf(
        "appCredential" to "userInfo",
        "messageId" to "msg-${System.currentTimeMillis()}",
    ),
    staticMessageHeaders = mapOf(
        "appCredential" to "userInfo",
        "messageId" to "msg-${System.currentTimeMillis()}",
    ),
)
val sdk = OnlineSpeechSdk(cfg)
kotlin
val asr = sdk.createAsrClient()
    .attachAudioSource(OpenSdkAudioSource())

asr.connect()
asr.startAsrWithMic()
// ...
asr.stopAsrWithMic()

4.3 TTS

kotlin
val tts = sdk.createTtsClient()
    .attachStreamPlayer(AndroidPcmTtsStreamPlayer())

tts.connect()
tts.speak("Hello, welcome to the online speech SDK.")
tts.stop()

Playback states:

  • IDLE
  • BUFFERING
  • PLAYING
  • COMPLETED
  • STOPPED
  • FAILED

5. Lifecycle recommendations

  • Page onStart/onResume: call connect() when needed.
  • Page onStop/onDestroy: call close() to release the WebSocket.
  • App exit: call sdk.close().

6. FAQ

  • WebSocket timeout: check domain/asrPath/ttsPath and certificate policy first.
  • No ASR result: make sure asr.connect() is called before asr.startAsrWithMic().
  • No TTS audio: make sure tts.connect() is called, then check device volume and audio routing.