ASR/TTS Private Deployment SDK Integration Guide
This document explains how third-party developers can integrate the online-speech SDK, initialize it, and call ASR/TTS features. The SDK can connect to Rokid public cloud speech services, and it can also connect to a private deployment by configuring domain, asrPath, and ttsPath.
1. Modules
sdk: Pure Kotlin/JVM ASR/TTS SDK.sdk-android-open: Android adaptation layer that integrates withopen-sdkrecording and default streaming playback.demo-android: Demo app for functional verification only.
2. Requirements
2.1 Build environment
- JDK 17
- Android Gradle Plugin 8.2.2 for Android modules
- Kotlin 2.2.0
2.2 Runtime environment for Android
minSdk = 26- Required permissions:
android.permission.INTERNETandroid.permission.RECORD_AUDIOfor ASR microphone mode
3. Add dependencies
Add Rokid Maven to your project repository configuration:
https://maven.rokid.com/repository/maven-public/
Example settings.gradle(.kts):
kotlin
dependencyResolutionManagement {
repositories {
maven(url = "https://maven.rokid.com/repository/maven-public/")
google()
mavenCentral()
}
}For older Gradle versions, add the same repository to allprojects.repositories in the root build.gradle(.kts).
Required dependencies:
com.rokid.security.sdk:online-speech:0.1.0com.rokid.security:glass3.open.sdk:2.1.6-E
App module example:
kotlin
dependencies {
implementation("com.rokid.security.sdk:online-speech:0.1.1")
implementation("com.rokid.security:glass3.open.sdk:2.1.6-E")
}4. Initialize and call the SDK
4.1 Initialize
For private deployments, replace domain, asrPath, and ttsPath with the domain and paths provided by the private environment. AK/SK, UID, and device ID should also come from the deployed environment.
kotlin
val cfg = OnlineSpeechSdkConfig(
domain = "api-test.rokid.com",
ak = "<AK>",
sk = "<SK>",
uid = "<UID>",
deviceId = "<DEVICE_ID>",
asrPath = "/ar/audio/api/ws/asr/streaming",
ttsPath = "/ar/audio/api/ws/tts",
trustAllCerts = true, // Enable for debugging only. Disable in production.
staticHttpHeaders = mapOf(
"appCredential" to "userInfo",
"messageId" to "msg-${System.currentTimeMillis()}",
),
staticMessageHeaders = mapOf(
"appCredential" to "userInfo",
"messageId" to "msg-${System.currentTimeMillis()}",
),
)
val sdk = OnlineSpeechSdk(cfg)4.2 ASR recommended high-level flow
kotlin
val asr = sdk.createAsrClient()
.attachAudioSource(OpenSdkAudioSource())
asr.connect()
asr.startAsrWithMic()
// ...
asr.stopAsrWithMic()4.3 TTS
kotlin
val tts = sdk.createTtsClient()
.attachStreamPlayer(AndroidPcmTtsStreamPlayer())
tts.connect()
tts.speak("Hello, welcome to the online speech SDK.")
tts.stop()Playback states:
IDLEBUFFERINGPLAYINGCOMPLETEDSTOPPEDFAILED
5. Lifecycle recommendations
- Page
onStart/onResume: callconnect()when needed. - Page
onStop/onDestroy: callclose()to release the WebSocket. - App exit: call
sdk.close().
6. FAQ
- WebSocket timeout: check
domain/asrPath/ttsPathand certificate policy first. - No ASR result: make sure
asr.connect()is called beforeasr.startAsrWithMic(). - No TTS audio: make sure
tts.connect()is called, then check device volume and audio routing.