Documentation Index
Fetch the complete documentation index at: https://learn.getodin.ai/llms.txt
Use this file to discover all available pages before exploring further.
The VoiceSDK provides advanced voice conversation capabilities with automatic chat system integration, and React hooks for building voice-enabled applications. Create natural voice interactions with AI agents, complete with audio visualization, transcription, and conversation management. In this article you find a Quick Start example that will not only get you up and running quicky, but help you understand how it works. The rest of the article details the various methods you can use, what they are used for, and best practices.
Installation
npm install @odin-ai-staging/sdk @elevenlabs/react
Quick Start
Basic Voice Conversation
In this example, you will learn how to implement real-time voice conversations using the VoiceSDK in both TypeScript and React applications. You start by initializing the VoiceSDK with your API credentials including a specific agentId that defines which AI voice agent will handle the conversation, then use startVoiceConversation() to begin a voice session with callback handlers that let you respond to connection events, receive AI messages, handle disconnections, and process live transcriptions of what the user is saying (with isFinal indicating when a complete phrase has been recognized). The saveToChat option allows you to persist the voice conversation as text in your chat history for later reference. For React applications, you’ll use the useVoiceConversation hook, which provides a cleaner interface with state management built-in: it gives you a status variable to track the connection state, startSession() and endSession() methods to control the conversation, setVolume() to adjust audio levels, getInputByteFrequencyData() for visualizing audio input (perfect for creating waveform displays), and conversationState that contains real-time information like current volume levels. This gives you everything you need to build voice-enabled AI applications with real-time speech recognition and synthesis, perfect for creating voice assistants, hands-free interfaces, or conversational AI experiences—with the React hook handling all the complex state management and WebSocket connections for you.
import { VoiceSDK } from '@odin-ai-staging/sdk';
// Initialize the SDK
const voiceSDK = new VoiceSDK({
baseUrl: 'https://your-api-endpoint.com/',
projectId: 'your-project-id',
apiKey: 'your-api-key',
apiSecret: 'your-api-secret',
agentId: 'your-agent-id'
});
// Start a voice conversation
async function startVoiceChat() {
const sessionId = await voiceSDK.startVoiceConversation({
saveToChat: true,
callbacks: {
onConnect: () => console.log('Voice connected'),
onMessage: (message) => console.log('Voice message:', message),
onDisconnect: () => console.log('Voice disconnected'),
onTranscription: (text, isFinal) => {
if (isFinal) console.log('User said:', text);
}
}
});
console.log('Voice session started:', sessionId);
}
React Hook Usage
import { useVoiceConversation } from '@odin-ai-staging/sdk';
function VoiceChat() {
const {
status,
startSession,
endSession,
setVolume,
getInputByteFrequencyData,
conversationState
} = useVoiceConversation({
sdkConfig: {
baseUrl: 'https://your-api-endpoint.com/',
projectId: 'your-project-id',
agentId: 'your-agent-id'
},
callbacks: {
onConnect: () => console.log('Connected!'),
onMessage: (message) => console.log('Message:', message)
}
});
return (
<div>
<button
onClick={() => startSession()}
disabled={status === 'connected'}
>
Start Voice Chat
</button>
<button
onClick={() => endSession()}
disabled={status !== 'connected'}
>
End Chat
</button>
<div>Status: {status}</div>
<div>Volume: {conversationState.volume}</div>
</div>
);
}
Configuration
VoiceSDKConfig Interface
interface VoiceSDKConfig extends BaseClientConfig {
agentId?: string; // Default agent ID for conversations
defaultVoiceSettings?: VoiceSettings; // Default voice configuration
}
VoiceSettings
interface VoiceSettings {
stability?: number; // Voice stability (0.0 to 1.0)
similarityBoost?: number; // Voice similarity boost (0.0 to 1.0)
style?: number; // Voice style (0.0 to 1.0)
useSpeakerBoost?: boolean; // Enable speaker boost
}
Example Configuration:
const voiceSDK = new VoiceSDK({
baseUrl: 'https://api.example.com/',
projectId: 'proj_123',
apiKey: 'your-api-key',
apiSecret: 'your-api-secret',
agentId: 'agent_456',
defaultVoiceSettings: {
stability: 0.8,
similarityBoost: 0.7,
style: 0.3,
useSpeakerBoost: true
}
});
Core Features
Voice Conversation Sessions
The VoiceSDK manages voice conversation sessions with automatic chat integration:
interface VoiceConversationSession {
id: string; // Session identifier
chatId?: string; // Associated chat ID
startTime: number; // Session start timestamp
endTime?: number; // Session end timestamp
messages: VoiceMessage[]; // Voice messages in session
metadata?: {
agentId?: string;
voiceSettings?: VoiceSettings;
totalDuration?: number;
userInfo?: { name: string; id: string };
};
}
Voice Messages
interface VoiceMessage {
id: string; // Message ID
type: 'user_speech' | 'ai_speech' | 'system';
text: string; // Transcribed/generated text
audioUrl?: string; // Audio file URL
timestamp: number; // Message timestamp
duration?: number; // Audio duration in seconds
voiceSettings?: VoiceSettings; // Voice settings used
saved?: boolean; // Whether saved to database
}
Session Management
startVoiceConversation(options?)
Start a new voice conversation session.
async startVoiceConversation(
options?: StartVoiceConversationOptions
): Promise<string>
StartVoiceConversationOptions:
interface StartVoiceConversationOptions {
callbacks?: VoiceConversationCallbacks;
saveToChat?: boolean; // Auto-save to chat history
existingChatId?: string; // Continue existing chat
agentId?: string; // Override default agent
voiceSettings?: VoiceSettings; // Custom voice settings
userInfo?: { name: string; id: string };
}
Example:
const sessionId = await voiceSDK.startVoiceConversation({
saveToChat: true,
existingChatId: 'chat_123',
voiceSettings: {
stability: 0.9,
similarityBoost: 0.8
},
userInfo: {
name: 'John Doe',
id: 'user_456'
},
callbacks: {
onConnect: () => console.log('Voice conversation started'),
onMessage: (message) => handleVoiceMessage(message),
onTranscription: (text, isFinal) => {
if (isFinal) displayTranscription(text);
},
onConversationSaved: (chatId, messageId) => {
console.log(`Conversation saved to chat ${chatId}`);
}
}
});
endVoiceSession(sessionId, reason?)
End a voice conversation session.
async endVoiceSession(sessionId: string, reason?: string): Promise<void>
Example:
await voiceSDK.endVoiceSession(sessionId, 'User ended conversation');
getVoiceState(sessionId)
Get current voice conversation state.
getVoiceState(sessionId: string): VoiceConversationState | null
Example:
const state = voiceSDK.getVoiceState(sessionId);
if (state) {
console.log('Connection status:', state.connectionStatus);
console.log('Is speaking:', state.isSpeaking);
console.log('Volume:', state.volume);
}
React Integration
useVoiceConversation Hook
The useVoiceConversation hook provides React integration with state management:
function useVoiceConversation(options: VoiceHookOptions): {
// Hook properties
status: VoiceStatus;
isSpeaking: boolean;
startSession: (config?: VoiceSessionConfig) => Promise<string>;
endSession: () => Promise<void>;
setVolume: (options: { volume: number }) => void;
// Enhanced SDK properties
conversationState: VoiceConversationState;
currentSessionId: string | null;
getInputByteFrequencyData: () => Uint8Array | null;
getOutputByteFrequencyData: () => Uint8Array | null;
}
Complete React Example:
import React, { useState } from 'react';
import { useVoiceConversation } from '@odin-ai-staging/sdk';
function VoiceConversationComponent() {
const [messages, setMessages] = useState<string[]>([]);
const [isRecording, setIsRecording] = useState(false);
const {
status,
isSpeaking,
startSession,
endSession,
setVolume,
conversationState,
currentSessionId,
getInputByteFrequencyData
} = useVoiceConversation({
sdkConfig: {
baseUrl: process.env.REACT_APP_API_BASE_URL,
projectId: process.env.REACT_APP_PROJECT_ID,
agentId: process.env.REACT_APP_AGENT_ID
},
callbacks: {
onConnect: () => {
console.log('Connected to voice chat');
setIsRecording(true);
},
onDisconnect: () => {
console.log('Disconnected from voice chat');
setIsRecording(false);
},
onTranscription: (text, isFinal) => {
if (isFinal) {
setMessages(prev => [...prev, `You: ${text}`]);
}
},
onMessage: (message) => {
if (message.type === 'ai_speech') {
setMessages(prev => [...prev, `AI: ${message.text}`]);
}
},
onError: (error) => {
console.error('Voice error:', error);
setIsRecording(false);
}
}
});
const handleStartConversation = async () => {
try {
await startSession({
saveToChat: true,
voiceSettings: {
stability: 0.8,
similarityBoost: 0.7
}
});
} catch (error) {
console.error('Failed to start conversation:', error);
}
};
const handleEndConversation = async () => {
try {
await endSession();
} catch (error) {
console.error('Failed to end conversation:', error);
}
};
const handleVolumeChange = (volume: number) => {
setVolume({ volume });
};
return (
<div className="voice-conversation">
<div className="controls">
<button
onClick={handleStartConversation}
disabled={status === 'connected'}
>
Start Voice Chat
</button>
<button
onClick={handleEndConversation}
disabled={status !== 'connected'}
>
End Voice Chat
</button>
</div>
<div className="status">
<div>Status: {status}</div>
<div>Speaking: {isSpeaking ? 'Yes' : 'No'}</div>
<div>Recording: {isRecording ? 'Yes' : 'No'}</div>
<div>Volume: {conversationState.volume}</div>
</div>
<div className="volume-control">
<label>Volume:</label>
<input
type="range"
min="0"
max="100"
value={conversationState.volume}
onChange={(e) => handleVolumeChange(parseInt(e.target.value))}
/>
</div>
<div className="messages">
{messages.map((message, index) => (
<div key={index} className="message">
{message}
</div>
))}
</div>
{currentSessionId && (
<AudioVisualizer
getInputData={getInputByteFrequencyData}
isActive={status === 'connected'}
/>
)}
</div>
);
}
Voice Controls
Volume Control
// Set volume (0-100)
await voiceSDK.setVolume(sessionId, 75);
Microphone Control
// Mute/unmute microphone
await voiceSDK.setMicrophoneMuted(sessionId, true); // Mute
await voiceSDK.setMicrophoneMuted(sessionId, false); // Unmute
Voice Settings Updates
// Update voice settings during conversation
await voiceSDK.updateVoiceSettings(sessionId, {
stability: 0.9,
similarityBoost: 0.8,
style: 0.4
});
Audio Visualization
Real-time Audio Data
// Get audio frequency data for visualization
const audioData = voiceSDK.getAudioFrequencyData(sessionId);
if (audioData) {
const inputData = audioData.input; // User's audio input
const outputData = audioData.output; // AI's audio output
// Use for audio visualization
renderAudioVisualization(inputData, outputData);
}
Audio Visualization Component
import React, { useRef, useEffect } from 'react';
interface AudioVisualizerProps {
getInputData: () => Uint8Array | null;
isActive: boolean;
}
function AudioVisualizer({ getInputData, isActive }: AudioVisualizerProps) {
const canvasRef = useRef<HTMLCanvasElement>(null);
useEffect(() => {
if (!isActive) return;
const canvas = canvasRef.current;
if (!canvas) return;
const ctx = canvas.getContext('2d');
if (!ctx) return;
const animate = () => {
const data = getInputData();
if (data) {
// Clear canvas
ctx.clearRect(0, 0, canvas.width, canvas.height);
// Draw frequency bars
const barWidth = canvas.width / data.length;
for (let i = 0; i < data.length; i++) {
const barHeight = (data[i] / 255) * canvas.height;
ctx.fillStyle = `hsl(${i * 2}, 100%, 50%)`;
ctx.fillRect(
i * barWidth,
canvas.height - barHeight,
barWidth - 1,
barHeight
);
}
}
requestAnimationFrame(animate);
};
animate();
}, [isActive, getInputData]);
return (
<canvas
ref={canvasRef}
width={400}
height={100}
className="audio-visualizer"
/>
);
}
Chat Integration
Automatic Chat Saving
Voice conversations can be automatically saved to your chat system:
const sessionId = await voiceSDK.startVoiceConversation({
saveToChat: true, // Enable automatic saving
existingChatId: 'chat_123', // Optional: continue existing chat
callbacks: {
onConversationSaved: (chatId, messageId) => {
console.log(`Voice conversation saved to chat ${chatId}`);
// Update UI to show the saved conversation
refreshChatHistory(chatId);
}
}
});
Manual Chat Integration
// Get conversation history from voice session
const messages = voiceSDK.getConversationHistory(sessionId);
// Save to chat manually
for (const message of messages) {
if (message.type === 'user_speech') {
await chatSDK.sendMessage(message.text, {
chatId: 'chat_123',
metadata: {
voiceMessage: true,
audioUrl: message.audioUrl,
sessionId: sessionId
}
});
}
}
Contextual Updates
Send additional context to the voice conversation:
// Send context from chat history
await voiceSDK.sendContextualUpdate(
sessionId,
'User previously asked about pricing. Current conversation is about features.'
);
Error Handling
try {
const sessionId = await voiceSDK.startVoiceConversation({
callbacks: {
onError: (error) => {
console.error('Voice conversation error:', error);
// Handle specific error types
if (error.message.includes('microphone')) {
showMicrophonePermissionDialog();
} else if (error.message.includes('network')) {
showNetworkErrorMessage();
}
},
onDisconnect: (details) => {
console.log('Disconnected:', details?.reason);
// Handle different disconnection reasons
if (details?.reason === 'user_ended') {
showConversationSummary();
} else if (details?.reason === 'error') {
showReconnectOption();
}
}
}
});
} catch (error) {
console.error('Failed to start voice conversation:', error);
if (error.message.includes('agent')) {
showAgentConfigError();
}
}
Examples
Voice-Enabled Customer Support
import { VoiceSDK, ChatSDK } from '@odin-ai-staging/sdk';
class VoiceCustomerSupport {
private voiceSDK: VoiceSDK;
private chatSDK: ChatSDK;
private activeSession?: string;
constructor() {
const config = {
baseUrl: process.env.API_BASE_URL,
projectId: process.env.PROJECT_ID,
apiKey: process.env.API_KEY,
apiSecret: process.env.API_SECRET,
};
this.voiceSDK = new VoiceSDK(config);
this.chatSDK = new ChatSDK(config);
}
async startSupportSession(customerId: string, issueType: string) {
try {
// Create a new chat for this support session
const chat = await this.chatSDK.createChat(
`Voice Support - ${issueType}`,
[] // Could add relevant document keys based on issue type
);
// Start voice conversation
this.activeSession = await this.voiceSDK.startVoiceConversation({
saveToChat: true,
existingChatId: chat.chat_id,
agentId: this.getAgentForIssueType(issueType),
userInfo: {
name: `Customer ${customerId}`,
id: customerId
},
callbacks: {
onConnect: () => {
console.log('Support session started');
this.logSupportEvent('session_started', { customerId, issueType });
},
onTranscription: (text, isFinal) => {
if (isFinal) {
this.logSupportEvent('customer_spoke', {
customerId,
text: text.substring(0, 100) // Log first 100 chars
});
}
},
onMessage: (message) => {
if (message.type === 'ai_speech') {
this.logSupportEvent('agent_responded', {
customerId,
responseLength: message.text.length
});
}
},
onConversationSaved: (chatId, messageId) => {
console.log(`Support conversation saved to chat ${chatId}`);
},
onDisconnect: (details) => {
this.logSupportEvent('session_ended', {
customerId,
reason: details?.reason,
duration: this.getSessionDuration()
});
}
}
});
return {
sessionId: this.activeSession,
chatId: chat.chat_id
};
} catch (error) {
console.error('Failed to start support session:', error);
throw error;
}
}
async endSupportSession() {
if (this.activeSession) {
await this.voiceSDK.endVoiceSession(this.activeSession);
this.activeSession = undefined;
}
}
private getAgentForIssueType(issueType: string): string {
const agentMap = {
'technical': 'agent_technical_support',
'billing': 'agent_billing_support',
'general': 'agent_general_support'
};
return agentMap[issueType] || agentMap['general'];
}
private logSupportEvent(event: string, data: any) {
console.log(`Support Event: ${event}`, data);
// Send to your analytics/logging system
}
private getSessionDuration(): number {
// Calculate session duration
return 0; // Placeholder
}
}
Best Practices
Error Handling and Fallbacks
const voiceSupport = {
async startWithFallback() {
try {
return await this.voiceSDK.startVoiceConversation(options);
} catch (error) {
console.warn('Voice failed, falling back to text chat:', error);
// Fallback to text-only chat
return await this.chatSDK.createChat('Support Chat (Text)');
}
}
};
Resource Management
class VoiceManager {
private activeSessions = new Set<string>();
async startSession(options: any) {
const sessionId = await this.voiceSDK.startVoiceConversation(options);
this.activeSessions.add(sessionId);
return sessionId;
}
async cleanup() {
// End all active sessions
for (const sessionId of this.activeSessions) {
try {
await this.voiceSDK.endVoiceSession(sessionId);
} catch (error) {
console.warn('Failed to end session:', sessionId, error);
}
}
this.activeSessions.clear();
}
}
// Use React.memo for audio visualization components
const AudioVisualizer = React.memo(({ getInputData, isActive }) => {
// Throttle animation updates
const throttledAnimate = useCallback(
throttle(() => {
// Animation logic
}, 16), // ~60fps
[]
);
// ... component logic
});
Accessibility
function VoiceAccessibleChat() {
const [transcript, setTranscript] = useState('');
const { startSession } = useVoiceConversation({
callbacks: {
onTranscription: (text, isFinal) => {
setTranscript(text);
// Update screen reader
if (isFinal) {
announceToScreenReader(`You said: ${text}`);
}
}
}
});
return (
<div>
<button
aria-label="Start voice conversation"
onClick={startSession}
>
🎤 Start Voice Chat
</button>
<div
aria-live="polite"
aria-label="Voice transcript"
>
{transcript}
</div>
</div>
);
}