This guide helps you migrate from the legacy
fish_audio_sdk (Session-based API) to the new fishaudio (client-based API) available in fish-audio-sdk v1.0+.Quick Migration
1
Upgrade the package
fish_audio_sdk to fishaudio.2
Update imports
3
Replace Session with Client
4
Update API calls
See the quick reference below for common operations.
Key Changes at a Glance
| Legacy | New | Notes |
|---|---|---|
Session() | FishAudio() | Client-based architecture |
session.tts() | client.tts.convert() | Returns complete audio bytes |
session.asr() | client.asr.transcribe() | Clearer method name |
session.create_model() | client.voices.create() | ”Model” → “Voice” terminology |
session.list_models() | client.voices.list() | Resource namespacing |
TTSRequest(...) | Direct parameters | No request objects |
WebSocketSession | client.tts.stream_websocket() | Integrated into client |
HttpCodeErr | Specific exceptions | Better error handling |
Text-to-Speech Migration
The new SDK’s
convert() returns complete audio bytes instead of chunks. Use stream() for chunk-by-chunk transfer or stream_websocket() for real-time streaming.Voice Cloning Migration
Speech-to-Text Migration
WebSocket Streaming Migration
Error Handling Migration
Async Support
The new SDK has full async support withAsyncFishAudio:
Breaking Changes Summary
TTS now returns complete audio, not chunks
TTS now returns complete audio, not chunks
Before: Iterator of chunksAfter: Complete audio bytesUse
stream() or stream_websocket() if you need chunks.No more request objects (TTSRequest, ASRRequest)
No more request objects (TTSRequest, ASRRequest)
Before:After:Pass parameters directly to methods.
ASR timestamps changed from seconds to milliseconds
ASR timestamps changed from seconds to milliseconds
Before:
segment.start in seconds (e.g., 1.5)After: segment.start in milliseconds (e.g., 1500)Convert: seconds = segment.start / 1000Model terminology changed to Voice
Model terminology changed to Voice
session.create_model()→client.voices.create()session.list_models()→client.voices.list()session.get_model()→client.voices.get()
client.voices.update() and client.voices.delete()Common Issues
ModuleNotFoundError: No module named 'fishaudio'
ModuleNotFoundError: No module named 'fishaudio'
Upgrade the package:
Code expects TTS chunks but gets complete audio
Code expects TTS chunks but gets complete audio
The new
convert() returns complete audio. Use stream() for chunks:WebSocket requires empty text parameter
WebSocket requires empty text parameter
Remove the empty text. Just pass your generator:
ASR timestamps are off by 1000x
ASR timestamps are off by 1000x
New SDK uses milliseconds instead of seconds:
Next Steps
Python SDK Guide
Complete guide for the new SDK
API Reference
Detailed API documentation
Text-to-Speech
TTS features and examples
Voice Cloning
Clone voices and manage models
Need Help?
- GitHub Repository - Report issues or request features
- Discord Community - Get help from the community
- PyPI Package - Package information

