Example:text/event-stream
SSE Audio Stream
Returns audio chunks and metadata via Server-Sent Events. Note that this implementation accepts a JSON body in a GET request.
Headers
Body
application/json
Body
TTSRequest
The model id to use (KSA dialect or MSA).
Allowed values:silma-tts-v2-msasilma-tts-v2-ksa
The text to synthesize into speech.
<= 250 characters
Example:أنا نموذج سِلْمَا الجديد لتحويل النص إلى كلام، أستطيع التحدث باللغة العربية مع أو بدون تشكيل.
Variance in speech prosody.
Default:0.2
Speed of the generated speech.
Default:0.2
The ID of the pre-defined voice.
Allowed values:sarahsalmasalwasajasultansalmansulaimansalim
Optional user identifier, needed only for pronunciation overrides and loading custom voices. Find it here https://app.silma.ai/api-keys
- The ID representing your uploaded custom voice to be cloned.
- This should be an ID for a voice (ex:voice_1769817467123) in the “Custom Voices” section in https://app.silma.ai/voices.
- If you use this parameter then you should use the “user_id” parameter as well.
This indicates that you have added custom pronunciation overrides to your account via https://app.silma.ai/control. Enabling this feature will automatically customize the model based on your overrides.
Default:false
Response
200 text/event-stream
Response
Stream of SSE events (data: {audio, text, status}).
Allowed values:streamingcompleted
Base64 encoded float32 waveform.
The specific text chunk processed.
Authentication
Headers
Body
WebSocket Streaming
Establish a full-duplex connection for TTS.
Handshake: Upgrade request to wss://api.silma.ai/tts/v2/ws/stream.
Process:
- Client sends
TTSRequestJSON. - Server sends messages with
status:started,streaming,completed, orfailed.
Response
101
Response
Switching Protocols to WebSocket.
Authentication
TTSRequest
objectThe model id to use (KSA dialect or MSA).
Allowed values:silma-tts-v2-msasilma-tts-v2-ksa
The text to synthesize into speech.
<= 250 characters
Example:أنا نموذج سِلْمَا الجديد لتحويل النص إلى كلام، أستطيع التحدث باللغة العربية مع أو بدون تشكيل.
Variance in speech prosody.
Default:0.2
Speed of the generated speech.
Default:0.2
The ID of the pre-defined voice.
Allowed values:sarahsalmasalwasajasultansalmansulaimansalim
Optional user identifier, needed only for pronunciation overrides and loading custom voices. Find it here https://app.silma.ai/api-keys
- The ID representing your uploaded custom voice to be cloned.
- This should be an ID for a voice (ex:voice_1769817467123) in the “Custom Voices” section in https://app.silma.ai/voices.
- If you use this parameter then you should use the “user_id” parameter as well.
This indicates that you have added custom pronunciation overrides to your account via https://app.silma.ai/control. Enabling this feature will automatically customize the model based on your overrides.
Default:false
ErrorResponse
objectError message details.