SILMA TTS API

SILMA TTS v2 API

2.0.0OAS 3.1

High-quality Arabic Text-to-Speech API supporting multiple dialects and streaming modes.

API Base URL
  • Server 1:https://api.silma.ai/tts/v2
Security
ApiKeyAuth (apiKey)

An API key is a token that you provide when making API calls. Include the token in a header parameter called apiKey.

Example: apiKey: 123

Binary Waveform Stream

Returns a raw stream of float32 audio bytes.

post
https://api.silma.ai/tts/v2/stream

Body

application/json

TTSRequest

model_idstringrequired

The model id to use (KSA dialect or MSA).

Allowed values:silma-tts-v2-msasilma-tts-v2-ksa

textstringrequired

The text to synthesize into speech.

<= 250 characters

Example:أنا نموذج سِلْمَا الجديد لتحويل النص إلى كلام، أستطيع التحدث باللغة العربية مع أو بدون تشكيل.

creativitynumber(float)

Variance in speech prosody.

Default:0.2

speednumber(float)

Speed of the generated speech.

Default:0.2

voice_idstringrequired

The ID of the pre-defined voice.

Allowed values:sarahsalmasalwasajasultansalmansulaimansalim

user_idstring

Optional user identifier, needed only for pronunciation overrides and loading custom voices. Find it here https://app.silma.ai/api-keys

custom_audio_idstring
  • The ID representing your uploaded custom voice to be cloned.
  • This should be an ID for a voice (ex:voice_1769817467123) in the “Custom Voices” section in https://app.silma.ai/voices.
  • If you use this parameter then you should use the “user_id” parameter as well.
enable_server_pronunciation_overridesboolean

This indicates that you have added custom pronunciation overrides to your account via https://app.silma.ai/control. Enabling this feature will automatically customize the model based on your overrides.

Default:false

Response

application/octet-stream

A continuous stream of raw float32 bytes.

string(binary)
post/stream

Body

{ "model_id": "silma-tts-v2-msa", "text": "أنا نموذج سِلْمَا الجديد لتحويل النص إلى كلام، أستطيع التحدث باللغة العربية مع أو بدون تشكيل.", "voice_id": "sarah" }
 
application/octet-stream

SSE Audio Stream

Returns audio chunks and metadata via Server-Sent Events. Note that this implementation accepts a JSON body in a GET request.

get
https://api.silma.ai/tts/v2/stream_sse

Headers

Acceptstringrequired

Example:text/event-stream

Body

application/json

TTSRequest

model_idstringrequired

The model id to use (KSA dialect or MSA).

Allowed values:silma-tts-v2-msasilma-tts-v2-ksa

textstringrequired

The text to synthesize into speech.

<= 250 characters

Example:أنا نموذج سِلْمَا الجديد لتحويل النص إلى كلام، أستطيع التحدث باللغة العربية مع أو بدون تشكيل.

creativitynumber(float)

Variance in speech prosody.

Default:0.2

speednumber(float)

Speed of the generated speech.

Default:0.2

voice_idstringrequired

The ID of the pre-defined voice.

Allowed values:sarahsalmasalwasajasultansalmansulaimansalim

user_idstring

Optional user identifier, needed only for pronunciation overrides and loading custom voices. Find it here https://app.silma.ai/api-keys

custom_audio_idstring
  • The ID representing your uploaded custom voice to be cloned.
  • This should be an ID for a voice (ex:voice_1769817467123) in the “Custom Voices” section in https://app.silma.ai/voices.
  • If you use this parameter then you should use the “user_id” parameter as well.
enable_server_pronunciation_overridesboolean

This indicates that you have added custom pronunciation overrides to your account via https://app.silma.ai/control. Enabling this feature will automatically customize the model based on your overrides.

Default:false

Response

200 text/event-stream

Stream of SSE events (data: {audio, text, status}).

statusstring

Allowed values:streamingcompleted

audiostring

Base64 encoded float32 waveform.

textstring

The specific text chunk processed.

get/stream_sse

Body

{ "model_id": "silma-tts-v2-msa", "text": "أنا نموذج سِلْمَا الجديد لتحويل النص إلى كلام، أستطيع التحدث باللغة العربية مع أو بدون تشكيل.", "voice_id": "sarah" }
 
200 text/event-stream

WebSocket Streaming

Establish a full-duplex connection for TTS.
Handshake: Upgrade request to wss://api.silma.ai/tts/v2/ws/stream.
Process:

  1. Client sends TTSRequest JSON.
  2. Server sends messages with status: started, streaming, completed, or failed.
get
https://api.silma.ai/tts/v2/ws/stream

Response

101

Switching Protocols to WebSocket.

get/ws/stream
 
101

TTSRequest

object
model_idstringrequired

The model id to use (KSA dialect or MSA).

Allowed values:silma-tts-v2-msasilma-tts-v2-ksa

textstringrequired

The text to synthesize into speech.

<= 250 characters

Example:أنا نموذج سِلْمَا الجديد لتحويل النص إلى كلام، أستطيع التحدث باللغة العربية مع أو بدون تشكيل.

creativitynumber(float)

Variance in speech prosody.

Default:0.2

speednumber(float)

Speed of the generated speech.

Default:0.2

voice_idstringrequired

The ID of the pre-defined voice.

Allowed values:sarahsalmasalwasajasultansalmansulaimansalim

user_idstring

Optional user identifier, needed only for pronunciation overrides and loading custom voices. Find it here https://app.silma.ai/api-keys

custom_audio_idstring
  • The ID representing your uploaded custom voice to be cloned.
  • This should be an ID for a voice (ex:voice_1769817467123) in the “Custom Voices” section in https://app.silma.ai/voices.
  • If you use this parameter then you should use the “user_id” parameter as well.
enable_server_pronunciation_overridesboolean

This indicates that you have added custom pronunciation overrides to your account via https://app.silma.ai/control. Enabling this feature will automatically customize the model based on your overrides.

Default:false

Example