docs v1.0.0

Text-to-Speech

Convert text to natural-sounding speech in multiple languages including Amharic, Somali, Tigrinya, Oromo, and English with authentic native voices.

Overview

Selam API provides OpenAI-compatible text-to-speech with native Ethiopian language support. Choose from 9 authentic voices across 5 languages, with flexible audio formats and speed control.

Multi-Language

Native support for Amharic, Somali, Tigrinya, Oromo, and English with authentic voices.

Multiple Formats

Support for MP3, WAV, Opus, AAC, and FLAC audio formats for any use case.

Speed Control

Adjust playback speed from 0.25x to 4.0x for different use cases.

Information

TTS is available for BETA and above. All users get access to 9 authentic voices with OpenAI-compatible API.

Available Voices

Choose from 9 authentic voices across 5 languages. All voices are optimized for natural-sounding speech.

Amharic (አማርኛ)

mekdesFemale

Native Amharic female voice

amehaMale

Native Amharic male voice

bettyFemale

Native Amharic female voice

Somali (Soomaali)

ubaxFemale

Native Somali female voice

muuseMale

Native Somali male voice

Tigrinya (ትግርኛ)

samiMale

Native Tigrinya male voice

Oromo (Afaan Oromoo)

ibsaaMale

Native Oromo male voice

English

ariaFemale

Standard English female voice

andrewMale

Standard English male voice

Warning

Important: Use only TTS voices (listed above) for text-to-speech. GPT Audio voices (alloy, echo, fable, etc.) are for Voice Agent only and will respond conversationally instead of reading your text.

Voice Samples

Listen to sample audio from each language to find the perfect voice for your application:

Amharic (አማርኛ)

Voice: betty

Tigrinya (ትግርኛ)

Voice: sami

Oromo (Afaan Oromoo)

Voice: ibsaa

Somali (Soomaali)

Voice: ubax

English

Voice: aria

Information

All samples are generated using the SelamGPT TTS API. Try different voices to find the best fit for your application.

Quick Start

Generate speech from text using the /v1/audio/speech endpoint:

1import requests
2
3def generate_speech(text, voice="mekdes", api_key="your-api-key"):
4    url = "https://api.selamgpt.com/v1/audio/speech"
5    headers = {
6        "X-API-Key": api_key,
7        "Content-Type": "application/json"
8    }
9    data = {
10        "model": "tts-1",
11        "input": text,
12        "voice": voice,
13        "response_format": "mp3"
14    }
15    
16    response = requests.post(url, headers=headers, json=data)
17    
18    if response.status_code == 200:
19        with open("output.mp3", "wb") as f:
20            f.write(response.content)
21        print("Audio saved as output.mp3")
22    else:
23        print(f"Error: {response.status_code} - {response.text}")
24
25# Generate Amharic speech
26generate_speech("ሰላም! እንዴት ነህ?", "mekdes")

Request Parameters

modelrequired

TTS model to use. Options: tts-1 or tts-1-hd

inputrequired

Text to convert to speech. Maximum length varies by tier:

Standard: 1,500 characters
Enhanced: 5,000 characters
Premium: 12,500 characters

voicerequired

Voice to use for speech generation. See Available Voices section above.

response_formatoptional

Audio format. Options: mp3, opus, aac, flac, wav

Default: mp3

speedoptional

Playback speed multiplier. Range: 0.25 to 4.0

Default: 1.0

Audio Formats

Choose the right audio format for your use case:

mp3Recommended

Best for general use, storage, and streaming

Compressed, widely supported

opus

Best for real-time applications and low latency

Highly compressed, optimized for speech

aac

Good for mobile applications

Compressed, good quality

flac

Best for archival and highest quality

Lossless compression

wav

Best for audio processing and editing

Uncompressed, large file size

Rate Limits

TTS rate limits are based on your account tier:

Tier	Requests/Minute	Audio Minutes/Day	Max Characters
FREE	5	5 minutes	1,500
BETA	15	15 minutes	5,000
PRO	30	40 minutes	12,500

Information

Rate limits are tracked per user, not per API key. Upgrade your tier for higher limits.

Best Practices

Text Preparation

•Clean text of special characters and formatting before sending
•Use proper UTF-8 encoding for Ethiopian scripts
•Remove excessive punctuation that may affect pronunciation

Chunking Long Text

•Split long texts into smaller chunks for better performance
•Break at natural sentence boundaries
•Stay within character limits for your tier

Caching

•Cache generated audio for repeated content
•Use content hashing to identify duplicate requests
•Store audio files locally or in CDN for faster delivery

Error Handling

•Implement retry logic with exponential backoff
•Handle rate limit errors gracefully
•Provide fallback options for failed requests

Related Resources

Rate Limits

View detailed rate limits for all tiers

Error Handling

Learn about error codes and troubleshooting

Was this page helpful?