docs v1.0.0

File Upload & OCR

Upload and process documents, images, and code files with automatic text extraction and OCR capabilities. Fully OpenAI-compatible with support for 40+ file types.

Overview

Selam API provides comprehensive file upload capabilities with automatic text extraction from documents, OCR for images, and code file processing. All file uploads are OpenAI-compatible and work seamlessly with vision models for combined text and visual analysis.

Documents

PDF, DOCX, TXT, and Markdown files with automatic text extraction.

OCR

Extract text from images in English, Amharic, and Tigrinya with high accuracy.

Code Files

Support for 40+ programming languages including Python, JavaScript, Java, and more.

Information

File upload is available for all account tiers. OCR and vision processing run in parallel for comprehensive analysis.

Supported File Types

Selam API supports a wide range of file types for different use cases:

Documents

.pdf

PDF documents (max 5 pages)

.docx

Microsoft Word documents

.txt

Plain text files

.md

Markdown files

Images (with OCR)

.png

.jpg

.jpeg

.bmp

.tiff

.gif

OCR supports English, Amharic (አማርኛ), and Tigrinya (ትግርኛ)

Code Files (40+ Languages)

.py

.js

.ts

.json

.java

.cpp

.go

.rs

.php

.rb

.swift

.kt

.html

.css

.xml

.yaml

And many more: C, C++, Scala, R, Shell, SQL, GraphQL, TOML, and more

File Size Limits

File size and text extraction limits ensure optimal performance:

File Type	Max Size	Max Text	Notes
PDF	5 MB	10,000 chars	Max 5 pages, 2000 chars/page
DOCX	5 MB	10,000 chars	Full document
Images	2 MB	2,000 chars	OCR per image
Code Files	5 MB	10,000 chars	UTF-8 encoded
Text Files	5 MB	10,000 chars	Plain text

Warning

Text is automatically truncated if it exceeds limits. Metadata in the response indicates if truncation occurred.

Quick Start

Upload files using base64 encoding with OpenAI-compatible format:

1from openai import OpenAI
2import base64
3
4client = OpenAI(
5    api_key="your-api-key",
6    base_url="https://api.selamgpt.com/v1"
7)
8
9# Read and encode file
10with open("document.pdf", "rb") as f:
11    file_data = base64.b64encode(f.read()).decode('utf-8')
12
13# Create data URI
14file_uri = f"data:application/pdf;base64,{file_data}"
15
16# Send request
17response = client.chat.completions.create(
18    model="selam-plus",
19    messages=[
20        {
21            "role": "user",
22            "content": [
23                {"type": "text", "text": "Summarize this document"},
24                {
25                    "type": "file",
26                    "file": {
27                        "url": file_uri,
28                        "filename": "document.pdf"
29                    }
30                }
31            ]
32        }
33    ]
34)
35
36print(response.choices[0].message.content)

OCR Capabilities

Optical Character Recognition (OCR) automatically extracts text from images and runs in parallel with vision models for comprehensive analysis:

How OCR Works

1.You send an image in your chat request
2.OCR extracts text from the image
3.Vision model analyzes the image visually
4.Model receives BOTH extracted text AND image
5.Comprehensive response using both sources

Supported Languages

English

Highest accuracy, default language

Amharic (አማርኛ)

Native Ethiopian language support

Tigrinya (ትግርኛ)

Eritrean/Ethiopian language support

Fallback Behavior

✓

OCR succeeds, Vision succeeds

Best response using both text and visual analysis

✓

OCR fails, Vision succeeds

Visual analysis only, still provides useful response

✓

OCR succeeds, Vision fails

Text extraction available, guaranteed fallback

✗

Both fail

Error returned with details

Information

OCR runs automatically for all images. No special configuration needed - just send images in your requests.

Vision Model Integration

File uploads work seamlessly with vision-capable models for combined text and visual analysis:

1from openai import OpenAI
2import base64
3
4client = OpenAI(
5    api_key="your-api-key",
6    base_url="https://api.selamgpt.com/v1"
7)
8
9# Upload image with OCR
10with open("receipt.jpg", "rb") as f:
11    image_data = base64.b64encode(f.read()).decode('utf-8')
12
13response = client.chat.completions.create(
14    model="selam-plus",  # Vision-capable model
15    messages=[
16        {
17            "role": "user",
18            "content": [
19                {"type": "text", "text": "Extract all items and prices from this receipt"},
20                {
21                    "type": "image_url",
22                    "image_url": {
23                        "url": f"data:image/jpeg;base64,{image_data}"
24                    }
25                }
26            ]
27        }
28    ]
29)
30
31# Response includes both OCR text and visual analysis
32print(response.choices[0].message.content)

Code File Processing

Upload code files for review, analysis, or debugging. Perfect for code reviews and documentation:

1from openai import OpenAI
2import base64
3
4client = OpenAI(
5    api_key="your-api-key",
6    base_url="https://api.selamgpt.com/v1"
7)
8
9# Upload Python code for review
10with open("app.py", "rb") as f:
11    code_data = base64.b64encode(f.read()).decode('utf-8')
12
13response = client.chat.completions.create(
14    model="selam-coder",  # Best for code analysis
15    messages=[
16        {
17            "role": "user",
18            "content": [
19                {"type": "text", "text": "Review this code for bugs and suggest improvements"},
20                {
21                    "type": "file",
22                    "file": {
23                        "url": f"data:text/x-python;base64,{code_data}",
24                        "filename": "app.py"
25                    }
26                }
27            ]
28        }
29    ]
30)
31
32print(response.choices[0].message.content)

Tip

Use selam-coder model for best code analysis results. It's optimized for understanding and reviewing code.

Multiple Files

Upload multiple files in a single request for comparison or combined analysis:

1from openai import OpenAI
2import base64
3
4client = OpenAI(
5    api_key="your-api-key",
6    base_url="https://api.selamgpt.com/v1"
7)
8
9# Upload multiple files
10files = ["version1.py", "version2.py"]
11content = [{"type": "text", "text": "Compare these two versions"}]
12
13for filename in files:
14    with open(filename, "rb") as f:
15        file_data = base64.b64encode(f.read()).decode('utf-8')
16    
17    content.append({
18        "type": "file",
19        "file": {
20            "url": f"data:text/x-python;base64,{file_data}",
21            "filename": filename
22        }
23    })
24
25response = client.chat.completions.create(
26    model="selam-coder",
27    messages=[{"role": "user", "content": content}]
28)
29
30print(response.choices[0].message.content)

Information

You can upload up to 5 files per request. Each file counts towards the total context window of the model.

Base64 Encoding

All files must be base64-encoded and sent as data URIs. Here's how to encode files in different languages:

Python

1import base64
2
3# Read and encode file
4with open("document.pdf", "rb") as f:
5    file_bytes = f.read()
6    base64_data = base64.b64encode(file_bytes).decode('utf-8')
7
8# Create data URI
9mime_type = "application/pdf"
10data_uri = f"data:{mime_type};base64,{base64_data}"
11
12print(f"Data URI length: {len(data_uri)} characters")

JavaScript/Node.js

1import fs from 'fs';
2import PageFeedback from '@/components/docs/PageFeedback';
3
4// Read and encode file
5const fileBytes = fs.readFileSync('document.pdf');
6const base64Data = fileBytes.toString('base64');
7
8// Create data URI
9const mimeType = 'application/pdf';
10const dataUri = `data:${mimeType};base64,${base64Data}`;
11
12console.log(`Data URI length: ${dataUri.length} characters`);

Command Line (Linux/Mac)

1# Encode file to base64
2base64 -w 0 document.pdf > document.b64
3
4# Or inline
5FILE_B64=$(base64 -w 0 document.pdf)
6
7# Create data URI
8echo "data:application/pdf;base64,$FILE_B64"

Warning

Base64 encoding increases file size by approximately 33%. A 3 MB file becomes ~4 MB when encoded.

OpenAI-Compatible Format

Selam API uses the same file upload format as OpenAI, making it easy to migrate existing code:

File Content Part Structure

1{
2  "type": "file",
3  "file": {
4    "url": "data:application/pdf;base64,<base64_data>",
5    "filename": "document.pdf"
6  }
7}

Complete Request Example

1{
2  "model": "selam-plus",
3  "messages": [
4    {
5      "role": "user",
6      "content": [
7        {
8          "type": "text",
9          "text": "Analyze this document"
10        },
11        {
12          "type": "file",
13          "file": {
14            "url": "data:application/pdf;base64,JVBERi0xLjQKJeLjz9...",
15            "filename": "report.pdf"
16          }
17        }
18      ]
19    }
20  ]
21}

Best Practices

Optimize File Size

Keep files under 5 MB for best performance. Compress images and PDFs before uploading. Remember base64 encoding adds 33% to file size.

Limit PDF Pages

PDFs are limited to 5 pages. For longer documents, split into multiple files or extract relevant pages before uploading.

Use Descriptive Filenames

Provide clear, descriptive filenames. The model uses filenames as context to better understand the content and purpose of each file.

High-Quality Images

Use clear, high-contrast images for best OCR accuracy. Avoid blurry or low-resolution images. Crop to relevant areas to reduce file size.

Choose the Right Model

Use selam-plus for documents, selam-coder for code review, and selam-thinking for complex analysis. Each model is optimized for different tasks.

Cache Extracted Text

If processing the same file multiple times, cache the extracted text locally to avoid repeated extraction and reduce API calls.

Handle Errors Gracefully

Implement error handling for file extraction failures. The API continues processing even if extraction fails, but your prompt may not have file context.

Verify MIME Types

Use correct MIME types in data URIs. application/pdf for PDFs, text/plain for text files, image/jpeg for images, etc.

Related Resources

Vision Guide

Image understanding with AI

Chat Completions

Text generation basics

API Reference

Complete endpoint documentation

Error Handling

Handle API errors gracefully

Was this page helpful?