File Upload & OCR
Upload and process documents, images, and code files with automatic text extraction and OCR capabilities. Fully OpenAI-compatible with support for 40+ file types.
Overview
Selam API provides comprehensive file upload capabilities with automatic text extraction from documents, OCR for images, and code file processing. All file uploads are OpenAI-compatible and work seamlessly with vision models for combined text and visual analysis.
Documents
PDF, DOCX, TXT, and Markdown files with automatic text extraction.
OCR
Extract text from images in English, Amharic, and Tigrinya with high accuracy.
Code Files
Support for 40+ programming languages including Python, JavaScript, Java, and more.
Information
Supported File Types
Selam API supports a wide range of file types for different use cases:
Documents
.pdfPDF documents (max 5 pages)
.docxMicrosoft Word documents
.txtPlain text files
.mdMarkdown files
Images (with OCR)
.png.jpg.jpeg.bmp.tiff.gifOCR supports English, Amharic (አማርኛ), and Tigrinya (ትግርኛ)
Code Files (40+ Languages)
.py.js.ts.json.java.cpp.go.rs.php.rb.swift.kt.html.css.xml.yamlAnd many more: C, C++, Scala, R, Shell, SQL, GraphQL, TOML, and more
File Size Limits
File size and text extraction limits ensure optimal performance:
| File Type | Max Size | Max Text | Notes |
|---|---|---|---|
| 5 MB | 10,000 chars | Max 5 pages, 2000 chars/page | |
| DOCX | 5 MB | 10,000 chars | Full document |
| Images | 2 MB | 2,000 chars | OCR per image |
| Code Files | 5 MB | 10,000 chars | UTF-8 encoded |
| Text Files | 5 MB | 10,000 chars | Plain text |
Warning
Quick Start
Upload files using base64 encoding with OpenAI-compatible format:
1from openai import OpenAI
2import base64
3
4client = OpenAI(
5 api_key="your-api-key",
6 base_url="https://api.selamgpt.com/v1"
7)
8
9# Read and encode file
10with open("document.pdf", "rb") as f:
11 file_data = base64.b64encode(f.read()).decode('utf-8')
12
13# Create data URI
14file_uri = f"data:application/pdf;base64,{file_data}"
15
16# Send request
17response = client.chat.completions.create(
18 model="selam-plus",
19 messages=[
20 {
21 "role": "user",
22 "content": [
23 {"type": "text", "text": "Summarize this document"},
24 {
25 "type": "file",
26 "file": {
27 "url": file_uri,
28 "filename": "document.pdf"
29 }
30 }
31 ]
32 }
33 ]
34)
35
36print(response.choices[0].message.content)OCR Capabilities
Optical Character Recognition (OCR) automatically extracts text from images and runs in parallel with vision models for comprehensive analysis:
How OCR Works
- 1.You send an image in your chat request
- 2.OCR extracts text from the image
- 3.Vision model analyzes the image visually
- 4.Model receives BOTH extracted text AND image
- 5.Comprehensive response using both sources
Supported Languages
English
Highest accuracy, default language
Amharic (አማርኛ)
Native Ethiopian language support
Tigrinya (ትግርኛ)
Eritrean/Ethiopian language support
Fallback Behavior
OCR succeeds, Vision succeeds
Best response using both text and visual analysis
OCR fails, Vision succeeds
Visual analysis only, still provides useful response
OCR succeeds, Vision fails
Text extraction available, guaranteed fallback
Both fail
Error returned with details
Information
Vision Model Integration
File uploads work seamlessly with vision-capable models for combined text and visual analysis:
1from openai import OpenAI
2import base64
3
4client = OpenAI(
5 api_key="your-api-key",
6 base_url="https://api.selamgpt.com/v1"
7)
8
9# Upload image with OCR
10with open("receipt.jpg", "rb") as f:
11 image_data = base64.b64encode(f.read()).decode('utf-8')
12
13response = client.chat.completions.create(
14 model="selam-plus", # Vision-capable model
15 messages=[
16 {
17 "role": "user",
18 "content": [
19 {"type": "text", "text": "Extract all items and prices from this receipt"},
20 {
21 "type": "image_url",
22 "image_url": {
23 "url": f"data:image/jpeg;base64,{image_data}"
24 }
25 }
26 ]
27 }
28 ]
29)
30
31# Response includes both OCR text and visual analysis
32print(response.choices[0].message.content)Code File Processing
Upload code files for review, analysis, or debugging. Perfect for code reviews and documentation:
1from openai import OpenAI
2import base64
3
4client = OpenAI(
5 api_key="your-api-key",
6 base_url="https://api.selamgpt.com/v1"
7)
8
9# Upload Python code for review
10with open("app.py", "rb") as f:
11 code_data = base64.b64encode(f.read()).decode('utf-8')
12
13response = client.chat.completions.create(
14 model="selam-coder", # Best for code analysis
15 messages=[
16 {
17 "role": "user",
18 "content": [
19 {"type": "text", "text": "Review this code for bugs and suggest improvements"},
20 {
21 "type": "file",
22 "file": {
23 "url": f"data:text/x-python;base64,{code_data}",
24 "filename": "app.py"
25 }
26 }
27 ]
28 }
29 ]
30)
31
32print(response.choices[0].message.content)Tip
selam-coder model for best code analysis results. It's optimized for understanding and reviewing code.Multiple Files
Upload multiple files in a single request for comparison or combined analysis:
1from openai import OpenAI
2import base64
3
4client = OpenAI(
5 api_key="your-api-key",
6 base_url="https://api.selamgpt.com/v1"
7)
8
9# Upload multiple files
10files = ["version1.py", "version2.py"]
11content = [{"type": "text", "text": "Compare these two versions"}]
12
13for filename in files:
14 with open(filename, "rb") as f:
15 file_data = base64.b64encode(f.read()).decode('utf-8')
16
17 content.append({
18 "type": "file",
19 "file": {
20 "url": f"data:text/x-python;base64,{file_data}",
21 "filename": filename
22 }
23 })
24
25response = client.chat.completions.create(
26 model="selam-coder",
27 messages=[{"role": "user", "content": content}]
28)
29
30print(response.choices[0].message.content)Information
Base64 Encoding
All files must be base64-encoded and sent as data URIs. Here's how to encode files in different languages:
Python
1import base64
2
3# Read and encode file
4with open("document.pdf", "rb") as f:
5 file_bytes = f.read()
6 base64_data = base64.b64encode(file_bytes).decode('utf-8')
7
8# Create data URI
9mime_type = "application/pdf"
10data_uri = f"data:{mime_type};base64,{base64_data}"
11
12print(f"Data URI length: {len(data_uri)} characters")JavaScript/Node.js
1import fs from 'fs';
2import PageFeedback from '@/components/docs/PageFeedback';
3
4// Read and encode file
5const fileBytes = fs.readFileSync('document.pdf');
6const base64Data = fileBytes.toString('base64');
7
8// Create data URI
9const mimeType = 'application/pdf';
10const dataUri = `data:${mimeType};base64,${base64Data}`;
11
12console.log(`Data URI length: ${dataUri.length} characters`);Command Line (Linux/Mac)
1# Encode file to base64
2base64 -w 0 document.pdf > document.b64
3
4# Or inline
5FILE_B64=$(base64 -w 0 document.pdf)
6
7# Create data URI
8echo "data:application/pdf;base64,$FILE_B64"Warning
OpenAI-Compatible Format
Selam API uses the same file upload format as OpenAI, making it easy to migrate existing code:
File Content Part Structure
1{
2 "type": "file",
3 "file": {
4 "url": "data:application/pdf;base64,<base64_data>",
5 "filename": "document.pdf"
6 }
7}Complete Request Example
1{
2 "model": "selam-plus",
3 "messages": [
4 {
5 "role": "user",
6 "content": [
7 {
8 "type": "text",
9 "text": "Analyze this document"
10 },
11 {
12 "type": "file",
13 "file": {
14 "url": "data:application/pdf;base64,JVBERi0xLjQKJeLjz9...",
15 "filename": "report.pdf"
16 }
17 }
18 ]
19 }
20 ]
21}Best Practices
Optimize File Size
Keep files under 5 MB for best performance. Compress images and PDFs before uploading. Remember base64 encoding adds 33% to file size.
Limit PDF Pages
PDFs are limited to 5 pages. For longer documents, split into multiple files or extract relevant pages before uploading.
Use Descriptive Filenames
Provide clear, descriptive filenames. The model uses filenames as context to better understand the content and purpose of each file.
High-Quality Images
Use clear, high-contrast images for best OCR accuracy. Avoid blurry or low-resolution images. Crop to relevant areas to reduce file size.
Choose the Right Model
Use selam-plus for documents, selam-coder for code review, and selam-thinking for complex analysis. Each model is optimized for different tasks.
Cache Extracted Text
If processing the same file multiple times, cache the extracted text locally to avoid repeated extraction and reduce API calls.
Handle Errors Gracefully
Implement error handling for file extraction failures. The API continues processing even if extraction fails, but your prompt may not have file context.
Verify MIME Types
Use correct MIME types in data URIs. application/pdf for PDFs, text/plain for text files, image/jpeg for images, etc.
Related Resources
Was this page helpful?