Files and Images
Language models work with text, but users want to share documents and images. The SDK bridges this gap by automatically extracting text from files before sending them as context.
Document Processing
The SDK extracts text from PDF (all pages), Word documents (raw text), Excel spreadsheets (structured JSON with sheet names), and ZIP archives (recursively processing files inside). Processing happens automatically when you attach files to a message. The extracted text is sent as context to the model, while original metadata is preserved for your UI.
For more control over individual file types, see usePdf and useOCR. To manage file attachments directly, use useFiles.
Images
Images are sent directly to vision models without text extraction. Models can identify objects, read text in images, understand charts, and answer questions about visual content. If you need text extracted from images specifically, use the OCR utility separately.
With Chat
useChatStorage handles file processing automatically when you send messages with attachments. You can configure processing behavior:
const { sendMessage } = useChatStorage({
database,
getToken,
fileProcessingOptions: {
maxFileSizeBytes: 10 * 1024 * 1024, // 10MB
keepOriginalFiles: true,
onProgress: (current, total) => setProgress(current / total),
},
});Generated Content
When models generate images through the image generation tool, the SDK downloads them automatically, stores them encrypted locally, and persists them in conversation history. Temporary API URLs are replaced with permanent local storage, so images remain available even after the original URLs expire.