Working with Figures
TeXRA allows AI agents to analyze, reference, and generate figures within your documents.
Quick Task: Add a Figure Caption
- Select the
polishagent from the dropdown - Choose a vision-capable model (e.g.,
gemini25p,gpt4o) - Select your figure in the Media section
- Enter instruction: "Write a detailed caption for this figure"
- Click Execute
The Media Section
The main TeXRA panel includes a "Media" section for managing figure files:
- Dropdown: Select a primary media file
- Multiple Toggle: Expand to select multiple files
- Auto Extract Dropdown: Configure automatic figure extraction
Supported File Types
Configurable via texra.files.included.mediaExtensions:
- Images:
.png,.jpeg,.jpg,.gif,.heic,.heif,.webp - Documents:
.pdf(native or converted to images) - Audio (experimental):
.wav,.m4a,.mp3,.aiff,.aac,.ogg,.flac
PDFs are processed natively when supported (Anthropic/Gemini/OpenAI). Otherwise, TeXRA uses GraphicsMagick/ImageMagick + Ghostscript for conversion.
Clipboard Images
Paste images directly into the instruction area:
- Copy any image to clipboard
- Paste with Ctrl/Cmd+V
- Image is saved and referenced as
[pasted_timestamp_hash.ext] - Media Files list updates automatically
Pasted images are stored temporarily and cleaned up after 3 days.
Automatic Figure Extraction
Enable via the Auto-extract dropdown near the Media label:
- Figures: Extracts images from
\includegraphicscommands - TikZ Figures: Extracts
tikzpictureenvironments as.tikzfiles
Figure Extraction Tools
Tool-use agents can extract figures programmatically:
extract_figures
Scans for \includegraphics and returns referenced files (max 20 attachments):
{
"name": "extract_figures",
"arguments": { "texPath": "paper/main.tex" }
}extract_tikz_figures
Extracts and optionally compiles TikZ environments (max 12 PDFs):
{
"name": "extract_tikz_figures",
"arguments": { "texPath": "paper/main.tex", "compile": true }
}extract_bib_entries
Retrieves BibTeX records for citations.
Using Media Files
When you provide media files:
- Vision models (GPT-4o, Gemini): Images are encoded and included with the prompt
- Audio models: Audio files are uploaded for transcription
- Non-multimodal models: Filenames provide context
Common use cases:
- Write captions for images (
polishagent) - Verify text matches figures (
correctagent) - Generate text from images/PDFs (
ocragent) - Transcribe audio (
transcribe_audioagent)
For TikZ-specific workflows, see TikZ Figures.