Working with Figures
TeXRA lets AI agents analyse, reference, and generate figures inside your documents — from .png screenshots to embedded TikZ diagrams and PDFs.
CLI
This page covers the VS Code Media selector. From the texra CLI, figures referenced by your input documents are auto-extracted and attached for vision-capable models on workflow runs — or work with a figure conversationally in texra chat. (--context files are read as text, so they're for .tex and .bib sources, not images.)
Quick Task: Add a Figure Caption
- Select the
polishagent from the agent dropdown (). - Pick a vision-capable model () — e.g.
gpt55,sonnet46T,gemini31p. - Select your figure in the Media () section.
- Type the instruction: "Write a detailed caption for this figure."
- Click Execute ().
In the terminal, the same five steps collapse to one command. Pick a vision-capable model, and the figures your document references are auto-extracted and attached for it:
- r0 — draft revision
- r1 — critique and revise
No media picker needed: on a vision model, round 0 extracts the figures referenced by draft.tex and sends them alongside the text.
The Media Section
The main TeXRA panel includes a Media section for figure files. Its header carries the inline Auto Extract dropdown () plus a three-action toolbar, and each file is a row you can drag to reorder:
The Media group: the wand opens auto-extract options, the toolbar adds opened files / clears all / adds media files, and each row drags to reorder with a trailing trash icon.
- Auto Extract Dropdown (): configure automatic figure extraction.
- Add opened files (): append every open editor tab whose extension is a configured media type.
- Clear all media files (): empty the media list.
- Add media files (): open a file picker to append figures.
- Drag-and-drop image, PDF, or audio files from anywhere onto the section.
Supported File Types
Configurable via texra.files.included.mediaExtensions:
Encoded and attached for vision-capable models.
Native multimodal on Anthropic / Gemini / OpenAI, otherwise rasterised.
Uploaded for transcription by audio-capable models.
Three media categories with their accepted extensions — images and audio carry several formats, while PDFs lean on native multimodal support where the provider offers it.
PDFs use native multimodal support when the provider offers it. Otherwise TeXRA converts them via GraphicsMagick / ImageMagick + Ghostscript — status for those system dependencies lives on Dashboard → LaTeX ().
Clipboard Images
Paste images directly into the instruction area:
- Copy any image to the clipboard.
- Paste with
Ctrl/Cmd+V. - The image is saved and referenced as
[pasted_timestamp_hash.ext]. - The Media Files list () updates automatically.
Pasted images are stored temporarily and cleaned up after 3 days.
Automatic Figure Extraction
Enable via the Auto Extract dropdown () near the Media label:
The lit wand button opens a checkbox menu — toggle Figures, TikZ Figures, and Compile Input PDF.
- Figures (): extracts images from
\includegraphicscommands. - TikZ Figures (): extracts
tikzpictureenvironments as.tikzfiles and compiles them.
Figure Extraction Tools
Tool-use agents can extract figures programmatically. These are part of the LaTeX Extraction built-in tool group — always available. In a run, an agent like research drives them one after another, attaching what it finds so a multimodal model can read it:
- extract_figurespaper/main.texFound 9 \includegraphics — attached 9 files
- extract_bib_entriespaper/main.texReturned BibTeX for every \cite key
- extract_tikz_figurespaper/main.texcompile: true → latexmk renders 7 tikzpicture snippets, attaches 7 PDFs
The three LaTeX Extraction tools as they surface in the Progress view — each returns referenced files, BibTeX records, or compiled TikZ PDFs the model can read directly. The raw request form for each is below.
extract_figures
Scans for \includegraphics and returns referenced files (up to 20 attachments):
{
"name": "extract_figures",
"arguments": { "texPath": "paper/main.tex" }
}extract_tikz_figures
Extracts and optionally compiles tikzpicture environments (up to 12 PDFs):
{
"name": "extract_tikz_figures",
"arguments": { "texPath": "paper/main.tex", "compile": true }
}Setting compile: true runs latexmk/pdflatex on each snippet and attaches the resulting PDFs so multimodal models can read them directly.
extract_bib_entries
Retrieves BibTeX records for every citation key found in the document.
Using Media Files with Models
When you provide media files, they're handed to the model according to its capabilities:
- Vision models (GPT-5.5, Claude Opus 4.8 / Sonnet 4.6, Gemini 3.1 Pro, …): images are encoded and attached to the prompt.
- Audio models: audio files are uploaded for transcription.
- Non-multimodal models: only filenames are passed as context.
Common use cases:
- Write captions for images (
polishagent) - Verify text matches figures (
correctagent) - Generate text from images/PDFs (
ocragent) - Transcribe audio (
transcribe_audioagent)
For TikZ-specific workflows, see TikZ Figures.