Skip to content

Working with Figures

TeXRA lets AI agents analyse, reference, and generate figures inside your documents — from .png screenshots to embedded TikZ diagrams and PDFs.

CLI

This page covers the VS Code Media selector. From the texra CLI, figures referenced by your input documents are auto-extracted and attached for vision-capable models on workflow runs — or work with a figure conversationally in texra chat. (--context files are read as text, so they're for .tex and .bib sources, not images.)

Quick Task: Add a Figure Caption

  1. Select the polish agent from the agent dropdown ().
  2. Pick a vision-capable model () — e.g. gpt55, sonnet46T, gemini31p.
  3. Select your figure in the Media () section.
  4. Type the instruction: "Write a detailed caption for this figure."
  5. Click Execute ().

In the terminal, the same five steps collapse to one command. Pick a vision-capable model, and the figures your document references are auto-extracted and attached for it:

texra run
$texra run polish --input draft.tex --model sonnet46T --instruction "Write a detailed caption for the pipeline figure."
  • r0 — draft revision
  • r1 — critique and revise
.texra/runs/b81d4f29e6a3/r1/draft.tex

No media picker needed: on a vision model, round 0 extracts the figures referenced by draft.tex and sends them alongside the text.

The Media Section

The main TeXRA panel includes a Media section for figure files. Its header carries the inline Auto Extract dropdown () plus a three-action toolbar, and each file is a row you can drag to reorder:

Media
figure1.pdf
plot.png
schematic.tikz

The Media group: the wand opens auto-extract options, the toolbar adds opened files / clears all / adds media files, and each row drags to reorder with a trailing trash icon.

  • Auto Extract Dropdown (): configure automatic figure extraction.
  • Add opened files (): append every open editor tab whose extension is a configured media type.
  • Clear all media files (): empty the media list.
  • Add media files (): open a file picker to append figures.
  • Drag-and-drop image, PDF, or audio files from anywhere onto the section.

Supported File Types

Configurable via texra.files.included.mediaExtensions:

Images

Encoded and attached for vision-capable models.

.png.jpeg.jpg.gif.heic.heif.webp
Documents

Native multimodal on Anthropic / Gemini / OpenAI, otherwise rasterised.

.pdf
AudioExperimental

Uploaded for transcription by audio-capable models.

.wav.m4a.mp3.aiff.aac.ogg.flac

Three media categories with their accepted extensions — images and audio carry several formats, while PDFs lean on native multimodal support where the provider offers it.

PDFs use native multimodal support when the provider offers it. Otherwise TeXRA converts them via GraphicsMagick / ImageMagick + Ghostscript — status for those system dependencies lives on Dashboard → LaTeX ().

Clipboard Images

Paste images directly into the instruction area:

  1. Copy any image to the clipboard.
  2. Paste with Ctrl/Cmd+V.
  3. The image is saved and referenced as [pasted_timestamp_hash.ext].
  4. The Media Files list () updates automatically.

Pasted images are stored temporarily and cleaned up after 3 days.

Automatic Figure Extraction

Enable via the Auto Extract dropdown () near the Media label:

Media

The lit wand button opens a checkbox menu — toggle Figures, TikZ Figures, and Compile Input PDF.

  • Figures (): extracts images from \includegraphics commands.
  • TikZ Figures (): extracts tikzpicture environments as .tikz files and compiles them.

Figure Extraction Tools

Tool-use agents can extract figures programmatically. These are part of the LaTeX Extraction built-in tool group — always available. In a run, an agent like research drives them one after another, attaching what it finds so a multimodal model can read it:

researchgathering a paper's figures · LaTeX Extraction · this run
  • extract_figurespaper/main.texFound 9 \includegraphics — attached 9 files
  • extract_bib_entriespaper/main.texReturned BibTeX for every \cite key
  • extract_tikz_figurespaper/main.texcompile: true → latexmk renders 7 tikzpicture snippets, attaches 7 PDFs

The three LaTeX Extraction tools as they surface in the Progress view — each returns referenced files, BibTeX records, or compiled TikZ PDFs the model can read directly. The raw request form for each is below.

extract_figures

Scans for \includegraphics and returns referenced files (up to 20 attachments):

json
{
  "name": "extract_figures",
  "arguments": { "texPath": "paper/main.tex" }
}

extract_tikz_figures

Extracts and optionally compiles tikzpicture environments (up to 12 PDFs):

json
{
  "name": "extract_tikz_figures",
  "arguments": { "texPath": "paper/main.tex", "compile": true }
}

Setting compile: true runs latexmk/pdflatex on each snippet and attaches the resulting PDFs so multimodal models can read them directly.

extract_bib_entries

Retrieves BibTeX records for every citation key found in the document.

Using Media Files with Models

When you provide media files, they're handed to the model according to its capabilities:

  • Vision models (GPT-5.5, Claude Opus 4.8 / Sonnet 4.6, Gemini 3.1 Pro, …): images are encoded and attached to the prompt.
  • Audio models: audio files are uploaded for transcription.
  • Non-multimodal models: only filenames are passed as context.

Common use cases:

  • Write captions for images (polish agent)
  • Verify text matches figures (correct agent)
  • Generate text from images/PDFs (ocr agent)
  • Transcribe audio (transcribe_audio agent)

For TikZ-specific workflows, see TikZ Figures.