Skip to content

Workflow Agents: How They Work

Every time you click "Execute" in TeXRA, an agent takes your files and instructions, asks an AI model to do the work, and delivers the result. This page explains what happens under the hood—enough to understand the system, customize it, and troubleshoot when things go sideways.

When to use workflow mode

Workflow agents are built for deep, single-shot thinking—things like rewriting a whole section, deriving or checking equations step-by-step, converting a paper to slides, or merging edits. They plan in a <scratchpad>, produce a full XML-wrapped output, and optionally reflect on it for another round, so runs with frontier reasoning models can take 10–30 minutes to finish.

If you want a snappier turnaround (e.g. quick polishes, small corrections), pick a smaller or faster model in the model dropdown—output quality drops somewhat, but wall-clock time drops a lot. For short, conversational edits or read-only questions, use a tool-use agent (assistant, research, review) instead: those stream back in seconds and don't go through the full workflow pipeline.

The settings.agentCategory key decides which of these two modes an agent runs in:

workflow
Best for
Deep single-shot work: rewriting a section, deriving equations, paper → slides.
Output
XML-wrapped <document> + <scratchpad> reasoning, saved as versioned files.
Latency
10–30 min with frontier reasoning models.
Reflection
Automatic critique rounds (Round 1+).
Examples
polishcorrectpaper2slide
toolUse
Best for
Quick edits and read-only questions you iterate on conversationally.
Output
Streamed chat replies interleaved with tool calls.
Latency
Seconds — replies stream back as they generate.
Reflection
None — you steer it turn by turn.
Examples
chatresearchreview

Workflow agents reason once and write a versioned, diffable file; tool-use agents converse and call tools turn by turn—it is the first thing to pick for any task. The split maps one-to-one onto the CLI's two entry points: texra run polish … for workflow agents, texra chat --agent research for tool-use agents.

Agent Definition Files (.yaml)

Each agent is defined in a simple .yaml file that tells TeXRA what to say to the AI model and how to handle the response. You can browse and manage these files from the Agents tab in the TeXRA Dashboard, or create your own (see Custom Agents).

Understanding the YAML Structure

These .yaml files have two main parts (and thankfully, YAML is usually less prickly than XML or JSON):

polish.yamlagent definition
settings:how to run
agentCategory: workflowworkflow vs toolUse
prefills: <scratchpad>starts every response
documentTag: documentXML output wrapper
prompts:what to say to the LLM
systemPrompt: |the LLM's role
You are an expert LaTeX editor…
userPrefix: |your files + instruction
{{ INPUT_CONTENT }} · {{ INSTRUCTION }}
userRequest:array → reflection rounds
- Round 0 — write the revisionround 0
- Round 1 — critique & improvereflection

A settings block defines how the agent runs; a prompts block holds the templates—a userRequest array drives Round 0 plus reflection rounds.

  1. settings: Define general operational parameters. For example:
    • agentCategory: Is it a workflow agent (structured Chain-of-Thought reasoning with XML-wrapped output) or a toolUse agent (interactive conversation that can call tools like file editing, web search, etc.)?
    • prefills: Text the agent should automatically start its response with (e.g., <scratchpad>).
    • (Other settings control output format, inheritance, etc. See Configuration and Custom Agents for full details).
  2. prompts: Contain text templates that TeXRA fills with your specific context (input files, instructions) to guide the LLM at different stages:
    • systemPrompt: Sets the overall role and high-level instructions for the LLM.
    • userPrefix: Provides the main context, including your input file(s) (available via e.g., {{ INPUT_CONTENT }}) and the specific instruction you typed in the UI (available via {{ INSTRUCTION }}).
    • userRequest: Asks the LLM to perform the initial task (Round 0). Often instructs the LLM to think within <scratchpad> tags and then output the main content wrapped within the XML tags defined by settings.documentTag (e.g., <document>...</document>). You can also provide an array here: the first entry becomes the round 0 request, and any additional entries drive automatic reflection rounds (Round 1+). When a run consumes more rounds than entries you specify, the first reflection template is reused.

(Prompts use Jinja2 templating. For a detailed list of available variables like {{ INPUT_CONTENT }} and how to use them, see the Custom Agents guide.)

Transparency & Customization

The prompts described above (systemPrompt, userPrefix, etc.) represent TeXRA's structured approach to guiding the LLM. This structured, template-based system means the agent's behavior is transparent and highly customizable through the .yaml file, not a hidden black box.

Basic Execution Flow

When you click "Execute" in the TeXRA UI, TeXRA uses the selected agent's definition (.yaml) and your UI inputs to interact with the chosen LLM:

Key Stages:

  1. Initialization: TeXRA loads the agent definition and reads the files you selected.
  2. Prompt Construction: It combines the agent's systemPrompt, userPrefix (filled with your files and instruction), and userRequest templates into a full prompt for the LLM.
  3. LLM Interaction (Round 0): TeXRA sends the prompt to the selected LLM API. The LLM generates a response, typically including reasoning (<scratchpad>) and the final answer wrapped in XML tags (e.g., <document>...</document>).
  4. Processing: TeXRA saves the raw LLM response (often as an .xml file internally, e.g., r{round}/output.xml). It then parses this file, extracts the content from the primary XML tag (defined by settings.documentTag), and saves that extracted content to the final output file in task storage (e.g., r{round}/output.tex, so Round 0 is r0/output.tex, the first reflection is r1/output.tex, and so on). You can monitor this in the ProgressBoard. For LaTeX files, TeXRA can also automatically generate a latexdiff file comparing the output to the input, enhancing observability. See the LaTeX Diff guide for details.

Clicking Execute is not the only way in — the same load-definition → prompt → rounds → save-to-run-storage pipeline runs headlessly from the terminal:

texra run
$texra run polish --input paper.tex --output paper.polished.tex
  • r0 — draft revision
  • r1 — reflection pass
paper.polished.tex
Same agent definition, same rounds, same run storage — no UI attached.

Each round lands in its own folder under task storage:

task storageone folder per round
r0/Round 0 — draft
  • output.xmlraw LLM response
  • output.texextracted output extract <document>…</document>
  • output.diff.pdflatexdiff vs input
r1/reflection
  • output.xml
  • output.tex
  • output.diff.pdf
r2/reflection
  • output.xml
  • output.tex
  • output.diff.pdf

Every round saves the raw output.xml, the extracted output.tex, and an optional latexdiff PDF—r0/ is the draft; r1/ and later are reflection passes.

Continuation Handling: If the LLM response gets cut off due to output token limits before generating the required endTag, TeXRA automatically sends a continuation prompt. This prompt asks the model to resume generating exactly where it left off, ensuring complete outputs even for very long tasks. This happens seamlessly within a processing round.

What Goes Into the Prompt

TeXRA assembles a conversation from your agent's prompts and the content you selected in the UI. If you enabled Attach TeX Count or Attach Diagnostics, that information is included too. Figures and audio files are sent alongside the text for models that support them. The AI then reasons through the task and produces its output.

Reflection Rounds (Round 1+):

When an agent definition includes multiple userRequest entries (or increases settings.rounds), TeXRA automatically performs additional passes after Round 0 completes:

  1. Reflection Prompt: It renders the appropriate reflection template from subsequent userRequest entries to ask the LLM to critique and improve its own Round 0 output (which is included in the conversation history).
  2. LLM Interaction (Round 1): The LLM generates a revised response.
  3. Processing: TeXRA saves this refined output to a separate round path (r{round}/output.ext, e.g., r1/output.ext for the first reflection, r2/output.ext for the next).

You can control how many rounds execute by editing the agent YAML—either adjust settings.rounds for the maximum number of passes or add more entries to userRequest. The run stops early whenever the model signals it is finished or when no reflection prompt content is supplied.

This basic flow, potentially with the reflection rounds, allows TeXRA agents to perform targeted tasks based on their specific definitions and your instructions. For concrete examples of built-in agents, see the Built-in Agent Reference.

Potential XML Issues

Occasionally, LLMs might generate slightly malformed XML (e.g., missing closing tags), especially with very long or complex outputs. If TeXRA fails to extract content from an agent's raw XML output (any round's r{round}/output.xml, e.g., r0/output.xml), you might need to manually inspect the .xml file and correct any structural errors (like adding a missing </document> tag) before TeXRA can process it correctly. See the Troubleshooting guide for more details.

Reflection

After generating an initial output (Round 0), TeXRA agents that define reflection prompts evaluate and refine their work (Round 1):

View full example
Red strikethrough: Round 0 content revised in Round 1
Blue underlined: New/improved content added in Round 1