Workflow Agents: How They Work
Every time you click "Execute" in TeXRA, an agent takes your files and instructions, asks an AI model to do the work, and delivers the result. This page explains what happens under the hood—enough to understand the system, customize it, and troubleshoot when things go sideways.
When to use workflow mode
Workflow agents are built for deep, single-shot thinking—things like rewriting a whole section, deriving or checking equations step-by-step, converting a paper to slides, or merging edits. They plan in a <scratchpad>, produce a full XML-wrapped output, and optionally reflect on it for another round, so runs with frontier reasoning models can take 10–30 minutes to finish.
If you want a snappier turnaround (e.g. quick polishes, small corrections), pick a smaller or faster model in the model dropdown—output quality drops somewhat, but wall-clock time drops a lot. For short, conversational edits or read-only questions, use a tool-use agent (assistant, research, review) instead: those stream back in seconds and don't go through the full workflow pipeline.
The settings.agentCategory key decides which of these two modes an agent runs in:
- Best for
- Deep single-shot work: rewriting a section, deriving equations, paper → slides.
- Output
- XML-wrapped <document> + <scratchpad> reasoning, saved as versioned files.
- Latency
- 10–30 min with frontier reasoning models.
- Reflection
- Automatic critique rounds (Round 1+).
- Examples
- polishcorrectpaper2slide
- Best for
- Quick edits and read-only questions you iterate on conversationally.
- Output
- Streamed chat replies interleaved with tool calls.
- Latency
- Seconds — replies stream back as they generate.
- Reflection
- None — you steer it turn by turn.
- Examples
- chatresearchreview
Workflow agents reason once and write a versioned, diffable file; tool-use agents converse and call tools turn by turn—it is the first thing to pick for any task. The split maps one-to-one onto the CLI's two entry points: texra run polish … for workflow agents, texra chat --agent research for tool-use agents.
Agent Definition Files (.yaml)
Each agent is defined in a simple .yaml file that tells TeXRA what to say to the AI model and how to handle the response. You can browse and manage these files from the Agents tab in the TeXRA Dashboard, or create your own (see Custom Agents).
Understanding the YAML Structure
These .yaml files have two main parts (and thankfully, YAML is usually less prickly than XML or JSON):
settings:how to runagentCategory: workflowworkflow vs toolUseprefills: <scratchpad>starts every responsedocumentTag: documentXML output wrapperprompts:what to say to the LLMsystemPrompt: |the LLM's roleYou are an expert LaTeX editor…userPrefix: |your files + instruction{{ INPUT_CONTENT }} · {{ INSTRUCTION }}userRequest:array → reflection rounds- Round 0 — write the revisionround 0- Round 1 — critique & improvereflectionA settings block defines how the agent runs; a prompts block holds the templates—a userRequest array drives Round 0 plus reflection rounds.
settings: Define general operational parameters. For example:agentCategory: Is it aworkflowagent (structured Chain-of-Thought reasoning with XML-wrapped output) or atoolUseagent (interactive conversation that can call tools like file editing, web search, etc.)?prefills: Text the agent should automatically start its response with (e.g.,<scratchpad>).- (Other settings control output format, inheritance, etc. See Configuration and Custom Agents for full details).
prompts: Contain text templates that TeXRA fills with your specific context (input files, instructions) to guide the LLM at different stages:systemPrompt: Sets the overall role and high-level instructions for the LLM.userPrefix: Provides the main context, including your input file(s) (available via e.g.,{{ INPUT_CONTENT }}) and the specific instruction you typed in the UI (available via{{ INSTRUCTION }}).userRequest: Asks the LLM to perform the initial task (Round 0). Often instructs the LLM to think within<scratchpad>tags and then output the main content wrapped within the XML tags defined bysettings.documentTag(e.g.,<document>...</document>). You can also provide an array here: the first entry becomes the round 0 request, and any additional entries drive automatic reflection rounds (Round 1+). When a run consumes more rounds than entries you specify, the first reflection template is reused.
(Prompts use Jinja2 templating. For a detailed list of available variables like {{ INPUT_CONTENT }} and how to use them, see the Custom Agents guide.)
Transparency & Customization
The prompts described above (systemPrompt, userPrefix, etc.) represent TeXRA's structured approach to guiding the LLM. This structured, template-based system means the agent's behavior is transparent and highly customizable through the .yaml file, not a hidden black box.
Basic Execution Flow
When you click "Execute" in the TeXRA UI, TeXRA uses the selected agent's definition (.yaml) and your UI inputs to interact with the chosen LLM:
Key Stages:
- Initialization: TeXRA loads the agent definition and reads the files you selected.
- Prompt Construction: It combines the agent's
systemPrompt,userPrefix(filled with your files and instruction), anduserRequesttemplates into a full prompt for the LLM. - LLM Interaction (Round 0): TeXRA sends the prompt to the selected LLM API. The LLM generates a response, typically including reasoning (
<scratchpad>) and the final answer wrapped in XML tags (e.g.,<document>...</document>). - Processing: TeXRA saves the raw LLM response (often as an
.xmlfile internally, e.g.,r{round}/output.xml). It then parses this file, extracts the content from the primary XML tag (defined bysettings.documentTag), and saves that extracted content to the final output file in task storage (e.g.,r{round}/output.tex, so Round 0 isr0/output.tex, the first reflection isr1/output.tex, and so on). You can monitor this in the ProgressBoard. For LaTeX files, TeXRA can also automatically generate alatexdifffile comparing the output to the input, enhancing observability. See the LaTeX Diff guide for details.
Clicking Execute is not the only way in — the same load-definition → prompt → rounds → save-to-run-storage pipeline runs headlessly from the terminal:
- r0 — draft revision
- r1 — reflection pass
Each round lands in its own folder under task storage:
r0/Round 0 — draftoutput.xmlraw LLM responseoutput.texextracted output extract<document>…</document>output.diff.pdflatexdiff vs input
r1/reflectionoutput.xmloutput.texoutput.diff.pdf
r2/reflectionoutput.xmloutput.texoutput.diff.pdf
Every round saves the raw output.xml, the extracted output.tex, and an optional latexdiff PDF—r0/ is the draft; r1/ and later are reflection passes.
Continuation Handling: If the LLM response gets cut off due to output token limits before generating the required endTag, TeXRA automatically sends a continuation prompt. This prompt asks the model to resume generating exactly where it left off, ensuring complete outputs even for very long tasks. This happens seamlessly within a processing round.
What Goes Into the Prompt
TeXRA assembles a conversation from your agent's prompts and the content you selected in the UI. If you enabled Attach TeX Count or Attach Diagnostics, that information is included too. Figures and audio files are sent alongside the text for models that support them. The AI then reasons through the task and produces its output.
Reflection Rounds (Round 1+):
When an agent definition includes multiple userRequest entries (or increases settings.rounds), TeXRA automatically performs additional passes after Round 0 completes:
- Reflection Prompt: It renders the appropriate reflection template from subsequent
userRequestentries to ask the LLM to critique and improve its own Round 0 output (which is included in the conversation history). - LLM Interaction (Round 1): The LLM generates a revised response.
- Processing: TeXRA saves this refined output to a separate round path (
r{round}/output.ext, e.g.,r1/output.extfor the first reflection,r2/output.extfor the next).
You can control how many rounds execute by editing the agent YAML—either adjust settings.rounds for the maximum number of passes or add more entries to userRequest. The run stops early whenever the model signals it is finished or when no reflection prompt content is supplied.
This basic flow, potentially with the reflection rounds, allows TeXRA agents to perform targeted tasks based on their specific definitions and your instructions. For concrete examples of built-in agents, see the Built-in Agent Reference.
Potential XML Issues
Occasionally, LLMs might generate slightly malformed XML (e.g., missing closing tags), especially with very long or complex outputs. If TeXRA fails to extract content from an agent's raw XML output (any round's r{round}/output.xml, e.g., r0/output.xml), you might need to manually inspect the .xml file and correct any structural errors (like adding a missing </document> tag) before TeXRA can process it correctly. See the Troubleshooting guide for more details.
Reflection
After generating an initial output (Round 0), TeXRA agents that define reflection prompts evaluate and refine their work (Round 1):