Task Decomposition and Planning Summary

This summary captures how Zyra’s orchestrator should decompose user requests into structured, executable stage plans. It describes a hybrid architecture combining LLM-based reasoning with deterministic orchestration logic, Guardrails validation, and provenance tracking.

Overview

Zyra’s orchestrator shouldn’t be a “black box” that hides logic in LLM prompts. Instead, it should operate as a planning and coordination layer where the LLM interprets user intent and structured Python code builds, validates, and executes a dependency graph.

Key Insight

The orchestrator should be a planner, not a magician.
The LLM reasons about what to do, while the orchestration layer determines how and when to do it.

Architecture Breakdown

Layer	Responsibility	Implementation
LLM System Prompt	Interpret natural-language user intent. Identify relevant Zyra stages. Suggest clarifications.	Structured reasoning via JSON output.
Planner (Python)	Build dependency graph based on stage relationships.	Deterministic DAG creation.
Validator (Guardrails / Pydantic)	Validate LLM-generated plans. Ensure supported stages and proper inputs/outputs.	Schema enforcement and safety checks.
Executor (Python)	Dispatch stage agents, manage retries and concurrency.	Code-based orchestration logic.
Provenance & Memory	Persist state, track user clarifications, record reproducibility.	Provenance database + session memory.

Task Decomposition Flow

User Request → Orchestrator (LLM Reasoning Layer)
The LLM parses user intent, identifies verbs and objects, and maps them to Zyra stages.
Capability Validation (Python)
The orchestrator verifies that requested actions exist in the capabilities registry.
DAG Construction (Planner)
Build a dependency graph based on supported stages and Zyra’s canonical order:
import → process → simulate → decide → visualize → narrate → verify → export
Interactive Clarification Loop
If information is missing (e.g., FTP path, dataset ID), the orchestrator pauses, prompts the user, validates input with Guardrails, and resumes execution.
Execution & Provenance Logging
Each stage logs its artifacts, metadata, and hash signatures to the Provenance Store.

Real-Time DAG Modification

When a stage fails or new information becomes available, the orchestrator can modify the active DAG:

Recoverable Error: Retry or skip a failed node.
Missing Input: Insert a temporary clarification node to gather user input before resuming.
Validation Failure: Replace or re-run a stage using safer parameters.

Example modification:

{
  "dag_modification": {
    "action": "insert_node",
    "new_node": {
      "name": "clarify_input_source",
      "type": "user_interaction",
      "upstream": ["import"],
      "downstream": ["process"]
    }
  }
}

All DAG changes are logged to the Provenance Store for full transparency and reproducibility.

Proactive Orchestration and Value Suggestion

Beyond reactive task handling, Zyra’s orchestrator can serve as a proactive research assistant, suggesting low-effort, high-value enhancements to user workflows.

Example:

User Request: “Generate a temperature anomaly map for 2020.”
Base Plan: import → process → visualize → export
Proactive Suggestions:

Add verify stage → “Check data coverage consistency before plotting.”
Add narrate stage → “Generate a short summary for documentation.”
Add quick diagnostics → “Create a QC scatter plot for raw vs processed data.”

Implementation Outline

Component	Role
ValueEngine (new)	Evaluates DAG and provenance for potential low-cost augmentations.
Planner	Accepts optional “suggested nodes” from ValueEngine.
Validator	Ensures additions are safe and compatible.
User Interaction	Presents suggestions to user for acceptance or dismissal.
Provenance	Logs which suggestions were made, accepted, or rejected.

Example JSON Output:

{
  "augmentations": [
    {
      "stage": "verify",
      "description": "Add statistical verification of processed data",
      "confidence": 0.92
    },
    {
      "stage": "narrate",
      "description": "Generate a summary paragraph explaining the map",
      "confidence": 0.88
    }
  ]
}

This transforms the orchestrator into an active collaborator that enhances scientific value without user burden.

Recommended System Prompt Structure

The LLM system prompt focuses solely on reasoning, not execution:

{
  "intent": "<summary of user goal>",
  "stages": [
    {"name": "<zyra_stage_name>", "description": "<purpose>", "inputs": [], "outputs": []}
  ],
  "dependencies": {"stage": ["upstream_stage"]},
  "clarifications_needed": ["<missing_info>"]
}

Always returns JSON.
No unstructured text or code.
Clarifications trigger user feedback loops.

Best Practices

Keep orchestration deterministic and auditable.
Restrict the LLM to interpretation, not execution.
Validate all plans before execution.
Support human-in-the-loop recovery for incomplete or ambiguous requests.
Use provenance and Guardrails to maintain safety and reproducibility.
Enable proactive, context-aware suggestions to add scientific value.