# Task Decomposition and Planning Summary This summary captures **how Zyra’s orchestrator should decompose user requests into structured, executable stage plans.** It describes a hybrid architecture combining LLM-based reasoning with deterministic orchestration logic, Guardrails validation, and provenance tracking. ## Overview Zyra’s orchestrator shouldn’t be a “black box” that hides logic in LLM prompts. Instead, it should operate as a **planning and coordination layer** where the LLM interprets user intent and structured Python code builds, validates, and executes a dependency graph. ### Key Insight > The orchestrator should be a **planner**, not a **magician**. > The LLM reasons about *what to do*, while the orchestration layer determines *how and when to do it*. --- ## Architecture Breakdown | Layer | Responsibility | Implementation | |-------|----------------|----------------| | **LLM System Prompt** | Interpret natural-language user intent. Identify relevant Zyra stages. Suggest clarifications. | Structured reasoning via JSON output. | | **Planner (Python)** | Build dependency graph based on stage relationships. | Deterministic DAG creation. | | **Validator (Guardrails / Pydantic)** | Validate LLM-generated plans. Ensure supported stages and proper inputs/outputs. | Schema enforcement and safety checks. | | **Executor (Python)** | Dispatch stage agents, manage retries and concurrency. | Code-based orchestration logic. | | **Provenance & Memory** | Persist state, track user clarifications, record reproducibility. | Provenance database + session memory. | --- ## Task Decomposition Flow 1. **User Request → Orchestrator (LLM Reasoning Layer)** The LLM parses user intent, identifies verbs and objects, and maps them to Zyra stages. 2. **Capability Validation (Python)** The orchestrator verifies that requested actions exist in the capabilities registry. 3. **DAG Construction (Planner)** Build a dependency graph based on supported stages and Zyra’s canonical order: `import → process → simulate → decide → visualize → narrate → verify → export` 4. **Interactive Clarification Loop** If information is missing (e.g., FTP path, dataset ID), the orchestrator pauses, prompts the user, validates input with Guardrails, and resumes execution. 5. **Execution & Provenance Logging** Each stage logs its artifacts, metadata, and hash signatures to the Provenance Store. --- ## Real-Time DAG Modification When a stage fails or new information becomes available, the orchestrator can modify the active DAG: - **Recoverable Error:** Retry or skip a failed node. - **Missing Input:** Insert a temporary *clarification node* to gather user input before resuming. - **Validation Failure:** Replace or re-run a stage using safer parameters. Example modification: ```json { "dag_modification": { "action": "insert_node", "new_node": { "name": "clarify_input_source", "type": "user_interaction", "upstream": ["import"], "downstream": ["process"] } } } ``` All DAG changes are logged to the Provenance Store for full transparency and reproducibility. --- ## Proactive Orchestration and Value Suggestion Beyond reactive task handling, Zyra’s orchestrator can serve as a **proactive research assistant**, suggesting low-effort, high-value enhancements to user workflows. ### Example: **User Request:** “Generate a temperature anomaly map for 2020.” **Base Plan:** `import → process → visualize → export` **Proactive Suggestions:** - Add `verify` stage → “Check data coverage consistency before plotting.” - Add `narrate` stage → “Generate a short summary for documentation.” - Add quick diagnostics → “Create a QC scatter plot for raw vs processed data.” ### Implementation Outline | Component | Role | |------------|------| | **ValueEngine (new)** | Evaluates DAG and provenance for potential low-cost augmentations. | | **Planner** | Accepts optional “suggested nodes” from ValueEngine. | | **Validator** | Ensures additions are safe and compatible. | | **User Interaction** | Presents suggestions to user for acceptance or dismissal. | | **Provenance** | Logs which suggestions were made, accepted, or rejected. | **Example JSON Output:** ```json { "augmentations": [ { "stage": "verify", "description": "Add statistical verification of processed data", "confidence": 0.92 }, { "stage": "narrate", "description": "Generate a summary paragraph explaining the map", "confidence": 0.88 } ] } ``` This transforms the orchestrator into an **active collaborator** that enhances scientific value without user burden. --- ## Recommended System Prompt Structure The LLM system prompt focuses solely on reasoning, not execution: ```json { "intent": "", "stages": [ {"name": "", "description": "", "inputs": [], "outputs": []} ], "dependencies": {"stage": ["upstream_stage"]}, "clarifications_needed": [""] } ``` - **Always returns JSON.** - **No unstructured text or code.** - **Clarifications trigger user feedback loops.** --- ## Best Practices - Keep orchestration **deterministic** and **auditable**. - Restrict the LLM to **interpretation**, not execution. - Validate all plans before execution. - Support human-in-the-loop recovery for incomplete or ambiguous requests. - Use provenance and Guardrails to maintain safety and reproducibility. - Enable proactive, context-aware suggestions to add scientific value.