Changelog
=========

v3.5.2 (2026-05-06)
--------------------

**Marketplace and Suite Consistency Sync**

* **Version synchronization:** Enforced absolute version consistency (v3.5.2) across all 4 plugins (agent-core, dev-suite, science-suite, research-suite), the marketplace manifest, project metadata (pyproject.toml, Makefile), and the entire documentation suite.
* **Documentation audit:** All suite reference pages (agent-core.rst, dev-suite.rst, research-suite.rst, science-suite.rst), cheatsheets, and integration guides updated to reflect the latest release.
* **Metadata Alignment:** Synced version strings in ``pyproject.toml``, ``Makefile``, and ``marketplace.json`` to ensure unified deployment and update signaling.

**Science-Suite Hub Integrity (test-driven)**

* **Routing Decision Tree added to 7 science-suite skills** that contained ``../`` cross-references
  but were missing a ``## Routing Decision Tree`` code block (a hub-skill structural requirement):
  ``bayesian-ude-workflow``, ``equation-discovery``, ``md-simulation-setup``, ``neural-pde``,
  ``sciml-modern-stack``, ``self-improving-ai``, ``time-series-analysis``.
* **Checklist headings normalized to** ``## Checklist`` **in 3 skills** whose qualified headings
  (``## Forecasting Checklist``, ``## 5. Performance & Convergence Checklist``,
  ``## Performance & Optimization Checklist``) caused the cross-suite invariant tests to fail:
  ``time-series-analysis``, ``advanced-simulations``, ``parallel-computing``.
* **Pytest collection fixed:** Updated ``pyproject.toml`` to set ``testpaths = ["tools/tests"]``
  and exclude ``plugins/`` and ``test-corpus/`` from recursion, resolving an ``ImportError``
  during test discovery on reference projects.
* **Sphinx short version fixed:** ``conf.py`` ``version`` field updated from stale ``"3.4"``
  to ``"3.5"`` to match the ``release = "3.5.2"`` label.
* **Test suite:** 258/258 pass.

v3.5.1 (2026-05-06)
--------------------

**3-Layer Routing Audit (Codex-assisted)**

* Full 3-layer (router → hub → sub-hub) routing audit across all 4 plugins using Codex.
  Science-suite: **FAIL** (3 confirmed defects); agent-core, dev-suite, research-suite: **WARN**
  (34 missing fallback branches).
* **Wrong-route fixes (3):**

  - ``advanced-simulations`` self-routed rare-event sampling — corrected to cross-hub route
    ``statistical-physics-hub → rare-events-sampling``.
  - ``sciml-and-diffeq`` routed PINN/NeuralPDE to ``pinn-engineer`` agent instead of
    ``neural-pde`` skill — fixed; agent delegation annotated separately.
  - ``llm-and-ai`` routed scientific-pipeline automation to ``sci-workflow-engineer`` agent
    without annotation — annotated clearly as agent delegation.

* **Routing trees added to 5 sub-hub skills** that had Core Skills but no dispatch tree:
  ``julia-mastery``, ``machine-learning``, ``parallel-computing``, ``python-development``,
  ``statistical-physics``.
* **Fallback branches added to all 34 routing trees** across all 4 plugins. Each terminal
  branch now names the appropriate expert agent for open-ended triage instead of silently
  dropping unmatched queries.
* **Routing trigger/table expansions**: dev-hub, agent-hub, research-hub, and science-hub
  routing tables now enumerate concrete invocation phrases for every hub (e.g. control-theory
  and signal-processing corrected from ``research-and-domains`` to ``simulation-and-hpc``).

**Agent Skills Array Expansions**

* ``orchestrator`` (agent-core): added ``llm-engineering``, ``reasoning-and-memory`` — now
  preloads the full agent-core hub toolkit on activation.
* ``simulation-expert``: added ``statistical-physics-hub`` — aligns with the expert-agent
  relationship already documented in statistical-physics-hub.
* ``neural-network-master``: added ``ml-deployment`` — natural progression from architecture
  to production serving.
* ``sre-expert``: added ``ci-cd-pipelines`` — SRE and CI/CD share deployment pipeline context.
* ``systems-engineer``: added ``architecture-and-infra`` — systems programming and cloud infra
  are tightly coupled.
* ``documentation-expert``: added ``testing-and-quality`` — documentation standards align with
  code review and quality gates.

**Cross-Suite Delegation**

* ``jax-pro`` and ``julia-pro`` descriptions updated to document delegation boundary with
  dev-suite ``software-architect`` for productionization, REST API design, and deployment.
* Stale ``ai-engineer`` reference in ``ai-assistant`` command fixed to ``sci-workflow-engineer``.

**Knowledge Graph Update**

* Incremental graphify update: 3023 → **3238 nodes**, 4232 → **4569 edges**, 269 → **293 communities**.
* +215 new nodes from routing tree and fallback additions; +338 new edges (routing links);
  6 hyperedges extracted including Dev Hub Two-Tier Routing System and Research Spark
  Eight-Stage Pipeline.

**Validation**

* pytest: 258/258 passing (was 256; +2 from new ``TestDescriptionTrimming`` parametrizations).
* All 4 plugins validate clean post-audit.
* ``tools/README.md`` test count updated: 154 → 258.

v3.5.0 (2026-05-05)
--------------------

**Hub Architecture: Meta-Router Consolidation**

* Each suite now registers a **single meta-hub** in ``plugin.json`` (``science-hub``, ``dev-hub``,
  ``research-hub``, ``agent-hub``). Routing is now three-tier: meta-hub → domain hub → sub-skill.
  This eliminates the flat hub list that caused ambiguous routing when multiple hubs matched
  similar trigger phrases.
* Updated skill counts: **34 hubs → 189 sub-skills** across 223 SKILL.md files
  (was 31 hubs → 187 sub-skills in v3.4.1).

**Agent Tier Rebalancing**

* ``jax-pro``: sonnet → **opus** (GPU kernels, custom VJP/JVP, Pallas, XLA/HLO analysis)
* ``julia-pro``: sonnet → **opus** (full SciML depth: MTK, Turing, UDEs, sensitivity)
* ``ml-expert``: sonnet → **haiku** (fast-turnaround classical ML/MLOps; DL delegated to
  ``neural-network-master``)

**Agent Renames**

* ``ai-engineer`` → ``pinn-engineer`` (refocused on physics-informed neural networks:
  NeuralPDE.jl, DeepXDE, BPINN/BNNODE, inverse PDEs)
* ``prompt-engineer`` → ``sci-workflow-engineer`` (refocused on LLM-assisted scientific
  pipelines: JAX/Julia codegen, experiment templates, scientific RAG)

**New Registered Commands (+5, -2 net = 14 → 17)**

* **Research Suite** (new): ``/lit-review``, ``/paper-implement``, ``/replicate``
* **Science Suite** (new): ``/md-sim``, ``/benchmark``
* **Dev Suite** (retired): ``/commit``, ``/refactor-clean`` (moved to skill-invoked; use
  ``commit-commands:commit`` plugin instead)

**Synergy & Triggering Audit**

* 22 skill and agent files edited to eliminate role collisions and close routing gaps.
* Role collisions resolved: ``julia-pro`` vs ``julia-ml-hpc``, ``ml-expert`` vs
  ``neural-network-master``, ``context-specialist`` vs ``reasoning-engine``.
* Routing gaps closed: ``deep-learning`` and ``advanced-simulations`` gained explicit
  routing decision trees (upgraded from sub-skills to domain hubs).
* Trigger descriptions rewritten to enumerate concrete action verbs and domain terms
  instead of vague "Use when..." phrases.

**Skill Listing Budget**

* ``skillListingBudgetFraction`` set to **8%** in all four plugin ``settings.json`` files
  (was defaulting to 1%, causing 126 of 223 descriptions to be truncated at session start).
* Also fixed two YAML frontmatter parse failures (unquoted colons) in ``agent-hub/SKILL.md``
  and ``dev-hub/SKILL.md``.

**Graphify Knowledge Graph**

* Added ``graphify-out/`` with a semantic knowledge graph of the full codebase:
  3023 nodes, 4232 edges, 269 communities (generated by Gemini with 92% extracted /
  8% inferred edges).
* ``GRAPH_REPORT.md`` documents god nodes, community hubs, surprising cross-module
  connections, and suggested graph queries.

**New Skill: ai-pair (dev-suite)**

* Added ``ai-pair`` sub-skill under ``dev-workflows`` hub: multi-model review patterns
  (Claude + Codex + Gemini), AI pair-programming prompt templates, and structured
  debugging workflows.

**Test Corpus**

* Added representative code corpus under ``test-corpus/`` for ``skill_validator.py``
  (covers JAX, Julia Bayesian, Julia GNN, GitHub Actions CI, async Python, FastAPI).

**Validation**

* pytest: 258/258 passing.
* All 223 SKILL.md files within 2% context budget.
* ``skillListingBudgetFraction`` 8% confirmed sufficient for full skill listing.

v3.4.1 (2026-04-19)
--------------------

**Hotfix**

* **research-suite hook loading:** removed the redundant ``"hooks": "./hooks/hooks.json"`` key from ``plugins/research-suite/.claude-plugin/plugin.json``. The Claude Code harness auto-loads ``hooks/hooks.json`` from every plugin, so the explicit manifest reference caused a "Duplicate hooks file detected" load error on plugin reload. Matches the pattern used by agent-core / dev-suite / science-suite (none of which declare ``hooks`` in their manifest). Version bump forces plugin-cache refresh so users who installed v3.4.0 pick up the fix automatically on ``/plugin update``.

**Validation**

* All other v3.4.0 validator numbers unchanged: metadata 0/0 on all 4 suites; doc_checker 0 warnings across all 4 suites; xref 531/531 valid; context budget 217/217 fit 200K; pytest 188/188 passing.

v3.4.0 (2026-04-18)
--------------------

**New Plugin: research-suite**

* New 4th plugin suite extracted from ``science-suite``. Contains:

  - **2 agents:** ``research-expert`` (moved from science-suite, unified methodology specialist) and ``research-spark-orchestrator`` (new, drives the 8-stage refinement pipeline)
  - **3 workflow tracks:**

    - ``scientific-review`` — journal-ready peer review producing a ``.docx`` referee report with Six-Lens analysis and Confidential Comments to Editor (standalone skill).
    - ``research-spark`` — 8-stage artifact-gated refinement of a rough research idea into a fundable plan. Stages: ``spark-articulator`` → ``landscape-scanner`` → ``falsifiable-claim`` → ``theory-scaffold`` (Stages 4-5) → ``numerical-prototype`` → ``experiment-designer`` → ``premortem-critique``. State tracked in ``_state.yaml``.
    - ``research-practice`` — methodology hub routing to ``research-methodology``, ``research-quality-assessment``, ``research-paper-implementation``, ``scientific-communication``, ``evidence-synthesis`` (all moved from ``science-suite``'s ``research-and-domains`` hub).

  - **Three adversarial patterns enforced:** Reviewer 2 persona (Stages 2-3), stepwise derivation protocol (Stages 4-5), instrument capability 3× margin rule (Stage 7).
  - **0 registered commands** (workflows are skill-driven). The legacy ``/paper-review`` command was removed in favor of ``scientific-review``.

* **science-suite** now focuses purely on computational work (JAX, Julia, HPC, ML/DL, physics, nonlinear dynamics): 11 agents (was 12), 14 hubs → 112 sub-skills (was 117).

**research-suite optimization pass (same release)**

* **Description normalization:** 5 methodology skills and 1 agent converted from weak "Use when..." form to strong third-person "This skill should be used when..." with 8-10 verbatim trigger phrases each (evidence-synthesis, scientific-communication, research-paper-implementation, research-quality-assessment, research-methodology, research-practice, research-expert).
* **Non-standard frontmatter cleanup:** dropped ``maturity`` / ``specialization`` / inline ``Version:`` footers from research-paper-implementation and research-quality-assessment (version lives only in ``plugin.json`` per convention).
* **Cross-skill linkage:** every methodology skill now points to its research-spark pipeline counterpart (e.g., ``research-methodology`` ↔ Stage 7 ``experiment-designer``; ``evidence-synthesis`` ↔ Stage 2 ``landscape-scanner``). Added phase↔stage mapping table to ``research-practice`` hub.
* **Plugin metadata:** sharper ``plugin.json`` description; 10 new discoverability keywords (``power-analysis``, ``prisma``, ``grade``, ``consort``, ``strobe``, ``reproducibility``, ``paper-implementation``, ``statistical-rigor``, ``pre-registration``, ``doe``).

**Documentation**

* New ``docs/suites/research-suite.rst`` with full skill/agent coverage.
* Updated ``docs/index.rst``, ``docs/categories/index.rst``, ``docs/reference/agents.md``, ``docs/reference/commands.md``, ``docs/integration-map.rst``, ``docs/guides/scientific-workflows.rst``, and ``docs/suites/science-suite.rst`` to reflect the split.
* CLAUDE.md suite table updated: 3→4 suites, 24→25 agents, suite counts refreshed.

**Validation**

* metadata_validator: 0 errors on all 4 suites.
* xref_validator: 530/530 cross-references valid.
* doc_checker: 0 errors on research-suite.
* context_budget_checker: 217/217 skills fit 2% budget on both 200K and 1M context windows.
* pytest: 180/180 passing (was 154 in v3.3.0 — 26 new hook-integrity tests from the bandit/vulture/gitleaks audit addition).

**v3.4.0 polish (2026-04-19)**

* **research-suite hooks:** added 3 hook events (``SessionStart`` artifact-resume, ``TaskCompleted`` audit logging, ``SubagentStop`` prompt-based stage-artifact verification) + 2 command handler scripts. Brings hook event total across suites from 24 → 27.
* **Version consistency sweep:** ``pyproject.toml`` bumped 3.3.0 → 3.4.0; ``docs/conf.py`` release 3.3.0 → 3.4.0; ``Makefile`` header 3.0.0 → 3.4.0; README badges + overview prose synchronized; agent-core ``commands/team-assemble.md`` "MyClaude v3.3.0" → "v3.4.0". All 13 canonical version surfaces now match.
* **Trigger-phrase parity:** final ``SKILL.md`` (``science-suite/skills/research-and-domains``) gained "Use when..." trigger — 217/217 skills now conform.
* **Tooling polish:** ``skill_validator.py`` now reports ``n/a ⚪ no corpus`` instead of misleading ``0.0% ❌`` when no test corpus is loaded; ``doc_checker.py`` wired into ``make validate`` (per-plugin iteration).

v3.3.0 (2026-04-12)
--------------------

**CLI 2.1.104 Ecosystem Optimization**

* Agent maxTurns standardization: 10 agents raised to model tier targets (opus≥50, sonnet≥35)
* Tool list enrichment: EnterPlanMode/ExitPlanMode added to all 7 allowlist-based opus agents;
  CronCreate, ScheduleWakeup added to automation-engineer, devops-architect, sre-expert;
  Monitor added to smart-debug command
* Hook expansion: agent-core 12→17 events (+PreSubagentUse, ExecutionError, PermissionPrompt,
  ContextOverflow, CostThreshold), dev-suite 0→8 events (new hooks/ directory),
  science-suite 0→6 events (new hooks/ directory)
* Forward-looking hooks: ContextOverflow and CostThreshold handlers registered in agent-core
  for future CLI versions (will not fire on CLI 2.1.x)
* Broken references fixed: 5 phantom agent references in ultra-think and reflection commands
  (research-intelligence, hpc-numerical-coordinator, ai-software-architect)
* Settings harmonization: dev-suite default maxTurns 35→40

**Agent & Skill Polish**

* Agent color scheme: added ``color`` frontmatter field to all 24 agents for statusline
  differentiation (blue/cyan/green/magenta/red/yellow)
* Abstract model tiers: replaced hardcoded model version strings with abstract tier names
  (opus/sonnet/haiku) across agent frontmatter
* Hub skill triggers: improved skill description triggering patterns for better routing accuracy
* CodeRabbit cleanup: removed uninstalled coderabbit agent from inventory and quality gate references

**Testing & CI**

* Cross-suite invariant tests: 19 new tests in ``test_cross_suite_invariants.py`` covering
  7 coverage gaps (color field, model tier validity, hook event naming, skill budget,
  version sync, agent-skill cross-refs, command registration)
* CI quality gate: new GitHub Actions workflow (``ci.yml``) running pytest, ruff, and mypy on PRs
* Test count: 135→154 total tests across the suite

**Validator State**

* metadata_validator 0/0/0; xref_validator all valid;
  context_budget_checker 206/206 (no skill over 80%);
  pytest 154/154; ruff + mypy clean; pip-audit clean.

v3.2.0 (2026-04-12)
--------------------

**Skill Validator Vacuous-Pass Fix**

* Fixed a vacuous-truth bug in ``skill_validator.py`` where the Overall
  Assessment reported "EXCELLENT" when no test corpus was provided.
  With ``total_tests == 0``, all rate metrics returned ``0.0%`` via
  division guards, which satisfied the ``< 10%`` threshold — a classic
  vacuous pass. The validator now reports "NO DATA" when no corpus is
  configured, and gates action-required messages behind
  ``total_tests > 0``.
* Added two regression tests to ``test_skill_validator.py``:
  ``test_no_corpus_reports_no_data`` (end-to-end: load plugins without
  corpus, verify report contains "NO DATA" and not "EXCELLENT") and
  ``test_zero_tests_metrics_accuracy`` (unit: verify all
  ``SkillValidationMetrics`` properties return ``0.0`` when
  ``total_tests == 0``).

**Validator State**

* metadata_validator 0/0/0; xref_validator 523/523 valid;
  context_budget_checker 206/206 (no skill over 80%);
  skill_validator NO DATA (no corpus — by design); pytest 120/120
  (+2 regression tests); ruff + mypy clean; pip-audit clean.

v3.1.7 (2026-04-11)
-------------------

**Bayesian SINDy Extraction**

* **New skill** ``bayesian-sindy-workflow`` extracted from
  ``equation-discovery`` which was at 88% of its context budget after
  a prior external Bayesian SINDy section. Lands at ~68.55% budget
  with headroom for v3.1.8+ growth.
* Structure: when-to-prefer decision table (Bayesian vs classical
  SINDy); three routes (horseshoe+NUTS, ensemble SINDy, Julia
  UQ-SINDy); full 5-stage Lorenz-63 worked example —
  ``scipy.integrate.solve_ivp`` + noise + central-difference,
  10-term second-order polynomial library, NumPyro horseshoe prior +
  NUTS (4 chains, 1000 warmup, 2000 samples), ArviZ diagnostics
  (R-hat, ESS, PSIS-LOO), inclusion probabilities + credible
  intervals; prior-sensitivity analysis; Julia sidebar using
  Turing + DataDrivenDiffEq with ``truncated(...; lower=0)`` keyword
  form (Turing 0.37+ API drift caught via Context7).
* ``equation-discovery`` dropped from 3535 → 2984 tokens
  (88% → 74.6%, under the 75% Commit D gate).
* Science-suite now at **117 sub-skills** (from 116);
  ``bayesian-inference`` hub grows from 9 → 10 sub-skills.

**Composition Headers**

* Added ``## Composition with neighboring skills`` section headers to
  three science-suite skills that had prose cross-references but
  lacked the canonical header: ``stochastic-dynamics`` (5 bullets),
  ``non-equilibrium-theory`` (6 bullets), ``correlation-physical-systems``
  (5 bullets).

**freud IntermediateScattering Re-verification**

* Re-verified ``freud.density.IntermediateScattering`` against the
  current freud release via Context7 on 2026-04-11: still not
  shipped in the density module. The ``numpy.fft`` + MDAnalysis
  fallback remains the recommended approach. Inline tag updated to
  ``[re-verified absent 2026-04-11]``.

**Tooling — pip-audit**

* Added ``pip-audit 2.10+`` dev dependency for automated CVE scanning.
  Wired into the per-commit validator gate.
* **One HIGH finding ignored** — PYSEC-2022-42969 / CVE-2022-42969
  (ReDoS in ``py.path.svnwc.InfoSvnCommand`` regex, reaches repo via
  ``interrogate 1.7.0 → py 1.11.0`` transitive). The ``py`` library
  is archived and unmaintained since 2022; ``interrogate`` upstream
  tracks removal at ``econchick/interrogate#142``, unreleased.
  In-repo exposure is zero (``interrogate`` only imports ``py`` for
  file-path handling, never reaching the vulnerable ``svnwc`` code
  path). Revisit in v3.1.8+ when ``interrogate`` can be replaced.

**Validator State**

* metadata_validator 0/0/0; xref_validator 523/523 valid (+4 from the
  new skill); context_budget_checker 206/206 (``equation-discovery``
  74.6%, ``bayesian-sindy-workflow`` 68.55%, ``non-equilibrium-theory``
  78.98%, no skill over 80%); skill_validator EXCELLENT; pytest
  118/118; ruff + mypy clean; pip-audit clean with one ignore as
  documented above.

v3.1.6 (2026-04-11)
-------------------

**Julia ↔ Python Parity Polish**

* **Julia → Python handoff for nonlinear time-series tools.** New section
  in ``chaos-attractors`` covering ``nolds``, ``antropy``, ``IDTxl``,
  ``pyEDM``, ``pyunicorn``, ``teaspoon`` (no native Julia equivalents)
  via the canonical ``PythonCall.jl`` + ``CondaPkg.jl`` import pattern
  with a concrete ``lyap_r`` example and GIL-under-``@threads`` caveats.
  Pointer edits in ``nonlinear-dynamics`` hub ecosystem-selection table
  and ``time-series-analysis``. Completes the Julia ↔ Python interop
  story with v3.1.5's reciprocal ``juliacall`` bifurcation path.
* **BAR free-energy worked example** bridging Langevin ensemble and
  non-equilibrium theory. Added a 4-stage pipeline to
  ``non-equilibrium-theory``: (1) JAX Langevin ensemble + ``jax.lax.scan``
  to accumulate forward/reverse work samples, (2) BAR fit via
  ``pymbar.other_estimators.bar`` (Context7-verified — pymbar 4.0 moved
  it out of top level), (3) variance comparison vs Jarzynski cumulant
  expansion, (4) multi-state MBAR pointer with ``alchemlyb`` ecosystem
  wrapper. One-sentence cross-link added to ``stochastic-dynamics`` to
  preserve its 74% budget cap.
* **freud ecosystem for physical correlations.** Added a "Python freud
  ecosystem" section to ``correlation-physical-systems`` covering
  ``freud.density.RDF``, ``StaticStructureFactorDebye`` /
  ``StaticStructureFactorDirect`` (with API-drift warning:
  ``StaticStructureFactorDebye`` takes ``num_k_values``, not ``bins``),
  Steinhardt ``Q_l``, Hexatic, Nematic, and ``SolidLiquid`` phase
  classifier. ``freud.density.IntermediateScattering`` tagged
  ``[unverified]`` (absent in freud 3.5.0) with a ``numpy.fft`` +
  MDAnalysis fallback. Algorithmic notes in
  ``correlation-computational-methods`` (AABBQuery neighbor-list reuse,
  ``reset=False`` multi-frame averaging, CuPy breakeven N ≈ 10⁴,
  "MDAnalysis/MDTraj as iterator, freud as analyzer" production
  pattern). One-line hub pointer in ``correlation-analysis`` with a
  ``PythonCall.jl`` handoff note for Julia users.

**Deferred**

* Item A (ML-FF CLI spot-check) — explicitly deferred until the user
  resumes active MLIP training.

**Validator State**

* metadata_validator: 0/0/0 across all 3 plugins.
* xref_validator: 519/519 references valid.
* context_budget_checker: 205/205 skills fit.
  ``non-equilibrium-theory`` at 73.8% (under 75% Commit C gate),
  ``correlation-physical-systems`` at 74.45% (under 75% Commit D
  gate), ``chaos-attractors`` at 79% (under 80% at-risk line).
* skill_validator EXCELLENT; pytest 118/118; ruff clean; mypy 0 errors.

**Known forward items for v3.1.7+**

* ``equation-discovery`` at 88% — flagged for Bayesian SINDy
  extraction split.
* ``freud.density.IntermediateScattering`` presence in newer freud
  releases — re-verify when correlation skills are next touched.

v3.1.5 (2026-04-11)
-------------------

**Julia/Python Parity Pass**

* **Fokker-Planck direct PDE methods** in ``stochastic-dynamics``:
  finite-difference / spectral discretization, boundary-condition
  patterns, cross-links to Langevin sampling.
* **Python bifurcation continuation escape hatch** in
  ``bifurcation-analysis`` and ``nonlinear-dynamics``: documented the
  ``juliacall`` path to Julia bifurcation routines (since
  ``BifurcationKit.jl`` is blocked on Julia 1.12), plus PyDSTool and
  AUTO-07p as Python-native alternatives.
* **Modern ML force fields** expansion in ``ml-force-fields``:
  equivariant GNNs (NequIP, MACE, Allegro), Julia ACE stack
  (``ACEpotentials.jl``), training loops, active learning,
  energy-and-force loss balance. Budget-tight at 78%.
* **Julia Monte Carlo idioms** in ``statistical-physics``: Metropolis
  sampler patterns with ``@inbounds`` / ``@fastmath``, SIMD inner
  loops, parallel tempering via ``Distributed.jl``, tuning heuristics.

**Tooling**

* Added ``types-PyYAML`` dev dependency for mypy stubs.
* Tightened ``self-improving-ai`` triggers so they no longer overlap
  with ``dspy-basics``.
* Added ``tools/validation/command_file_linter.py`` — a targeted
  structural linter for Claude Code command files with 5 stable
  rule IDs (``fence-unbalanced``, ``heading-skip``, ``step-ref-broken``,
  ``trailing-whitespace``, ``heading-duplicate``). Importable API +
  standalone CLI, wired into ``make validate`` (errors block, warnings
  non-blocking). Caught a pre-existing duplicate ``## Metrics`` H2
  in ``dev-suite/commands/tech-debt.md``. 15 new tests.

**Validator State**

* metadata 0/0/0; xref 515/515 valid (+3 from v3.1.4); context budget
  204/204 (``bifurcation-analysis`` 79%, ``ml-force-fields`` 78%,
  both under 80%). pytest 103 passing, ruff clean, mypy 0 errors.

v3.1.4 (2026-04-11)
-------------------

**Research-Focus Optimization Pass (science-suite)**

* Aligned agents and skills with research in Bayesian MCMC
  (NUTS / Consensus MC / Pigeons), Universal Differential Equations,
  SINDy, nonlinear dynamics, time series, rare events / avalanche
  dynamics, non-equilibrium statistical physics, and point / jump
  processes.
* **9 new sub-skills** (hub-discovered, not registered in
  ``plugin.json``): ``consensus-mcmc-pigeons`` (non-reversible
  parallel tempering via Pigeons.jl, now distinguished from Scott-2016
  divide-and-conquer Consensus Monte Carlo), ``bayesian-ude-workflow``
  (Turing + DiffEq + Lux staged pipeline), ``bayesian-ude-jax``
  (Python/JAX counterpart via Diffrax + Equinox + NumPyro),
  ``bayesian-pinn`` (BNNODE/BayesianPINN extracted from ``neural-pde``
  which drops from 78% → 65% of budget), ``point-processes``
  (Hawkes / HSGP / Julia PointProcesses.jl), ``rare-events-sampling``
  (large-deviation / cloning / avalanche statistics),
  ``self-improving-ai`` (research overview), ``dspy-basics``
  (DSPy programmatic prompts depth-skill), ``rlaif-training``
  (Constitutional AI / RLAIF / DPO depth-skill).
* **1 new agent-core skill**: ``self-improving-agents`` under the
  ``reasoning-and-memory`` hub — operational counterpart to
  science-suite's ``self-improving-ai`` (agents inside Claude Code
  vs research framework overview). Covers closed-loop
  reflection-refine-validate, self-consistency ensembles, DSPy and
  TextGrad automatic prompt optimization, evolutionary prompt
  search, and constitutional self-critique.

**Research Audit Remediation**

* Added ``extreme-value-statistics`` skill (GEV/GPD/Hill/Pickands/POT,
  return levels, non-stationary EVT) and wired into
  ``statistical-physics-hub``.
* Wired the orphaned ``robust-testing`` sub-skill into the
  ``research-and-domains`` hub (was on disk but unreachable).
* Extended ``rare-events-sampling`` triggers to cover SOC, sandpile /
  Bak-Tang-Wiesenfeld, crackling noise, and avalanche-size
  distributions; cross-linked to ``extreme-value-statistics``.
* Resolved jump-diffusion routing: ``stochastic-dynamics`` owns
  general physics jump-diffusion SDEs (Lévy flights, shot noise,
  regime-switching Langevin); ``catalyst-reactions`` stays scoped
  to biochemical reaction networks.
* Added Bayesian SINDy coverage to ``equation-discovery``
  (horseshoe-prior NumPyro, ensemble SINDy, UQ-SINDy via Turing) —
  pushed to 88% budget and flagged for v3.1.7 extraction.
* Disambiguated ``sciml-modern-stack`` vs ``sciml-and-diffeq`` by
  rewriting hub routing without touching the frozen
  ``sciml-modern-stack`` body.
* Added missing trigger keywords (ADF, KPSS, Phillips-Perron, PELT,
  BinSeg, renewal processes, non-parametric Hawkes EM) to
  ``time-series-analysis`` and ``point-processes``.

**Agent Updates**

* ``julia-pro``: Bayesian stack upgraded to Turing + Pigeons;
  sensealg table rewritten with ``GaussAdjoint`` as modern default
  and the ForwardDiff-bypasses-sensealg factual fix; decision tree
  adds UDE and multimodal branches.
* ``julia-ml-hpc``, ``statistical-physicist``, ``ai-engineer``,
  ``jax-pro``, ``simulation-expert`` also aligned with research-focus
  delegation updates.

**Codebase-Aware /team-assemble (agent-core)**

* Major rework: static catalog → codebase-aware recommender /
  adapter / validator. **21 → 25 team templates.**
* **4 new teams**: ``nonlinear-dynamics`` (bifurcation, chaos,
  coupled oscillators, pattern formation — first wiring of
  ``nonlinear-dynamics-expert`` to its documented delegation
  targets), ``julia-ml`` (Lux.jl/Flux.jl/MLJ.jl + CUDA.jl/MPI.jl
  distributed training), ``multi-agent-systems``, ``sci-desktop``
  (PyQt6/PySide6 + JAX scientific desktop apps).
* ``ai-engineering`` team swaps ``reasoning-architect`` for
  ``context-architect`` as default 4th teammate.
* Closed drift: **0 unused local agents** (down from 4).
* New capabilities: Step 1.5 codebase detection (4-tier signal
  gathering with efficiency gates), Step 2.5 fingerprint table,
  Step 2.6 rule-based ranking with confidence labels, Step 2.6a/b
  validation + auto-fill, five new invocation modes.
* **Session cache**: Tier 0 cache at
  ``/tmp/team-assemble-cache/<sanitized-abspath>.json`` with
  mtime-based invalidation (15 min TTL); ``--no-cache`` bypass flag.
* **S1 prompt-injection safeguards** for README probes (HIGH):
  character neutralization, ``<untrusted_readme_excerpt>`` wrapping,
  9 refusal-trigger patterns. Non-English README hardening via
  ``language_hint`` classification and auto-fill trust tiers.

**Tooling Hardening**

* Added ``sys.path.insert`` to 5 validators so CLI invocation
  works without ``PYTHONPATH=.``.
* ``PluginLoader`` consolidates YAML frontmatter parsing via
  normalized component helpers.
* ``xref_validator`` gains disk-discovery of sub-skills so
  hub-architecture sub-skills no longer false-positive as broken
  references.
* Restored ``requires = ["maturin>=1.0,<2.0"]`` in rust-extensions
  scaffold (maturin is the real PEP 517 build backend for PyO3).
* ``pyproject.toml``: excluded ``test-corpus/`` from mypy and ruff.

**Validator State**

* metadata 0/0/0; xref 512/512 valid; context budget 204/204;
  pytest 60/60; ruff clean.

v3.1.3 (2026-04-10)
-------------------

**New Skill: thinkfirst (agent-core)**

* Added ``thinkfirst`` as a sub-skill under the ``llm-engineering`` hub.
  Interview-first workflow that clarifies vague user intent through
  a Seven Dimensions framework before any prompt is drafted.
* Positioned as the first branch in the ``llm-engineering`` routing
  tree so users with brain dumps hit clarification before reaching
  for production templates.
* Cross-linked with ``prompt-engineering-patterns``: ``thinkfirst``
  handles intent clarification, ``prompt-engineering-patterns``
  handles production-grade refinement.

v3.1.2 (2026-04-06)
-------------------

**Bug Fixes**

* Removed duplicate ``hooks`` manifest entries from agent-core and dev-suite
  ``plugin.json``. The ``hooks/hooks.json`` file is auto-discovered by convention;
  declaring it explicitly caused duplicate-load errors at startup.
* Fixed dev-suite ``.lsp.json`` structure to match expected schema.

**Documentation**

* Updated plugin READMEs to use hub→sub-skill notation matching CLAUDE.md.
* Rewrote ``tools/README.md`` to reflect current tooling structure.

v3.1.1 (2026-04-06)
-------------------

**Bug Fixes**

* Set ``strict: true`` in marketplace.json to resolve conflicting manifests
  when both marketplace.json and individual plugin.json files declare components.
* Fixed agent teams guide reference in README (34 → 21 teams).

v3.1.0 (2026-04-03)
-------------------

**Hub-Skill Architecture**

* Introduced :term:`Hub Skill` routing: 26 hub skills route to 167 :term:`sub-skills <Sub-Skill>` via
  :term:`routing decision trees <Routing Decision Tree>`. Hubs are declared in ``plugin.json``; sub-skills
  are discovered through hub routing. Eliminates ambiguous flat-list skill matching.
* agent-core: 3 hubs (agent-systems, reasoning-and-memory, llm-engineering) → 12 sub-skills.
* dev-suite: 9 hubs (backend-patterns, frontend-and-mobile, architecture-and-infra,
  testing-and-quality, ci-cd-pipelines, observability-and-sre, python-toolchain,
  data-and-security, dev-workflows) → 49 sub-skills.
* science-suite: 14 hubs → 106 sub-skills.
* Total: 24 agents, 14 registered commands, 26 hubs → 167 sub-skills (193 total).

**Knowledge Gap Closure (+28 skills, +3 commands)**

* Added 6 agent-core skills: prompt-engineering-patterns, memory-system-patterns,
  safety-guardrails, tool-use-patterns, agent-evaluation, knowledge-graph-patterns.
* Added 10 dev-suite skills: database-patterns, containerization-patterns,
  cloud-provider-patterns, message-queue-patterns, caching-patterns,
  graphql-patterns, accessibility-testing, websocket-patterns,
  search-patterns, mobile-testing-patterns.
* Added 12 science-suite skills: computer-vision, nlp-fundamentals,
  bioinformatics, time-series-analysis, control-theory, experiment-tracking,
  signal-processing, symbolic-math, reinforcement-learning, quantum-computing,
  federated-learning, advanced-optimization.
* Added 3 science-suite commands: ``run-experiment``, ``analyze-data``, ``paper-review``.
* Deduplicated ``prompt-engineering-patterns`` (science-suite copy removed, migrated to agent-core).

**Agent Optimization (24 agents)**

* Added ``background: true`` to 18 agents for parallel dispatch.
* Upgraded neural-network-master and simulation-expert to opus model tier.
* 9 opus agents: orchestrator, reasoning-engine, software-architect, debugger-pro,
  research-expert, statistical-physicist, nonlinear-dynamics-expert,
  neural-network-master, simulation-expert.
* Added "Use when..." activation triggers to all 24 agent descriptions.
* Fixed cross-suite delegation annotations and invalid agent references.

**Skill Quality & Integrity**

* All 193 skills have: trigger phrases, Expert Agent sections, and checklists.
* Fixed 235 broken relative links across 38 files.
* Zero orphaned skills — all 193 reachable via hub routing.
* Resolved routing overlaps (ml-and-data-science/ml-deployment/deep-learning-hub triangle).
* Refactored testing-patterns from 96% to under 75% context budget.
* 193/193 skills within 2% context budget.

**Security Fixes**

* Gated ``commit_fixes()`` behind ``--auto-commit`` flag (default: dry-run).
* Added package name validation regex for npm/pip subprocess calls.
* Replaced ``git add .`` with ``git add --update`` for safe staging.
* Anchored SessionStart hook matcher to ``^(startup|resume)$``.

**Documentation**

* Rewrote all reference docs for hub architecture.
* Added :term:`Hub Skill`, :term:`Sub-Skill`, :term:`Routing Decision Tree`, and
  :term:`Agent Team` to glossary.
* Updated all workflow guides with hub → sub notation.
* Docs build with zero warnings. 60/60 tests pass.

**Governance**

* Added skill size governance policy (>3000 bytes = review required).
* 14 commands intentionally registered; 22 skill-invoked by design.

v3.0.0 (2026-04-02)
-------------------

**Julia ML/DL/HPC Expansion**

* Added ``julia-ml-hpc`` agent (sonnet) for Julia ML, Deep Learning, and HPC.
  Covers Lux.jl/Flux.jl, MLJ.jl, CUDA.jl, MPI.jl, GraphNeuralNetworks.jl, and
  ReinforcementLearning.jl. Delegates SciML/ODE work to ``julia-pro``.
* Added 10 new Julia skills: ``julia-neural-networks``, ``julia-neural-architectures``,
  ``julia-training-diagnostics``, ``julia-ad-backends``, ``julia-ml-pipelines``,
  ``julia-gpu-kernels``, ``julia-hpc-distributed``, ``julia-model-deployment``,
  ``julia-graph-neural-networks``, ``julia-reinforcement-learning``.
* Updated 4 existing agents with ``julia-ml-hpc`` delegation rows.

**Nonlinear Dynamics Expansion (2026-03-31)**

* Added ``nonlinear-dynamics-expert`` agent (opus) for bifurcation theory,
  chaos analysis, network dynamics, and pattern formation.
* Added 8 nonlinear dynamics skills: bifurcation-analysis, chaos-attractors,
  pattern-formation, equation-discovery, network-coupled-dynamics, and more.

**Agent-Skill Synergy (100% coverage)**

* Added Expert Agent pointers to all 142 skills (47% → 100%).
* Dev-suite: 39/39 skills mapped to 9 domain agents.
* Science-suite: 29 orphan skills assigned to correct agents.

**Architecture Reorganization (5 → 3 suites)**

* Merged engineering-suite + infrastructure-suite + quality-suite into ``dev-suite``.
  Eliminates 27 cross-suite agent delegation edges.
* New structure: agent-core (3 meta-agents), dev-suite (9 agents, 27 commands,
  39 skills), science-suite (12 agents, 96 skills).

**v2.1.88 Spec Compliance**

* Migrated all manifests to ``.claude-plugin/plugin.json`` per official plugin spec.
* Removed non-spec ``version``/``color`` fields from all agent and command frontmatter.
* Version now lives only in ``plugin.json`` (single source of truth).

**Agent Hardening**

* Added ``effort``, ``memory``, ``tools``/``disallowedTools`` fields to all 24 agents.
* Added ``isolation: worktree`` to app-developer and automation-engineer.

**Model Tier Optimization**

* Assigned Opus to 6 deep-reasoning agents; Haiku to documentation-expert.
* Fixed neural-network-master from ``inherit`` to explicit ``sonnet``.

**Hook Expansion (3 → 10 events)**

* agent-core: Added PostToolUse, PostCompact, SubagentStop, PermissionDenied,
  TaskCompleted (3 → 8 events).
* dev-suite: Added PostToolUse and SubagentStop (2 events).

**Skill Consolidations (7 merges)**

* advanced-reasoning + structured-reasoning → reasoning-frameworks
* meta-cognitive-reflection + comprehensive-reflection-framework → reflection-framework
* ai-assisted-debugging + debugging-strategies → debugging-toolkit
* comprehensive-validation-framework → comprehensive-validation
* machine-learning-essentials → machine-learning
* parallel-computing-strategy → parallel-computing
* python-testing-patterns + javascript-testing-patterns → testing-patterns

v2.2.1 (2026-02-15)
-------------------

**Debugging Team Templates**

* Added 5 debugging :term:`agent teams <Agent Team>`: debug-gui, debug-numerical,
  debug-schema, debug-triage, and debug-full-audit.
* Teams use a Core Trio pattern (explorer → debugger → python-pro) plus rotating specialists.
* Consolidated from 35 to 21 team templates (40% reduction): merged 5 overlapping pairs
  (pr-review, quality-security, sci-compute, md-simulation, docs-publish), removed 7 niche
  teams, added alias table for backward compatibility. Total: 21 team templates.

**Agent Teams System**

* New ``/team-assemble`` command with pre-built team configurations.
* Teams span 5 categories: Development & Operations, Scientific Computing,
  Cross-Suite Specialized, Official Plugin Integration, and Debugging.
* Integrated 20 official plugin agents (pr-review-toolkit, feature-dev,
  coderabbit, plugin-dev, hookify, huggingface-skills, agent-sdk-dev, superpowers).
* Quality Gate Enhancers for adding review agents to any team.
* Comprehensive reference guide at ``docs/agent-teams-guide.md``.

**Agent Enhancements**

* Added adaptive thinking references to reasoning-engine agent.
* Integrated Agent Teams coordination into orchestrator agent.
* Added ``memory`` frontmatter to 11 key agents for persistent context.

**Hooks Infrastructure**

* Added hooks support to agent-core suite (``SessionStart``, ``PreToolUse``).
* New ``hooks/hooks.json`` configuration in agent-core plugin manifest.

**Tooling**

* Added context budget checker tool (``tools/validation/context_budget_checker.py``).

v2.2.0 (2026-02-14)
-------------------

* Added Agent Teams support with ``team-assemble`` command and guide.
* Updated all suites to v2.2.0 for Claude Opus 4.6 compatibility.
* Added context budget checker tool.

v2.1.0 (2026-01-20)
-------------------

**Suite Consolidation**

* Consolidated 31 legacy plugins into 5 suites: agent-core, engineering-suite,
  infrastructure-suite, quality-suite, science-suite.

**Flattened Skills Architecture**

* Restructured all skills to a flat directory structure for reliable auto-discovery.
* Science suite: 80 flattened skills for comprehensive coverage.

**Agent & Command Updates**

* Standardized agent metadata with consistent colors, versions, and examples.
* Renamed ``feature-dev`` to ``eng-feature-dev`` to prevent conflicts.

v2.0.0 (2025-12-15)
-------------------

* Initial release of the consolidated architecture.