Changelog ========= v3.5.2 (2026-05-06) -------------------- **Marketplace and Suite Consistency Sync** * **Version synchronization:** Enforced absolute version consistency (v3.5.2) across all 4 plugins (agent-core, dev-suite, science-suite, research-suite), the marketplace manifest, project metadata (pyproject.toml, Makefile), and the entire documentation suite. * **Documentation audit:** All suite reference pages (agent-core.rst, dev-suite.rst, research-suite.rst, science-suite.rst), cheatsheets, and integration guides updated to reflect the latest release. * **Metadata Alignment:** Synced version strings in ``pyproject.toml``, ``Makefile``, and ``marketplace.json`` to ensure unified deployment and update signaling. **Science-Suite Hub Integrity (test-driven)** * **Routing Decision Tree added to 7 science-suite skills** that contained ``../`` cross-references but were missing a ``## Routing Decision Tree`` code block (a hub-skill structural requirement): ``bayesian-ude-workflow``, ``equation-discovery``, ``md-simulation-setup``, ``neural-pde``, ``sciml-modern-stack``, ``self-improving-ai``, ``time-series-analysis``. * **Checklist headings normalized to** ``## Checklist`` **in 3 skills** whose qualified headings (``## Forecasting Checklist``, ``## 5. Performance & Convergence Checklist``, ``## Performance & Optimization Checklist``) caused the cross-suite invariant tests to fail: ``time-series-analysis``, ``advanced-simulations``, ``parallel-computing``. * **Pytest collection fixed:** Updated ``pyproject.toml`` to set ``testpaths = ["tools/tests"]`` and exclude ``plugins/`` and ``test-corpus/`` from recursion, resolving an ``ImportError`` during test discovery on reference projects. * **Sphinx short version fixed:** ``conf.py`` ``version`` field updated from stale ``"3.4"`` to ``"3.5"`` to match the ``release = "3.5.2"`` label. * **Test suite:** 258/258 pass. v3.5.1 (2026-05-06) -------------------- **3-Layer Routing Audit (Codex-assisted)** * Full 3-layer (router → hub → sub-hub) routing audit across all 4 plugins using Codex. Science-suite: **FAIL** (3 confirmed defects); agent-core, dev-suite, research-suite: **WARN** (34 missing fallback branches). * **Wrong-route fixes (3):** - ``advanced-simulations`` self-routed rare-event sampling — corrected to cross-hub route ``statistical-physics-hub → rare-events-sampling``. - ``sciml-and-diffeq`` routed PINN/NeuralPDE to ``pinn-engineer`` agent instead of ``neural-pde`` skill — fixed; agent delegation annotated separately. - ``llm-and-ai`` routed scientific-pipeline automation to ``sci-workflow-engineer`` agent without annotation — annotated clearly as agent delegation. * **Routing trees added to 5 sub-hub skills** that had Core Skills but no dispatch tree: ``julia-mastery``, ``machine-learning``, ``parallel-computing``, ``python-development``, ``statistical-physics``. * **Fallback branches added to all 34 routing trees** across all 4 plugins. Each terminal branch now names the appropriate expert agent for open-ended triage instead of silently dropping unmatched queries. * **Routing trigger/table expansions**: dev-hub, agent-hub, research-hub, and science-hub routing tables now enumerate concrete invocation phrases for every hub (e.g. control-theory and signal-processing corrected from ``research-and-domains`` to ``simulation-and-hpc``). **Agent Skills Array Expansions** * ``orchestrator`` (agent-core): added ``llm-engineering``, ``reasoning-and-memory`` — now preloads the full agent-core hub toolkit on activation. * ``simulation-expert``: added ``statistical-physics-hub`` — aligns with the expert-agent relationship already documented in statistical-physics-hub. * ``neural-network-master``: added ``ml-deployment`` — natural progression from architecture to production serving. * ``sre-expert``: added ``ci-cd-pipelines`` — SRE and CI/CD share deployment pipeline context. * ``systems-engineer``: added ``architecture-and-infra`` — systems programming and cloud infra are tightly coupled. * ``documentation-expert``: added ``testing-and-quality`` — documentation standards align with code review and quality gates. **Cross-Suite Delegation** * ``jax-pro`` and ``julia-pro`` descriptions updated to document delegation boundary with dev-suite ``software-architect`` for productionization, REST API design, and deployment. * Stale ``ai-engineer`` reference in ``ai-assistant`` command fixed to ``sci-workflow-engineer``. **Knowledge Graph Update** * Incremental graphify update: 3023 → **3238 nodes**, 4232 → **4569 edges**, 269 → **293 communities**. * +215 new nodes from routing tree and fallback additions; +338 new edges (routing links); 6 hyperedges extracted including Dev Hub Two-Tier Routing System and Research Spark Eight-Stage Pipeline. **Validation** * pytest: 258/258 passing (was 256; +2 from new ``TestDescriptionTrimming`` parametrizations). * All 4 plugins validate clean post-audit. * ``tools/README.md`` test count updated: 154 → 258. v3.5.0 (2026-05-05) -------------------- **Hub Architecture: Meta-Router Consolidation** * Each suite now registers a **single meta-hub** in ``plugin.json`` (``science-hub``, ``dev-hub``, ``research-hub``, ``agent-hub``). Routing is now three-tier: meta-hub → domain hub → sub-skill. This eliminates the flat hub list that caused ambiguous routing when multiple hubs matched similar trigger phrases. * Updated skill counts: **34 hubs → 189 sub-skills** across 223 SKILL.md files (was 31 hubs → 187 sub-skills in v3.4.1). **Agent Tier Rebalancing** * ``jax-pro``: sonnet → **opus** (GPU kernels, custom VJP/JVP, Pallas, XLA/HLO analysis) * ``julia-pro``: sonnet → **opus** (full SciML depth: MTK, Turing, UDEs, sensitivity) * ``ml-expert``: sonnet → **haiku** (fast-turnaround classical ML/MLOps; DL delegated to ``neural-network-master``) **Agent Renames** * ``ai-engineer`` → ``pinn-engineer`` (refocused on physics-informed neural networks: NeuralPDE.jl, DeepXDE, BPINN/BNNODE, inverse PDEs) * ``prompt-engineer`` → ``sci-workflow-engineer`` (refocused on LLM-assisted scientific pipelines: JAX/Julia codegen, experiment templates, scientific RAG) **New Registered Commands (+5, -2 net = 14 → 17)** * **Research Suite** (new): ``/lit-review``, ``/paper-implement``, ``/replicate`` * **Science Suite** (new): ``/md-sim``, ``/benchmark`` * **Dev Suite** (retired): ``/commit``, ``/refactor-clean`` (moved to skill-invoked; use ``commit-commands:commit`` plugin instead) **Synergy & Triggering Audit** * 22 skill and agent files edited to eliminate role collisions and close routing gaps. * Role collisions resolved: ``julia-pro`` vs ``julia-ml-hpc``, ``ml-expert`` vs ``neural-network-master``, ``context-specialist`` vs ``reasoning-engine``. * Routing gaps closed: ``deep-learning`` and ``advanced-simulations`` gained explicit routing decision trees (upgraded from sub-skills to domain hubs). * Trigger descriptions rewritten to enumerate concrete action verbs and domain terms instead of vague "Use when..." phrases. **Skill Listing Budget** * ``skillListingBudgetFraction`` set to **8%** in all four plugin ``settings.json`` files (was defaulting to 1%, causing 126 of 223 descriptions to be truncated at session start). * Also fixed two YAML frontmatter parse failures (unquoted colons) in ``agent-hub/SKILL.md`` and ``dev-hub/SKILL.md``. **Graphify Knowledge Graph** * Added ``graphify-out/`` with a semantic knowledge graph of the full codebase: 3023 nodes, 4232 edges, 269 communities (generated by Gemini with 92% extracted / 8% inferred edges). * ``GRAPH_REPORT.md`` documents god nodes, community hubs, surprising cross-module connections, and suggested graph queries. **New Skill: ai-pair (dev-suite)** * Added ``ai-pair`` sub-skill under ``dev-workflows`` hub: multi-model review patterns (Claude + Codex + Gemini), AI pair-programming prompt templates, and structured debugging workflows. **Test Corpus** * Added representative code corpus under ``test-corpus/`` for ``skill_validator.py`` (covers JAX, Julia Bayesian, Julia GNN, GitHub Actions CI, async Python, FastAPI). **Validation** * pytest: 258/258 passing. * All 223 SKILL.md files within 2% context budget. * ``skillListingBudgetFraction`` 8% confirmed sufficient for full skill listing. v3.4.1 (2026-04-19) -------------------- **Hotfix** * **research-suite hook loading:** removed the redundant ``"hooks": "./hooks/hooks.json"`` key from ``plugins/research-suite/.claude-plugin/plugin.json``. The Claude Code harness auto-loads ``hooks/hooks.json`` from every plugin, so the explicit manifest reference caused a "Duplicate hooks file detected" load error on plugin reload. Matches the pattern used by agent-core / dev-suite / science-suite (none of which declare ``hooks`` in their manifest). Version bump forces plugin-cache refresh so users who installed v3.4.0 pick up the fix automatically on ``/plugin update``. **Validation** * All other v3.4.0 validator numbers unchanged: metadata 0/0 on all 4 suites; doc_checker 0 warnings across all 4 suites; xref 531/531 valid; context budget 217/217 fit 200K; pytest 188/188 passing. v3.4.0 (2026-04-18) -------------------- **New Plugin: research-suite** * New 4th plugin suite extracted from ``science-suite``. Contains: - **2 agents:** ``research-expert`` (moved from science-suite, unified methodology specialist) and ``research-spark-orchestrator`` (new, drives the 8-stage refinement pipeline) - **3 workflow tracks:** - ``scientific-review`` — journal-ready peer review producing a ``.docx`` referee report with Six-Lens analysis and Confidential Comments to Editor (standalone skill). - ``research-spark`` — 8-stage artifact-gated refinement of a rough research idea into a fundable plan. Stages: ``spark-articulator`` → ``landscape-scanner`` → ``falsifiable-claim`` → ``theory-scaffold`` (Stages 4-5) → ``numerical-prototype`` → ``experiment-designer`` → ``premortem-critique``. State tracked in ``_state.yaml``. - ``research-practice`` — methodology hub routing to ``research-methodology``, ``research-quality-assessment``, ``research-paper-implementation``, ``scientific-communication``, ``evidence-synthesis`` (all moved from ``science-suite``'s ``research-and-domains`` hub). - **Three adversarial patterns enforced:** Reviewer 2 persona (Stages 2-3), stepwise derivation protocol (Stages 4-5), instrument capability 3× margin rule (Stage 7). - **0 registered commands** (workflows are skill-driven). The legacy ``/paper-review`` command was removed in favor of ``scientific-review``. * **science-suite** now focuses purely on computational work (JAX, Julia, HPC, ML/DL, physics, nonlinear dynamics): 11 agents (was 12), 14 hubs → 112 sub-skills (was 117). **research-suite optimization pass (same release)** * **Description normalization:** 5 methodology skills and 1 agent converted from weak "Use when..." form to strong third-person "This skill should be used when..." with 8-10 verbatim trigger phrases each (evidence-synthesis, scientific-communication, research-paper-implementation, research-quality-assessment, research-methodology, research-practice, research-expert). * **Non-standard frontmatter cleanup:** dropped ``maturity`` / ``specialization`` / inline ``Version:`` footers from research-paper-implementation and research-quality-assessment (version lives only in ``plugin.json`` per convention). * **Cross-skill linkage:** every methodology skill now points to its research-spark pipeline counterpart (e.g., ``research-methodology`` ↔ Stage 7 ``experiment-designer``; ``evidence-synthesis`` ↔ Stage 2 ``landscape-scanner``). Added phase↔stage mapping table to ``research-practice`` hub. * **Plugin metadata:** sharper ``plugin.json`` description; 10 new discoverability keywords (``power-analysis``, ``prisma``, ``grade``, ``consort``, ``strobe``, ``reproducibility``, ``paper-implementation``, ``statistical-rigor``, ``pre-registration``, ``doe``). **Documentation** * New ``docs/suites/research-suite.rst`` with full skill/agent coverage. * Updated ``docs/index.rst``, ``docs/categories/index.rst``, ``docs/reference/agents.md``, ``docs/reference/commands.md``, ``docs/integration-map.rst``, ``docs/guides/scientific-workflows.rst``, and ``docs/suites/science-suite.rst`` to reflect the split. * CLAUDE.md suite table updated: 3→4 suites, 24→25 agents, suite counts refreshed. **Validation** * metadata_validator: 0 errors on all 4 suites. * xref_validator: 530/530 cross-references valid. * doc_checker: 0 errors on research-suite. * context_budget_checker: 217/217 skills fit 2% budget on both 200K and 1M context windows. * pytest: 180/180 passing (was 154 in v3.3.0 — 26 new hook-integrity tests from the bandit/vulture/gitleaks audit addition). **v3.4.0 polish (2026-04-19)** * **research-suite hooks:** added 3 hook events (``SessionStart`` artifact-resume, ``TaskCompleted`` audit logging, ``SubagentStop`` prompt-based stage-artifact verification) + 2 command handler scripts. Brings hook event total across suites from 24 → 27. * **Version consistency sweep:** ``pyproject.toml`` bumped 3.3.0 → 3.4.0; ``docs/conf.py`` release 3.3.0 → 3.4.0; ``Makefile`` header 3.0.0 → 3.4.0; README badges + overview prose synchronized; agent-core ``commands/team-assemble.md`` "MyClaude v3.3.0" → "v3.4.0". All 13 canonical version surfaces now match. * **Trigger-phrase parity:** final ``SKILL.md`` (``science-suite/skills/research-and-domains``) gained "Use when..." trigger — 217/217 skills now conform. * **Tooling polish:** ``skill_validator.py`` now reports ``n/a ⚪ no corpus`` instead of misleading ``0.0% ❌`` when no test corpus is loaded; ``doc_checker.py`` wired into ``make validate`` (per-plugin iteration). v3.3.0 (2026-04-12) -------------------- **CLI 2.1.104 Ecosystem Optimization** * Agent maxTurns standardization: 10 agents raised to model tier targets (opus≥50, sonnet≥35) * Tool list enrichment: EnterPlanMode/ExitPlanMode added to all 7 allowlist-based opus agents; CronCreate, ScheduleWakeup added to automation-engineer, devops-architect, sre-expert; Monitor added to smart-debug command * Hook expansion: agent-core 12→17 events (+PreSubagentUse, ExecutionError, PermissionPrompt, ContextOverflow, CostThreshold), dev-suite 0→8 events (new hooks/ directory), science-suite 0→6 events (new hooks/ directory) * Forward-looking hooks: ContextOverflow and CostThreshold handlers registered in agent-core for future CLI versions (will not fire on CLI 2.1.x) * Broken references fixed: 5 phantom agent references in ultra-think and reflection commands (research-intelligence, hpc-numerical-coordinator, ai-software-architect) * Settings harmonization: dev-suite default maxTurns 35→40 **Agent & Skill Polish** * Agent color scheme: added ``color`` frontmatter field to all 24 agents for statusline differentiation (blue/cyan/green/magenta/red/yellow) * Abstract model tiers: replaced hardcoded model version strings with abstract tier names (opus/sonnet/haiku) across agent frontmatter * Hub skill triggers: improved skill description triggering patterns for better routing accuracy * CodeRabbit cleanup: removed uninstalled coderabbit agent from inventory and quality gate references **Testing & CI** * Cross-suite invariant tests: 19 new tests in ``test_cross_suite_invariants.py`` covering 7 coverage gaps (color field, model tier validity, hook event naming, skill budget, version sync, agent-skill cross-refs, command registration) * CI quality gate: new GitHub Actions workflow (``ci.yml``) running pytest, ruff, and mypy on PRs * Test count: 135→154 total tests across the suite **Validator State** * metadata_validator 0/0/0; xref_validator all valid; context_budget_checker 206/206 (no skill over 80%); pytest 154/154; ruff + mypy clean; pip-audit clean. v3.2.0 (2026-04-12) -------------------- **Skill Validator Vacuous-Pass Fix** * Fixed a vacuous-truth bug in ``skill_validator.py`` where the Overall Assessment reported "EXCELLENT" when no test corpus was provided. With ``total_tests == 0``, all rate metrics returned ``0.0%`` via division guards, which satisfied the ``< 10%`` threshold — a classic vacuous pass. The validator now reports "NO DATA" when no corpus is configured, and gates action-required messages behind ``total_tests > 0``. * Added two regression tests to ``test_skill_validator.py``: ``test_no_corpus_reports_no_data`` (end-to-end: load plugins without corpus, verify report contains "NO DATA" and not "EXCELLENT") and ``test_zero_tests_metrics_accuracy`` (unit: verify all ``SkillValidationMetrics`` properties return ``0.0`` when ``total_tests == 0``). **Validator State** * metadata_validator 0/0/0; xref_validator 523/523 valid; context_budget_checker 206/206 (no skill over 80%); skill_validator NO DATA (no corpus — by design); pytest 120/120 (+2 regression tests); ruff + mypy clean; pip-audit clean. v3.1.7 (2026-04-11) ------------------- **Bayesian SINDy Extraction** * **New skill** ``bayesian-sindy-workflow`` extracted from ``equation-discovery`` which was at 88% of its context budget after a prior external Bayesian SINDy section. Lands at ~68.55% budget with headroom for v3.1.8+ growth. * Structure: when-to-prefer decision table (Bayesian vs classical SINDy); three routes (horseshoe+NUTS, ensemble SINDy, Julia UQ-SINDy); full 5-stage Lorenz-63 worked example — ``scipy.integrate.solve_ivp`` + noise + central-difference, 10-term second-order polynomial library, NumPyro horseshoe prior + NUTS (4 chains, 1000 warmup, 2000 samples), ArviZ diagnostics (R-hat, ESS, PSIS-LOO), inclusion probabilities + credible intervals; prior-sensitivity analysis; Julia sidebar using Turing + DataDrivenDiffEq with ``truncated(...; lower=0)`` keyword form (Turing 0.37+ API drift caught via Context7). * ``equation-discovery`` dropped from 3535 → 2984 tokens (88% → 74.6%, under the 75% Commit D gate). * Science-suite now at **117 sub-skills** (from 116); ``bayesian-inference`` hub grows from 9 → 10 sub-skills. **Composition Headers** * Added ``## Composition with neighboring skills`` section headers to three science-suite skills that had prose cross-references but lacked the canonical header: ``stochastic-dynamics`` (5 bullets), ``non-equilibrium-theory`` (6 bullets), ``correlation-physical-systems`` (5 bullets). **freud IntermediateScattering Re-verification** * Re-verified ``freud.density.IntermediateScattering`` against the current freud release via Context7 on 2026-04-11: still not shipped in the density module. The ``numpy.fft`` + MDAnalysis fallback remains the recommended approach. Inline tag updated to ``[re-verified absent 2026-04-11]``. **Tooling — pip-audit** * Added ``pip-audit 2.10+`` dev dependency for automated CVE scanning. Wired into the per-commit validator gate. * **One HIGH finding ignored** — PYSEC-2022-42969 / CVE-2022-42969 (ReDoS in ``py.path.svnwc.InfoSvnCommand`` regex, reaches repo via ``interrogate 1.7.0 → py 1.11.0`` transitive). The ``py`` library is archived and unmaintained since 2022; ``interrogate`` upstream tracks removal at ``econchick/interrogate#142``, unreleased. In-repo exposure is zero (``interrogate`` only imports ``py`` for file-path handling, never reaching the vulnerable ``svnwc`` code path). Revisit in v3.1.8+ when ``interrogate`` can be replaced. **Validator State** * metadata_validator 0/0/0; xref_validator 523/523 valid (+4 from the new skill); context_budget_checker 206/206 (``equation-discovery`` 74.6%, ``bayesian-sindy-workflow`` 68.55%, ``non-equilibrium-theory`` 78.98%, no skill over 80%); skill_validator EXCELLENT; pytest 118/118; ruff + mypy clean; pip-audit clean with one ignore as documented above. v3.1.6 (2026-04-11) ------------------- **Julia ↔ Python Parity Polish** * **Julia → Python handoff for nonlinear time-series tools.** New section in ``chaos-attractors`` covering ``nolds``, ``antropy``, ``IDTxl``, ``pyEDM``, ``pyunicorn``, ``teaspoon`` (no native Julia equivalents) via the canonical ``PythonCall.jl`` + ``CondaPkg.jl`` import pattern with a concrete ``lyap_r`` example and GIL-under-``@threads`` caveats. Pointer edits in ``nonlinear-dynamics`` hub ecosystem-selection table and ``time-series-analysis``. Completes the Julia ↔ Python interop story with v3.1.5's reciprocal ``juliacall`` bifurcation path. * **BAR free-energy worked example** bridging Langevin ensemble and non-equilibrium theory. Added a 4-stage pipeline to ``non-equilibrium-theory``: (1) JAX Langevin ensemble + ``jax.lax.scan`` to accumulate forward/reverse work samples, (2) BAR fit via ``pymbar.other_estimators.bar`` (Context7-verified — pymbar 4.0 moved it out of top level), (3) variance comparison vs Jarzynski cumulant expansion, (4) multi-state MBAR pointer with ``alchemlyb`` ecosystem wrapper. One-sentence cross-link added to ``stochastic-dynamics`` to preserve its 74% budget cap. * **freud ecosystem for physical correlations.** Added a "Python freud ecosystem" section to ``correlation-physical-systems`` covering ``freud.density.RDF``, ``StaticStructureFactorDebye`` / ``StaticStructureFactorDirect`` (with API-drift warning: ``StaticStructureFactorDebye`` takes ``num_k_values``, not ``bins``), Steinhardt ``Q_l``, Hexatic, Nematic, and ``SolidLiquid`` phase classifier. ``freud.density.IntermediateScattering`` tagged ``[unverified]`` (absent in freud 3.5.0) with a ``numpy.fft`` + MDAnalysis fallback. Algorithmic notes in ``correlation-computational-methods`` (AABBQuery neighbor-list reuse, ``reset=False`` multi-frame averaging, CuPy breakeven N ≈ 10⁴, "MDAnalysis/MDTraj as iterator, freud as analyzer" production pattern). One-line hub pointer in ``correlation-analysis`` with a ``PythonCall.jl`` handoff note for Julia users. **Deferred** * Item A (ML-FF CLI spot-check) — explicitly deferred until the user resumes active MLIP training. **Validator State** * metadata_validator: 0/0/0 across all 3 plugins. * xref_validator: 519/519 references valid. * context_budget_checker: 205/205 skills fit. ``non-equilibrium-theory`` at 73.8% (under 75% Commit C gate), ``correlation-physical-systems`` at 74.45% (under 75% Commit D gate), ``chaos-attractors`` at 79% (under 80% at-risk line). * skill_validator EXCELLENT; pytest 118/118; ruff clean; mypy 0 errors. **Known forward items for v3.1.7+** * ``equation-discovery`` at 88% — flagged for Bayesian SINDy extraction split. * ``freud.density.IntermediateScattering`` presence in newer freud releases — re-verify when correlation skills are next touched. v3.1.5 (2026-04-11) ------------------- **Julia/Python Parity Pass** * **Fokker-Planck direct PDE methods** in ``stochastic-dynamics``: finite-difference / spectral discretization, boundary-condition patterns, cross-links to Langevin sampling. * **Python bifurcation continuation escape hatch** in ``bifurcation-analysis`` and ``nonlinear-dynamics``: documented the ``juliacall`` path to Julia bifurcation routines (since ``BifurcationKit.jl`` is blocked on Julia 1.12), plus PyDSTool and AUTO-07p as Python-native alternatives. * **Modern ML force fields** expansion in ``ml-force-fields``: equivariant GNNs (NequIP, MACE, Allegro), Julia ACE stack (``ACEpotentials.jl``), training loops, active learning, energy-and-force loss balance. Budget-tight at 78%. * **Julia Monte Carlo idioms** in ``statistical-physics``: Metropolis sampler patterns with ``@inbounds`` / ``@fastmath``, SIMD inner loops, parallel tempering via ``Distributed.jl``, tuning heuristics. **Tooling** * Added ``types-PyYAML`` dev dependency for mypy stubs. * Tightened ``self-improving-ai`` triggers so they no longer overlap with ``dspy-basics``. * Added ``tools/validation/command_file_linter.py`` — a targeted structural linter for Claude Code command files with 5 stable rule IDs (``fence-unbalanced``, ``heading-skip``, ``step-ref-broken``, ``trailing-whitespace``, ``heading-duplicate``). Importable API + standalone CLI, wired into ``make validate`` (errors block, warnings non-blocking). Caught a pre-existing duplicate ``## Metrics`` H2 in ``dev-suite/commands/tech-debt.md``. 15 new tests. **Validator State** * metadata 0/0/0; xref 515/515 valid (+3 from v3.1.4); context budget 204/204 (``bifurcation-analysis`` 79%, ``ml-force-fields`` 78%, both under 80%). pytest 103 passing, ruff clean, mypy 0 errors. v3.1.4 (2026-04-11) ------------------- **Research-Focus Optimization Pass (science-suite)** * Aligned agents and skills with research in Bayesian MCMC (NUTS / Consensus MC / Pigeons), Universal Differential Equations, SINDy, nonlinear dynamics, time series, rare events / avalanche dynamics, non-equilibrium statistical physics, and point / jump processes. * **9 new sub-skills** (hub-discovered, not registered in ``plugin.json``): ``consensus-mcmc-pigeons`` (non-reversible parallel tempering via Pigeons.jl, now distinguished from Scott-2016 divide-and-conquer Consensus Monte Carlo), ``bayesian-ude-workflow`` (Turing + DiffEq + Lux staged pipeline), ``bayesian-ude-jax`` (Python/JAX counterpart via Diffrax + Equinox + NumPyro), ``bayesian-pinn`` (BNNODE/BayesianPINN extracted from ``neural-pde`` which drops from 78% → 65% of budget), ``point-processes`` (Hawkes / HSGP / Julia PointProcesses.jl), ``rare-events-sampling`` (large-deviation / cloning / avalanche statistics), ``self-improving-ai`` (research overview), ``dspy-basics`` (DSPy programmatic prompts depth-skill), ``rlaif-training`` (Constitutional AI / RLAIF / DPO depth-skill). * **1 new agent-core skill**: ``self-improving-agents`` under the ``reasoning-and-memory`` hub — operational counterpart to science-suite's ``self-improving-ai`` (agents inside Claude Code vs research framework overview). Covers closed-loop reflection-refine-validate, self-consistency ensembles, DSPy and TextGrad automatic prompt optimization, evolutionary prompt search, and constitutional self-critique. **Research Audit Remediation** * Added ``extreme-value-statistics`` skill (GEV/GPD/Hill/Pickands/POT, return levels, non-stationary EVT) and wired into ``statistical-physics-hub``. * Wired the orphaned ``robust-testing`` sub-skill into the ``research-and-domains`` hub (was on disk but unreachable). * Extended ``rare-events-sampling`` triggers to cover SOC, sandpile / Bak-Tang-Wiesenfeld, crackling noise, and avalanche-size distributions; cross-linked to ``extreme-value-statistics``. * Resolved jump-diffusion routing: ``stochastic-dynamics`` owns general physics jump-diffusion SDEs (Lévy flights, shot noise, regime-switching Langevin); ``catalyst-reactions`` stays scoped to biochemical reaction networks. * Added Bayesian SINDy coverage to ``equation-discovery`` (horseshoe-prior NumPyro, ensemble SINDy, UQ-SINDy via Turing) — pushed to 88% budget and flagged for v3.1.7 extraction. * Disambiguated ``sciml-modern-stack`` vs ``sciml-and-diffeq`` by rewriting hub routing without touching the frozen ``sciml-modern-stack`` body. * Added missing trigger keywords (ADF, KPSS, Phillips-Perron, PELT, BinSeg, renewal processes, non-parametric Hawkes EM) to ``time-series-analysis`` and ``point-processes``. **Agent Updates** * ``julia-pro``: Bayesian stack upgraded to Turing + Pigeons; sensealg table rewritten with ``GaussAdjoint`` as modern default and the ForwardDiff-bypasses-sensealg factual fix; decision tree adds UDE and multimodal branches. * ``julia-ml-hpc``, ``statistical-physicist``, ``ai-engineer``, ``jax-pro``, ``simulation-expert`` also aligned with research-focus delegation updates. **Codebase-Aware /team-assemble (agent-core)** * Major rework: static catalog → codebase-aware recommender / adapter / validator. **21 → 25 team templates.** * **4 new teams**: ``nonlinear-dynamics`` (bifurcation, chaos, coupled oscillators, pattern formation — first wiring of ``nonlinear-dynamics-expert`` to its documented delegation targets), ``julia-ml`` (Lux.jl/Flux.jl/MLJ.jl + CUDA.jl/MPI.jl distributed training), ``multi-agent-systems``, ``sci-desktop`` (PyQt6/PySide6 + JAX scientific desktop apps). * ``ai-engineering`` team swaps ``reasoning-architect`` for ``context-architect`` as default 4th teammate. * Closed drift: **0 unused local agents** (down from 4). * New capabilities: Step 1.5 codebase detection (4-tier signal gathering with efficiency gates), Step 2.5 fingerprint table, Step 2.6 rule-based ranking with confidence labels, Step 2.6a/b validation + auto-fill, five new invocation modes. * **Session cache**: Tier 0 cache at ``/tmp/team-assemble-cache/.json`` with mtime-based invalidation (15 min TTL); ``--no-cache`` bypass flag. * **S1 prompt-injection safeguards** for README probes (HIGH): character neutralization, ```` wrapping, 9 refusal-trigger patterns. Non-English README hardening via ``language_hint`` classification and auto-fill trust tiers. **Tooling Hardening** * Added ``sys.path.insert`` to 5 validators so CLI invocation works without ``PYTHONPATH=.``. * ``PluginLoader`` consolidates YAML frontmatter parsing via normalized component helpers. * ``xref_validator`` gains disk-discovery of sub-skills so hub-architecture sub-skills no longer false-positive as broken references. * Restored ``requires = ["maturin>=1.0,<2.0"]`` in rust-extensions scaffold (maturin is the real PEP 517 build backend for PyO3). * ``pyproject.toml``: excluded ``test-corpus/`` from mypy and ruff. **Validator State** * metadata 0/0/0; xref 512/512 valid; context budget 204/204; pytest 60/60; ruff clean. v3.1.3 (2026-04-10) ------------------- **New Skill: thinkfirst (agent-core)** * Added ``thinkfirst`` as a sub-skill under the ``llm-engineering`` hub. Interview-first workflow that clarifies vague user intent through a Seven Dimensions framework before any prompt is drafted. * Positioned as the first branch in the ``llm-engineering`` routing tree so users with brain dumps hit clarification before reaching for production templates. * Cross-linked with ``prompt-engineering-patterns``: ``thinkfirst`` handles intent clarification, ``prompt-engineering-patterns`` handles production-grade refinement. v3.1.2 (2026-04-06) ------------------- **Bug Fixes** * Removed duplicate ``hooks`` manifest entries from agent-core and dev-suite ``plugin.json``. The ``hooks/hooks.json`` file is auto-discovered by convention; declaring it explicitly caused duplicate-load errors at startup. * Fixed dev-suite ``.lsp.json`` structure to match expected schema. **Documentation** * Updated plugin READMEs to use hub→sub-skill notation matching CLAUDE.md. * Rewrote ``tools/README.md`` to reflect current tooling structure. v3.1.1 (2026-04-06) ------------------- **Bug Fixes** * Set ``strict: true`` in marketplace.json to resolve conflicting manifests when both marketplace.json and individual plugin.json files declare components. * Fixed agent teams guide reference in README (34 → 21 teams). v3.1.0 (2026-04-03) ------------------- **Hub-Skill Architecture** * Introduced :term:`Hub Skill` routing: 26 hub skills route to 167 :term:`sub-skills ` via :term:`routing decision trees `. Hubs are declared in ``plugin.json``; sub-skills are discovered through hub routing. Eliminates ambiguous flat-list skill matching. * agent-core: 3 hubs (agent-systems, reasoning-and-memory, llm-engineering) → 12 sub-skills. * dev-suite: 9 hubs (backend-patterns, frontend-and-mobile, architecture-and-infra, testing-and-quality, ci-cd-pipelines, observability-and-sre, python-toolchain, data-and-security, dev-workflows) → 49 sub-skills. * science-suite: 14 hubs → 106 sub-skills. * Total: 24 agents, 14 registered commands, 26 hubs → 167 sub-skills (193 total). **Knowledge Gap Closure (+28 skills, +3 commands)** * Added 6 agent-core skills: prompt-engineering-patterns, memory-system-patterns, safety-guardrails, tool-use-patterns, agent-evaluation, knowledge-graph-patterns. * Added 10 dev-suite skills: database-patterns, containerization-patterns, cloud-provider-patterns, message-queue-patterns, caching-patterns, graphql-patterns, accessibility-testing, websocket-patterns, search-patterns, mobile-testing-patterns. * Added 12 science-suite skills: computer-vision, nlp-fundamentals, bioinformatics, time-series-analysis, control-theory, experiment-tracking, signal-processing, symbolic-math, reinforcement-learning, quantum-computing, federated-learning, advanced-optimization. * Added 3 science-suite commands: ``run-experiment``, ``analyze-data``, ``paper-review``. * Deduplicated ``prompt-engineering-patterns`` (science-suite copy removed, migrated to agent-core). **Agent Optimization (24 agents)** * Added ``background: true`` to 18 agents for parallel dispatch. * Upgraded neural-network-master and simulation-expert to opus model tier. * 9 opus agents: orchestrator, reasoning-engine, software-architect, debugger-pro, research-expert, statistical-physicist, nonlinear-dynamics-expert, neural-network-master, simulation-expert. * Added "Use when..." activation triggers to all 24 agent descriptions. * Fixed cross-suite delegation annotations and invalid agent references. **Skill Quality & Integrity** * All 193 skills have: trigger phrases, Expert Agent sections, and checklists. * Fixed 235 broken relative links across 38 files. * Zero orphaned skills — all 193 reachable via hub routing. * Resolved routing overlaps (ml-and-data-science/ml-deployment/deep-learning-hub triangle). * Refactored testing-patterns from 96% to under 75% context budget. * 193/193 skills within 2% context budget. **Security Fixes** * Gated ``commit_fixes()`` behind ``--auto-commit`` flag (default: dry-run). * Added package name validation regex for npm/pip subprocess calls. * Replaced ``git add .`` with ``git add --update`` for safe staging. * Anchored SessionStart hook matcher to ``^(startup|resume)$``. **Documentation** * Rewrote all reference docs for hub architecture. * Added :term:`Hub Skill`, :term:`Sub-Skill`, :term:`Routing Decision Tree`, and :term:`Agent Team` to glossary. * Updated all workflow guides with hub → sub notation. * Docs build with zero warnings. 60/60 tests pass. **Governance** * Added skill size governance policy (>3000 bytes = review required). * 14 commands intentionally registered; 22 skill-invoked by design. v3.0.0 (2026-04-02) ------------------- **Julia ML/DL/HPC Expansion** * Added ``julia-ml-hpc`` agent (sonnet) for Julia ML, Deep Learning, and HPC. Covers Lux.jl/Flux.jl, MLJ.jl, CUDA.jl, MPI.jl, GraphNeuralNetworks.jl, and ReinforcementLearning.jl. Delegates SciML/ODE work to ``julia-pro``. * Added 10 new Julia skills: ``julia-neural-networks``, ``julia-neural-architectures``, ``julia-training-diagnostics``, ``julia-ad-backends``, ``julia-ml-pipelines``, ``julia-gpu-kernels``, ``julia-hpc-distributed``, ``julia-model-deployment``, ``julia-graph-neural-networks``, ``julia-reinforcement-learning``. * Updated 4 existing agents with ``julia-ml-hpc`` delegation rows. **Nonlinear Dynamics Expansion (2026-03-31)** * Added ``nonlinear-dynamics-expert`` agent (opus) for bifurcation theory, chaos analysis, network dynamics, and pattern formation. * Added 8 nonlinear dynamics skills: bifurcation-analysis, chaos-attractors, pattern-formation, equation-discovery, network-coupled-dynamics, and more. **Agent-Skill Synergy (100% coverage)** * Added Expert Agent pointers to all 142 skills (47% → 100%). * Dev-suite: 39/39 skills mapped to 9 domain agents. * Science-suite: 29 orphan skills assigned to correct agents. **Architecture Reorganization (5 → 3 suites)** * Merged engineering-suite + infrastructure-suite + quality-suite into ``dev-suite``. Eliminates 27 cross-suite agent delegation edges. * New structure: agent-core (3 meta-agents), dev-suite (9 agents, 27 commands, 39 skills), science-suite (12 agents, 96 skills). **v2.1.88 Spec Compliance** * Migrated all manifests to ``.claude-plugin/plugin.json`` per official plugin spec. * Removed non-spec ``version``/``color`` fields from all agent and command frontmatter. * Version now lives only in ``plugin.json`` (single source of truth). **Agent Hardening** * Added ``effort``, ``memory``, ``tools``/``disallowedTools`` fields to all 24 agents. * Added ``isolation: worktree`` to app-developer and automation-engineer. **Model Tier Optimization** * Assigned Opus to 6 deep-reasoning agents; Haiku to documentation-expert. * Fixed neural-network-master from ``inherit`` to explicit ``sonnet``. **Hook Expansion (3 → 10 events)** * agent-core: Added PostToolUse, PostCompact, SubagentStop, PermissionDenied, TaskCompleted (3 → 8 events). * dev-suite: Added PostToolUse and SubagentStop (2 events). **Skill Consolidations (7 merges)** * advanced-reasoning + structured-reasoning → reasoning-frameworks * meta-cognitive-reflection + comprehensive-reflection-framework → reflection-framework * ai-assisted-debugging + debugging-strategies → debugging-toolkit * comprehensive-validation-framework → comprehensive-validation * machine-learning-essentials → machine-learning * parallel-computing-strategy → parallel-computing * python-testing-patterns + javascript-testing-patterns → testing-patterns v2.2.1 (2026-02-15) ------------------- **Debugging Team Templates** * Added 5 debugging :term:`agent teams `: debug-gui, debug-numerical, debug-schema, debug-triage, and debug-full-audit. * Teams use a Core Trio pattern (explorer → debugger → python-pro) plus rotating specialists. * Consolidated from 35 to 21 team templates (40% reduction): merged 5 overlapping pairs (pr-review, quality-security, sci-compute, md-simulation, docs-publish), removed 7 niche teams, added alias table for backward compatibility. Total: 21 team templates. **Agent Teams System** * New ``/team-assemble`` command with pre-built team configurations. * Teams span 5 categories: Development & Operations, Scientific Computing, Cross-Suite Specialized, Official Plugin Integration, and Debugging. * Integrated 20 official plugin agents (pr-review-toolkit, feature-dev, coderabbit, plugin-dev, hookify, huggingface-skills, agent-sdk-dev, superpowers). * Quality Gate Enhancers for adding review agents to any team. * Comprehensive reference guide at ``docs/agent-teams-guide.md``. **Agent Enhancements** * Added adaptive thinking references to reasoning-engine agent. * Integrated Agent Teams coordination into orchestrator agent. * Added ``memory`` frontmatter to 11 key agents for persistent context. **Hooks Infrastructure** * Added hooks support to agent-core suite (``SessionStart``, ``PreToolUse``). * New ``hooks/hooks.json`` configuration in agent-core plugin manifest. **Tooling** * Added context budget checker tool (``tools/validation/context_budget_checker.py``). v2.2.0 (2026-02-14) ------------------- * Added Agent Teams support with ``team-assemble`` command and guide. * Updated all suites to v2.2.0 for Claude Opus 4.6 compatibility. * Added context budget checker tool. v2.1.0 (2026-01-20) ------------------- **Suite Consolidation** * Consolidated 31 legacy plugins into 5 suites: agent-core, engineering-suite, infrastructure-suite, quality-suite, science-suite. **Flattened Skills Architecture** * Restructured all skills to a flat directory structure for reliable auto-discovery. * Science suite: 80 flattened skills for comprehensive coverage. **Agent & Command Updates** * Standardized agent metadata with consistent colors, versions, and examples. * Renamed ``feature-dev`` to ``eng-feature-dev`` to prevent conflicts. v2.0.0 (2025-12-15) ------------------- * Initial release of the consolidated architecture.