Changelog¶

v3.5.2 (2026-05-06)¶

Marketplace and Suite Consistency Sync

Version synchronization: Enforced absolute version consistency (v3.5.2) across all 4 plugins (agent-core, dev-suite, science-suite, research-suite), the marketplace manifest, project metadata (pyproject.toml, Makefile), and the entire documentation suite.
Documentation audit: All suite reference pages (agent-core.rst, dev-suite.rst, research-suite.rst, science-suite.rst), cheatsheets, and integration guides updated to reflect the latest release.
Metadata Alignment: Synced version strings in pyproject.toml, Makefile, and marketplace.json to ensure unified deployment and update signaling.

Science-Suite Hub Integrity (test-driven)

Routing Decision Tree added to 7 science-suite skills that contained ../ cross-references but were missing a ## Routing Decision Tree code block (a hub-skill structural requirement): bayesian-ude-workflow, equation-discovery, md-simulation-setup, neural-pde, sciml-modern-stack, self-improving-ai, time-series-analysis.
Checklist headings normalized to ## Checklist in 3 skills whose qualified headings (## Forecasting Checklist, ## 5. Performance & Convergence Checklist, ## Performance & Optimization Checklist) caused the cross-suite invariant tests to fail: time-series-analysis, advanced-simulations, parallel-computing.
Pytest collection fixed: Updated pyproject.toml to set testpaths = ["tools/tests"] and exclude plugins/ and test-corpus/ from recursion, resolving an ImportError during test discovery on reference projects.
Sphinx short version fixed: conf.py version field updated from stale "3.4" to "3.5" to match the release = "3.5.2" label.
Test suite: 258/258 pass.

v3.5.1 (2026-05-06)¶

3-Layer Routing Audit (Codex-assisted)

Full 3-layer (router → hub → sub-hub) routing audit across all 4 plugins using Codex. Science-suite: FAIL (3 confirmed defects); agent-core, dev-suite, research-suite: WARN (34 missing fallback branches).
Wrong-route fixes (3):
- advanced-simulations self-routed rare-event sampling — corrected to cross-hub route statistical-physics-hub → rare-events-sampling.
- sciml-and-diffeq routed PINN/NeuralPDE to pinn-engineer agent instead of neural-pde skill — fixed; agent delegation annotated separately.
- llm-and-ai routed scientific-pipeline automation to sci-workflow-engineer agent without annotation — annotated clearly as agent delegation.
Routing trees added to 5 sub-hub skills that had Core Skills but no dispatch tree: julia-mastery, machine-learning, parallel-computing, python-development, statistical-physics.
Fallback branches added to all 34 routing trees across all 4 plugins. Each terminal branch now names the appropriate expert agent for open-ended triage instead of silently dropping unmatched queries.
Routing trigger/table expansions: dev-hub, agent-hub, research-hub, and science-hub routing tables now enumerate concrete invocation phrases for every hub (e.g. control-theory and signal-processing corrected from research-and-domains to simulation-and-hpc).

Agent Skills Array Expansions

orchestrator (agent-core): added llm-engineering, reasoning-and-memory — now preloads the full agent-core hub toolkit on activation.
simulation-expert: added statistical-physics-hub — aligns with the expert-agent relationship already documented in statistical-physics-hub.
neural-network-master: added ml-deployment — natural progression from architecture to production serving.
sre-expert: added ci-cd-pipelines — SRE and CI/CD share deployment pipeline context.
systems-engineer: added architecture-and-infra — systems programming and cloud infra are tightly coupled.
documentation-expert: added testing-and-quality — documentation standards align with code review and quality gates.

Cross-Suite Delegation

jax-pro and julia-pro descriptions updated to document delegation boundary with dev-suite software-architect for productionization, REST API design, and deployment.
Stale ai-engineer reference in ai-assistant command fixed to sci-workflow-engineer.

Knowledge Graph Update

Incremental graphify update: 3023 → 3238 nodes, 4232 → 4569 edges, 269 → 293 communities.
+215 new nodes from routing tree and fallback additions; +338 new edges (routing links); 6 hyperedges extracted including Dev Hub Two-Tier Routing System and Research Spark Eight-Stage Pipeline.

Validation

pytest: 258/258 passing (was 256; +2 from new TestDescriptionTrimming parametrizations).
All 4 plugins validate clean post-audit.
tools/README.md test count updated: 154 → 258.

v3.5.0 (2026-05-05)¶

Hub Architecture: Meta-Router Consolidation

Each suite now registers a single meta-hub in plugin.json (science-hub, dev-hub, research-hub, agent-hub). Routing is now three-tier: meta-hub → domain hub → sub-skill. This eliminates the flat hub list that caused ambiguous routing when multiple hubs matched similar trigger phrases.
Updated skill counts: 34 hubs → 189 sub-skills across 223 SKILL.md files (was 31 hubs → 187 sub-skills in v3.4.1).

Agent Tier Rebalancing

jax-pro: sonnet → opus (GPU kernels, custom VJP/JVP, Pallas, XLA/HLO analysis)
julia-pro: sonnet → opus (full SciML depth: MTK, Turing, UDEs, sensitivity)
ml-expert: sonnet → haiku (fast-turnaround classical ML/MLOps; DL delegated to neural-network-master)

Agent Renames

ai-engineer → pinn-engineer (refocused on physics-informed neural networks: NeuralPDE.jl, DeepXDE, BPINN/BNNODE, inverse PDEs)
prompt-engineer → sci-workflow-engineer (refocused on LLM-assisted scientific pipelines: JAX/Julia codegen, experiment templates, scientific RAG)

New Registered Commands (+5, -2 net = 14 → 17)

Research Suite (new): /lit-review, /paper-implement, /replicate
Science Suite (new): /md-sim, /benchmark
Dev Suite (retired): /commit, /refactor-clean (moved to skill-invoked; use commit-commands:commit plugin instead)

Synergy & Triggering Audit

22 skill and agent files edited to eliminate role collisions and close routing gaps.
Role collisions resolved: julia-pro vs julia-ml-hpc, ml-expert vs neural-network-master, context-specialist vs reasoning-engine.
Routing gaps closed: deep-learning and advanced-simulations gained explicit routing decision trees (upgraded from sub-skills to domain hubs).
Trigger descriptions rewritten to enumerate concrete action verbs and domain terms instead of vague “Use when…” phrases.

Skill Listing Budget

skillListingBudgetFraction set to 8% in all four plugin settings.json files (was defaulting to 1%, causing 126 of 223 descriptions to be truncated at session start).
Also fixed two YAML frontmatter parse failures (unquoted colons) in agent-hub/SKILL.md and dev-hub/SKILL.md.

Graphify Knowledge Graph

Added graphify-out/ with a semantic knowledge graph of the full codebase: 3023 nodes, 4232 edges, 269 communities (generated by Gemini with 92% extracted / 8% inferred edges).
GRAPH_REPORT.md documents god nodes, community hubs, surprising cross-module connections, and suggested graph queries.

New Skill: ai-pair (dev-suite)

Added ai-pair sub-skill under dev-workflows hub: multi-model review patterns (Claude + Codex + Gemini), AI pair-programming prompt templates, and structured debugging workflows.

Test Corpus

Added representative code corpus under test-corpus/ for skill_validator.py (covers JAX, Julia Bayesian, Julia GNN, GitHub Actions CI, async Python, FastAPI).

Validation

pytest: 258/258 passing.
All 223 SKILL.md files within 2% context budget.
skillListingBudgetFraction 8% confirmed sufficient for full skill listing.

v3.4.1 (2026-04-19)¶

Hotfix

research-suite hook loading: removed the redundant "hooks": "./hooks/hooks.json" key from plugins/research-suite/.claude-plugin/plugin.json. The Claude Code harness auto-loads hooks/hooks.json from every plugin, so the explicit manifest reference caused a “Duplicate hooks file detected” load error on plugin reload. Matches the pattern used by agent-core / dev-suite / science-suite (none of which declare hooks in their manifest). Version bump forces plugin-cache refresh so users who installed v3.4.0 pick up the fix automatically on /plugin update.

Validation

All other v3.4.0 validator numbers unchanged: metadata 0/0 on all 4 suites; doc_checker 0 warnings across all 4 suites; xref 531/531 valid; context budget 217/217 fit 200K; pytest 188/188 passing.

v3.4.0 (2026-04-18)¶

New Plugin: research-suite

New 4th plugin suite extracted from science-suite. Contains:
- 2 agents: research-expert (moved from science-suite, unified methodology specialist) and research-spark-orchestrator (new, drives the 8-stage refinement pipeline)
- 3 workflow tracks:
  - scientific-review — journal-ready peer review producing a .docx referee report with Six-Lens analysis and Confidential Comments to Editor (standalone skill).
  - research-spark — 8-stage artifact-gated refinement of a rough research idea into a fundable plan. Stages: spark-articulator → landscape-scanner → falsifiable-claim → theory-scaffold (Stages 4-5) → numerical-prototype → experiment-designer → premortem-critique. State tracked in _state.yaml.
  - research-practice — methodology hub routing to research-methodology, research-quality-assessment, research-paper-implementation, scientific-communication, evidence-synthesis (all moved from science-suite’s research-and-domains hub).
- Three adversarial patterns enforced: Reviewer 2 persona (Stages 2-3), stepwise derivation protocol (Stages 4-5), instrument capability 3× margin rule (Stage 7).
- 0 registered commands (workflows are skill-driven). The legacy /paper-review command was removed in favor of scientific-review.
science-suite now focuses purely on computational work (JAX, Julia, HPC, ML/DL, physics, nonlinear dynamics): 11 agents (was 12), 14 hubs → 112 sub-skills (was 117).

research-suite optimization pass (same release)

Description normalization: 5 methodology skills and 1 agent converted from weak “Use when…” form to strong third-person “This skill should be used when…” with 8-10 verbatim trigger phrases each (evidence-synthesis, scientific-communication, research-paper-implementation, research-quality-assessment, research-methodology, research-practice, research-expert).
Non-standard frontmatter cleanup: dropped maturity / specialization / inline Version: footers from research-paper-implementation and research-quality-assessment (version lives only in plugin.json per convention).
Cross-skill linkage: every methodology skill now points to its research-spark pipeline counterpart (e.g., research-methodology ↔ Stage 7 experiment-designer; evidence-synthesis ↔ Stage 2 landscape-scanner). Added phase↔stage mapping table to research-practice hub.
Plugin metadata: sharper plugin.json description; 10 new discoverability keywords (power-analysis, prisma, grade, consort, strobe, reproducibility, paper-implementation, statistical-rigor, pre-registration, doe).

Documentation

New docs/suites/research-suite.rst with full skill/agent coverage.
Updated docs/index.rst, docs/categories/index.rst, docs/reference/agents.md, docs/reference/commands.md, docs/integration-map.rst, docs/guides/scientific-workflows.rst, and docs/suites/science-suite.rst to reflect the split.
CLAUDE.md suite table updated: 3→4 suites, 24→25 agents, suite counts refreshed.

Validation

metadata_validator: 0 errors on all 4 suites.
xref_validator: 530/530 cross-references valid.
doc_checker: 0 errors on research-suite.
context_budget_checker: 217/217 skills fit 2% budget on both 200K and 1M context windows.
pytest: 180/180 passing (was 154 in v3.3.0 — 26 new hook-integrity tests from the bandit/vulture/gitleaks audit addition).

v3.4.0 polish (2026-04-19)

research-suite hooks: added 3 hook events (SessionStart artifact-resume, TaskCompleted audit logging, SubagentStop prompt-based stage-artifact verification) + 2 command handler scripts. Brings hook event total across suites from 24 → 27.
Version consistency sweep: pyproject.toml bumped 3.3.0 → 3.4.0; docs/conf.py release 3.3.0 → 3.4.0; Makefile header 3.0.0 → 3.4.0; README badges + overview prose synchronized; agent-core commands/team-assemble.md “MyClaude v3.3.0” → “v3.4.0”. All 13 canonical version surfaces now match.
Trigger-phrase parity: final SKILL.md (science-suite/skills/research-and-domains) gained “Use when…” trigger — 217/217 skills now conform.
Tooling polish: skill_validator.py now reports n/a ⚪ no corpus instead of misleading 0.0% ❌ when no test corpus is loaded; doc_checker.py wired into make validate (per-plugin iteration).

v3.3.0 (2026-04-12)¶

CLI 2.1.104 Ecosystem Optimization

Agent maxTurns standardization: 10 agents raised to model tier targets (opus≥50, sonnet≥35)
Tool list enrichment: EnterPlanMode/ExitPlanMode added to all 7 allowlist-based opus agents; CronCreate, ScheduleWakeup added to automation-engineer, devops-architect, sre-expert; Monitor added to smart-debug command
Hook expansion: agent-core 12→17 events (+PreSubagentUse, ExecutionError, PermissionPrompt, ContextOverflow, CostThreshold), dev-suite 0→8 events (new hooks/ directory), science-suite 0→6 events (new hooks/ directory)
Forward-looking hooks: ContextOverflow and CostThreshold handlers registered in agent-core for future CLI versions (will not fire on CLI 2.1.x)
Broken references fixed: 5 phantom agent references in ultra-think and reflection commands (research-intelligence, hpc-numerical-coordinator, ai-software-architect)
Settings harmonization: dev-suite default maxTurns 35→40

Agent & Skill Polish

Agent color scheme: added color frontmatter field to all 24 agents for statusline differentiation (blue/cyan/green/magenta/red/yellow)
Abstract model tiers: replaced hardcoded model version strings with abstract tier names (opus/sonnet/haiku) across agent frontmatter
Hub skill triggers: improved skill description triggering patterns for better routing accuracy
CodeRabbit cleanup: removed uninstalled coderabbit agent from inventory and quality gate references

Testing & CI

Cross-suite invariant tests: 19 new tests in test_cross_suite_invariants.py covering 7 coverage gaps (color field, model tier validity, hook event naming, skill budget, version sync, agent-skill cross-refs, command registration)
CI quality gate: new GitHub Actions workflow (ci.yml) running pytest, ruff, and mypy on PRs
Test count: 135→154 total tests across the suite

Validator State

metadata_validator 0/0/0; xref_validator all valid; context_budget_checker 206/206 (no skill over 80%); pytest 154/154; ruff + mypy clean; pip-audit clean.

v3.2.0 (2026-04-12)¶

Skill Validator Vacuous-Pass Fix

Fixed a vacuous-truth bug in skill_validator.py where the Overall Assessment reported “EXCELLENT” when no test corpus was provided. With total_tests == 0, all rate metrics returned 0.0% via division guards, which satisfied the < 10% threshold — a classic vacuous pass. The validator now reports “NO DATA” when no corpus is configured, and gates action-required messages behind total_tests > 0.
Added two regression tests to test_skill_validator.py: test_no_corpus_reports_no_data (end-to-end: load plugins without corpus, verify report contains “NO DATA” and not “EXCELLENT”) and test_zero_tests_metrics_accuracy (unit: verify all SkillValidationMetrics properties return 0.0 when total_tests == 0).

Validator State

metadata_validator 0/0/0; xref_validator 523/523 valid; context_budget_checker 206/206 (no skill over 80%); skill_validator NO DATA (no corpus — by design); pytest 120/120 (+2 regression tests); ruff + mypy clean; pip-audit clean.

v3.1.7 (2026-04-11)¶

Bayesian SINDy Extraction

New skill bayesian-sindy-workflow extracted from equation-discovery which was at 88% of its context budget after a prior external Bayesian SINDy section. Lands at ~68.55% budget with headroom for v3.1.8+ growth.
Structure: when-to-prefer decision table (Bayesian vs classical SINDy); three routes (horseshoe+NUTS, ensemble SINDy, Julia UQ-SINDy); full 5-stage Lorenz-63 worked example — scipy.integrate.solve_ivp + noise + central-difference, 10-term second-order polynomial library, NumPyro horseshoe prior + NUTS (4 chains, 1000 warmup, 2000 samples), ArviZ diagnostics (R-hat, ESS, PSIS-LOO), inclusion probabilities + credible intervals; prior-sensitivity analysis; Julia sidebar using Turing + DataDrivenDiffEq with truncated(...; lower=0) keyword form (Turing 0.37+ API drift caught via Context7).
equation-discovery dropped from 3535 → 2984 tokens (88% → 74.6%, under the 75% Commit D gate).
Science-suite now at 117 sub-skills (from 116); bayesian-inference hub grows from 9 → 10 sub-skills.

Composition Headers

Added ## Composition with neighboring skills section headers to three science-suite skills that had prose cross-references but lacked the canonical header: stochastic-dynamics (5 bullets), non-equilibrium-theory (6 bullets), correlation-physical-systems (5 bullets).

freud IntermediateScattering Re-verification

Re-verified freud.density.IntermediateScattering against the current freud release via Context7 on 2026-04-11: still not shipped in the density module. The numpy.fft + MDAnalysis fallback remains the recommended approach. Inline tag updated to [re-verified absent 2026-04-11].

Tooling — pip-audit

Added pip-audit 2.10+ dev dependency for automated CVE scanning. Wired into the per-commit validator gate.
One HIGH finding ignored — PYSEC-2022-42969 / CVE-2022-42969 (ReDoS in py.path.svnwc.InfoSvnCommand regex, reaches repo via interrogate 1.7.0 → py 1.11.0 transitive). The py library is archived and unmaintained since 2022; interrogate upstream tracks removal at econchick/interrogate#142, unreleased. In-repo exposure is zero (interrogate only imports py for file-path handling, never reaching the vulnerable svnwc code path). Revisit in v3.1.8+ when interrogate can be replaced.

Validator State

metadata_validator 0/0/0; xref_validator 523/523 valid (+4 from the new skill); context_budget_checker 206/206 (equation-discovery 74.6%, bayesian-sindy-workflow 68.55%, non-equilibrium-theory 78.98%, no skill over 80%); skill_validator EXCELLENT; pytest 118/118; ruff + mypy clean; pip-audit clean with one ignore as documented above.

v3.1.6 (2026-04-11)¶

Julia ↔ Python Parity Polish

Julia → Python handoff for nonlinear time-series tools. New section in chaos-attractors covering nolds, antropy, IDTxl, pyEDM, pyunicorn, teaspoon (no native Julia equivalents) via the canonical PythonCall.jl + CondaPkg.jl import pattern with a concrete lyap_r example and GIL-under-@threads caveats. Pointer edits in nonlinear-dynamics hub ecosystem-selection table and time-series-analysis. Completes the Julia ↔ Python interop story with v3.1.5’s reciprocal juliacall bifurcation path.
BAR free-energy worked example bridging Langevin ensemble and non-equilibrium theory. Added a 4-stage pipeline to non-equilibrium-theory: (1) JAX Langevin ensemble + jax.lax.scan to accumulate forward/reverse work samples, (2) BAR fit via pymbar.other_estimators.bar (Context7-verified — pymbar 4.0 moved it out of top level), (3) variance comparison vs Jarzynski cumulant expansion, (4) multi-state MBAR pointer with alchemlyb ecosystem wrapper. One-sentence cross-link added to stochastic-dynamics to preserve its 74% budget cap.
freud ecosystem for physical correlations. Added a “Python freud ecosystem” section to correlation-physical-systems covering freud.density.RDF, StaticStructureFactorDebye / StaticStructureFactorDirect (with API-drift warning: StaticStructureFactorDebye takes num_k_values, not bins), Steinhardt Q_l, Hexatic, Nematic, and SolidLiquid phase classifier. freud.density.IntermediateScattering tagged [unverified] (absent in freud 3.5.0) with a numpy.fft + MDAnalysis fallback. Algorithmic notes in correlation-computational-methods (AABBQuery neighbor-list reuse, reset=False multi-frame averaging, CuPy breakeven N ≈ 10⁴, “MDAnalysis/MDTraj as iterator, freud as analyzer” production pattern). One-line hub pointer in correlation-analysis with a PythonCall.jl handoff note for Julia users.

Deferred

Item A (ML-FF CLI spot-check) — explicitly deferred until the user resumes active MLIP training.

Validator State

metadata_validator: 0/0/0 across all 3 plugins.
xref_validator: 519/519 references valid.
context_budget_checker: 205/205 skills fit. non-equilibrium-theory at 73.8% (under 75% Commit C gate), correlation-physical-systems at 74.45% (under 75% Commit D gate), chaos-attractors at 79% (under 80% at-risk line).
skill_validator EXCELLENT; pytest 118/118; ruff clean; mypy 0 errors.

Known forward items for v3.1.7+

equation-discovery at 88% — flagged for Bayesian SINDy extraction split.
freud.density.IntermediateScattering presence in newer freud releases — re-verify when correlation skills are next touched.

v3.1.5 (2026-04-11)¶

Julia/Python Parity Pass

Fokker-Planck direct PDE methods in stochastic-dynamics: finite-difference / spectral discretization, boundary-condition patterns, cross-links to Langevin sampling.
Python bifurcation continuation escape hatch in bifurcation-analysis and nonlinear-dynamics: documented the juliacall path to Julia bifurcation routines (since BifurcationKit.jl is blocked on Julia 1.12), plus PyDSTool and AUTO-07p as Python-native alternatives.
Modern ML force fields expansion in ml-force-fields: equivariant GNNs (NequIP, MACE, Allegro), Julia ACE stack (ACEpotentials.jl), training loops, active learning, energy-and-force loss balance. Budget-tight at 78%.
Julia Monte Carlo idioms in statistical-physics: Metropolis sampler patterns with @inbounds / @fastmath, SIMD inner loops, parallel tempering via Distributed.jl, tuning heuristics.

Tooling

Added types-PyYAML dev dependency for mypy stubs.
Tightened self-improving-ai triggers so they no longer overlap with dspy-basics.
Added tools/validation/command_file_linter.py — a targeted structural linter for Claude Code command files with 5 stable rule IDs (fence-unbalanced, heading-skip, step-ref-broken, trailing-whitespace, heading-duplicate). Importable API + standalone CLI, wired into make validate (errors block, warnings non-blocking). Caught a pre-existing duplicate ## Metrics H2 in dev-suite/commands/tech-debt.md. 15 new tests.

Validator State

metadata 0/0/0; xref 515/515 valid (+3 from v3.1.4); context budget 204/204 (bifurcation-analysis 79%, ml-force-fields 78%, both under 80%). pytest 103 passing, ruff clean, mypy 0 errors.

v3.1.4 (2026-04-11)¶

Research-Focus Optimization Pass (science-suite)

Aligned agents and skills with research in Bayesian MCMC (NUTS / Consensus MC / Pigeons), Universal Differential Equations, SINDy, nonlinear dynamics, time series, rare events / avalanche dynamics, non-equilibrium statistical physics, and point / jump processes.
9 new sub-skills (hub-discovered, not registered in plugin.json): consensus-mcmc-pigeons (non-reversible parallel tempering via Pigeons.jl, now distinguished from Scott-2016 divide-and-conquer Consensus Monte Carlo), bayesian-ude-workflow (Turing + DiffEq + Lux staged pipeline), bayesian-ude-jax (Python/JAX counterpart via Diffrax + Equinox + NumPyro), bayesian-pinn (BNNODE/BayesianPINN extracted from neural-pde which drops from 78% → 65% of budget), point-processes (Hawkes / HSGP / Julia PointProcesses.jl), rare-events-sampling (large-deviation / cloning / avalanche statistics), self-improving-ai (research overview), dspy-basics (DSPy programmatic prompts depth-skill), rlaif-training (Constitutional AI / RLAIF / DPO depth-skill).
1 new agent-core skill: self-improving-agents under the reasoning-and-memory hub — operational counterpart to science-suite’s self-improving-ai (agents inside Claude Code vs research framework overview). Covers closed-loop reflection-refine-validate, self-consistency ensembles, DSPy and TextGrad automatic prompt optimization, evolutionary prompt search, and constitutional self-critique.

Research Audit Remediation

Added extreme-value-statistics skill (GEV/GPD/Hill/Pickands/POT, return levels, non-stationary EVT) and wired into statistical-physics-hub.
Wired the orphaned robust-testing sub-skill into the research-and-domains hub (was on disk but unreachable).
Extended rare-events-sampling triggers to cover SOC, sandpile / Bak-Tang-Wiesenfeld, crackling noise, and avalanche-size distributions; cross-linked to extreme-value-statistics.
Resolved jump-diffusion routing: stochastic-dynamics owns general physics jump-diffusion SDEs (Lévy flights, shot noise, regime-switching Langevin); catalyst-reactions stays scoped to biochemical reaction networks.
Added Bayesian SINDy coverage to equation-discovery (horseshoe-prior NumPyro, ensemble SINDy, UQ-SINDy via Turing) — pushed to 88% budget and flagged for v3.1.7 extraction.
Disambiguated sciml-modern-stack vs sciml-and-diffeq by rewriting hub routing without touching the frozen sciml-modern-stack body.
Added missing trigger keywords (ADF, KPSS, Phillips-Perron, PELT, BinSeg, renewal processes, non-parametric Hawkes EM) to time-series-analysis and point-processes.

Agent Updates

julia-pro: Bayesian stack upgraded to Turing + Pigeons; sensealg table rewritten with GaussAdjoint as modern default and the ForwardDiff-bypasses-sensealg factual fix; decision tree adds UDE and multimodal branches.
julia-ml-hpc, statistical-physicist, ai-engineer, jax-pro, simulation-expert also aligned with research-focus delegation updates.

Codebase-Aware /team-assemble (agent-core)

Major rework: static catalog → codebase-aware recommender / adapter / validator. 21 → 25 team templates.
4 new teams: nonlinear-dynamics (bifurcation, chaos, coupled oscillators, pattern formation — first wiring of nonlinear-dynamics-expert to its documented delegation targets), julia-ml (Lux.jl/Flux.jl/MLJ.jl + CUDA.jl/MPI.jl distributed training), multi-agent-systems, sci-desktop (PyQt6/PySide6 + JAX scientific desktop apps).
ai-engineering team swaps reasoning-architect for context-architect as default 4th teammate.
Closed drift: 0 unused local agents (down from 4).
New capabilities: Step 1.5 codebase detection (4-tier signal gathering with efficiency gates), Step 2.5 fingerprint table, Step 2.6 rule-based ranking with confidence labels, Step 2.6a/b validation + auto-fill, five new invocation modes.
Session cache: Tier 0 cache at /tmp/team-assemble-cache/<sanitized-abspath>.json with mtime-based invalidation (15 min TTL); --no-cache bypass flag.
S1 prompt-injection safeguards for README probes (HIGH): character neutralization, <untrusted_readme_excerpt> wrapping, 9 refusal-trigger patterns. Non-English README hardening via language_hint classification and auto-fill trust tiers.

Tooling Hardening

Added sys.path.insert to 5 validators so CLI invocation works without PYTHONPATH=..
PluginLoader consolidates YAML frontmatter parsing via normalized component helpers.
xref_validator gains disk-discovery of sub-skills so hub-architecture sub-skills no longer false-positive as broken references.
Restored requires = ["maturin>=1.0,<2.0"] in rust-extensions scaffold (maturin is the real PEP 517 build backend for PyO3).
pyproject.toml: excluded test-corpus/ from mypy and ruff.

Validator State

metadata 0/0/0; xref 512/512 valid; context budget 204/204; pytest 60/60; ruff clean.

v3.1.3 (2026-04-10)¶

New Skill: thinkfirst (agent-core)

Added thinkfirst as a sub-skill under the llm-engineering hub. Interview-first workflow that clarifies vague user intent through a Seven Dimensions framework before any prompt is drafted.
Positioned as the first branch in the llm-engineering routing tree so users with brain dumps hit clarification before reaching for production templates.
Cross-linked with prompt-engineering-patterns: thinkfirst handles intent clarification, prompt-engineering-patterns handles production-grade refinement.

v3.1.2 (2026-04-06)¶

Bug Fixes

Removed duplicate hooks manifest entries from agent-core and dev-suite plugin.json. The hooks/hooks.json file is auto-discovered by convention; declaring it explicitly caused duplicate-load errors at startup.
Fixed dev-suite .lsp.json structure to match expected schema.

Documentation

Updated plugin READMEs to use hub→sub-skill notation matching CLAUDE.md.
Rewrote tools/README.md to reflect current tooling structure.

v3.1.1 (2026-04-06)¶

Bug Fixes

Set strict: true in marketplace.json to resolve conflicting manifests when both marketplace.json and individual plugin.json files declare components.
Fixed agent teams guide reference in README (34 → 21 teams).

v3.1.0 (2026-04-03)¶

Hub-Skill Architecture

Introduced Hub Skill routing: 26 hub skills route to 167 sub-skills via routing decision trees. Hubs are declared in plugin.json; sub-skills are discovered through hub routing. Eliminates ambiguous flat-list skill matching.
agent-core: 3 hubs (agent-systems, reasoning-and-memory, llm-engineering) → 12 sub-skills.
dev-suite: 9 hubs (backend-patterns, frontend-and-mobile, architecture-and-infra, testing-and-quality, ci-cd-pipelines, observability-and-sre, python-toolchain, data-and-security, dev-workflows) → 49 sub-skills.
science-suite: 14 hubs → 106 sub-skills.
Total: 24 agents, 14 registered commands, 26 hubs → 167 sub-skills (193 total).

Knowledge Gap Closure (+28 skills, +3 commands)

Added 6 agent-core skills: prompt-engineering-patterns, memory-system-patterns, safety-guardrails, tool-use-patterns, agent-evaluation, knowledge-graph-patterns.
Added 10 dev-suite skills: database-patterns, containerization-patterns, cloud-provider-patterns, message-queue-patterns, caching-patterns, graphql-patterns, accessibility-testing, websocket-patterns, search-patterns, mobile-testing-patterns.
Added 12 science-suite skills: computer-vision, nlp-fundamentals, bioinformatics, time-series-analysis, control-theory, experiment-tracking, signal-processing, symbolic-math, reinforcement-learning, quantum-computing, federated-learning, advanced-optimization.
Added 3 science-suite commands: run-experiment, analyze-data, paper-review.
Deduplicated prompt-engineering-patterns (science-suite copy removed, migrated to agent-core).

Agent Optimization (24 agents)

Added background: true to 18 agents for parallel dispatch.
Upgraded neural-network-master and simulation-expert to opus model tier.
9 opus agents: orchestrator, reasoning-engine, software-architect, debugger-pro, research-expert, statistical-physicist, nonlinear-dynamics-expert, neural-network-master, simulation-expert.
Added “Use when…” activation triggers to all 24 agent descriptions.
Fixed cross-suite delegation annotations and invalid agent references.

Skill Quality & Integrity

All 193 skills have: trigger phrases, Expert Agent sections, and checklists.
Fixed 235 broken relative links across 38 files.
Zero orphaned skills — all 193 reachable via hub routing.
Resolved routing overlaps (ml-and-data-science/ml-deployment/deep-learning-hub triangle).
Refactored testing-patterns from 96% to under 75% context budget.
193/193 skills within 2% context budget.

Security Fixes

Gated commit_fixes() behind --auto-commit flag (default: dry-run).
Added package name validation regex for npm/pip subprocess calls.
Replaced git add . with git add --update for safe staging.
Anchored SessionStart hook matcher to ^(startup|resume)$.

Documentation

Rewrote all reference docs for hub architecture.
Added Hub Skill, Sub-Skill, Routing Decision Tree, and Agent Team to glossary.
Updated all workflow guides with hub → sub notation.
Docs build with zero warnings. 60/60 tests pass.

Governance

Added skill size governance policy (>3000 bytes = review required).
14 commands intentionally registered; 22 skill-invoked by design.

v3.0.0 (2026-04-02)¶

Julia ML/DL/HPC Expansion

Added julia-ml-hpc agent (sonnet) for Julia ML, Deep Learning, and HPC. Covers Lux.jl/Flux.jl, MLJ.jl, CUDA.jl, MPI.jl, GraphNeuralNetworks.jl, and ReinforcementLearning.jl. Delegates SciML/ODE work to julia-pro.
Added 10 new Julia skills: julia-neural-networks, julia-neural-architectures, julia-training-diagnostics, julia-ad-backends, julia-ml-pipelines, julia-gpu-kernels, julia-hpc-distributed, julia-model-deployment, julia-graph-neural-networks, julia-reinforcement-learning.
Updated 4 existing agents with julia-ml-hpc delegation rows.

Nonlinear Dynamics Expansion (2026-03-31)

Added nonlinear-dynamics-expert agent (opus) for bifurcation theory, chaos analysis, network dynamics, and pattern formation.
Added 8 nonlinear dynamics skills: bifurcation-analysis, chaos-attractors, pattern-formation, equation-discovery, network-coupled-dynamics, and more.

Agent-Skill Synergy (100% coverage)

Added Expert Agent pointers to all 142 skills (47% → 100%).
Dev-suite: 39/39 skills mapped to 9 domain agents.
Science-suite: 29 orphan skills assigned to correct agents.

Architecture Reorganization (5 → 3 suites)

Merged engineering-suite + infrastructure-suite + quality-suite into dev-suite. Eliminates 27 cross-suite agent delegation edges.
New structure: agent-core (3 meta-agents), dev-suite (9 agents, 27 commands, 39 skills), science-suite (12 agents, 96 skills).

v2.1.88 Spec Compliance

Migrated all manifests to .claude-plugin/plugin.json per official plugin spec.
Removed non-spec version/color fields from all agent and command frontmatter.
Version now lives only in plugin.json (single source of truth).

Agent Hardening

Added effort, memory, tools/disallowedTools fields to all 24 agents.
Added isolation: worktree to app-developer and automation-engineer.

Model Tier Optimization

Assigned Opus to 6 deep-reasoning agents; Haiku to documentation-expert.
Fixed neural-network-master from inherit to explicit sonnet.

Hook Expansion (3 → 10 events)

agent-core: Added PostToolUse, PostCompact, SubagentStop, PermissionDenied, TaskCompleted (3 → 8 events).
dev-suite: Added PostToolUse and SubagentStop (2 events).

Skill Consolidations (7 merges)

advanced-reasoning + structured-reasoning → reasoning-frameworks
meta-cognitive-reflection + comprehensive-reflection-framework → reflection-framework
ai-assisted-debugging + debugging-strategies → debugging-toolkit
comprehensive-validation-framework → comprehensive-validation
machine-learning-essentials → machine-learning
parallel-computing-strategy → parallel-computing
python-testing-patterns + javascript-testing-patterns → testing-patterns

v2.2.1 (2026-02-15)¶

Debugging Team Templates

Added 5 debugging agent teams: debug-gui, debug-numerical, debug-schema, debug-triage, and debug-full-audit.
Teams use a Core Trio pattern (explorer → debugger → python-pro) plus rotating specialists.
Consolidated from 35 to 21 team templates (40% reduction): merged 5 overlapping pairs (pr-review, quality-security, sci-compute, md-simulation, docs-publish), removed 7 niche teams, added alias table for backward compatibility. Total: 21 team templates.

Agent Teams System

New /team-assemble command with pre-built team configurations.
Teams span 5 categories: Development & Operations, Scientific Computing, Cross-Suite Specialized, Official Plugin Integration, and Debugging.
Integrated 20 official plugin agents (pr-review-toolkit, feature-dev, coderabbit, plugin-dev, hookify, huggingface-skills, agent-sdk-dev, superpowers).
Quality Gate Enhancers for adding review agents to any team.
Comprehensive reference guide at docs/agent-teams-guide.md.

Agent Enhancements

Added adaptive thinking references to reasoning-engine agent.
Integrated Agent Teams coordination into orchestrator agent.
Added memory frontmatter to 11 key agents for persistent context.

Hooks Infrastructure

Added hooks support to agent-core suite (SessionStart, PreToolUse).
New hooks/hooks.json configuration in agent-core plugin manifest.

Tooling

Added context budget checker tool (tools/validation/context_budget_checker.py).

v2.2.0 (2026-02-14)¶

Added Agent Teams support with team-assemble command and guide.
Updated all suites to v2.2.0 for Claude Opus 4.6 compatibility.
Added context budget checker tool.

v2.1.0 (2026-01-20)¶

Suite Consolidation

Consolidated 31 legacy plugins into 5 suites: agent-core, engineering-suite, infrastructure-suite, quality-suite, science-suite.

Flattened Skills Architecture

Restructured all skills to a flat directory structure for reliable auto-discovery.
Science suite: 80 flattened skills for comprehensive coverage.

Agent & Command Updates

Standardized agent metadata with consistent colors, versions, and examples.
Renamed feature-dev to eng-feature-dev to prevent conflicts.

v2.0.0 (2025-12-15)¶

Initial release of the consolidated architecture.