Research Suite

Scientific research workflows: peer review, idea-to-plan refinement, and methodology orchestration. Three complementary tracks — scientific-review (manuscripts from other authors → .docx referee report), research-spark (own rough idea → 8-stage artifact-gated plan), and research-practice (general methodology hub).

Version: 3.5.2 | 2 Agents | 3 Registered Commands | 3 Hubs (research-hub → research-spark + research-practice + _research-commons) → 14 sub-skills | 3 Hook Events

Created in v3.4.0 by extracting research-expert plus 5 methodology skills from science-suite and adding the research-spark pipeline (new 8-stage orchestrator + 7 stage-specialist skills + _research-commons resource hub).

Agents

Agent: research-expert

Unified specialist for research methodology, evidence synthesis (PRISMA/GRADE), statistical-rigor assessment, IMRaD structuring, paper-to-code reproduction, and publication-quality visualization. For one-off methodology tasks, not pipeline-driven work.

Model: opus

Version: 3.5.2

Agent: research-spark-orchestrator

Autonomous driver for the 8-stage research-spark refinement pipeline. Owns ``_state.yaml``, enforces the artifact contract, fans out to parallel sub-agents at Stage 2 (literature layers), Stage 6 (validation passes), and Stage 8 (reviewer archetypes).

Model: opus

Version: 3.5.2

Commands

Three slash commands registered in v3.5.2:

Command: lit-review

Systematic literature review with PRISMA-compliant search, evidence synthesis, and gap analysis.

Command: paper-implement

Reproduce a research paper end-to-end — theory parsing, code implementation, and validation against published results.

Command: replicate

Computational replication of published experiments with statistical deviation analysis and reproducibility report.

The legacy /paper-review command was removed in v3.4.0 because scientific-review produces a strictly better .docx deliverable (journal-adapted Six-Lens analysis with Confidential Comments to Editor).

Stand-alone Workflows

Two top-level skills that are complete workflows, not hubs or stage specialists:

Skill: scientific-review

Peer review of a single manuscript (PDF / DOCX / pasted text) with six competencies: domain expertise, methodological rigor, critical thinking, constructive communication, ethical integrity, and time-efficient delivery. Produces a downloadable .docx referee report (markdown fallback if python-docx is unavailable). If the user names a target journal, performs a live web search for that journal’s reviewer guidelines before structuring the output.

Skill: _research-commons

Shared assets for the research-spark skill stack — not a standalone workflow. Other skills reference files here for writing style (style/writing_constraints.md with banned-vocabulary list), code architecture rules (code_architecture/jax_first_rules.md), shared templates (templates/heilmeier.md, templates/reviewer2_persona.md, templates/abstract.md, templates/onepage.md, templates/project_log.md), and utility scripts (scripts/style_lint.py, scripts/formalism_code_reconcile.py, scripts/concept_extractor.py, scripts/latex_sanity.py, scripts/artifact_diff.py).

research-spark Pipeline (8 stages)

Pipeline orchestrator plus seven stage-specialist skills. Each stage writes one canonical artifact at a canonical path; the next stage consumes it as authoritative input. State lives in _state.yaml at the project root.

Hub: research-spark

Orchestrator for the 8-stage refinement pipeline. Detects current stage from _state.yaml + user cue, loads the right stage specialist, enforces canonical paths, preserves prior-stage artifacts when a completed stage is re-entered, and logs depth-gate overrides (e.g., the 8-steelmanned-papers rule in Stage 2).

  • spark-articulatorStage 1. Rough idea → 3-5 sentence articulation naming the spark, its novelty, and the observation that would confirm it. Writes 01_spark.md.

  • landscape-scannerStage 2. Three-layer literature scan (foundational / recent / adjacent), steelmanning each paper, gap matrix, Reviewer 2 adversarial pass. Writes 02_landscape.md.

  • falsifiable-claimStage 3. Claim + Heilmeier catechism + kill criterion + Reviewer 2 challenge. Writes 03_claim.md.

  • theory-scaffoldStages 4-5 (merged). Stepwise derivation protocol → LaTeX formalism. Blocks multi-step symbolic leaps, identifies governing dimensionless groups, checks known limits. Writes 04_theory.md + 05_formalism.tex.

  • numerical-prototypeStage 6. JAX-based solver + three validation passes (analytic-limit recovery, synthetic benchmark, convergence study) → concrete predicted observable. Writes 06_prototype.md + code/.

  • experiment-designerStage 7. Instrument capability map (3× margin rule per dimension), DoE matrix, formal power analysis, pre-registered success metrics, risk register. Writes 07_plan.md.

  • premortem-critiqueStage 8. Failure narrative, root-cause clustering, cheapest early-warning signals fed back into Stage 7, simulated-reviewer critique across archetypes. Writes 08_premortem.md.

research-practice Hub (methodology)

For free-form methodology questions that are neither structured pipelines nor peer reviews.

Hub: research-practice (5 sub-skills)

Meta-orchestrator for the research lifecycle. Routes to the appropriate phase-aligned specialist.

  • research-methodologyDesign phase. Hypothesis formulation, power analysis, sample-size justification, ablation planning, statistical-test selection before data collection.

  • research-quality-assessmentEvaluate phase. Score existing work against CONSORT/STROBE/PRISMA/MOOSE, detect red flags (p-hacking, HARKing, selective reporting, circular analysis). Not a .docx deliverable — use scientific-review for that.

  • research-paper-implementationReproduce phase. Translate a published paper’s methods + appendix into runnable code.

  • scientific-communicationWrite-up phase. IMRaD structure, abstracts (see _research-commons/templates/abstract.md), posters, technical reports.

  • evidence-synthesisSynthesize phase. PRISMA systematic reviews, meta-analysis (effect-size pooling, I²/Q heterogeneity), GRADE evidence grading.

Phase ↔ research-spark stage mapping

When the user is inside an active research-spark project, the pipeline stage supersedes the generic methodology sub-skill — the stage version enforces tighter artifact contracts.

research-practice phase

research-spark stage

Design (research-methodology)

Stage 7 — experiment-designer

Reproduce (paper-impl)

Stage 6 — numerical-prototype

Synthesize (evidence-synth)

Stage 2 — landscape-scanner

Write-up (sci-comm)

Stage 1 — spark-articulator

Evaluate (quality-assessment)

Stage 8 — premortem-critique

Three adversarial patterns worth knowing

These exist because they catch failures the non-adversarial workflow misses.

  • Reviewer 2 persona (Stages 2-3). Adversarial reviewer argues the gap isn’t real / tractable / impact-bearing, or the claim is physically impossible / mathematically unsound / already solved. Each rebuttal must cite a specific paper.

  • Stepwise derivation protocol (Stages 4-5). One conceptual step per invocation, with a verification pass (dimensional check, limit check, sanity argument) before the next step. Blocks multi-step symbolic leaps.

  • Instrument capability margin (Stage 7). For each measurable quantity, compute the margin between predicted signal and instrument capability on each axis. Margin < 3× → high-risk measurement requiring explicit mitigation.

Style enforcement

Every emitted markdown artifact passes _research-commons/scripts/style_lint.py: no em dashes, no banned vocabulary (innovative, state-of-the-art, transformative, novel, groundbreaking, cutting-edge), quantified language preferred.

Cross-suite delegation

The research-spark-orchestrator delegates across suite boundaries at natural fan-out points:

Delegate to (suite)

When

jax-pro (science-suite)

Stage 6 JAX implementation details (JIT, vmap, integrator choice)

julia-pro (science-suite)

Stage 6 SciML/DifferentialEquations.jl, SINDy, stiff-ODE alternatives

nonlinear-dynamics-expert (science-suite)

Stages 4-5 when theory involves bifurcation, chaos, pattern formation

statistical-physicist (science-suite)

Stages 4-5 for correlation functions, Langevin/Fokker-Planck, critical phenomena

simulation-expert (science-suite)

Stage 6 when the prototype is MD or Monte Carlo

Hooks

3 hook events supporting the research-spark pipeline:

  • SessionStart — Detect research-spark stage artifacts (01_spark.md through 08_premortem.md) and resume at the latest completed stage

  • TaskCompleted — Log research tasks to .research-log.jsonl (audit trail) and prompt stage-artifact commit before advancing

  • SubagentStop (prompt-based) — LLM-driven verification that stage artifacts (research-spark) or referee-report sections (scientific-review) are present before the orchestrator advances

Beyond these, adversarial patterns and style linting remain enforced inside skill workflows (_research-commons/scripts/style_lint.py), so they run deterministically without depending on CLI event schemas.

Requirements

scientific-review

python-docx for .docx generation (markdown fallback), pandoc for DOCX ingestion, pdftotext / pypdf / pymupdf for PDF.

research-spark stack

Python 3.12+, uv for dependency resolution. Stage-specific: sympy (theory-scaffold), scipy + pyyaml (experiment-designer), jax + jaxlib (numerical-prototype), pdflatex (latex_compile_check.sh). All scripts install locally via uv add, never globally.

Project directory layout (research-spark)

<workspace>/<idea-slug>/
├── _state.yaml
├── project_log.md
├── artifacts/
│   ├── 01_spark.md
│   └── ... 08_premortem.md
└── code/                    # emerges at Stage 6
    ├── pyproject.toml
    ├── src/<slug>/
    └── tests/