Research Suite¶
Scientific research workflows: peer review, idea-to-plan refinement, and methodology orchestration. Three complementary tracks — scientific-review (manuscripts from other authors → .docx referee report), research-spark (own rough idea → 8-stage artifact-gated plan), and research-practice (general methodology hub).
Version: 3.5.2 | 2 Agents | 3 Registered Commands | 3 Hubs (research-hub → research-spark + research-practice + _research-commons) → 14 sub-skills | 3 Hook Events
Created in v3.4.0 by extracting research-expert plus 5 methodology skills from science-suite and adding the research-spark pipeline (new 8-stage orchestrator + 7 stage-specialist skills + _research-commons resource hub).
Agents¶
Agent: research-expert
Unified specialist for research methodology, evidence synthesis (PRISMA/GRADE), statistical-rigor assessment, IMRaD structuring, paper-to-code reproduction, and publication-quality visualization. For one-off methodology tasks, not pipeline-driven work.
Model: opus
Version: 3.5.2
Agent: research-spark-orchestrator
Autonomous driver for the 8-stage research-spark refinement pipeline. Owns ``_state.yaml``, enforces the artifact contract, fans out to parallel sub-agents at Stage 2 (literature layers), Stage 6 (validation passes), and Stage 8 (reviewer archetypes).
Model: opus
Version: 3.5.2
Commands¶
Three slash commands registered in v3.5.2:
Command: lit-review
Systematic literature review with PRISMA-compliant search, evidence synthesis, and gap analysis.
Command: paper-implement
Reproduce a research paper end-to-end — theory parsing, code implementation, and validation against published results.
Command: replicate
Computational replication of published experiments with statistical deviation analysis and reproducibility report.
The legacy /paper-review command was removed in v3.4.0 because scientific-review produces a strictly better .docx deliverable (journal-adapted Six-Lens analysis with Confidential Comments to Editor).
Stand-alone Workflows¶
Two top-level skills that are complete workflows, not hubs or stage specialists:
Skill: scientific-review¶
Peer review of a single manuscript (PDF / DOCX / pasted text) with six competencies: domain expertise, methodological rigor, critical thinking, constructive communication, ethical integrity, and time-efficient delivery. Produces a downloadable .docx referee report (markdown fallback if python-docx is unavailable). If the user names a target journal, performs a live web search for that journal’s reviewer guidelines before structuring the output.
Skill: _research-commons¶
Shared assets for the research-spark skill stack — not a standalone workflow. Other skills reference files here for writing style (style/writing_constraints.md with banned-vocabulary list), code architecture rules (code_architecture/jax_first_rules.md), shared templates (templates/heilmeier.md, templates/reviewer2_persona.md, templates/abstract.md, templates/onepage.md, templates/project_log.md), and utility scripts (scripts/style_lint.py, scripts/formalism_code_reconcile.py, scripts/concept_extractor.py, scripts/latex_sanity.py, scripts/artifact_diff.py).
research-spark Pipeline (8 stages)¶
Pipeline orchestrator plus seven stage-specialist skills. Each stage writes one canonical artifact at a canonical path; the next stage consumes it as authoritative input. State lives in _state.yaml at the project root.
Hub: research-spark¶
Orchestrator for the 8-stage refinement pipeline. Detects current stage from _state.yaml + user cue, loads the right stage specialist, enforces canonical paths, preserves prior-stage artifacts when a completed stage is re-entered, and logs depth-gate overrides (e.g., the 8-steelmanned-papers rule in Stage 2).
spark-articulator— Stage 1. Rough idea → 3-5 sentence articulation naming the spark, its novelty, and the observation that would confirm it. Writes01_spark.md.landscape-scanner— Stage 2. Three-layer literature scan (foundational / recent / adjacent), steelmanning each paper, gap matrix, Reviewer 2 adversarial pass. Writes02_landscape.md.falsifiable-claim— Stage 3. Claim + Heilmeier catechism + kill criterion + Reviewer 2 challenge. Writes03_claim.md.theory-scaffold— Stages 4-5 (merged). Stepwise derivation protocol → LaTeX formalism. Blocks multi-step symbolic leaps, identifies governing dimensionless groups, checks known limits. Writes04_theory.md+05_formalism.tex.numerical-prototype— Stage 6. JAX-based solver + three validation passes (analytic-limit recovery, synthetic benchmark, convergence study) → concrete predicted observable. Writes06_prototype.md+code/.experiment-designer— Stage 7. Instrument capability map (3× margin rule per dimension), DoE matrix, formal power analysis, pre-registered success metrics, risk register. Writes07_plan.md.premortem-critique— Stage 8. Failure narrative, root-cause clustering, cheapest early-warning signals fed back into Stage 7, simulated-reviewer critique across archetypes. Writes08_premortem.md.
research-practice Hub (methodology)¶
For free-form methodology questions that are neither structured pipelines nor peer reviews.
Hub: research-practice (5 sub-skills)¶
Meta-orchestrator for the research lifecycle. Routes to the appropriate phase-aligned specialist.
research-methodology— Design phase. Hypothesis formulation, power analysis, sample-size justification, ablation planning, statistical-test selection before data collection.research-quality-assessment— Evaluate phase. Score existing work against CONSORT/STROBE/PRISMA/MOOSE, detect red flags (p-hacking, HARKing, selective reporting, circular analysis). Not a .docx deliverable — usescientific-reviewfor that.research-paper-implementation— Reproduce phase. Translate a published paper’s methods + appendix into runnable code.scientific-communication— Write-up phase. IMRaD structure, abstracts (see_research-commons/templates/abstract.md), posters, technical reports.evidence-synthesis— Synthesize phase. PRISMA systematic reviews, meta-analysis (effect-size pooling, I²/Q heterogeneity), GRADE evidence grading.
Phase ↔ research-spark stage mapping¶
When the user is inside an active research-spark project, the pipeline stage supersedes the generic methodology sub-skill — the stage version enforces tighter artifact contracts.
research-practice phase |
research-spark stage |
|---|---|
Design (research-methodology) |
Stage 7 — experiment-designer |
Reproduce (paper-impl) |
Stage 6 — numerical-prototype |
Synthesize (evidence-synth) |
Stage 2 — landscape-scanner |
Write-up (sci-comm) |
Stage 1 — spark-articulator |
Evaluate (quality-assessment) |
Stage 8 — premortem-critique |
Three adversarial patterns worth knowing¶
These exist because they catch failures the non-adversarial workflow misses.
Reviewer 2 persona (Stages 2-3). Adversarial reviewer argues the gap isn’t real / tractable / impact-bearing, or the claim is physically impossible / mathematically unsound / already solved. Each rebuttal must cite a specific paper.
Stepwise derivation protocol (Stages 4-5). One conceptual step per invocation, with a verification pass (dimensional check, limit check, sanity argument) before the next step. Blocks multi-step symbolic leaps.
Instrument capability margin (Stage 7). For each measurable quantity, compute the margin between predicted signal and instrument capability on each axis. Margin < 3× → high-risk measurement requiring explicit mitigation.
Style enforcement¶
Every emitted markdown artifact passes _research-commons/scripts/style_lint.py: no em dashes, no banned vocabulary (innovative, state-of-the-art, transformative, novel, groundbreaking, cutting-edge), quantified language preferred.
Cross-suite delegation¶
The research-spark-orchestrator delegates across suite boundaries at natural fan-out points:
Delegate to (suite) |
When |
|---|---|
|
Stage 6 JAX implementation details (JIT, vmap, integrator choice) |
|
Stage 6 SciML/DifferentialEquations.jl, SINDy, stiff-ODE alternatives |
|
Stages 4-5 when theory involves bifurcation, chaos, pattern formation |
|
Stages 4-5 for correlation functions, Langevin/Fokker-Planck, critical phenomena |
|
Stage 6 when the prototype is MD or Monte Carlo |
Hooks¶
3 hook events supporting the research-spark pipeline:
SessionStart— Detect research-spark stage artifacts (01_spark.mdthrough08_premortem.md) and resume at the latest completed stageTaskCompleted— Log research tasks to.research-log.jsonl(audit trail) and prompt stage-artifact commit before advancingSubagentStop(prompt-based) — LLM-driven verification that stage artifacts (research-spark) or referee-report sections (scientific-review) are present before the orchestrator advances
Beyond these, adversarial patterns and style linting remain enforced inside skill workflows (_research-commons/scripts/style_lint.py), so they run deterministically without depending on CLI event schemas.
Requirements¶
- scientific-review
python-docxfor .docx generation (markdown fallback),pandocfor DOCX ingestion,pdftotext/pypdf/pymupdffor PDF.- research-spark stack
Python 3.12+,
uvfor dependency resolution. Stage-specific:sympy(theory-scaffold),scipy+pyyaml(experiment-designer),jax+jaxlib(numerical-prototype),pdflatex(latex_compile_check.sh). All scripts install locally viauv add, never globally.
Project directory layout (research-spark)¶
<workspace>/<idea-slug>/
├── _state.yaml
├── project_log.md
├── artifacts/
│ ├── 01_spark.md
│ └── ... 08_premortem.md
└── code/ # emerges at Stage 6
├── pyproject.toml
├── src/<slug>/
└── tests/