Scientist
An agent responsible for data/statistical analysis and research execution using Python.
Overview
Scientist is an agent that runs data analysis and research tasks in Python and reports evidence-based findings. It handles data loading/exploration, statistical analysis, hypothesis testing, visualization, and report generation.
Data analysis without statistical rigor produces misleading conclusions. Findings without confidence intervals are guesses, visualizations without context mislead, and conclusions that don't acknowledge limitations are dangerous.
It is a read-only agent — Write/Edit tools are blocked. Analysis results and reports are generated via python_repl.
When to Use
- When exploring a dataset and analyzing statistical patterns
- When testing hypotheses and reporting results
- When generating visualizations and writing reports
- When analyzing performance benchmark data
Usage Examples
"Analyze this CSV data"
"Analyze the correlation between price and sales"
"Statistically validate this A/B test result"Analysis Process
- SETUP: Verify Python/packages, create working directory (
.omc/scientist/), identify data files, state the goal ([OBJECTIVE]) - EXPLORE: Load data, check shape/types/missing values, output data characteristics ([DATA]) with
.head(),.describe() - ANALYZE: Run statistical analysis. For each insight, output a finding ([FINDING]) and statistics ([STAT:*]). Hypothesis-driven analysis
- SYNTHESIZE: Summarize findings, flag limitations ([LIMITATION]), generate report, clean up
Output Markers
Scientist uses defined markers.
| Marker | Purpose |
|---|---|
[OBJECTIVE] | Analysis goal |
[DATA] | Data characteristics |
[FINDING] | Finding |
[STAT:ci] | Confidence interval |
[STAT:effect_size] | Effect size |
[STAT:p_value] | p-value |
[STAT:n] | Sample size |
[LIMITATION] | Limitations/caveats |
Every [FINDING] must include at least one [STAT:*] evidence within 10 lines.
Technical Constraints
- All Python code must be executed via
python_repl(no Bash heredocs) - No package installation — use stdlib alternatives or inform the user
- No printing entire DataFrames — use
.head(),.describe(), aggregated results - Use matplotlib Agg backend,
plt.savefig()required (noplt.show())
Combining with Other Agents
- tracer: Combines with tracer for data-driven causal tracing analysis
- document-specialist: External research material lookup is handled by document-specialist
- architect: When architecture decisions are needed based on analysis results
Reference
| Item | Value |
|---|---|
| Model | sonnet |
| Subagent Type | oh-my-claudecode:scientist |
| Lane | Domain |
| Read-Only | Yes (Write, Edit blocked) |
| Tier Variant | scientist-high (opus) |
| Output Paths | .omc/scientist/reports/, .omc/scientist/figures/ |
| Skill Integration | /oh-my-claudecode:sciomc |