AutoResearch Experiment Setup is a automation claude skill built by Alireza Rezvani. Best for: ML engineers and researchers use this to initialize optimization experiments with standardized evaluation frameworks..
- What it does
- Set up new autoresearch experiments interactively with domain, target file, evaluation command, and metrics configuration.
- Category
- automation
- Created by
- Alireza Rezvani
- Last updated
AutoResearch Experiment Setup
Set up new autoresearch experiments interactively with domain, target file, evaluation command, and metrics configuration.
Skill instructions
name: "setup" description: "Set up a new autoresearch experiment interactively. Collects domain, target file, eval command, metric, direction, and evaluator." command: /ar:setup
/ar:setup — Create New Experiment
Set up a new autoresearch experiment with all required configuration.
Usage
/ar:setup # Interactive mode
/ar:setup engineering api-speed src/api.py "pytest bench.py" p50_ms lower
/ar:setup --list # Show existing experiments
/ar:setup --list-evaluators # Show available evaluators
What It Does
If arguments provided
Pass them directly to the setup script:
python {skill_path}/scripts/setup_experiment.py \
--domain {domain} --name {name} \
--target {target} --eval "{eval_cmd}" \
--metric {metric} --direction {direction} \
[--evaluator {evaluator}] [--scope {scope}]
If no arguments (interactive mode)
Collect each parameter one at a time:
- Domain — Ask: "What domain? (engineering, marketing, content, prompts, custom)"
- Name — Ask: "Experiment name? (e.g., api-speed, blog-titles)"
- Target file — Ask: "Which file to optimize?" Verify it exists.
- Eval command — Ask: "How to measure it? (e.g., pytest bench.py, python evaluate.py)"
- Metric — Ask: "What metric does the eval output? (e.g., p50_ms, ctr_score)"
- Direction — Ask: "Is lower or higher better?"
- Evaluator (optional) — Show built-in evaluators. Ask: "Use a built-in evaluator, or your own?"
- Scope — Ask: "Store in project (.autoresearch/) or user (~/.autoresearch/)?"
Then run setup_experiment.py with the collected parameters.
Listing
# Show existing experiments
python {skill_path}/scripts/setup_experiment.py --list
# Show available evaluators
python {skill_path}/scripts/setup_experiment.py --list-evaluators
Built-in Evaluators
| Name | Metric | Use Case |
|------|--------|----------|
| benchmark_speed | p50_ms (lower) | Function/API execution time |
| benchmark_size | size_bytes (lower) | File, bundle, Docker image size |
| test_pass_rate | pass_rate (higher) | Test suite pass percentage |
| build_speed | build_seconds (lower) | Build/compile/Docker build time |
| memory_usage | peak_mb (lower) | Peak memory during execution |
| llm_judge_content | ctr_score (higher) | Headlines, titles, descriptions |
| llm_judge_prompt | quality_score (higher) | System prompts, agent instructions |
| llm_judge_copy | engagement_score (higher) | Social posts, ad copy, emails |
After Setup
Report to the user:
- Experiment path and branch name
- Whether the eval command worked and the baseline metric
- Suggest: "Run
/ar:run {domain}/{name}to start iterating, or/ar:loop {domain}/{name}for autonomous mode."
Use this skill
Most skills are portable instruction packages. Claude Code supports SKILL.md directly. Other agents can use adapted files like AGENTS.md, .cursorrules, and GEMINI.md.
Claude Code
Save SKILL.md into your Claude Skills folder, then restart Claude Code.
mkdir -p ~/.claude/skills/autoresearch-experiment-setup && curl -L "https://raw.githubusercontent.com/alirezarezvani/claude-skills/HEAD/engineering/autoresearch-agent/skills/setup/SKILL.md" -o ~/.claude/skills/autoresearch-experiment-setup/SKILL.mdInstalls to ~/.claude/skills/autoresearch-experiment-setup/SKILL.md.
Use cases
ML engineers and researchers use this to initialize optimization experiments with standardized evaluation frameworks.
Reviews
No reviews yet. Be the first to review this skill.
No signup required