Back to Skills

Senior Data Scientist

World-class senior data scientist skill specialising in statistical modeling, experiment design, causal inference, and predictive analytics. Covers A/B testing (sample sizing, two-proportion z-tests, Bonferroni correction), difference-in-differences, feature engineering pipeline…

pythontesting
By alirezarezvani
19k2.7kUpdated 3 days agoPythonMIT

Skill Content

# Senior Data Scientist

World-class senior data scientist skill for production-grade AI/ML/Data systems.

## Core Workflows

### 1. Design an A/B Test

```python
import numpy as np
from scipy import stats

def calculate_sample_size(baseline_rate, mde, alpha=0.05, power=0.8):
    """
    Calculate required sample size per variant.
    baseline_rate: current conversion rate (e.g. 0.10)
    mde: minimum detectable effect (relative, e.g. 0.05 = 5% lift)
    """
    p1 = baseline_rate
    p2 = baseline_rate * (1 + mde)
    effect_size = abs(p2 - p1) / np.sqrt((p1 * (1 - p1) + p2 * (1 - p2)) / 2)
    z_alpha = stats.norm.ppf(1 - alpha / 2)
    z_beta = stats.norm.ppf(power)
    n = ((z_alpha + z_beta) / effect_size) ** 2
    return int(np.ceil(n))

def analyze_experiment(control, treatment, alpha=0.05):
    """
    Run two-proportion z-test and return structured results.
    control/treatment: dicts with 'conversions' and 'visitors'.
    """
    p_c = control["conversions"] / control["visitors"]
    p_t = treatment["conversions"] / treatment["visitors"]
    pooled = (control["conversions"] + treatment["conversions"]) / (control["visitors"] + treatment["visitors"])
    se = np.sqrt(pooled * (1 - pooled) * (1 / control["visitors"] + 1 / treatment["visitors"]))
    z = (p_t - p_c) / se
    p_value = 2 * (1 - stats.norm.cdf(abs(z)))
    ci_low = (p_t - p_c) - stats.norm.ppf(1 - alpha / 2) * se
    ci_high = (p_t - p_c) + stats.norm.ppf(1 - alpha / 2) * se
    return {
        "lift": (p_t - p_c) / p_c,
        "p_value": p_value,
        "significant": p_value < alpha,
        "ci_95": (ci_low, ci_high),
    }

# --- Experiment checklist ---
# 1. Define ONE primary metric and pre-register secondary metrics.
# 2. Calculate sample size BEFORE starting: calculate_sample_size(0.10, 0.05)
# 3. Randomise at the user (not session) level to avoid leakage.
# 4. Run for at least 1 full business cycle (typically 2 weeks).
# 5. Check for sample ratio mismatch: abs(n_control - n_treatment) / expected < 0.01
# 6. Analyze with analyze_experiment() and report lift + CI, not just p-value.
# 7. Apply Bonferroni correction if testing multiple metrics: alpha / n_metrics
```

### 2. Build a Feature Engineering Pipeline

```python
import pandas as pd
import numpy as np
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.impute import SimpleImputer
from sklearn.compose import ColumnTransformer

def build_feature_pipeline(numeric_cols, categorical_cols, date_cols=None):
    """
    Returns a fitted-ready ColumnTransformer for structured tabular data.
    """
    numeric_pipeline = Pipeline([
        ("impute", SimpleImputer(strategy="median")),
        ("scale",  StandardScaler()),
    ])
    categorical_pipeline = Pipeline([
        ("impute", SimpleImputer(strategy="most_frequent")),
        ("encode", OneHotEncoder(handle_unknown="ignore", sparse_output=False)),
    ])
    transformers = [
        ("num", numeric_pipeline, numeric_cols),
        ("cat", categorical_pipeline, categorical_cols),
    ]
    return ColumnTransformer(transformers, remainder="drop")

def add_time_features(df, date_col):
    """Extract cyclical and lag features from a datetime column."""
    df = df.copy()
    df[date_col] = pd.to_datetime(df[date_col])
    df["dow_sin"] = np.sin(2 * np.pi * df[date_col].dt.dayofweek / 7)
    df["dow_cos"] = np.cos(2 * np.pi * df[date_col].dt.dayofweek / 7)
    df["month_sin"] = np.sin(2 * np.pi * df[date_col].dt.month / 12)
    df["month_cos"] = np.cos(2 * np.pi * df[date_col].dt.month / 12)
    df["is_weekend"] = (df[date_col].dt.dayofweek >= 5).astype(int)
    return df

# --- Feature engineering checklist ---
# 1. Never fit transformers on the full dataset — fit on train, transform test.
# 2. Log-transform right-skewed numeric features before scaling.
# 3. For high-cardinality categoricals (>50 levels), use target encoding or embeddings.
# 4. Generate lag/rolling features BEFORE the train/test split to avoid leakage.
# 5. Document each feature's business meaning alongside its code.
```

### 3. Train, Evaluate, and Select a Prediction Model

```python
from sklearn.model_selection import StratifiedKFold, cross_validate
from sklearn.metrics import make_scorer, roc_auc_score, average_precision_score
import xgboost as xgb
import mlflow

SCORERS = {
    "roc_auc":  make_scorer(roc_auc_score, needs_proba=True),
    "avg_prec": make_scorer(average_precision_score, needs_proba=True),
}

def evaluate_model(model, X, y, cv=5):
    """
    Cross-validate and return mean ± std for each scorer.
    Use StratifiedKFold for classification to preserve class balance.
    """
    cv_results = cross_validate(
        model, X, y,
        cv=StratifiedKFold(n_splits=cv, shuffle=True, random_state=42),
        scoring=SCORERS,
        return_train_score=True,
    )
    summary = {}
    for metric in SCORERS:
        test_scores = cv_results[f"test_{metric}"]
        summary[metric] = {"mean": test_scores.mean(), "std": test_scores.std()}
        # Flag overfitting: large gap between train and test score
        train_mean = cv_results[f"train_{metric}"].mean()
        summary[metric]["overfit_gap"] = train_mean - test_scores.mean()
    return summary

def train_and_log(model, X_train, y_train, X_test, y_test, run_name):
    """Train model and log all artefacts to MLflow."""
    with mlflow.start_run(run_name=run_name):
        model.fit(X_train, y_train)
        proba = model.predict_proba(X_test)[:, 1]
        metrics = {
            "roc_auc":  roc_auc_score(y_test, proba),
            "avg_prec": average_precision_score(y_test, proba),
        }
        mlflow.log_params(model.get_params())
        mlflow.log_metrics(metrics)
        mlflow.sklearn.log_model(model, "model")
        return metrics

# --- Model evaluation checklist ---
# 1. Always report AUC-PR alongside AUC-ROC for imbalanced datasets.
# 2. Check overfit_gap > 0.05 as a warning sign of overfitting.
# 3. Calibrate probabilities (Platt scaling / isotonic) before production use.
# 4. Compute SHAP values to validate feature importance makes business sense.
# 5. Run a baseline (e.g. DummyClassifier) and verify the model beats it.
# 6. Log every run to MLflow — never rely on notebook output for comparison.
```

### 4. Causal Inference: Difference-in-Differences

```python
import statsmodels.formula.api as smf

def diff_in_diff(df, outcome, treatment_col, post_col, controls=None):
    """
    Estimate ATT via OLS DiD with optional covariates.
    df must have: outcome, treatment_col (0/1), post_col (0/1).
    Returns the interaction coefficient (treatment × post) and its p-value.
    """
    covariates = " + ".join(controls) if controls else ""
    formula = (
        f"{outcome} ~ {treatment_col} * {post_col}"
        + (f" + {covariates}" if covariates else "")
    )
    result = smf.ols(formula, data=df).fit(cov_type="HC3")
    interaction = f"{treatment_col}:{post_col}"
    return {
        "att":     result.params[interaction],
        "p_value": result.pvalues[interaction],
        "ci_95":   result.conf_int().loc[interaction].tolist(),
        "summary": result.summary(),
    }

# --- Causal inference checklist ---
# 1. Validate parallel trends in pre-period before trusting DiD estimates.
# 2. Use HC3 robust standard errors to handle heteroskedasticity.
# 3. For panel data, cluster SEs at the unit level (add groups= param to fit).
# 4. Consider propensity score matching if groups differ at baseline.
# 5. Report the ATT with confidence interval, not just statistical significance.
```

## Reference Documentation

- **Statistical Methods:** `references/statistical_methods_advanced.md`
- **Experiment Design Frameworks:** `references/experiment_design_frameworks.md`
- **Feature Engineering Patterns:** `references/feature_engineering_patterns.md`

## Common Commands

```bash
# Testing & linting
python -m pytest tests/ -v --cov=src/
python -m black src/ && python -m pylint src/

# Bundled pipeline scaffolds (stdlib runners — extend the process() body with project logic)
python3 scripts/experiment_designer.py --input experiment_spec.json --output experiment_design.json
python3 scripts/feature_engineering_pipeline.py --input raw_features.json --output features.json
python3 scripts/model_evaluation_suite.py --input model_predictions.json --output evaluation.json
# Each prints a JSON run report ({status, processed_items, start/end_time}); any status other
# than "completed" means the stage failed — fix before moving to the next pipeline stage.
```

How to use

  1. Copy the skill content above
  2. Create a .claude/skills directory in your project
  3. Save as .claude/skills/claude-skills-senior-data-scientist.md
  4. Use /claude-skills-senior-data-scientist in Claude Code to invoke this skill

Claude Code Skills & Plugins — Agent Skills for Every Coding Tool

345 production-ready Claude Code skills, plugins, and agent skills for 13 AI coding tools.

The most comprehensive open-source library of Claude Code skills and agent plugins — also works with OpenAI Codex, Gemini CLI, Cursor, and 9 more coding agents. Reusable expertise packages covering engineering, DevOps, marketing (incl. AEO — Answer Engine Optimization for LLM citation), security (PreToolUse hooks), compliance, C-level advisory (incl. founder-mode CFO/CMO/CRO/CPO/COO/CHRO/CISO/GC/CDO/CAIO/CCO/VPE personas + 21 /cs:* slash commands), productivity (capture/email/reflect), an academic research stack (litreview/grants/dossier/patent/syllabus/pulse/notebooklm + hybrid router), and enterprise Research Operations (clinical-research/research-finance/market-research/product-research, v2.9.0).

Works with: Claude Code · OpenAI Codex · Gemini CLI · OpenClaw · Hermes Agent1 · Mistral Vibe2 · Cursor · Aider · Windsurf · Kilo Code · OpenCode · Augment · Antigravity

License: MIT Skills Agents Personas Commands Stars SkillCheck Validated

5,200+ GitHub stars — the most comprehensive open-source Claude Code skills & agent plugins library.


What Are Claude Code Skills & Agent Plugins?

Claude Code skills (also called agent skills or coding agent plugins) are modular instruction packages that give AI coding agents domain expertise they don't have out of the box. Each skill includes:

  • SKILL.md — structured instructions, workflows, and decision frameworks
  • Python tools — 579 CLI scripts (all stdlib-only, zero pip installs)
  • Reference docs — 702 templates, checklists, and domain-specific knowledge files

One repo, thirteen platforms. Works natively as Claude Code plugins, Codex agent skills, Gemini CLI skills, Hermes Agent skills, Mistral Vibe skills, and converts to more tools via scripts/convert.sh. All 579 Python tools run anywhere Python runs.

Skills vs Agents vs Personas

SkillsAgentsPersonas
PurposeHow to execute a taskWhat task to doWho is thinking
ScopeSingle domainSingle domainCross-domain
VoiceNeutralProfessionalPersonality-driven
Example"Follow these steps for SEO""Run a security audit""Think like a startup CTO"

All three work together. See Orchestration for how to combine them.


Quick Install

Gemini CLI (New)

# Clone the repository
git clone https://github.com/alirezarezvani/claude-skills.git
cd claude-skills

# Run the setup script
./scripts/gemini-install.sh

# Start using skills
> activate_skill(name="senior-architect")

Claude Code (Recommended)

# Add the marketplace
/plugin marketplace add alirezarezvani/claude-skills

# Install by domain
/plugin install engineering-skills@claude-code-skills          # 24 core engineering
/plugin install engineering-advanced-skills@claude-code-skills  # 25 POWERFUL-tier
/plugin install product-skills@claude-code-skills               # 12 product skills
/plugin install marketing-skills@claude-code-skills             # 43 marketing skills
/plugin install ra-qm-skills@claude-code-skills                 # 12 regulatory/quality
/plugin install pm-skills@claude-code-skills                    # 6 project management
/plugin install c-level-skills@claude-code-skills               # 28 C-level advisory (full C-suite)
/plugin install business-growth-skills@claude-code-skills       # 4 business & growth
/plugin install finance-skills@claude-code-skills               # 2 finance (analyst + SaaS metrics)

# Or install individual skills
/plugin install skill-security-auditor@claude-code-skills       # Security scanner
/plugin install playwright-pro@claude-code-skills                  # Playwright testing toolkit
/plugin install self-improving-agent@claude-code-skills         # Auto-memory curation
/plugin install content-creator@claude-code-skills              # Single skill

OpenAI Codex

npx agent-skills-cli add alirezarezvani/claude-skills --agent codex
# Or: git clone + ./scripts/codex-install.sh

OpenClaw

bash <(curl -s https://raw.githubusercontent.com/alirezarezvani/claude-skills/main/scripts/openclaw-install.sh)

Manual Installation

git clone https://github.com/alirezarezvani/claude-skills.git
# Copy any skill folder to ~/.claude/skills/ (Claude Code) or ~/.codex/skills/ (Codex)

Multi-Tool Support (New)

Convert all 345 skills to 9 AI coding tools with a single script:

ToolFormatInstall
Cursor.mdc rules./scripts/install.sh --tool cursor --target .
AiderCONVENTIONS.md./scripts/install.sh --tool aider --target .
Kilo Code.kilocode/rules/./scripts/install.sh --tool kilocode --target .
Windsurf.windsurf/skills/./scripts/install.sh --tool windsurf --target .
OpenCode.opencode/skills/./scripts/install.sh --tool opencode --target .
Augment.augment/rules/./scripts/install.sh --tool augment --target .
Antigravity~/.gemini/antigravity/skills/./scripts/install.sh --tool antigravity
Hermes Agent~/.hermes/skills/python scripts/sync-hermes-skills.py --verbose
Mistral Vibe~/.vibe/skills/./scripts/vibe-install.sh

How it works:

# 1. Convert all skills to all tools (takes ~15 seconds)
./scripts/convert.sh --tool all

# 2. Install into your project (with confirmation)
./scripts/install.sh --tool cursor --target /path/to/project

# Or use --force to skip confirmation:
./scripts/install.sh --tool aider --target . --force

# 3. Verify
find .cursor/rules -name "*.mdc" | wc -l  # Should show 346

Each tool gets:

  • ✅ All 345 skills converted to native format
  • ✅ Per-tool README with install/verify/update steps
  • ✅ Support for scripts, references, templates where applicable
  • ✅ Zero manual conversion work

Run ./scripts/convert.sh --tool all to generate tool-specific outputs locally.


Skills Overview

345 skills across 17 domains:

DomainSkillsHighlightsDetails
🔧 Engineering — Core51Architecture, frontend, backend, fullstack, QA, DevOps, SecOps, AI/ML, data, Playwright Pro (test gen, flaky fix, migrations), self-improving agent (auto-memory curation), security suite, a11y auditengineering-team/
⚡ Engineering — POWERFUL78Agent designer, RAG architect, database designer, CI/CD builder, security auditor, MCP builder, AgentHub, Helm charts, Terraform, self-eval, llm-wiki, tc-tracker, autoresearch-agent, reliability portfolio (feature-flags-architect, kubernetes-operator, chaos-engineering, slo-architect), ship-gate, security-guidance PreToolUse hook, Matt Pocock skills (write-a-skill, caveman, grill-me, handoff, grill-with-docs)engineering/
🎯 Product17Product manager, agile PO, strategist, UX researcher, UI design, landing pages, SaaS scaffolder, analytics, experiment designer, discovery, roadmap communicator, code-to-prd, apple-hig-expertproduct-team/
📣 Marketing468 pods: Content, SEO + AEO (aeo — E-E-A-T audit, citation tracking across 5 LLMs), CRO, Channels, Growth, Intelligence, Sales + context foundation + orchestration routermarketing-skill/
🚀 Productivity6capture (brain-dump-to-action), email pair (inbox-setup + inbox-triage), reflect (journal), handoff (Matt Pocock-inspired), andreessen (market-first decision mode)productivity/
🎨 Marketing (top-level)1landing — single-file HTML landing-page generator (4 design styles, GSAP patterns, brand palette validator)marketing/
🔬 Research (academic)8research orchestrator (hybrid router + fallback) + 7 specialists: pulse, litreview, grants (NIH), dossier, patent, syllabus, notebooklmresearch/
🧪 Research Operations ✨v2.9.05Enterprise/cross-functional research: orchestrator + clinical-research (study design), research-finance (R&D program finance), market-research (sizing/survey/segmentation), product-research (user research) — each with onboarding + customization + opt-in autoresearch bridgeresearch-ops/
📋 Project Management9Senior PM, scrum master, Jira, Confluence, Atlassian admin, templates + bundled Atlassian Remote MCPproject-management/
🏥 Regulatory & QM18ISO 13485, MDR 2017/745, FDA, ISO 27001, GDPR, SOC 2, CAPA, risk managementra-qm-team/
🛡️ Compliance OS9Compliance operating system — controls, evidence, audit-readiness workflowscompliance-os/
💼 C-Level Advisory66Full C-suite (CEO/CTO/CFO/CMO/CRO/CPO/COO/CHRO/CISO/GC/CDO/CAIO/CCO/VPE) + founder-mode agents + orchestration + board meetings + culture & collaborationc-level-advisor/
📈 Business & Growth5Customer success, sales engineer, revenue ops, contracts & proposals, BizDev toolkitbusiness-growth/
🏭 Business Operations7Orchestrator + process-mapper, vendor-management, capacity-planner, internal-comms, knowledge-ops, procurement-optimizerbusiness-operations/
🤝 Commercial8Orchestrator + pricing-strategist, deal-desk, partnerships-architect, channel-economics, commercial-policy, rfp-responder, commercial-forecastercommercial/
💰 Finance4Financial analyst (DCF, budgeting, forecasting), SaaS metrics coach, business investment advisorfinance/

Personas

Pre-configured agent identities with curated skill loadouts, workflows, and distinct communication styles. Personas go beyond "use these skills" — they define how an agent thinks, prioritizes, and communicates.

PersonaDomainBest For
Startup CTOEngineering + StrategyArchitecture decisions, tech stack selection, team building, technical due diligence
Growth MarketerMarketing + GrowthContent-led growth, launch strategy, channel optimization, bootstrapped marketing
Solo FounderCross-domainOne-person sta

Footnotes

  1. Hermes Agent is BYO-sync tier: the repo ships a pre-generated .hermes/skills/claude-skills/ tree, but you run python scripts/sync-hermes-skills.py once locally to install into ~/.hermes/skills/. Uses the same agentskills.io SKILL.md standard — no format conversion.

  2. Mistral Vibe is also BYO-sync tier: the repo ships a pre-generated .vibe/skills/claude-skills/ tree, run ./scripts/vibe-install.sh once locally to install into ~/.vibe/skills/. Same agentskills.io SKILL.md standard — no format conversion. Docs: https://docs.mistral.ai/mistral-vibe/agents-skills.

View source on GitHub