Quickstart Guide

Get BioMapper running in 5 minutes.

Prerequisites

Python 3.11+
Poetry package manager
Git

Installation

# Clone and install
git clone https://github.com/arpanauts/biomapper.git
cd biomapper
poetry install --with dev,docs,api
poetry shell

Start the API

# Start from project root (recommended)
poetry run uvicorn src.api.main:app --reload --port 8000

# Alternative: Check API status
poetry run biomapper health

API will be available at:

Interactive docs: http://localhost:8000/api/docs
Health endpoint: http://localhost:8000/api/health
Root endpoint: http://localhost:8000/

Your First Strategy

Create a YAML strategy (test_strategy.yaml):

name: protein_harmonization
description: Harmonize protein identifiers

parameters:
  input_file: "/data/proteins.csv"
  output_dir: "/results"

steps:
  - name: load_proteins
    action:
      type: LOAD_DATASET_IDENTIFIERS
      params:
        file_path: "${parameters.input_file}"
        identifier_column: "uniprot_id"
        output_key: "proteins"

  - name: export_results
    action:
      type: EXPORT_DATASET
      params:
        input_key: "proteins"
        output_file: "${parameters.output_dir}/harmonized.csv"
        format: "csv"

Execute with Python client:

from src.client.client_v2 import BiomapperClient

# Simple synchronous execution
client = BiomapperClient(base_url="http://localhost:8000")
result = client.run("protein_harmonization", parameters={
    "input_file": "/path/to/your/data.csv",
    "output_dir": "/path/to/output"
})
print(f"Success: {result.success}")  # StrategyResult object

Or use the CLI:

# Check available CLI commands
poetry run biomapper --help

# List available strategies
poetry run biomapper strategies

# Verify CLI installation
poetry run biomapper health

Verify Installation

# Test CLI installation
poetry run biomapper health
poetry run biomapper test-import

# Run tests with coverage
poetry run pytest --cov=src

# Quick unit tests only
poetry run pytest tests/unit/

# Check API health (if API server is running)
curl http://localhost:8000/api/health

# View interactive API docs
open http://localhost:8000/api/docs

Common Actions

LOAD_DATASET_IDENTIFIERS - Load biological identifiers from CSV/TSV
PROTEIN_EXTRACT_UNIPROT_FROM_XREFS - Extract UniProt IDs from reference fields
PROTEIN_NORMALIZE_ACCESSIONS - Standardize protein accession formats
MERGE_DATASETS - Combine multiple datasets with deduplication
FILTER_DATASET - Apply filtering criteria to datasets
CUSTOM_TRANSFORM_EXPRESSION - Apply Python expressions to data
EXPORT_DATASET - Export results to various formats
SYNC_TO_GOOGLE_DRIVE_V2 - Upload results to Google Drive
SEMANTIC_METABOLITE_MATCH - AI-powered metabolite matching
NIGHTINGALE_NMR_MATCH - Nightingale NMR platform matching
CHEMISTRY_FUZZY_TEST_MATCH - Fuzzy matching for clinical tests

Next Steps

Installation Guide - Detailed setup instructions
Usage Guide - Advanced usage patterns
Configuration Guide - Strategy configuration
Actions Reference - Complete action reference

—

Verification Sources

Last verified: 2025-08-22

This documentation was verified against the following project resources:

/biomapper/pyproject.toml (Python 3.11+ requirement, GitHub repository URL, src-layout structure)
/biomapper/CLAUDE.md (Essential commands and environment setup procedures)
/biomapper/src/api/main.py (FastAPI application with correct import paths and endpoint structure)
/biomapper/src/client/client_v2.py (BiomapperClient with run() method returning StrategyResult objects)
/biomapper/src/cli/minimal.py (CLI commands including health and test-import)
/biomapper/src/actions/ (Action registry and organized entity-based action structure)