Creating New Actions
This guide walks through creating new actions for BioMapper using Test-Driven Development (TDD).
Overview
BioMapper actions are self-registering components that process biological data. Each action:
Inherits from
TypedStrategyActionUses Pydantic models for parameters
Self-registers via
@register_actiondecoratorModifies a shared execution context
Step 1: Write Tests First (TDD)
Always start by writing tests:
# tests/unit/core/strategy_actions/test_my_action.py
import pytest
from biomapper.core.strategy_actions.my_action import (
MyAction,
MyActionParams
)
@pytest.mark.asyncio
async def test_my_action_basic():
"""Test basic functionality."""
# Arrange
params = MyActionParams(
input_key="test_data",
threshold=0.8,
output_key="filtered"
)
context = {
"datasets": {
"test_data": [
{"id": "1", "score": 0.9},
{"id": "2", "score": 0.7},
{"id": "3", "score": 0.85}
]
}
}
# Act
action = MyAction()
result = await action.execute_typed(params, context)
# Assert
assert result.success
assert "filtered" in context["datasets"]
assert len(context["datasets"]["filtered"]) == 2
@pytest.mark.asyncio
async def test_my_action_validation():
"""Test parameter validation."""
with pytest.raises(ValidationError):
MyActionParams(
input_key="", # Empty key should fail
threshold=1.5 # Out of range
)
Step 2: Define Parameters
Create Pydantic models for type-safe parameters:
# biomapper/core/strategy_actions/my_action.py
from pydantic import BaseModel, Field, field_validator
from typing import Optional, List, Dict, Any
class MyActionParams(BaseModel):
"""Parameters for MyAction."""
input_key: str = Field(
...,
description="Key to input dataset in context"
)
threshold: float = Field(
0.8,
ge=0.0,
le=1.0,
description="Score threshold for filtering"
)
output_key: str = Field(
"filtered_output",
description="Key for output dataset"
)
include_metadata: bool = Field(
True,
description="Include metadata in output"
)
@field_validator("input_key")
@classmethod
def validate_input_key(cls, v: str) -> str:
if not v or not v.strip():
raise ValueError("Input key cannot be empty")
return v.strip()
Step 3: Implement the Action
from biomapper.actions.typed_base import TypedStrategyAction
from biomapper.actions.registry import register_action
from biomapper.core.models.action_results import ActionResult
from biomapper.core.models.execution_context import StrategyExecutionContext
from typing import Dict, Any, List, Type
import logging
logger = logging.getLogger(__name__)
@register_action("MY_ACTION")
class MyAction(TypedStrategyAction[MyActionParams, ActionResult]):
"""
Filter biological data based on score threshold.
This action filters items from an input dataset based on a
configurable score threshold and stores results in the context.
Example:
Input: [{"id": "A", "score": 0.9}, {"id": "B", "score": 0.6}]
Threshold: 0.8
Output: [{"id": "A", "score": 0.9}]
"""
def get_params_model(self) -> Type[MyActionParams]:
"""Return the parameters model class."""
return MyActionParams
def get_result_model(self) -> Type[ActionResult]:
"""Return the result model class."""
return ActionResult
async def execute_typed(
self,
current_identifiers: List[str],
current_ontology_type: str,
params: MyActionParams,
source_endpoint: Any,
target_endpoint: Any,
context: StrategyExecutionContext
) -> ActionResult:
"""Execute the filtering action."""
try:
# Get input data
if params.input_key not in context.get("datasets", {}):
return ActionResult(
success=False,
message=f"Input key '{params.input_key}' not found"
)
input_data = context["datasets"][params.input_key]
logger.info(f"Processing {len(input_data)} items")
# Apply filtering
filtered = [
item for item in input_data
if item.get("score", 0) >= params.threshold
]
# Add metadata if requested
if params.include_metadata:
for item in filtered:
item["_metadata"] = {
"filtered_by": "score",
"threshold": params.threshold
}
# Store results
if "datasets" not in context:
context["datasets"] = {}
context["datasets"][params.output_key] = filtered
# Update statistics
if "statistics" not in context:
context["statistics"] = {}
context["statistics"][params.output_key] = {
"total_input": len(input_data),
"total_output": len(filtered),
"filter_rate": len(filtered) / len(input_data)
}
logger.info(f"Filtered {len(input_data)} to {len(filtered)} items")
return ActionResult(
success=True,
message=f"Filtered {len(filtered)} items with threshold {params.threshold}",
data={
"input_count": len(input_data),
"output_count": len(filtered),
"removed_count": len(input_data) - len(filtered)
}
)
except Exception as e:
logger.error(f"Error in MyAction: {str(e)}")
return ActionResult(
success=False,
message=f"Action failed: {str(e)}"
)
Step 4: Choose Action Location
Place your action in the appropriate directory:
actions/
├── entities/ # Entity-specific actions
│ ├── proteins/ # Protein processing
│ ├── metabolites/ # Metabolite processing
│ └── chemistry/ # Clinical chemistry
├── utils/ # General utilities
│ └── data_processing/
├── io/ # Input/output actions
└── algorithms/ # Reusable algorithms
Step 5: Register the Action
The @register_action decorator automatically registers your action. No manual registration needed!
Step 6: Use in YAML Strategy
name: filter_example
description: Example using custom filter action
parameters:
input_file: "/data/scores.csv"
score_threshold: 0.75
steps:
- name: load_data
action:
type: LOAD_DATASET_IDENTIFIERS
params:
file_path: "${parameters.input_file}"
output_key: "raw_data"
- name: filter_high_scores
action:
type: MY_ACTION
params:
input_key: "raw_data"
threshold: "${parameters.score_threshold}"
output_key: "high_scores"
include_metadata: true
- name: export_results
action:
type: EXPORT_DATASET_V2
params:
input_key: "high_scores"
output_file: "/results/filtered.csv"
Best Practices
- 1. Always Use TDD
Write tests first
Test edge cases
Test error conditions
- 2. Parameter Validation
Use Pydantic Field constraints
Add custom validators for complex logic
Provide clear descriptions
- 3. Error Handling
Return ActionResult with success=False on errors
Log errors with context
Don’t raise exceptions
- 4. Documentation
Add docstrings with examples
Document parameters clearly
Include usage in docstring
- 5. Performance
Process large datasets in chunks
Use efficient data structures
Consider memory usage
- 6. Testing Checklist
✅ Unit tests pass
✅ Parameter validation tested
✅ Error cases handled
✅ Integration with context tested
✅ Performance acceptable
Common Patterns
Reading from Context:
# Safe context access
datasets = context.get("datasets", {})
input_data = datasets.get(params.input_key, [])
Writing to Context:
# Ensure datasets exists
if "datasets" not in context:
context["datasets"] = {}
context["datasets"][params.output_key] = result
Updating Statistics:
# Track metrics
if "statistics" not in context:
context["statistics"] = {}
context["statistics"][self.__class__.__name__] = {
"processed": len(data),
"runtime": elapsed_time
}
Chunked Processing:
from biomapper.core.utils import chunk_list
CHUNK_SIZE = 10000
results = []
for chunk in chunk_list(input_data, CHUNK_SIZE):
chunk_result = process_chunk(chunk)
results.extend(chunk_result)
Debugging Tips
Enable Debug Logging:
import logging logging.basicConfig(level=logging.DEBUG)
Test Locally First:
poetry run pytest tests/unit/core/strategy_actions/test_my_action.py -xvs
Use Print Debugging in Tests:
print(f"Context after action: {context}") assert result.success
Check Action Registration:
from biomapper.actions.registry import ACTION_REGISTRY print(ACTION_REGISTRY.keys())
Need Help?
Check existing actions in
biomapper/actions/Review tests in
tests/unit/actions/See
CLAUDE.mdfor AI assistance with development
—
Verification Sources
Last verified: 2025-08-17
This documentation was verified against the following project resources:
/biomapper/src/actions/typed_base.py(TypedStrategyAction base class with execute_typed signature requiring StrategyExecutionContext)/biomapper/src/actions/registry.py(self-registering action system with @register_action decorator)/biomapper/src/core/models/action_results.py(ActionResult model for return values)/biomapper/src/core/models/execution_context.py(StrategyExecutionContext for typed context)/biomapper/src/actions/(current action directory structure under src/)/biomapper/CLAUDE.md(action organization and development patterns)/biomapper/src/configs/strategies/(YAML strategy examples and usage)