Anomaly Detection Rules
Automatically detect outliers and unusual patterns in your AI agent behavior.
What are Anomalies?
Anomalies are individual data points or patterns that deviate significantly from expected behavior. Unlike drift (which is a gradual change), anomalies are point-in-time outliers that may indicate:
- Errors - Unexpected failures or exceptions
- Attacks - Prompt injection or abuse attempts
- Edge Cases - Unusual inputs the agent struggles with
- Resource Issues - Rate limits, timeouts, OOM
- Data Quality - Malformed or unexpected inputs
Creating Anomaly Rules via UI
Step 1: Navigate to Anomalies
Go to Controls → Anomalies in the sidebar.
Step 2: Create New Rule
Click Create Rule and configure:
- Name - Descriptive name (e.g., "Latency Spike Detector")
- Workflow - Select specific workflow or "All Workflows"
- Metric - Metric to monitor for anomalies
- Detection Method - Z-Score, IQR, or Isolation Forest
- Threshold - Anomaly score threshold
- Minimum Samples - Data points needed before detection
Step 3: Configure Alerts
- Severity - Warning or Critical
- Auto-create Incident - Toggle for automatic incidents
- Alert Channels - Select notification channels
Creating Anomaly Rules via API
create_anomaly_rule.py
import requests
# Create an anomaly detection rule
response = requests.post(
"https://api.turingpulse.ai/api/v1/config/anomaly-rules",
headers={"Authorization": "Bearer sk_live_..."},
json={
"name": "Latency Spike Detector",
"workflow_id": "customer-support",
"metric": "latency_ms",
"method": "zscore", # Detection method
"threshold": 3.0, # Z-score threshold
"window": "1h", # Lookback window
"min_samples": 50, # Minimum samples
"severity": "warning",
"auto_create_incident": False,
"alert_channels": ["slack://alerts"],
"enabled": True,
}
)Detection Methods
Z-Score (zscore)
Measures how many standard deviations a value is from the mean. Best for normally distributed metrics.
- Threshold 2.0 - ~5% of data flagged (lenient)
- Threshold 3.0 - ~0.3% of data flagged (standard)
- Threshold 4.0 - ~0.01% of data flagged (strict)
Interquartile Range (iqr)
Uses quartiles to identify outliers. More robust to non-normal distributions.
- Threshold 1.5 - Standard outlier detection
- Threshold 3.0 - Extreme outliers only
Isolation Forest (isolation_forest)
Machine learning-based detection. Best for complex, multi-dimensional anomalies.
- Threshold 0.1 - ~10% flagged as anomalies
- Threshold 0.05 - ~5% flagged
| Method | Best For | Pros | Cons |
|---|---|---|---|
zscore | Normal distributions | Simple, interpretable | Assumes normality |
iqr | Skewed distributions | Robust to outliers | Less sensitive |
isolation_forest | Complex patterns | Multi-dimensional | Less interpretable |
Anomaly Rule Configuration
| Option | Type | Description |
|---|---|---|
name | str | Human-readable name |
workflow_id | str | Workflow to monitor, or "*" for all |
metric | str | Metric to monitor (latency_ms, tokens, cost, etc.) |
method | str | zscore, iqr, isolation_forest |
threshold | float | Detection threshold (method-specific) |
window | str | Lookback window for baseline (e.g., "1h", "24h") |
min_samples | int | Minimum samples before detection |
severity | str | warning, critical |
auto_create_incident | bool | Create incident on detection |
Viewing Anomalies
When anomalies are detected:
- Anomaly events appear in Operations → Overview → Anomalies tab
- Each anomaly shows the metric value, expected range, and severity
- Click on an anomaly to see the affected run details
- Notifications are sent to configured alert channels
Anomaly Event Details
- Metric - Which metric triggered the anomaly
- Value - The anomalous value
- Expected Range - Normal range based on baseline
- Anomaly Score - How anomalous (higher = more unusual)
- Run ID - Link to the affected run
- Timestamp - When the anomaly occurred
Anomaly Clustering
TuringPulse automatically clusters related anomalies:
- Time-based - Anomalies occurring close together
- Metric-based - Same metric across workflows
- Cause-based - Similar root causes
💡
Incident Creation
When multiple anomalies are clustered, a single incident is created to avoid alert fatigue.
Best Practices
- Start with Z-Score - Simple and effective for most metrics. Switch to IQR or Isolation Forest if you see too many false positives.
- Set appropriate windows - Use 1h for real-time detection, 24h for more stable baselines.
- Tune thresholds - Start lenient and tighten based on false positive rates.
- Monitor multiple metrics - Create rules for latency, tokens, errors, and custom metrics.