Anomaly Detection Rules

Automatically detect outliers and unusual patterns in your AI agent behavior.

What are Anomalies?

Anomalies are individual data points or patterns that deviate significantly from expected behavior. Unlike drift (which is a gradual change), anomalies are point-in-time outliers that may indicate:

Errors - Unexpected failures or exceptions
Attacks - Prompt injection or abuse attempts
Edge Cases - Unusual inputs the agent struggles with
Resource Issues - Rate limits, timeouts, OOM
Data Quality - Malformed or unexpected inputs

Creating Anomaly Rules via UI

Step 1: Navigate to Anomalies

Go to Controls → Anomalies in the sidebar.

Step 2: Create New Rule

Click Create Rule and configure:

Name - Descriptive name (e.g., "Latency Spike Detector")
Workflow - Select specific workflow or "All Workflows"
Metric - Metric to monitor for anomalies
Detection Method - Z-Score, IQR, or Isolation Forest
Threshold - Anomaly score threshold
Minimum Samples - Data points needed before detection

Step 3: Configure Alerts

Severity - Warning or Critical
Auto-create Incident - Toggle for automatic incidents
Alert Channels - Select notification channels

Creating Anomaly Rules via API

import requests

# Create an anomaly detection rule
response = requests.post(
    "https://api.turingpulse.ai/api/v1/config/anomaly-rules",
    headers={"Authorization": "Bearer sk_live_..."},
    json={
        "name": "Latency Spike Detector",
        "workflow_id": "customer-support",
        "metric": "latency_ms",
        "method": "zscore",           # Detection method
        "threshold": 3.0,             # Z-score threshold
        "window": "1h",               # Lookback window
        "min_samples": 50,            # Minimum samples
        "severity": "warning",
        "auto_create_incident": False,
        "alert_channels": ["slack://alerts"],
        "enabled": True,
    }
)

Detection Methods

Z-Score (zscore)

Measures how many standard deviations a value is from the mean. Best for normally distributed metrics.

Threshold 2.0 - ~5% of data flagged (lenient)
Threshold 3.0 - ~0.3% of data flagged (standard)
Threshold 4.0 - ~0.01% of data flagged (strict)

Interquartile Range (iqr)

Uses quartiles to identify outliers. More robust to non-normal distributions.

Threshold 1.5 - Standard outlier detection
Threshold 3.0 - Extreme outliers only

Isolation Forest (isolation_forest)

Machine learning-based detection. Best for complex, multi-dimensional anomalies.

Threshold 0.1 - ~10% flagged as anomalies
Threshold 0.05 - ~5% flagged

Method	Best For	Pros	Cons
`zscore`	Normal distributions	Simple, interpretable	Assumes normality
`iqr`	Skewed distributions	Robust to outliers	Less sensitive
`isolation_forest`	Complex patterns	Multi-dimensional	Less interpretable

Anomaly Rule Configuration

Option	Type	Description
`name`	str	Human-readable name
`workflow_id`	str	Workflow to monitor, or "*" for all
`metric`	str	Metric to monitor (latency_ms, tokens, cost, etc.)
`method`	str	zscore, iqr, isolation_forest
`threshold`	float	Detection threshold (method-specific)
`window`	str	Lookback window for baseline (e.g., "1h", "24h")
`min_samples`	int	Minimum samples before detection
`severity`	str	warning, critical
`auto_create_incident`	bool	Create incident on detection

Viewing Anomalies

When anomalies are detected:

Anomaly events appear in Operations → Overview → Anomalies tab
Each anomaly shows the metric value, expected range, and severity
Click on an anomaly to see the affected run details
Notifications are sent to configured alert channels

Anomaly Event Details

Metric - Which metric triggered the anomaly
Value - The anomalous value
Expected Range - Normal range based on baseline
Anomaly Score - How anomalous (higher = more unusual)
Run ID - Link to the affected run
Timestamp - When the anomaly occurred

Anomaly Clustering

TuringPulse automatically clusters related anomalies:

Time-based - Anomalies occurring close together
Metric-based - Same metric across workflows
Cause-based - Similar root causes

💡

Incident Creation

When multiple anomalies are clustered, a single incident is created to avoid alert fatigue.

Best Practices

Start with Z-Score - Simple and effective for most metrics. Switch to IQR or Isolation Forest if you see too many false positives.
Set appropriate windows - Use 1h for real-time detection, 24h for more stable baselines.
Tune thresholds - Start lenient and tighten based on false positive rates.
Monitor multiple metrics - Create rules for latency, tokens, errors, and custom metrics.