Mistral AI Integration

Full observability for the Mistral AI API. Capture chat completions, function calling, and streaming responses with automatic instrumentation.

Mistral SDK >= 0.1.0Mistral LargeMixtralFunction Calling

Installation

pip install turingpulse_sdk turingpulse_sdk_mistral mistralai

Quick Start

1. Initialize & Instrument

from turingpulse_sdk import init, TuringPulseConfig
from turingpulse_sdk_mistral import patch_mistral

# Initialize TuringPulse
init(TuringPulseConfig(
    api_key="sk_live_your_api_key",
    workflow_name="my-project",
))

# Enable auto-instrumentation for Mistral
patch_mistral()

2. Use Mistral Normally

from mistralai import Mistral

client = Mistral(api_key="your-mistral-key")

# Chat with Mistral Large - traces are captured automatically
response = client.chat.complete(
    model="mistral-large-latest",
    messages=[
        {"role": "user", "content": "Explain how neural networks learn"},
    ],
)
print(response.choices[0].message.content)

ℹ️

Zero Code Changes

Once auto-instrumentation is enabled, all Mistral API calls including chat completions and streaming are automatically traced.

What Gets Captured

Data Point	Description	Example
Chat Completions	Model, messages, and completion with metadata	`mistral-large-latest, tokens: 380`
Function Calling	Tool definitions, function calls, and results	`get_weather(location='Paris')`
Streaming	Time-to-first-token and full streaming trace	`ttfb: 180ms, total: 2100ms`
Token Usage	Input and output token counts per call	`prompt: 200, completion: 180`
Model Parameters	Temperature, top_p, max_tokens, and other settings	`temp=0.7, top_p=0.95, max=1024`
Latency	End-to-end and per-call timing	`total: 1600ms`
Errors	API errors with status codes and retry information	`RateLimitError: 429 Too Many Requests`

Advanced Configuration

from turingpulse_sdk import instrument, KPIConfig
from turingpulse_sdk_mistral import patch_mistral

patch_mistral(name="mistral-service")

@instrument(
    name="mistral-agent",
    kpis=[
        KPIConfig(kpi_id="latency_ms", use_duration=True, alert_threshold=5000),
        KPIConfig(kpi_id="tokens", alert_threshold=4000, comparator="gt"),
    ],
)
def my_agent(query: str):
    return client.chat.complete(model="mistral-large-latest", messages=[{"role": "user", "content": query}])

Function Calling

from mistralai import Mistral

client = Mistral(api_key="your-mistral-key")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "The city name"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                },
                "required": ["city"],
            },
        },
    },
]

response = client.chat.complete(
    model="mistral-large-latest",
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
    tools=tools,
    tool_choice="auto",
)

# Function calls are automatically captured in the trace

Streaming

from mistralai import Mistral

client = Mistral(api_key="your-mistral-key")

# Streaming is automatically tracked
stream = client.chat.stream(
    model="mistral-large-latest",
    messages=[{"role": "user", "content": "Write a poem about coding"}],
)

for chunk in stream:
    if chunk.data.choices[0].delta.content:
        print(chunk.data.choices[0].delta.content, end="")

# Trace captures time-to-first-token and full response

💡

Model Comparison

TuringPulse tracks performance across Mistral Large, Mixtral, and other models, helping you compare cost, quality, and latency.