Mistral AI Integration
Full observability for the Mistral AI API. Capture chat completions, function calling, and streaming responses with automatic instrumentation.
Mistral SDK >= 0.1.0Mistral LargeMixtralFunction Calling
Installation
Terminal
pip install turingpulse_sdk turingpulse_sdk_mistral mistralaiQuick Start
1. Initialize & Instrument
setup.py
from turingpulse_sdk import init, TuringPulseConfig
from turingpulse_sdk_mistral import patch_mistral
# Initialize TuringPulse
init(TuringPulseConfig(
api_key="sk_live_your_api_key",
workflow_name="my-project",
))
# Enable auto-instrumentation for Mistral
patch_mistral()2. Use Mistral Normally
main.py
from mistralai import Mistral
client = Mistral(api_key="your-mistral-key")
# Chat with Mistral Large - traces are captured automatically
response = client.chat.complete(
model="mistral-large-latest",
messages=[
{"role": "user", "content": "Explain how neural networks learn"},
],
)
print(response.choices[0].message.content)ℹ️
Zero Code Changes
Once auto-instrumentation is enabled, all Mistral API calls including chat completions and streaming are automatically traced.
What Gets Captured
| Data Point | Description | Example |
|---|---|---|
| Chat Completions | Model, messages, and completion with metadata | mistral-large-latest, tokens: 380 |
| Function Calling | Tool definitions, function calls, and results | get_weather(location='Paris') |
| Streaming | Time-to-first-token and full streaming trace | ttfb: 180ms, total: 2100ms |
| Token Usage | Input and output token counts per call | prompt: 200, completion: 180 |
| Model Parameters | Temperature, top_p, max_tokens, and other settings | temp=0.7, top_p=0.95, max=1024 |
| Latency | End-to-end and per-call timing | total: 1600ms |
| Errors | API errors with status codes and retry information | RateLimitError: 429 Too Many Requests |
Advanced Configuration
config.py
from turingpulse_sdk import instrument, KPIConfig
from turingpulse_sdk_mistral import patch_mistral
patch_mistral(name="mistral-service")
@instrument(
name="mistral-agent",
kpis=[
KPIConfig(kpi_id="latency_ms", use_duration=True, alert_threshold=5000),
KPIConfig(kpi_id="tokens", alert_threshold=4000, comparator="gt"),
],
)
def my_agent(query: str):
return client.chat.complete(model="mistral-large-latest", messages=[{"role": "user", "content": query}])Function Calling
functions.py
from mistralai import Mistral
client = Mistral(api_key="your-mistral-key")
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "The city name"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["city"],
},
},
},
]
response = client.chat.complete(
model="mistral-large-latest",
messages=[{"role": "user", "content": "What's the weather in Paris?"}],
tools=tools,
tool_choice="auto",
)
# Function calls are automatically captured in the traceStreaming
streaming.py
from mistralai import Mistral
client = Mistral(api_key="your-mistral-key")
# Streaming is automatically tracked
stream = client.chat.stream(
model="mistral-large-latest",
messages=[{"role": "user", "content": "Write a poem about coding"}],
)
for chunk in stream:
if chunk.data.choices[0].delta.content:
print(chunk.data.choices[0].delta.content, end="")
# Trace captures time-to-first-token and full response💡
Model Comparison
TuringPulse tracks performance across Mistral Large, Mixtral, and other models, helping you compare cost, quality, and latency.