Cohere Integration

Full observability for the Cohere API. Capture chat, embed, and rerank calls with tool use tracking and automatic instrumentation.

Cohere SDK >= 5.0Command R+Embed v3Rerank

Installation

pip install turingpulse_sdk turingpulse_sdk_cohere cohere

Quick Start

1. Initialize & Instrument

from turingpulse_sdk import init, TuringPulseConfig
from turingpulse_sdk_cohere import patch_cohere

# Initialize TuringPulse
init(TuringPulseConfig(
    api_key="sk_live_your_api_key",
    workflow_name="my-project",
))

# Enable auto-instrumentation for Cohere
patch_cohere()

2. Use Cohere Normally

import cohere

client = cohere.ClientV2(api_key="your-cohere-key")

# Chat with Command R+ - traces are captured automatically
response = client.chat(
    model="command-r-plus",
    messages=[
        {"role": "user", "content": "Explain the theory of relativity"},
    ],
)
print(response.message.content[0].text)

ℹ️

Zero Code Changes

Once auto-instrumentation is enabled, all Cohere API calls including chat, embed, and rerank are automatically traced.

What Gets Captured

Data Point	Description	Example
Chat Calls	Model, messages, and completion with metadata	`command-r-plus, tokens: 420`
Embed Calls	Embedding model, input texts, and dimensions	`embed-v3.0, 10 texts, 1024 dims`
Rerank Calls	Query, documents, and relevance scores	`rerank-v3.0, 25 docs, top=5`
Tool Use	Tool calls with arguments and results	`search_db(query='revenue Q4')`
Token Usage	Input and output token counts per call	`prompt: 280, completion: 140`
Latency	End-to-end and per-call timing	`total: 1800ms, chat: 1500ms`
Errors	API errors with status codes and context	`TooManyRequestsError: rate limited`

Advanced Configuration

from turingpulse_sdk import instrument, KPIConfig
from turingpulse_sdk_cohere import patch_cohere

patch_cohere(name="cohere-service")

@instrument(
    name="cohere-agent",
    kpis=[
        KPIConfig(kpi_id="latency_ms", use_duration=True, alert_threshold=5000),
        KPIConfig(kpi_id="tokens", alert_threshold=4000, comparator="gt"),
    ],
)
def my_agent(query: str):
    return co.chat(model="command-r-plus", message=query)

Tool Use

import cohere

client = cohere.ClientV2(api_key="your-cohere-key")

tools = [
    {
        "type": "function",
        "function": {
            "name": "query_database",
            "description": "Query the sales database",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "SQL query to execute"},
                },
                "required": ["query"],
            },
        },
    },
]

response = client.chat(
    model="command-r-plus",
    messages=[{"role": "user", "content": "What were our Q4 sales?"}],
    tools=tools,
)

# Tool calls are automatically captured in the trace

Embeddings & Rerank

import cohere

client = cohere.ClientV2(api_key="your-cohere-key")

# Embeddings - batch size and dimensions are tracked
embed_response = client.embed(
    texts=["Hello world", "Machine learning is fascinating"],
    model="embed-english-v3.0",
    input_type="search_document",
    embedding_types=["float"],
)

# Rerank - query, documents, and scores are tracked
rerank_response = client.rerank(
    query="What is machine learning?",
    documents=["ML is a subset of AI...", "Deep learning uses neural networks..."],
    model="rerank-english-v3.0",
    top_n=3,
)

# Both embed and rerank calls are captured with full metadata

💡

RAG Pipeline Tracking

Combine Cohere embed, rerank, and chat tracing to get full visibility into your RAG pipeline performance.