Cohere Integration

Full observability for the Cohere API. Capture chat, embed, and rerank calls with tool use tracking and automatic instrumentation.

Cohere SDK >= 5.0Command R+Embed v3Rerank

Installation

Terminal
pip install turingpulse_sdk turingpulse_sdk_cohere cohere

Quick Start

1. Initialize & Instrument

setup.py
from turingpulse_sdk import init, TuringPulseConfig
from turingpulse_sdk_cohere import patch_cohere

# Initialize TuringPulse
init(TuringPulseConfig(
    api_key="sk_live_your_api_key",
    workflow_name="my-project",
))

# Enable auto-instrumentation for Cohere
patch_cohere()

2. Use Cohere Normally

main.py
import cohere

client = cohere.ClientV2(api_key="your-cohere-key")

# Chat with Command R+ - traces are captured automatically
response = client.chat(
    model="command-r-plus",
    messages=[
        {"role": "user", "content": "Explain the theory of relativity"},
    ],
)
print(response.message.content[0].text)
ℹ️
Zero Code Changes
Once auto-instrumentation is enabled, all Cohere API calls including chat, embed, and rerank are automatically traced.

What Gets Captured

Data PointDescriptionExample
Chat CallsModel, messages, and completion with metadatacommand-r-plus, tokens: 420
Embed CallsEmbedding model, input texts, and dimensionsembed-v3.0, 10 texts, 1024 dims
Rerank CallsQuery, documents, and relevance scoresrerank-v3.0, 25 docs, top=5
Tool UseTool calls with arguments and resultssearch_db(query='revenue Q4')
Token UsageInput and output token counts per callprompt: 280, completion: 140
LatencyEnd-to-end and per-call timingtotal: 1800ms, chat: 1500ms
ErrorsAPI errors with status codes and contextTooManyRequestsError: rate limited

Advanced Configuration

config.py
from turingpulse_sdk import instrument, KPIConfig
from turingpulse_sdk_cohere import patch_cohere

patch_cohere(name="cohere-service")

@instrument(
    name="cohere-agent",
    kpis=[
        KPIConfig(kpi_id="latency_ms", use_duration=True, alert_threshold=5000),
        KPIConfig(kpi_id="tokens", alert_threshold=4000, comparator="gt"),
    ],
)
def my_agent(query: str):
    return co.chat(model="command-r-plus", message=query)

Tool Use

tools.py
import cohere

client = cohere.ClientV2(api_key="your-cohere-key")

tools = [
    {
        "type": "function",
        "function": {
            "name": "query_database",
            "description": "Query the sales database",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "SQL query to execute"},
                },
                "required": ["query"],
            },
        },
    },
]

response = client.chat(
    model="command-r-plus",
    messages=[{"role": "user", "content": "What were our Q4 sales?"}],
    tools=tools,
)

# Tool calls are automatically captured in the trace

Embeddings & Rerank

embed-rerank.py
import cohere

client = cohere.ClientV2(api_key="your-cohere-key")

# Embeddings - batch size and dimensions are tracked
embed_response = client.embed(
    texts=["Hello world", "Machine learning is fascinating"],
    model="embed-english-v3.0",
    input_type="search_document",
    embedding_types=["float"],
)

# Rerank - query, documents, and scores are tracked
rerank_response = client.rerank(
    query="What is machine learning?",
    documents=["ML is a subset of AI...", "Deep learning uses neural networks..."],
    model="rerank-english-v3.0",
    top_n=3,
)

# Both embed and rerank calls are captured with full metadata
💡
RAG Pipeline Tracking
Combine Cohere embed, rerank, and chat tracing to get full visibility into your RAG pipeline performance.

Next Steps