SDK Quickstart

Build with Canary in minutes

Transparent, evals-first AI monitoring. Framework-agnostic — works with OpenAI, Anthropic, Cohere, or any LLM. Three API calls to catch hallucinations before your users do.

1

Install

Add the Canary SDK to your project:

bash
# npm
npm install canary-ai

# yarn
yarn add canary-ai

# or use the REST API directly — no SDK required
2

Connect

Initialize the client with your API key. Get yours at canary.ai/signup — takes 30 seconds.

javascript
import { Canary } from 'canary-ai';

// Initialize with your API key
const canary = new Canary({
  apiKey: 'can_your_key_here',
});

// Register your LLM provider (one-time setup)
const monitor = await canary.connect({
  provider: 'openai',
  model:    'gpt-4o',
  name:     'production',
});

// monitor.id — use this to evaluate responses
💡

Framework-agnostic. Works with any LLM provider — OpenAI, Anthropic, Cohere, Mistral, or a self-hosted model. Just pass the provider name you want to track.

3

Evaluate

Call evaluate() after every LLM response. Canary runs dual-layer scoring — heuristics catch obvious failures instantly, then the AI layer scores coherence, factuality, and relevance.

javascript
// Wrap your LLM call
const llmResponse = await openai.chat.completions.create({
  model:    'gpt-4o',
  messages: [{ role: 'user', content: userInput }],
});

const output = llmResponse.choices[0].message.content;

// Evaluate the response
const result = await canary.evaluate({
  input:  userInput,
  output: output,
});

// Act on the result
if (!result.passed) {
  console.warn('Quality issue detected:', result.flags);
  // fallback logic here
}
Example Response
{
  "passed":    true,
  "score":     94,
  "flags":     [],
  "breakdown": {
    "coherence":   97,
    "factuality":  91,
    "relevance":   95
  }
}

When a hallucination is detected:

Hallucination Detected
{
  "passed":    false,
  "score":     19,
  "flags":     ["likely_hallucination", "overconfidence"],
  "breakdown": {
    "coherence":   45,
    "factuality":   8,
    "relevance":   22
  }
}

Authentication

All API requests require an API key passed in the X-API-Key header. Keys are prefixed with can_.

bash — curl example
curl -X POST https://canary.polsia.app/api/monitor/evaluate \
  -H "Content-Type: application/json" \
  -H "X-API-Key: can_your_key_here" \
  -d '{ "input": "What is 2+2?", "output": "4" }'

REST API Reference

POST /api/monitor/connect

Register a new LLM monitor. Returns an API key for evaluations.

Parameter Type Required Description
provider string required LLM provider (e.g., openai, anthropic, cohere)
model string optional Model name (e.g., gpt-4o). Defaults to default.
name string optional Friendly name for this monitor (e.g., production)
POST /api/monitor/evaluate

Evaluate an LLM response. Requires X-API-Key header.

Parameter Type Required Description
input string required The user prompt or input sent to the LLM
output string required The LLM's response to evaluate
context string optional System prompt or context provided to the LLM
GET /api/monitor/status

Get quality metrics for your monitor. Returns rolling score, hallucination count, and recent alerts. Requires X-API-Key header.

Response Format

All evaluation responses include:

  • passed — boolean, overall pass/fail
  • score — 0–100 quality score
  • flags — array of detected issues (e.g., likely_hallucination, identity_leak, repetition, overconfidence, refusal)
  • breakdown — per-dimension scores (coherence, factuality, relevance)
🐤

Threshold: Responses with score ≥ 70 pass by default. Any flagged anomaly (likely_hallucination, identity_leak) auto-fails regardless of score.

Error Codes

Status Meaning
400 Missing required parameters
401 Invalid or missing API key
429 Rate limit exceeded
500 Internal evaluation error

Ready to stop babysitting your LLM?

Free to start. No credit card. API key in 30 seconds.

Get your API key →