Skip to main content
April 23rd, 2026
NewDigital Humans

DTMF tool for Digital Humans

Digital humans can now be configured to send DTMF tones during a call via a new allow_dtmf_tool field, aligned with how allow_end_call_tool and allow_silence_tool are stored and returned. Create and bulk-create default to allow_dtmf_tool: true when omitted; updates treat the field as optional (omit to leave unchanged). OpenAPI schemas DigitalHumanRequestData, DigitalHumanResponseData, and UpdateDigitalHumanRequest include the new property.Create digital human · Update digital human
April 23rd, 2026
NewSimulations

Runs per Digital Human

Simulations now support executing multiple runs for each selected digital human in a single batch. Set runs_per_digital_human on the simulation (persisted in experiments.settings) to control the default, or pass an explicit per-run override when queuing a run. Values must be positive integers; when unset, each digital human runs once. The Bluejay dashboard exposes the new control on the simulation settings page and the “Create simulation” and “Create new run” dialogs.Simulations overview
April 23rd, 2026
ImprovementDigital Humans

Digital Human intelligence upgrade

We upgraded the reasoning quality of digital humans during both simulations and observability replays. Expect more coherent multi-turn behavior, better adherence to persona and objectives, and steadier tool-use decisions across longer conversations. No configuration changes are required — the upgrade applies automatically to all digital humans.Digital humans overview
April 23rd, 2026
ImprovementWorkflows

Scripted silence in workflows

When a digital human is running in workflow mode, the silence tool now evaluates only at end-of-turn instead of firing from timeout-based checks mid-turn. This removes a class of false silence triggers on scripted workflow steps and keeps silence decisions aligned with the workflow’s turn boundaries. Non-workflow digital humans are unchanged.Workflows overview
April 23rd, 2026
PerformanceWorkflows

Workflow latency fix

Fixed a latency regression in workflow-mode conversations where session-handler state could delay turn transitions. Workflow runs now advance between agent and user turns without the extra wait, improving perceived responsiveness on branching paths.Workflows overview
April 14th, 2026
ImprovementAPI

Digital Human: silence tool fields

Digital human create, read, update, delete, and bulk APIs now expose allow_silence_tool and silence_tool_instructions, aligned with how allow_end_call_tool and hangup_instructions are stored and returned. Create and bulk-create default to allow_silence_tool: false and silence_tool_instructions: "default" when omitted; updates treat fields as optional (omit to leave unchanged). OpenAPI schemas DigitalHumanRequestData, DigitalHumanResponseData, and UpdateDigitalHumanRequest include the new properties.Create digital human · Update digital human
April 3rd, 2026
ImprovementAPI

Redesigned Workflows

Workflows are a structured way to test your agent along a conversation path.
  • Agent and user turns — you define what the agent should say or do (so simulations can check it) and what the digital human says on each step.
  • Branchingoptions nodes capture different things the caller might do next (speech, DTMF, silence).
  • Coverage — each distinct path through the graph becomes its own digital human, so every branch gets exercised.
  • Docs — cookbook and API reference refreshed; older workflow endpoints are under Deprecated.
Create workflow · Cookbook
March 22nd, 2026
ImprovementDocs

Revamped documentation

We completely overhauled the Bluejay docs with improved navigation, expanded guides, and new content across the board. Highlights include:
  • Restructured navigation with dedicated tabs for Documentation, API Reference, Bluejay University, and Changelog
  • Bluejay University — a new learning track with guided lessons on simulations, observability, metrics, and the API
  • Expanded integration guides covering all supported providers and simulation transports
  • Cookbook recipes for common workflows like GitHub Actions CI, API-driven evaluations, and webhook setup
Explore the docs
March 20th, 2026
NewSimulations

SMS simulation support

Bluejay now supports SMS-based simulations, enabling you to test text-based agent flows end-to-end. Configure SMS simulations the same way you configure voice simulations — define digital humans, set up custom metrics, and run batch evaluations against your SMS agent.SMS simulations support all existing integrations including telephony providers and HTTP webhooks.Read the docs
March 15th, 2026
NewAlerts

Threshold alarms

Set threshold-based alarms on any custom metric to get alerted when agent performance degrades. Define upper or lower bounds, choose your notification channel (Slack, email, or webhook), and Bluejay will trigger alerts automatically when production metrics cross your thresholds.Alarms work across both observability and simulation metrics, so you can catch regressions in production and in testing.Read the docs
March 14th, 2026
PerformanceObservability

Faster call log evaluation

We made significant performance improvements to the observability evaluation pipeline:
  • 3x faster evaluation for call logs with custom metrics
  • Parallel metric execution — multiple custom metrics now evaluate concurrently instead of sequentially
  • Reduced API latency for the /evaluate and /re-evaluate endpoints by approximately 40%
These improvements apply automatically to all existing observability configurations.Read the docs
March 10th, 2026
NewIntegration

Miro integration for simulation visualization

You can now connect Bluejay to Miro to automatically generate visual conversation flow diagrams from your simulation results. Each simulation run produces a Miro board showing the conversation tree, branching paths, and metric outcomes.Connect your Miro workspace from the Integrations page in your Bluejay dashboard.Read the docs
March 3rd, 2026
ImprovementWorkflows

Workflow scheduling improvements

Workflows now support cron-based scheduling with finer granularity. You can schedule simulation runs and observability evaluations to execute at specific intervals — hourly, daily, or on a custom cron expression.Additional improvements include:
  • Retry logic for failed workflow steps
  • Execution history with detailed step-level logs
  • Webhook notifications on workflow completion or failure
Read the docs
February 24th, 2026
NewSimulations

Community-based simulation runs

You can now run simulations against an entire community of digital humans in a single batch. Previously, simulations ran against individual digital humans or manually selected groups. Community-based runs let you test your agent against a diverse, pre-configured population in one click.Combine communities with custom metrics to get aggregate performance scores across demographic segments, persona types, or behavioral profiles.Read the docs
February 17th, 2026
ImprovementMetrics

Custom metric formulas

Custom metrics now support formula-based definitions in addition to LLM-as-a-Judges. Define metrics using arithmetic expressions over existing metric scores, enabling composite metrics like weighted averages or pass/fail thresholds without writing evaluation prompts.Formula metrics evaluate instantly and do not consume LLM credits.Read the docs
February 10th, 2026
NewIntegration

Pipecat integration

Bluejay now integrates natively with Pipecat for running simulations against Pipecat-powered voice agents. Connect your Pipecat pipeline endpoint and Bluejay will handle session orchestration, audio transport, and evaluation.Read the docs
February 3rd, 2026
NewAPI

Knowledge base versioning API

The new Knowledge Base API lets you manage versioned knowledge base snapshots for your agents. Create versions, apply labels, and roll back to previous versions — all through the API. Knowledge base versions integrate with simulations so you can A/B test agent behavior across different knowledge configurations.Read the docs
January 27th, 2026
ImprovementDashboard

Dashboard redesign

We redesigned the Bluejay dashboard with a focus on surfacing actionable insights. The new layout includes:
  • At-a-glance health scores for each agent across simulation and production metrics
  • Trend sparklines showing metric performance over time
  • Alert badges highlighting agents that need attention
  • Quick-launch actions for running simulations and viewing recent call logs
The redesign is live for all users.Read the docs
January 20th, 2026
NewIntegration

ElevenLabs observability integration

Bluejay now supports direct observability integration with ElevenLabs Conversational AI. Connect your ElevenLabs account to automatically ingest call logs, evaluate them with custom metrics, and surface quality issues in your dashboard.Read the docs
January 13th, 2026
ImprovementSimulations

Digital human generation improvements

The digital human generation engine has been upgraded with better persona diversity and more realistic conversational styles:
  • Expanded trait library with 40+ new customer traits including emotional tone, technical proficiency, and communication preferences
  • Scenario-aware generation — digital humans now adapt their behavior based on the simulation scenario context
  • Bulk generation — generate up to 100 digital humans in a single API call
Read the docs
January 6th, 2026
NewIntegration

Slack alerting integration

Connect Bluejay to Slack to receive real-time alerts when production metrics drop below thresholds or simulation runs complete. Configure per-channel routing so the right team gets the right alerts.Read the docs
December 16th, 2025
NewSimulations

WebSocket simulation support

Bluejay now supports WebSocket-based simulations for testing real-time, bidirectional agent communication. Configure your WebSocket endpoint, define the message protocol, and run simulations with full transcript capture and metric evaluation.Read the docs
December 9th, 2025
ImprovementAPI

Prompt versioning and labels

The Prompt API now supports versioning and labeling. Create multiple versions of a prompt, tag them with labels like production or staging, and reference them by label in your agent configuration. Roll back to any previous version instantly.Read the docs
December 2nd, 2025
NewObservability

Webhook-based log ingestion

You can now send call logs to Bluejay via webhook for evaluation. Configure a webhook endpoint in your dashboard, point your agent platform at it, and Bluejay will automatically ingest, evaluate, and store the results.This is the fastest way to get observability running if your platform isn’t covered by a native integration.Read the docs
November 25th, 2025
NewIntegration

LiveKit simulation integration

Run simulations against LiveKit-powered voice agents. Bluejay connects to your LiveKit room, manages participant sessions, and captures full audio transcripts for evaluation.Read the docs
November 18th, 2025
NewMetrics

Metrics Lab

Introducing Metrics Lab — an interactive environment for prototyping and testing custom metrics before deploying them. Write evaluation prompts, test them against sample transcripts, and iterate on scoring criteria without affecting production data.Read the docs
November 11th, 2025
ImprovementDashboard

Folder-based agent organization

Agents can now be organized into folders for better workspace management. Create folders, move agents between them, and filter your agent list by folder. Folders are available in both the dashboard and the API.Read the docs
November 4th, 2025
NewIntegration

Retell observability integration

Bluejay now integrates directly with Retell for production call monitoring. Connect your Retell account to automatically pull call logs, run evaluations, and track agent performance over time.Read the docs
October 21st, 2025
NewIntegration

Vapi observability integration

Bluejay now integrates with Vapi for production call monitoring. Connect your Vapi account to automatically ingest call logs, run custom metric evaluations, and track agent quality over time.Read the docs
October 7th, 2025
NewIntegration

Bland observability integration

You can now connect Bluejay to Bland for production observability. Call logs from your Bland-powered agents are automatically ingested, evaluated against your custom metrics, and surfaced in the dashboard.Read the docs
September 29th, 2025
NewAPI

Communities and workflows API endpoints

New API endpoint groups for managing communities and workflows:
  • Communities — create, update, add members, list, and delete communities programmatically
  • Workflows — define, schedule, and manage automation workflows through the API
Read the docs
September 22nd, 2025
NewIntegration

SIP simulation integration

Bluejay now supports SIP-based simulations. Connect your SIP trunk and run simulations directly over the SIP protocol, enabling testing for enterprise telephony deployments and contact center agents.Read the docs
September 15th, 2025
NewIntegration

Telephony simulation support

Run simulations over PSTN by connecting your telephony provider to Bluejay. Dial into your agent’s phone number, capture the full conversation, and evaluate it with custom metrics — all without changing your agent’s infrastructure.Read the docs
September 1st, 2025
NewAPI

Observability and evaluation API endpoints

New API endpoint groups for observability and evaluation workflows:
  • Observability — evaluate and re-evaluate call logs, manage call log lifecycle
  • Custom Metrics — create, bulk-create, update, list, and delete custom metrics via API
Read the docs
August 18th, 2025
NewAPI

Digital humans and simulation runs API endpoints

New API endpoint groups for simulation orchestration:
  • Digital Humans — create, generate, update, list, and manage digital human personas
  • Simulation Runs — queue voice and SMS runs, retrieve results, and manage active conversations
Read the docs
August 4th, 2025
NewAPI

Agents and simulations API endpoints

The first set of public API endpoints is now available:
  • Agents — create, update, list, move, and delete agents
  • Simulations — create, configure, list, and manage simulations programmatically
These endpoints form the foundation of the Bluejay API and enable full automation of your testing pipeline.Read the docs
July 21st, 2025
NewSimulations

Digital human personas

Introducing digital humans — synthetic customer personas that power Bluejay simulations. Define demographic profiles, personality traits, communication styles, and scenario-specific behaviors to create realistic test conversations at scale.Read the docs
July 7th, 2025
NewSimulations

Simulation engine

The Bluejay simulation engine is live. Run synthetic conversations against your voice agents to validate behavior before production. Define scenarios, assign digital humans, and evaluate performance with custom metrics.Read the docs
June 23rd, 2025
NewObservability

Observability pipeline

Bluejay’s observability pipeline is now available. Ingest production call logs, evaluate them against custom metrics, and surface quality trends in the dashboard. Supports both API-based and webhook-based log ingestion.Read the docs
June 9th, 2025
NewMetrics

Custom metrics engine

Define custom evaluation criteria tailored to your use case. Bluejay’s custom metrics engine supports LLM-as-a-Judge evaluations with configurable scoring rubrics, pass/fail thresholds, and dynamic variables that adapt to conversation context.Read the docs
May 19th, 2025
NewDashboard

Agent management and dashboard

The Bluejay dashboard is live. Create and manage your conversational AI agents, view performance summaries, and navigate your workspace from a central hub.Read the docs
April 28th, 2025
NewPlatform

Evaluation framework

Bluejay’s evaluation framework is ready. Score agent conversations using structured rubrics, capture per-turn and per-call metrics, and generate evaluation reports. This framework underpins both simulation testing and production observability.
April 7th, 2025
NewPlatform

Core platform infrastructure

The foundational Bluejay platform is up and running — authentication, workspace management, and the base API layer. This milestone sets the stage for all product features to follow.