Skip to main content
Run Bluejay simulations in GitHub Actions and fail CI if your agent doesn’t meet quality standards. Just like unit tests, but for AI agents. Key capabilities:
  • 🚨 Fail CI if score drops below your threshold
  • 🔍 Automatically test every PR and commit
  • ⚡ Zero-install; runs on GitHub-hosted runners
  • 🎯 Override prompts, knowledge bases, and digital humans per run

Before Starting

You’ll need:
  1. Bluejay API Key – Get yours from the Bluejay dashboard
  2. Simulation ID – Create a simulation in Bluejay first with your test scenarios and digital humans. The simulation defines what conversations your agent will be tested on.

Quick Start

1

Add your API key and variables

Go to: Settings → Secrets and variables → ActionsAdd a Secret:
  • Click New repository secret
  • Name: BLUEJAY_API_KEY
  • Value: Your API key from the developers page
Add Variables:
  • Click the Variables tab
  • Click New repository variable
  • Add the following:
Variable NameValueRequired
BLUEJAY_SIMULATION_IDYour simulation ID (e.g., sim_12345)✅ Yes
BLUEJAY_MIN_SCOREMinimum passing score (e.g., 80)No
BLUEJAY_PROMPT_IDPrompt override IDNo
BLUEJAY_KB_IDKnowledge base override IDNo
BLUEJAY_DIGITAL_HUMAN_IDSComma-separated Digital Human IDsNo
BLUEJAY_PHONE_NUMBERPhone number overrideNo
BLUEJAY_SIP_URISIP URI overrideNo
2

Create your workflow

Add .github/workflows/bluejay-tests.yml to your repo:
name: Agent Tests

on:
  workflow_dispatch:
    inputs:
      simulation_id:
        description: 'Bluejay Simulation ID to run'
        required: false
        type: string
      prompt_id:
        description: 'Optional Prompt ID override'
        required: false
        type: string
      knowledge_base_id:
        description: 'Optional Knowledge Base ID override'
        required: false
        type: string
      digital_human_ids:
        description: 'Comma-separated Digital Human IDs (e.g. "dh_1,dh_2")'
        required: false
        type: string
      phone_number:
        description: 'Optional phone number to use for this run'
        required: false
        type: string
      sip_uri:
        description: 'Optional SIP URI to use for this run'
        required: false
        type: string
      min_score:
        description: 'Minimum required score (0–100)'
        required: false
        type: string
        default: '80'
  push:
    branches: [main]
  pull_request:
    types: [opened, synchronize]

jobs:
  run-bluejay-tests:
    runs-on: ubuntu-latest

    steps:
      - name: Run Bluejay Tests
        uses: bluejay-ai-dev/bluejay-github-actions@v1
        with:
          api_key: ${{ secrets.BLUEJAY_API_KEY }}
          # Required: simulation id (manual input OR repo variable)
          simulation_id: ${{ inputs.simulation_id || vars.BLUEJAY_SIMULATION_ID }}
          # Optional overrides (manual input OR repo variable)
          prompt_id: ${{ inputs.prompt_id || vars.BLUEJAY_PROMPT_ID }}
          knowledge_base_id: ${{ inputs.knowledge_base_id || vars.BLUEJAY_KB_ID }}
          digital_human_ids: ${{ inputs.digital_human_ids || vars.BLUEJAY_DIGITAL_HUMAN_IDS }}
          phone_number: ${{ inputs.phone_number || vars.BLUEJAY_PHONE_NUMBER }}
          sip_uri: ${{ inputs.sip_uri || vars.BLUEJAY_SIP_URI }}
          # Behavior controls
          wait_for_results: 'true'
          min_score: ${{ inputs.min_score || vars.BLUEJAY_MIN_SCORE || '80' }}
          poll_interval_seconds: '10'
          timeout_seconds: '1500'
3

Trigger a simulation

Make changes to your codebase and open a pull request. The GitHub Action will automatically run Bluejay tests on every PR.GitHub Action Check
4

Monitor your simulation

Click on the Actions tab in your GitHub repository to view the simulation run in real-time. You’ll see the status and score once the simulation completes.GitHub Action Monitor

Inputs

InputRequiredDefaultDescription
api_key✅ YesYour Bluejay API key.
simulation_id✅ YesID of the simulation to run.
prompt_idNoOverride prompt for this run.
knowledge_base_idNoOverride knowledge base for this run.
digital_human_idsNoComma-separated list of Digital Human IDs.
phone_numberNoPhone number override for the run.
sip_uriNoSIP URI override for the run.
wait_for_resultsNotrueWait for simulation to finish.
min_scoreNo80Required overall score (0–100).
poll_interval_secondsNo10Polling frequency in seconds.
timeout_secondsNo1500Timeout (25 minutes).

Outputs

OutputDescription
simulation-run-idThe ID of the queued simulation run.
final-statusFinal simulation status: completed, failed, cancelled, etc.
scoreOverall numeric score from the simulation.

Advanced Usage

Customize When Tests Run

You can customize when your Bluejay tests run by modifying the on: section of your workflow. Here are some common patterns:
Run tests only when pushing to specific branches:
name: Agent Tests

on:
  push:
    branches:
      - main
      - production
      - staging

jobs:
  run-bluejay-tests:
    runs-on: ubuntu-latest
    steps:
      - name: Run Bluejay Tests
        uses: bluejay-ai-dev/bluejay-github-actions@v1
        with:
          api_key: ${{ secrets.BLUEJAY_API_KEY }}
          simulation_id: ${{ vars.BLUEJAY_SIMULATION_ID }}
          wait_for_results: 'true'
          min_score: '80'
Want more control? For a complete list of events and advanced trigger configurations, see the GitHub Actions documentation on workflow triggers.