Error Reference Guide

This document catalogs errors you may encounter when using the Neon agent evaluation platform, organized by component. Each entry includes the error message, what causes it, and how to resolve it.


CLI Errors

The Neon CLI (agent-eval) uses exit code 0 for success and 1 for any failure. User cancellations raise typer.Abort().

Exit Codes

Exit CodeMeaning
0Success, or all test cases passed
1Error occurred, or one or more test cases failed

File & Validation Errors

Error MessageCauseResolution
File not found: {file}The specified suite file does not existCheck the file path and ensure the YAML file exists
Suite must be a YAML file: {suite}A non-YAML file was provided as a suiteProvide a .yaml or .yml file
Suite not found: {name}No suite with the given name existsRun agent-eval suite list to see available suites
Empty or invalid YAML file: {path}The YAML file is empty or cannot be parsedCheck the file has valid YAML content
YAML syntax error: {details}The YAML file has a syntax errorFix the YAML syntax (check indentation, colons, quotes)
Invalid suite file: {errors}The suite file fails schema validationReview the validation errors and fix the suite definition
Validation failed with N error(s)One or more fields fail validationReview the listed errors and correct the suite file

Agent & Execution Errors

Error MessageCauseResolution
Agent is required in local mode. Use --agent <module:function>No agent was specified for local executionProvide the --agent flag with a module:function path
Failed to load agent: {details}The agent module or function could not be importedVerify the module path exists and the function is exported
Failed to load suite: {details}The suite could not be loadedCheck the suite file path and contents
Run not found: {run_id}The specified run ID does not existVerify the run ID with agent-eval run list
sqlite3 is required for local modeThe sqlite3 module is missingInstall Python with sqlite3 support (usually built-in)
mlflow is required for local mode. Install with: pip install mlflow>=3.7MLflow dependency is missingRun pip install mlflow>=3.7

Authentication Errors

Error MessageCauseResolution
Not authenticatedNo API credentials are configuredRun agent-eval auth login to authenticate
Failed to verify credentials: {details}Credential verification failedRe-authenticate with agent-eval auth login
Failed to revoke key: {key_id}API key revocation request failedVerify the key ID and try again
Unknown action: {action}An invalid auth action was providedUse one of the supported auth actions

Comparison Errors

Error MessageCauseResolution
Not enough runs to compareFewer than two runs are availableComplete at least two runs before comparing
Not enough local runs to compareFewer than two local runs existRun evaluations locally at least twice
Failed to compare runsThe comparison operation failedCheck that both run IDs are valid and have results

Warnings (Non-Fatal)

These are displayed in yellow and do not cause the CLI to exit with an error:

WarningMeaning
Directory already initialized: {path}The suites directory already exists
No suites foundNo suite files exist in the suites directory
No API keys foundNo API keys are configured
No local runs foundNo local run results exist
No runs foundNo runs are available

API Errors

The Neon API returns JSON error responses with an error field and an HTTP status code.

Error Response Format

{
  "error": "Human-readable error description",
  "details": "Additional context (on 500/503 errors)",
  "hint": "Suggestion for fixing the issue (on 401/403 errors)",
  "status": 400
}

400 Bad Request

Returned when the request is missing required fields or contains invalid data.

Error MessageEndpointCause
Workspace context requiredMultipleNo workspace_id provided via header or query parameter
name is requiredPOST /api/suites, POST /api/promptsThe name field is missing from the request body
type must be "text" or "chat"POST /api/promptsInvalid prompt type
template is required for text promptsPOST /api/promptsText prompt missing template
messages are required for chat promptsPOST /api/promptsChat prompt missing messages
Prompt is requiredPOST /api/feedback/comparisonsMissing prompt field
Feedback type is requiredPOST /api/feedbackMissing feedback type
Preference data is required for preference feedbackPOST /api/feedbackPreference feedback missing data
Correction data is required for correction feedbackPOST /api/feedbackCorrection feedback missing data
trace_id is requiredPOST /api/scoresMissing trace_id
baseline_run_id is requiredPOST /api/compareMissing baseline run ID
candidate_run_id is requiredPOST /api/compareMissing candidate run ID
baseline_run_id and candidate_run_id must be differentPOST /api/compareSame ID used for both runs
agentId is requiredPOST /api/runsMissing agent ID for eval run
dataset.items is required and must not be emptyPOST /api/runsEmpty or missing dataset
scorers is required and must not be emptyPOST /api/runsNo scorers specified
Invalid suite ID format/api/suites/[id]Suite ID is not valid
Invalid action (Action must be one of: pause, resume, cancel)POST /api/runs/[id]/controlUnrecognized control action

401 Unauthorized

Returned when authentication is missing or invalid.

Error MessageDetailsResolution
UnauthorizedValid authentication requiredProvide a valid Authorization: Bearer <token> or X-API-Key header

Supported authentication methods:

  • JWT Bearer Token: Authorization: Bearer <token>
  • API Key: X-API-Key: ae_<env>_<key>

403 Forbidden

Returned when the authenticated user lacks permission for the requested operation.

Error MessageDetailsResolution
ForbiddenWorkspace context required for this operationProvide workspace_id via header (X-Workspace-Id) or query parameter
ForbiddenMissing permission: {permission}The API key or user does not have the required permission

404 Not Found

Returned when the requested resource does not exist. For security, 404 is also returned when the user lacks access to the resource (to prevent enumeration).

Error MessageEndpoint
Suite not found/api/suites/[id]
Eval run not found/api/runs/[id]
Span not found/api/spans/[id]
Prompt "{id}" not found/api/prompts/[id]
Baseline run {id} not foundPOST /api/compare
Candidate run {id} not foundPOST /api/compare

422 Unprocessable Entity

Returned when request body validation fails (Zod schema validation).

{
  "error": "Validation error",
  "code": "VALIDATION_ERROR",
  "message": "Field validation failed",
  "details": [
    { "field": "field_name", "message": "error message" }
  ]
}

Common causes: missing required fields, invalid data types, invalid enum values.

500 Internal Server Error

Returned when an unexpected error occurs. The details field contains the error message.

Error PatternEndpoint
Failed to list promptsGET /api/prompts
Failed to create promptPOST /api/prompts
Failed to get promptGET /api/prompts/[id]
Failed to update promptPUT /api/prompts/[id]
Failed to create suitePOST /api/suites
Failed to fetch suiteGET /api/suites/[id]
Failed to update suitePUT /api/suites/[id]
Failed to delete suiteDELETE /api/suites/[id]
Failed to create eval runPOST /api/runs
Failed to get eval runGET /api/runs/[id]
Failed to control eval runPOST /api/runs/[id]/control
Failed to create scorePOST /api/scores
Failed to fetch scoresGET /api/scores
Failed to insert spanPOST /api/spans
Failed to get span detailsGET /api/spans/[id]
Failed to submit feedbackPOST /api/feedback
Failed to fetch feedbackGET /api/feedback
Failed to compare runsPOST /api/compare

Alert Rule API Errors

Error MessageEndpointCause
name is requiredPOST /api/alerts/rulesMissing rule name
metric is requiredPOST /api/alerts/rulesMissing metric field
threshold is requiredPOST /api/alerts/rulesMissing threshold
operator is requiredPOST /api/alerts/rulesMissing operator
operator must be one of: gt, gte, lt, lte, eqPOST /api/alerts/rulesInvalid operator
severity must be one of: critical, warning, infoPOST /api/alerts/rulesInvalid severity
Alert rule not foundDELETE /api/alerts/rulesRule ID doesn’t exist (404)
id query parameter is requiredDELETE /api/alerts/rulesMissing ID parameter (400)

503 Service Unavailable

Returned when an external dependency (database, workflow engine) is not reachable.

Error MessageDetailsResolution
ClickHouse service unavailableThe database is not reachable.Ensure ClickHouse is running: docker compose up -d
Database not availablePostgreSQL is not reachable.Ensure PostgreSQL is running: docker compose up -d
Temporal service unavailableThe workflow engine is not reachable.Ensure Temporal is running: docker compose --profile temporal up -d

Connection error detection: The API checks for ECONNREFUSED, ETIMEDOUT, UNAVAILABLE, and connect errors in exception messages to distinguish infrastructure issues from application errors.


SDK Errors

TypeScript SDK (@neon/sdk)

Custom Error Classes

CloudSyncError Thrown when cloud sync operations fail (network issues, authentication problems, timeouts).

class CloudSyncError extends Error {
  statusCode?: number;
  cause?: unknown;
}
ScenarioResolution
Network timeoutCheck connectivity to the Neon API
HTTP 401Verify your API key is valid
HTTP 403Verify workspace permissions

CorrelationAnalysisError Thrown when ClickHouse correlation queries fail.

class CorrelationAnalysisError extends Error {
  code: CorrelationErrorCode;
  cause?: unknown;
}
Error CodeMeaningResolution
QUERY_FAILEDClickHouse query execution failedCheck ClickHouse logs and query syntax
QUERY_TIMEOUTQuery exceeded the timeoutReduce the data range or increase the timeout
PARSE_ERRORFailed to parse query resultsThis may indicate a schema mismatch; check ClickHouse table schema
CONNECTION_ERRORFailed to connect to ClickHouseEnsure ClickHouse is running and accessible
INVALID_INPUTInvalid input parametersCheck the parameters passed to the analysis function

Client Errors

Error MessageCauseResolution
Neon API error: {status} {message}API request returned a non-OK statusCheck the status code and message for details
Evaluation run failed: {message}An eval run completed with errorsCheck the run details for per-case error messages

Validation Errors

Error MessageModuleResolution
llmJudge requires a prompt stringscorers/llm-judgeProvide a prompt parameter to llmJudge()
Threshold value cannot be emptythresholdProvide a non-empty threshold value
Invalid threshold value: "{input}"thresholdUse a valid numeric or percentage threshold
Threshold must be positivethresholdUse a positive number
Threshold cannot exceed 100%thresholdUse a value of 100% or less
Embedding dimension mismatchanalysis/pattern-detectorEnsure embeddings have consistent dimensions
Invalid experiment variants: {errors}comparison/experimentFix the variant configuration
Experiment must have exactly 1 control and at least 1 treatmentcomparison/experimentDefine one control and one or more treatment variants
Variant allocation must be between 0 and 100comparison/variantSet allocation percentage between 0 and 100
Percentile must be between 0 and 100comparison/statisticsUse a percentile value in the valid range
Confidence level must be between 0 and 1comparison/statisticsUse a confidence level between 0 and 1

Debug Client Errors

Error MessageCauseResolution
DebugClient: url is requiredNo URL provided to debug clientPass a URL when constructing DebugClient
DebugClient: traceId is requiredNo trace ID providedProvide a valid trace ID
Not connected to debug serverClient not connectedCall connect() before using the client
HTTP {status}: {statusText}Debug server returned an errorCheck the debug server is running and accessible

Export Errors

Error MessageCauseResolution
Export format '{name}' is already registeredDuplicate format registrationUse a unique format name
Unknown export format '{name}'Unrecognized export formatUse a registered format (json, csv, etc.)
Format '{name}' does not support parsingFormat is export-onlyUse a format that supports parsing

Python SDK (neon-sdk)

Import Errors (Optional Dependencies)

Error MessageResolution
Temporal support requires the 'temporal' extra. Install with: pip install neon-sdk[temporal]Install the temporal extra
ClickHouse extra not installedInstall with pip install neon-sdk[clickhouse]

Client Errors

Error MessageCauseResolution
Evaluation run failed: {message}Eval run completed with errorsCheck run details for per-case errors
Neon API error: {status} {message}API returned non-OK statusCheck the status code and response
Not connected. Call connect() first.Temporal client not connectedCall await client.connect() before operations

Validation Errors

Error MessageCauseResolution
llm_judge requires a prompt stringMissing prompt for LLM judgeProvide a prompt parameter to llm_judge()

Infrastructure Errors

Temporal Workflow Errors

Worker Connection

ErrorCauseResolution
Worker fails to connect after 10 retriesTemporal server unreachableEnsure Temporal is running: docker compose --profile temporal up -d
Worker exits with code 1Fatal error during startup or shutdownCheck worker logs for the specific error

The worker uses exponential backoff with configurable retry:

  • Max reconnect attempts: 10 (override with MAX_RECONNECT_ATTEMPTS env var)
  • Reconnect delay: 5000ms (override with RECONNECT_DELAY_MS env var)

Activity Timeouts

Activities have configured timeouts and retry policies. If an activity exceeds its timeout, Temporal will retry according to the retry policy.

ActivityTimeoutMax RetriesRetry Interval
Agent execution5 minutes51s - 30s
Score trace (LLM judges)10 minutes32s - 1m
Emit span1 minute31s - 10s
LLM call5 minutes51s - 30s
Tool execution5 minutes51s - 30s

Workflow-Level Errors

Error ScenarioBehavior
Individual test case failsCase recorded as failed with error message; other cases continue
Scorer throws an exceptionScore recorded as 0 with error reason; other scorers continue
Notification delivery failsError logged; workflow does not fail
Workflow cancelled via signalAll pending cases return cancelled status
Workflow pausedExecution pauses for up to 24 hours; resume via signal

Workflow Control Signals

SignalEffectTimeout
cancelRunSignalCancels the entire eval runImmediate
pauseSignalPauses/resumes workflow execution24 hours max pause
approvalSignalProvides human approval for sensitive tools7 days max wait
cancelSignalCancels an individual eval caseImmediate

LLM Provider Errors

Thrown when LLM provider SDKs are missing or misconfigured in the Temporal worker.

ProviderError MessageResolution
AnthropicAnthropic provider requires the "@anthropic-ai/sdk" packageRun bun add @anthropic-ai/sdk
OpenAIOpenAI provider requires the "openai" packageRun bun add openai
Vertex AIVertex AI provider requires a GCP project IDSet GOOGLE_CLOUD_PROJECT env var
Vertex AIVertex AI provider requires the "@google-cloud/vertexai" packageRun bun add @google-cloud/vertexai
Vertex ClaudeVertex Claude provider requires a GCP project IDSet GOOGLE_CLOUD_PROJECT env var
Vertex ClaudeVertex Claude provider requires the "@anthropic-ai/vertex-sdk" packageRun bun add @anthropic-ai/vertex-sdk
FactoryUnknown provider: {name}Use a supported provider name

Health Check API

The GET /api/health endpoint returns the overall system status:

HTTP StatusResponseMeaning
200{ "status": "healthy" }All services operational
200{ "status": "degraded" }Some services unavailable (e.g., ClickHouse down)
503{ "status": "unhealthy" }No backend services available

ClickHouse Errors

SymptomCauseResolution
ECONNREFUSED on trace queriesClickHouse is not runningRun docker compose up -d to start ClickHouse
Slow trace queriesLarge data volume without partition pruningAdd time-range filters to queries
503 ClickHouse service unavailable from APIClickHouse server is down or unreachableCheck ClickHouse container logs: docker compose logs clickhouse

PostgreSQL Errors

SymptomCauseResolution
ECONNREFUSED on suite operationsPostgreSQL is not runningRun docker compose up -d to start PostgreSQL
does not exist errorsDatabase tables not createdRun database migrations
ETIMEDOUTDatabase overloaded or network issueCheck PostgreSQL container health
503 Database not available from APIPostgreSQL server is downCheck container logs: docker compose logs postgres

Troubleshooting

Services Won’t Start

  1. Check Docker is running: docker ps
  2. Start all infrastructure: docker compose up -d
  3. Start with Temporal: docker compose --profile temporal up -d
  4. Check container logs: docker compose logs <service-name>
  5. Verify ports are free: ClickHouse (8123), PostgreSQL (5432), Temporal (7233)

Eval Runs Stuck in “Running” State

  1. Check the Temporal UI (default: http://localhost:8233) for workflow status
  2. Verify the Temporal worker is running: bun run workers
  3. Check worker logs for activity failures
  4. If needed, cancel the run: POST /api/runs/{id}/control with {"action": "cancel"}

Scores Are All Zero

  1. Verify the scorer is configured correctly in the suite definition
  2. Check if the agent is producing output (non-empty responses)
  3. Review trace spans to confirm the agent executed
  4. For LLM judges, ensure the ANTHROPIC_API_KEY environment variable is set
  5. Check scorer error reasons in the run results

API Returns 503 Errors

  1. Identify which service is unavailable from the error message
  2. Check Docker container status: docker compose ps
  3. Restart the failing service: docker compose restart <service>
  4. Review container logs for startup errors

CLI Authentication Issues

  1. Run agent-eval auth status to check current credentials
  2. Re-authenticate: agent-eval auth login
  3. Verify the API URL is correct in your configuration
  4. Check that your API key has not expired

Suite Validation Fails

  1. Validate YAML syntax with a YAML linter
  2. Ensure all required fields are present (name, cases, scorers)
  3. Check that scorer names match available scorers
  4. Run agent-eval suite validate <file> for detailed validation errors