The Evaluations tab is where you test if a user’s natural language prompt is correctly translated into the action you’ve defined, and whether the right input parameters are extracted based on the schema.
- Confirm the AI routes the user intent to the correct action.
- Check that the schema fields (parameters) are filled correctly from the prompt.
- Validate how the agent handles the server response using the mock response.
Generating Test Prompts
Section titled “Generating Test Prompts”- Open the Evaluations tab.
- Click Generate Prompts to auto-create a set of test prompts based on your action schema.
Example (for Create Campaign action):
- “Create a new campaign named ‘Summer Sale’ starting from July 1 to July 31 with a budget of 5000 and status ACTIVE.”
- “I want to create a campaign called ‘Holiday Promo’.”
- “Create campaign starting on August 1 with a budget of 2000 and status PAUSED.”
You can also click + Add New Test to write your own custom prompt.
Running a Test
Section titled “Running a Test”- Click Run next to a test prompt.
- The agent will process the input and attempt to:
- Match the correct action.
- Extract values for each field in the Schema.
- Return the Response Mock you defined.
Reviewing Results
Section titled “Reviewing Results”On the right-hand side, you’ll see the Agent Testing output.
Example:
Campaign Created: Spring SaleName: Spring SaleStart Date: 2024-05-01End Date: 2024-05-31Budget: 10,000Status: ACTIVEIn the Agent Testing panel, click the action link (e.g., Create Campaign was executed).
This opens the Arguments view, which shows the raw schema extraction:
{ "startDate": "2024-05-01", "endDate": "2024-05-31", "budget": 10000, "status": "ACTIVE", "campaignName": "Spring Sale"}This lets you confirm that user language (e.g., “budget of 10k”) is mapped into structured schema fields.
Best Practices
Section titled “Best Practices”Create tests that cover:
- All required fields provided (happy path).
- Only required fields provided (minimal input).
- Missing required fields (should fail validation).
- Partial optional fields (some extras given, others missing).
Update your schema descriptions or instructions if the AI is misinterpreting user prompts.
Always re-run evaluations after editing the schema or response mock.
Use Evaluations before publishing to ensure your action works reliably across different ways a user might phrase their request.