The Reporting system provides basic evaluation results in a readable format. For each evaluation run, it shows pass/fail status, scores, and any relevant error messages or reasons. Results are displayed both individually and as a summary, making it easier to track how your model performed across different metrics.
import { evaluate, RelevancyMetric, HallucinationMetric } from '@evalkit/core';const results = await evaluate({ input: "What is the weather like today?", output: "The temperature is 72°F with partly cloudy skies.", context: "Current conditions: 72°F, partly cloudy"}, [RelevancyMetric, HallucinationMetric]);// outputs📊 EvalKit Report Summary========================🕒 Duration: 0.81s📝 Total Evaluations: 2✅ Passed: 2❌ Failed: 0📈 Metrics Breakdown------------------Relevancy Evaluation: Score: 0.90 Passed: 1 Failed: 0Hallucination Evaluation: Score: 1.00 Passed: 1 Failed: 0
The report provides a clear overview of how well your model performed across different evaluation metrics, making it easy to identify areas that need improvement.