Dynamic (G-Eval)
Overview
The Dynamic metric assesses how well the actual output from a model aligns with an expected output based on a set of specified criteria. This evaluation is dynamic as it considers various criteria such as accuracy, relevance, and other aspects tailored to the specific evaluation needs. The metric provides a nuanced insight into the model’s performance by analyzing each criterion separately.
DynamicMetric
utilizes the evaluateDynamic
function to conduct this detailed analysis.
Methods
evaluateDynamic
Function
This function evaluates the alignment of the actual output with the expected output based on dynamic criteria specified for the evaluation.
input
: The original input question or statement provided to the model.actualOutput
: The actual output generated by the model in response to the input.expectedOutput
: The expected output that ideally should be generated from the input.criteria
: An array of criteria used to evaluate the output. Each criterion details an aspect of the evaluation such as accuracy or relevance.
It generates a detailed prompt for OpenAI’s model to analyze the texts and provides a numerical score and reasons for each criterion. The function returns a promise that resolves to an array of evaluation results for each criterion.
DynamicMetric
Class
DynamicMetric
leverages the evaluateDynamic
function to provide a comprehensive evaluation of the model’s output compared to the expected output.
input
: The original input provided to the model.actualOutput
: The actual output from the model.expectedOutput
: The expected output ideally generated from the input.criteria
: The evaluation criteria.
The evaluateSteps
method processes the evaluation and returns an overall score based on the average scores across the specified criteria, along with detailed reasons for each criterion’s score.