Faithfulness

Overview

The Faithfulness metric evaluates the accuracy of text generated by a model by comparing it to the provided context. This metric checks the number of truthful statements against the total number of statements in the generated text. A high faithfulness score indicates that the generated content accurately reflects the context without introducing inaccuracies or distortions.

FaithfulnessMetric leverages the evaluateFaithfulness function to compute this metric.

Methods

`evaluateFaithfulness` Function

This function analyzes the faithfulness of the generated text based on its alignment with the given context.

output: The text generated by the model.
context: The reference context against which the output is evaluated.

It splits the output into individual statements and assesses each statement’s truthfulness relative to the context using a pre-trained AI model. The function returns a promise that resolves to a numeric score representing the percentage of truthful statements.

`FaithfulnessMetric` Class

FaithfulnessMetric uses the evaluateFaithfulness function to compute the faithfulness score.

output: The text generated by the model.
context: The reference context used for evaluation.

The evaluateSteps method invokes evaluateFaithfulness and returns a detailed result including the faithfulness score and reasons, which highlights the accuracy of the generated text relative to the provided context.

Example

import { evaluate, FaithfulnessMetric } from '@evalkit/core';

evaluate({
    // The generated text from an LLM
    output: "Investing in renewable energy can lead to long-term economic benefits.",
    // The context against which to evaluate the text
    context: "Renewable energy includes sources like solar and wind power.",
}, [FaithfulnessMetric])

// outputs
{
  passed: true,
  // The number of truthful statements in the text is 1 out of 1, resulting in a faithfulness score of 1.
  score: 1,
  reasons: ['All statements in the generated text are truthful.']
}

Getting Started

Evaluations

Overview

Methods

`evaluateFaithfulness` Function

`FaithfulnessMetric` Class

Example

Getting Started

Evaluations

​Overview

​Methods

​evaluateFaithfulness Function

​FaithfulnessMetric Class

​Example

Overview

Methods

`evaluateFaithfulness` Function

`FaithfulnessMetric` Class

Example