Hallucination

Overview

The Hallucination metric evaluates the presence of unsupported statements in the text generated by a model. Hallucinations in AI-generated text refer to instances where the model produces information that is not grounded in the provided context or established facts. This metric helps identify the extent to which the generated text deviates from factual accuracy.

HallucinationMetric uses the evaluateHallucination function to calculate this metric, focusing on distinguishing between faithful and hallucinated content.

Methods

`evaluateHallucination` Function

This function assesses the generated text for hallucinations by evaluating the presence of statements unsupported by the given context.

output: The text generated by the model.
context: The context used to assess the truthfulness of the output.

It splits the output into individual statements and checks each against the provided context to determine if they are based on factual content. The function returns a promise that resolves to a percentage indicating the extent of hallucinated content in the text.

`HallucinationMetric` Class

HallucinationMetric evaluates hallucinations in generated text based on the context from which the text should draw conclusions.

output: The text generated by the model.
context: The reference context used for evaluation.

The evaluateSteps method invokes evaluateHallucination and provides a detailed result including a hallucination score. This score quantifies the proportion of the text that contains hallucinated information relative to the total content.

Example

import { evaluate, HallucinationMetric } from '@evalkit/core';

evaluate({
    // The generated text from an LLM
    output: "Neptune is the largest planet in our solar system.",
    // The context against which to evaluate the text
    context: "The solar system consists of eight planets with Jupiter being the largest.",
}, [HallucinationMetric])

// outputs
{
  passed: false,
  // The number of hallucinated statements divided by the total number of statements
  score: 0,
  reasons: ['Hallucination score: 0']
}

On this page

Overview
Methods
evaluateHallucination Function
HallucinationMetric Class
Example

Overview

HallucinationMetric uses the evaluateHallucination function to calculate this metric, focusing on distinguishing between faithful and hallucinated content.

Methods

`evaluateHallucination` Function

This function assesses the generated text for hallucinations by evaluating the presence of statements unsupported by the given context.

output: The text generated by the model.
context: The context used to assess the truthfulness of the output.

`HallucinationMetric` Class

HallucinationMetric evaluates hallucinations in generated text based on the context from which the text should draw conclusions.

output: The text generated by the model.
context: The reference context used for evaluation.

Example

import { evaluate, HallucinationMetric } from '@evalkit/core';

evaluate({
    // The generated text from an LLM
    output: "Neptune is the largest planet in our solar system.",
    // The context against which to evaluate the text
    context: "The solar system consists of eight planets with Jupiter being the largest.",
}, [HallucinationMetric])

// outputs
{
  passed: false,
  // The number of hallucinated statements divided by the total number of statements
  score: 0,
  reasons: ['Hallucination score: 0']
}

On this page

Overview
Methods
evaluateHallucination Function
HallucinationMetric Class
Example

Overview

Methods

`evaluateHallucination` Function

`HallucinationMetric` Class

Example

Getting Started

Evaluations

Hallucination

Overview

Methods

`evaluateHallucination` Function

`HallucinationMetric` Class

Example

​Overview

​Methods

​evaluateHallucination Function

​HallucinationMetric Class

​Example

Getting Started

Evaluations

​Overview

​Methods

​evaluateHallucination Function

​HallucinationMetric Class

​Example

Overview

Methods

`evaluateHallucination` Function

`HallucinationMetric` Class

Example

Overview

Methods

`evaluateHallucination` Function

`HallucinationMetric` Class

Example