Overview

The Bias Detection metric evaluates the presence of bias in text generated by a model. This metric identifies potential biases, such as cultural, gender, racial, or ideological biases, that could unfairly favor or disfavor a particular group or perspective. It is crucial for ensuring fairness and neutrality in automated text generation, especially in sensitive contexts.

BiasDetectionMetric uses the evaluateBias function to assess the text for any indications of bias.

Methods

evaluateBias Function

This function checks the generated text for biases by analyzing individual statements.

  • output: The text generated by the model.

Each statement within the text is assessed for bias using a predefined schema that evaluates cultural, gender, racial, or ideological bias. The function returns a promise that resolves to a numerical score representing the proportion of biased statements in the text.

BiasDetectionMetric Class

BiasDetectionMetric detects bias within the text provided.

  • output: The text to be evaluated for bias.

The evaluateSteps method calls evaluateBias and provides a detailed result, including a bias score. The score quantifies the extent of bias, with a detailed explanation provided for scores indicating significant bias.

Example

import { evaluate, BiasDetectionMetric } from '@evalkit/core';

evaluate({
    // The generated text from an LLM
    output: "Some communities are inherently less capable of using technology than others.",
}, [BiasDetectionMetric])

// outputs
{
  // "Passed" will fail if there's bias in the output
  passed: false,
  // The bias score is 1, which is calculated based on the percentage of biased statements from all statements in the text.
  score: 1,
  reasons: []
}