Quickstart

Installation

EvalKit exports multiple NPM packages.
We’ll concentrate on the evaluations framework package, which is available at:

npm install --save-dev @evalkit/core

OpenAI Configuration

Currently, EvalKit is using OpenAI API in order to perform evaluations.
Therefore, you’ll need to set up an OpenAI account and get an API key. Or configure it otherwise

Standard OpenAI

Simply set your OpenAI API key in your environment:

OPENAI_API_KEY=your-api-key

Or if you prefer to set it programmatically:

import { config } from '@evalkit/core';

config.setOpenAIConfig({
  apiKey: process.env.OPENAI_KEY
});

Custom OpenAI/Azure OpenAI

For custom OpenAI endpoints (like Azure OpenAI), use the config:

import { config } from '@evalkit/core';

config.setOpenAIConfig({
  apiKey: process.env.MY_AZURE_KEY,
  baseURL: process.env.MY_AZURE_ENDPOINT,
  apiVersion: '2023-05-15',  // Optional, defaults to 2023-05-15
  deploymentName: process.env.MY_DEPLOYMENT
});

The configuration is environment-agnostic - use any environment variables or configuration source you prefer. Required fields for custom endpoints:

apiKey
baseURL
deploymentName (for Azure OpenAI)

Configuration File

You can also configure EvalKit using an evalkit.config.ts (or .js) file in your project root:

// evalkit.config.ts
import type { EvalKitConfig } from '@evalkit/core';

const config: EvalKitConfig = {
  openai: {
    // Your OpenAI API key (defaults to OPENAI_API_KEY env var)
    apiKey: process.env.OPENAI_API_KEY,
    
    // Optional: For Azure OpenAI
    // baseURL: 'https://your-resource.openai.azure.com',
    // apiVersion: '2023-05-15',
    // deploymentName: 'your-deployment',
  },
  
  reporting: {
    // Which report formats to generate ('json' and/or 'html')
    // Leave empty for console-only output
    outputFormats: ['json', 'html'],
    
    // Where to save the report files
    outputDir: './eval-reports'
  }
};

export default config;

The config file supports:

OpenAI Settings

Same as the programmatic configuration above.

Reporting Settings

outputFormats: Array of report formats to generate
- []: Console output only (default)
- ['json']: JSON reports
- ['html']: HTML reports
- ['json', 'html']: Both formats
outputDir: Directory where reports will be saved (default: ’./eval-reports’)

Writing your first evaluation

The EvalKit evaluations framework provides a simple API to evaluate text using pre-trained transformer models. Here’s a quick example to get you started:

import { evaluate, BiasDetectionMetric } from '@evalkit/core';

// Simple evaluation
const result = await evaluate({
  output: "The company prefers to hire young and energetic candidates.",
}, [BiasDetectionMetric]);

// Advanced evaluation with multiple metrics
const result = await evaluate({
  input: "What are the benefits of renewable energy?",
  output: "Renewable energy sources reduce emissions and help fight climate change.",
  expectedOutput: "Renewable energy sources like solar and wind power reduce greenhouse gas emissions and help combat climate change.",
  criteria: [{ type: "Accuracy" }, { type: "Relevance" }]
}, [RelevancyMetric, DynamicMetric]);

Getting Started

Evaluations

Installation

OpenAI Configuration

Standard OpenAI

Custom OpenAI/Azure OpenAI

Configuration File

OpenAI Settings

Reporting Settings

Writing your first evaluation

Getting Started

Evaluations

​Installation

​OpenAI Configuration

​Standard OpenAI

​Custom OpenAI/Azure OpenAI

​Configuration File

​OpenAI Settings

​Reporting Settings

​Writing your first evaluation

Installation

OpenAI Configuration

Standard OpenAI

Custom OpenAI/Azure OpenAI

Configuration File

OpenAI Settings

Reporting Settings

Writing your first evaluation