openai/evals
framework for testing and benchmarking large language models with custom or built-in evaluations

View on index · View in 3D Map
// SURVEILLANCE FEED
Discovered repositories from the open source frontier
framework for testing and benchmarking large language models with custom or built-in evaluations

View on index · View in 3D Map