ExploreNiches

LLM Eval & Testing

9 companies in this category

9
Total
$3M
Median ARR
$25M
Top ARR
78%
Free tier %

Arize AI

Visit

Arize AI provides an end-to-end ML observability platform, expanding to include deep capabilities for LLM evaluation and monitoring.

ARR $25MsubscriptionUSA2019

Vellum AI

Visit

Vellum provides a platform for prompt engineering, LLM deployment, and evaluation with built-in analytics and monitoring.

ARR $8MFree tierusage-basedUSA2022

Humanloop

Visit

Humanloop offers tools for fine-tuning, evaluating, and deploying large language models with human feedback.

ARR $5MFree tierusage-basedUK2020

Langfuse

Visit

Langfuse is an open-source observability and evaluation platform for LLM applications, tracking traces, metrics, and user feedback.

ARR $4MFree tierusage-basedGermany2023

Promptfoo

Visit

An open-source tool and platform for testing and evaluating LLM prompts and models.

ARR $3MFree tiersubscriptionUSA2023

Promptlayer

Visit

Promptlayer acts as an API wrapper for all LLMs, providing logging, analytics, and prompt management for developers.

ARR $3MFree tierusage-basedUSA2022

Patronus AI

Visit

Patronus AI offers an automated LLM evaluation platform to detect flaws such as hallucinations, toxicity, and bias before deployment.

ARR $2MsubscriptionUSA2023

Helicone (by Braintrust)

Visit

Helicone (an open-source project supported by Braintrust) provides an observability platform for LLMs, including logging, caching, and analytics.

ARR $2MFree tierusage-basedUSA2022

Giskard

Visit

Giskard is an open-source platform for ML model testing, including an expanding focus on evaluating and debugging LLMs for security and robustness.

ARR $1MFree tierusage-basedFrance2021