RealPerformance: A Dataset of Language Model Business Compliance Issues
David Berenstein
Jean-Marie John-Mathews
RealPerformance is a dataset of functional issues of language models, that mirrors failure patterns identified through rigorous testing in real LLM agents