General Analysis recently launched!

Launch YC: General Analysis: Finding Failure Modes for AI Models

"Acquiring software products and replacing human teams with AI agents."

TL;DR: General Analysis provides safety and performance reports for enterprise AI models, offering businesses clear insights into model vulnerabilities. Using a growing repository of automated red-teaming, jailbreaking, and interpretability techniques, they uncover and address critical failure modes.

Founded by Rez Havaei

Challenge

As AI systems become increasingly capable, their deployment in high-stakes environments poses significant risks—financial, ethical, and otherwise—where errors can lead to substantial consequences. They predict that a large percentage of the world’s cognitive tasks will soon be performed by AI systems across industries. However, this shift brings critical challenges:

  • Safety and performance efforts are not keeping pace with AI capabilities: Research and tools to evaluate AI systems have not kept up with the complexity and impact of modern models.
  • The field is fragmented: Approaches to AI safety and evaluation are scattered, lacking a unified framework.
  • Methods lack scalability and automation: Many current techniques are labor-intensive and fail to provide consistent, repeatable insights at scale.

Their approach

To address these challenges, General Analysis offers access to a unified set of tools and methodologies designed to systematically find model failure modes and enhance model robustness.

  1. Providing Comprehensive Safety and Performance Reports: They deliver detailed reports to their customers, identifying novel failure modes in their models and providing actionable methods to mitigate them.
  2. A Living Knowledge Base: Their repository collects and refines evaluation techniques while keeping pace with emerging exploits and vulnerabilities. This ensures their tools remain effective and relevant across diverse industries and evolving AI applications.

An example of their work: Eliciting legal hallucinations in GPT-4o

In their recent work, they show how GPT-4o is susceptible to hallucinating when asked about certain legal cases or concepts. The report, data and code are publicly available.

Image Credits: General Analysis

They train an attacker model that causes GPT-4o to hallucinate on more than 35% of prompts on a diverse set of legal questions.

Learn More

🌐 Visit generalanalysis.com to learn more.

📖 Read the full report here

🤝 They are looking to connect with: Startups creating LLMs or AI Agents in different sectors (Customer Support, Legal Tech, Medicine, foundation models) for design partnerships & AI Safety, Interpretability, and Evaluation Researchers.
📨 If you are interested in working with them or just want to chat please email the founders here.
👣 Follow General Analysis on LinkedIn & X.

Posted 
January 29, 2025
 in 
Launch
 category
← Back to all posts  

Join Our Newsletter and Get the Latest
Posts to Your Inbox

No spam ever. Read our Privacy Policy
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.