"Sepal AI builds benchmarks, evals, & training datasets for enterprises"
Tl;dr: Sepal provides frontier data and tooling for advancing responsible AI development.
Founded by Kat Hu, Robert Lin & Fedor Paretsky
👪 The team:
Meet Kat, Robi, Fedor!
Robi and Kat previously built the technical LLM training business for Turing. Kat on the go-to-market & operations side. Robi on the product & fulfillment side. Fedor is a long-time close friend - he was an early engineer at Vercel & Newfront where he built out foundational infrastructure.
Sepal AI is on a mission to advance human knowledge and capabilities with the responsible development of artificial intelligence.
🧐 Responsibly advance human knowledge with AI? What does that mean?
They believe in a world where AI advances scientific research and empowers economic growth.
To achieve that future, AI product & model builders need:
- Golden Datasets and Frontier Benchmarking: To iteratively measure model performance on specific use cases.
- Training Data: To improve model capabilities using fine-tuning and RLHF.
- Safety / Red-teaming: To measure and forecast the safety of LLMs before putting them out in the wild.
⚠️ Okay, well why does it matter?
Frontier data for AI development is vital for safe deployment & scaling. However, developing this data is difficult.
Most frontier data requires domain knowledge that can be hard to source and curate (e.g., finance, medical, physics, biology, etc.). Publicly available benchmarks (e.g., MMLU, GPQA, MATH, etc.) are contaminated and too general to be useful to actual product & model builders.
🌱 How do they do this?
They’ve built Sepal AI - the data development platform that enables you to curate useful datasets.
The Platform: They bring data generation tooling, human experts, synthetic data augmentation, and rigorous quality control into one platform so you can manage the production of high-quality datasets.
Their Expert Network: They’ve built a network of 20k+ experts across STEM and professional services (think academic PhDs, business analysts, medical professionals, marketing and finance consultants) to support campaign design & data development.
Sample engagements they’ve run:
- 🧬 Cell and Molecular Biology Benchmark: An original benchmark to evaluate complex reasoning across models. Produced by a team of PhD biologists from top institutions in the US.
- 💼 Finance Q&A + SQL Eval: A Golden Dataset to test the ability of an AI agent to query a database and produce human-expert-level answers to complex finance questions.
- 📏 Uplift Trials & Human Baselining: End to end support for conducting secure in-person evaluations on model performance.
- …. [insert your custom use case next?]
🙏 Asks:
- If you are building an AI application and need to measure or improve your model, or
- If you are a researcher at an AI lab building or evaluating models for new capabilities / risk areas, or
- If you’re passionate about the development of AI, AI safety, or evals in general…
Learn More
🌐 Visit www.sepalai.com to learn more.
🤝 Interested? Book a free consultation here or reach out and say hi to the founders here