"End-to-end AI development platform: voice simulations, prompt tuning, evals, observability"
TLDR: Are you testing your voice agents by hand? They're launching Voice Simulations to automatically test your voice agents and flag quality issues in development and production.
Founded by Sumanyu Sharma and Marius Buleandra
Meet the team
Sumanyu previously helped Citizen (safety app; backed by Founders Fund, Sequoia, 8VC) grow its users by 4X and grew an AI-powered sales program to $100s of millions in revenue/year at Tesla.
Marius previously ran data infrastructure @ Anduril, drove user growth at Citizen with Sumanyu and was a founding engineer @ Spell (MLOps startup acquired by Reddit).
🌟 Click here to try their free Voice Simulations Demo 🌟
Problem: Making voice agents reliable feels like whack-a-mole
Here's the workflow most teams follow:
- Call your voice agent by hand and find bugs. Slow and ad-hoc.
- Tweak your voice agents by adding new tools and changing the prompts or models to fix the bugs.
- Call again to see if the changes worked.
- Detect regressions when users complain of things breaking in production.
- Repeat steps 1 to 4 until you get tired.
Calling your voice agent & finding bugs is the slowest & most painful part of the feedback loop. This is what they automate.
Hamming's take: Character AI for voice testing
They create hundreds of characters that simulate how real users interact with your voice agents in real life. For every call, they measure whether their character successfully accomplishes the task (e.g., ordering a vegan burger, canceling next week’s appointment, etc.).
Hamming's approach is 100x faster, cheaper, and more thorough than manual testing.
Flag errors & Tag calls in production
You can log all call transcripts and traces within Hamming. They actively tag your production calls in real-time, and flag cases the team needs to double-click on. This helps engineering teams quickly prioritize cases they need to fix.
Example tags: human detects that the bot is an AI, a follow-up call is needed, the user requested an urgent appointment, etc.
Test new changes quickly
Simulation-driven development
Let’s imagine you’re building an agent called ‘YC Founder’; they can spin up 100s of VC agents who will try to distract you. You can edit the prompts or models and re-run the simulation to make sure you made progress.
Want to see how you would handle a persistent investor? Try their ‘VC trying to distract founders’ free demo here.
Easily create new characters from call transcripts
When customers complain about a bad call, you can locate the call transcript and create a new character in one click. Make a change to your prompt, and then run the simulations to ensure you addressed the bad call.
Summary
The team previously launched Prompt Optimizer and AI Experimentation Tools to automate prompt engineering and make RAG pipelines more robust. In this launch, they show how you can test your voice agents quickly.
The offer
Personalized characters + 100 free calls. Struggling to make your voice agents reliable? We’ll create personalized characters and call + stress test your system