Fondo | Exla Launches: Run Datacenter Models on Edge Devices

Exla recently launched!

Launch YC: Exla – Run datacenter models on edge devices

‍

^{"Data center models. Now on the edge."}

‍

*^TLDR: The^Exla***^{SDK optimizes models for edge devices (e.g. NVIDIA Jetsons), cutting memory usage by up to 80%, with 3-20x faster inference. They are focusing on optimizing and deploying LLMs, VLMs, VLAs, and other CV models on the edge.}

^‍^{Here’s Viraat showcasing their SDK}

‍

Founded by Viraat Das & Pranav Nair

They met each other on the first day of college under questionable circumstances and have built several projects together since. Exla is the latest in that series!

Viraat graduated in 2.5 years and joined Amazon as machine learning engineer, where he worked on building personalized search and infrastructure to optimize models. In a previous life, he used to marathon around the world. Now he marathons at home, coding.

Pranav previously worked at Apple as an OS engineer, hacking the iOS/macOS kernel to improve the sleep/wake experience for over a billion devices. In his non-existent free time, he tends to his 5 year old baby – an operating system he’s built from scratch.

‍

‼️ The Problem

Frontier models are unlocking new applications on constrained edge devices – Vision-Language Models in manufacturing defect detection, Vision-Language-Action Models to control robots via natural language, and LLMs to power in-car assistants are a few examples.

But these models are now shy of a trillion parameters, and with the emergence of inference-time scaling, they are more computationally demanding than ever. This limits their adoption to edge devices with beefy GPUs and sufficiently large VRAM, and even then, a Jetson Orin Nano Super is completely saturated attempting to run a 13B model, leaving little room for other tasks.

‍

🏎️ The Solution: The Exla SDK

The team is building mixed-precision low-bit quantization software to dramatically cut the compute footprint of these models, leading to 80% less memory usage, 3-20x faster inference, and reduced energy consumption.

They are starting with the Exla SDK which applies their optimizations to a catalog of transformer-based and CV models, with growing support for your custom models. They are primarily targeting deployment on NVIDIA Jetsons, followed by CPU-based platforms like Raspberry Pis and other embedded platforms.

‍

Their roadmap includes building custom silicon that takes advantage of the quirks of low bit compute – which they expect to bring in another order of magnitude in compute savings. They are bringing frontier models everywhere.

‍

Learn More

‍

^{🌐 Visit}^exla.ai^{to learn more.}

^‍

*^{🤝 Reach out to the founders}^here****^{if you’re facing issues optimizing your models on Jetsons, other edge devices, or on-prem deployments! They are happy to onboard you to their private beta.}*

^‍

^{✨ They are particularly looking to solve model optimization at companies working on robotics, manufacturing & industrial automation, and camera-based systems – your help would make a world of difference <3}

^‍

^‍^‍*^{👣 Follow Exla on}^LinkedIn***^&^X^.

Posted

March 14, 2025

Launch

David J. Phillips

CEO & Founder

View Posts

About The Author

David is the CEO & Founder of Fondo (YC W18). He is an angel investor in Rippling, Flexport, LiquidDeath, and 85+ other startups. David began his career as an accountant at Deloitte before learning to code and becoming a founder. Previously, he was co-founder of Hackbright where 1,000+ software engineers have been trained and placed at tech companies including Slack, Disney, and Uber and was acquired by Capella Education NASDAQ: $CPLA in 2016.

← Back to all posts

Exla Launches: Run Datacenter Models on Edge Devices

Save time, money, and run a better startup.

The all-in-one accounting platform for startups. Bookkeeping, taxes, and tax credits on autopilot.

"Data center models. Now on the edge."

TLDR: The Exla SDK optimizes models for edge devices (e.g. NVIDIA Jetsons), cutting memory usage by up to 80%, with 3-20x faster inference. They are focusing on optimizing and deploying LLMs, VLMs, VLAs, and other CV models on the edge.

‍Here’s Viraat showcasing their SDK

‼️ The Problem

🏎️ The Solution: The Exla SDK

Learn More

🌐 Visit exla.ai to learn more.

‍

🤝 Reach out to the founders here if you’re facing issues optimizing your models on Jetsons, other edge devices, or on-prem deployments! They are happy to onboard you to their private beta.

‍

✨ They are particularly looking to solve model optimization at companies working on robotics, manufacturing & industrial automation, and camera-based systems – your help would make a world of difference <3

‍

‍‍👣 Follow Exla on LinkedIn & X.

Featured

c/ua Launches: Docker Container for Computer-Use Agents

Quickbooks Cash vs Accrual

Quickbooks Accrual vs Cash

Categories

David J. Phillips

About The Author

Simplify Startup Finances Today

Take the stress out of bookkeeping, taxes, and tax credits with Fondo’s all-in-one accounting platform built for startups. Start saving time and money with our expert-backed solutions.

Simplify Startup Finances Today

Take the stress out of bookkeeping, taxes, and tax credits with Fondo’s all-in-one accounting platform built for startups. Start saving time and money with our expert-backed solutions.

Exla Launches: Run Datacenter Models on Edge Devices

"Data center models. Now on the edge."

TLDR: The Exla SDK optimizes models for edge devices (e.g. NVIDIA Jetsons), cutting memory usage by up to 80%, with 3-20x faster inference. They are focusing on optimizing and deploying LLMs, VLMs, VLAs, and other CV models on the edge.

‍Here’s Viraat showcasing their SDK

‼️ The Problem

🏎️ The Solution: The Exla SDK

Learn More

🌐 Visit exla.ai to learn more.

‍

🤝 Reach out to the founders here if you’re facing issues optimizing your models on Jetsons, other edge devices, or on-prem deployments! They are happy to onboard you to their private beta.

‍

✨ They are particularly looking to solve model optimization at companies working on robotics, manufacturing & industrial automation, and camera-based systems – your help would make a world of difference <3

‍

‍‍👣 Follow Exla on LinkedIn & X.

David J. Phillips

About The Author

Join Our Newsletter and Get the LatestPosts to Your Inbox

Featured

c/ua Launches: Docker Container for Computer-Use Agents

Quickbooks Cash vs Accrual

Quickbooks Accrual vs Cash

Categories

Newsletter

Save time, money, and run a better startup.

The all-in-one accounting platform for startups. Bookkeeping, taxes, and tax credits on autopilot.

Products

Resources

About

Get started ⚡

^{"Data center models. Now on the edge."}

*^TLDR: The^Exla***^{SDK optimizes models for edge devices (e.g. NVIDIA Jetsons), cutting memory usage by up to 80%, with 3-20x faster inference. They are focusing on optimizing and deploying LLMs, VLMs, VLAs, and other CV models on the edge.}

^‍^{Here’s Viraat showcasing their SDK}

^{🌐 Visit}^exla.ai^{to learn more.}

^‍

*^{🤝 Reach out to the founders}^here****^{if you’re facing issues optimizing your models on Jetsons, other edge devices, or on-prem deployments! They are happy to onboard you to their private beta.}*

^‍

^{✨ They are particularly looking to solve model optimization at companies working on robotics, manufacturing & industrial automation, and camera-based systems – your help would make a world of difference <3}

^‍

^‍^‍*^{👣 Follow Exla on}^LinkedIn***^&^X^.

^{"Data center models. Now on the edge."}

*^TLDR: The^Exla***^{SDK optimizes models for edge devices (e.g. NVIDIA Jetsons), cutting memory usage by up to 80%, with 3-20x faster inference. They are focusing on optimizing and deploying LLMs, VLMs, VLAs, and other CV models on the edge.}

^‍^{Here’s Viraat showcasing their SDK}

^{🌐 Visit}^exla.ai^{to learn more.}

^‍

*^{🤝 Reach out to the founders}^here****^{if you’re facing issues optimizing your models on Jetsons, other edge devices, or on-prem deployments! They are happy to onboard you to their private beta.}*

^‍

^{✨ They are particularly looking to solve model optimization at companies working on robotics, manufacturing & industrial automation, and camera-based systems – your help would make a world of difference <3}

^‍

^‍^‍*^{👣 Follow Exla on}^LinkedIn***^&^X^.

Join Our Newsletter and Get the Latest
Posts to Your Inbox