Staff Engineer, API Platform
Sarvam AI
Software Engineering
Bengaluru, Karnataka, India
Location
Bengaluru
Employment Type
Full time
Location Type
On-site
Department
Engineering
About Sarvam
Sarvam is building the bedrock of Sovereign AI for India. The company is developing India’s full-stack sovereign AI platform, building across research, models, infrastructure and applications with a singular focus on making AI genuinely work for India. Sarvam works with leading enterprises and public institutions and is backed by Lightspeed, Peak XV, and Khosla Ventures. Sarvam partners with India’s leading brands, including Tata Capital, SBI Life, CRED, IDFC, and LIC.
About the Role
Sarvam's model APIs (ASR, TTS, Vision, LLM, Translate) are how developers and enterprises ship products on our foundation models. The platform handles tens of millions of API calls every day, over HTTP and WebSockets, with metering, billing, prepaid wallets, and rate-limiting all part of the same stack. Today it's FastAPI and Python; we're in the middle of rewriting it in Go.
We're hiring a Staff Engineer to own this platform end-to-end. The architecture. The reliability. The performance. The standards. This is an IC role.
What You’ll Do
The end-to-end design and evolution of the platform, from the moment a request hits the edge to the response that goes back out
The Python→Go rewrite: leading the architecture, setting the patterns, and bringing the platform across without compromising reliability
Audio pipeline engineering: TTS chunking around model context limits, sample-rate adjustments and format encoding, VAD-based silence detection for ASR
Vision pipelines: orchestrating OCR, layout detection, and VLM harnesses for structured data extraction
Streaming infrastructure: WebSocket connections, queue-based batch processing, back pressure, low-latency model invocation
The commercial layer: metering, billing, prepaid wallet management, and rate-limiting, engineered to the same bar as the model APIs themselves
Performance and reliability for a multi-tenant platform operating at sub-second latency SLOs
Observability (logging, metrics, tracing), so the team can debug production confidently and quickly
Integration testing infrastructure: the harness that lets us ship fast without breaking customer-facing APIs
Close partnership with the Inference and MLOps teams, mentoring them on production engineering so model serving is fast, reliable, and operable
Design-first, documentation-first engineering culture within the team: RFCs before code, decisions written down, no tribal knowledge
What We’re Looking For
7+ years building production backend systems at scale
Strong Go in production: you've designed, built, and operated Go services under real load
Comfortable in Python: you'll work in both languages through the rewrite, and FastAPI is the current foundation
A track record of designing and operating high-scale, low-latency, multi-tenant distributed systems
Hands-on experience with real-time / streaming systems: WebSockets, long-lived connections, backpressure
Strong PostgreSQL and Redis fundamentals: schema design, query performance, caching strategy
Comfort running services on Kubernetes in production
The judgment of a senior IC: when to build vs. buy, when to optimize vs. ship, when to abstract vs. inline, and how to bring others up with you
Bonus Points
Experience serving LLMs, ASR, TTS, or vision models in production
Background in audio processing (pyav, FFmpeg, codec / sample-rate work, VAD)
Experience building metering, billing, or wallet / payments infrastructure
Time at an early- or growth-stage startup: comfort with ambiguity, speed, and ownership
Why Sarvam?
Sarvam is a fast-moving, high talent-density team building full-stack AI for India, working on problems that push the frontiers of AI with real population-scale impact.
Work alongside researchers, engineers, builders, and business leaders who move fast and hold each other to a very high bar
High ownership and high impact, from day one
Everything we do is AI-first, from the way we build and ship to the way we think about problems
You can work on problems that could change how an entire country learns, works, and communicates
If you want to work on problems at the frontier of AI in India, Sarvam is the place to be.