Staff Engineer, API Platform

Sarvam AI

Sarvam AI

Software Engineering

Bengaluru, Karnataka, India

Posted on May 21, 2026

Location

Bengaluru

Employment Type

Full time

Location Type

On-site

Department

Engineering

About Sarvam

Sarvam is building the bedrock of Sovereign AI for India. The company is developing India’s full-stack sovereign AI platform, building across research, models, infrastructure and applications with a singular focus on making AI genuinely work for India. Sarvam works with leading enterprises and public institutions and is backed by Lightspeed, Peak XV, and Khosla Ventures. Sarvam partners with India’s leading brands, including Tata Capital, SBI Life, CRED, IDFC, and LIC.

About the Role

Sarvam's model APIs (ASR, TTS, Vision, LLM, Translate) are how developers and enterprises ship products on our foundation models. The platform handles tens of millions of API calls every day, over HTTP and WebSockets, with metering, billing, prepaid wallets, and rate-limiting all part of the same stack. Today it's FastAPI and Python; we're in the middle of rewriting it in Go.

We're hiring a Staff Engineer to own this platform end-to-end. The architecture. The reliability. The performance. The standards. This is an IC role.

What You’ll Do

  • The end-to-end design and evolution of the platform, from the moment a request hits the edge to the response that goes back out

  • The Python→Go rewrite: leading the architecture, setting the patterns, and bringing the platform across without compromising reliability

  • Audio pipeline engineering: TTS chunking around model context limits, sample-rate adjustments and format encoding, VAD-based silence detection for ASR

  • Vision pipelines: orchestrating OCR, layout detection, and VLM harnesses for structured data extraction

  • Streaming infrastructure: WebSocket connections, queue-based batch processing, back pressure, low-latency model invocation

  • The commercial layer: metering, billing, prepaid wallet management, and rate-limiting, engineered to the same bar as the model APIs themselves

  • Performance and reliability for a multi-tenant platform operating at sub-second latency SLOs

  • Observability (logging, metrics, tracing), so the team can debug production confidently and quickly

  • Integration testing infrastructure: the harness that lets us ship fast without breaking customer-facing APIs

  • Close partnership with the Inference and MLOps teams, mentoring them on production engineering so model serving is fast, reliable, and operable

  • Design-first, documentation-first engineering culture within the team: RFCs before code, decisions written down, no tribal knowledge

What We’re Looking For

  • 7+ years building production backend systems at scale

  • Strong Go in production: you've designed, built, and operated Go services under real load

  • Comfortable in Python: you'll work in both languages through the rewrite, and FastAPI is the current foundation

  • A track record of designing and operating high-scale, low-latency, multi-tenant distributed systems

  • Hands-on experience with real-time / streaming systems: WebSockets, long-lived connections, backpressure

  • Strong PostgreSQL and Redis fundamentals: schema design, query performance, caching strategy

  • Comfort running services on Kubernetes in production

  • The judgment of a senior IC: when to build vs. buy, when to optimize vs. ship, when to abstract vs. inline, and how to bring others up with you

Bonus Points

  • Experience serving LLMs, ASR, TTS, or vision models in production

  • Background in audio processing (pyav, FFmpeg, codec / sample-rate work, VAD)

  • Experience building metering, billing, or wallet / payments infrastructure

  • Time at an early- or growth-stage startup: comfort with ambiguity, speed, and ownership

Why Sarvam?

Sarvam is a fast-moving, high talent-density team building full-stack AI for India, working on problems that push the frontiers of AI with real population-scale impact.

  • Work alongside researchers, engineers, builders, and business leaders who move fast and hold each other to a very high bar

  • High ownership and high impact, from day one

  • Everything we do is AI-first, from the way we build and ship to the way we think about problems

  • You can work on problems that could change how an entire country learns, works, and communicates

If you want to work on problems at the frontier of AI in India, Sarvam is the place to be.