Backend Engineer - Studio Media Platform
Sarvam AI
Marketing & Communications, Software Engineering
Bengaluru, Karnataka, India
Location
Bengaluru
Employment Type
Full time
Location Type
On-site
Department
Engineering
About Sarvam
Sarvam is building the bedrock of Sovereign AI for India. The company is developing India’s full-stack sovereign AI platform, building across research, models, infrastructure and applications with a singular focus on making AI genuinely work for India. Sarvam works with leading enterprises and public institutions and is backed by Lightspeed, Peak XV, and Khosla Ventures. Sarvam partners with India’s leading brands, including Tata Capital, SBI Life, CRED, IDFC, and LIC.
About the Role
We are hiring a Backend Engineer to work across Sarvam’s Studio media platform — spanning AI dubbing, live translation, and the shared service foundation that powers all Studio products (voice cloning, stem separation, lip sync, music generation, and more). You will build and maintain production services, ML pipeline libraries, and platform SDKs that together enable multilingual media processing at scale for enterprise customers and Sarvam Studio users.
The work cuts across multiple codebases: a core ML pipeline library (ASR, translation, TTS, audio processing), production services for dubbing and live translation, and a shared platform SDK that provides common capabilities to every Studio service.
What You’ll Do
Service & Infrastructure
Design and optimize production FastAPI services for dubbing and live translation — multi-stage task orchestration, rate-limited scheduling, and backpressure controls for concurrent workloads
Build and maintain distributed worker architectures with independent scaling per pipeline stage and automatic recovery of stuck or failed tasks
Own the data layer — async ORM models, schema migrations, and query optimization on PostgreSQL
Implement real-time features — WebSocket-based job tracking for dubbing and streaming audio pipelines for live translation
Manage Kubernetes deployments — Helm charts, secrets management, ingress configuration, and multi-role container images
ML Pipeline & Library
Extend and maintain the core dubbing library across all pipeline stages: audio extraction, VAD, speech recognition, translation, QC, TTS, and final video stitching
Integrate and optimize ML model serving — remote inference server clients and local model inference for audio analysis and vocal separation
Build and improve QC orchestration — automated scoring, tempo analysis, guided normalization, and pronunciation verification
Design async-first pipelines with efficient concurrency patterns for CPU-bound audio processing
Maintain and evolve LLM integration layers for translation, QC, and pre-processing across multiple provider backends
Platform SDK & Shared Services
Build and maintain the shared Studio service SDK — reusable FastAPI middleware and routers for authentication, billing, workspace isolation, and input validation
Design media storage abstractions — upload, signed URL generation, retention policies, and cloud blob storage integration
Implement cross-cutting concerns: rate limiting, metering, audit trails, and request history across all Studio services
Build observability foundations — OpenTelemetry instrumentation, structured logging, and metrics collection shared across services
Quality & Operations
Maintain high test coverage with strong CI gates, parallel test execution, and thorough mocking of external services
nstrument services with custom metrics and structured error tracking for production observability
Manage CI/CD pipelines — automated testing, linting, container builds, artifact publishing, and version management
What We’re Looking For
4–6 years of experience in backend engineering, with a focus on building and operating production services at scale
Strong proficiency in Python with hands-on experience building production FastAPI or similar async web services (non-negotiable)
Deep understanding of async programming — asyncio, concurrent execution patterns, and designing for high-throughput workloads
Experience with distributed task systems: task queues (Celery or similar), message brokers, and designing fault-tolerant job orchestration
Hands-on with PostgreSQL and an async ORM (SQLAlchemy preferred) — comfortable with query optimization, schema design, and migrations
Familiarity with audio/media processing: FFmpeg, common audio formats, and processing libraries like soundfile or librosa
Experience integrating ML models into production — API-based (REST/gRPC) or local inference (PyTorch, ONNX Runtime)
Experience building reusable libraries or SDKs — designing clean APIs, managing backward compatibility, and publishing packages for internal consumers
Proficiency with Docker, Kubernetes, Helm charts, and at least one major cloud platform (Azure/GCP/AWS)
Strong testing discipline — writing thorough tests, mocking external services, and maintaining CI/CD pipelines
Bonus Points
Prior experience with speech/NLP systems: ASR, TTS, or machine translation
Experience with ML model serving infrastructure (Triton, TorchServe, or similar)
Familiarity with LLM orchestration for structured output and multi-step agent workflows
Experience with real-time streaming — WebSocket, WebRTC, or Server-Sent Events in production
Familiarity with Indic languages and the nuances of multilingual content (code-mixing, transliteration, regional dialects)
Experience designing platform middleware — auth, billing, rate limiting, or multi-tenant isolation
Experience with observability tooling — Prometheus, Grafana, OpenTelemetry, or similar stacks
Familiarity with video processing pipelines and media localization workflows
Contributions to open-source backend, audio, or NLP projects
Why Sarvam?
Sarvam is a fast-moving, high talent-density team building full-stack AI for India, working on problems that push the frontiers of AI with real population-scale impact.
Work alongside researchers, engineers, builders, and business leaders who move fast and hold each other to a very high bar
High ownership and high impact, from day one
Everything we do is AI-first, from the way we build and ship to the way we think about problems
You can work on problems that could change how an entire country learns, works, and communicates
If you want to work on problems at the frontier of AI in India, Sarvam is the place to be.