ML Researcher, Foundational Models
Sarvam AI
Software Engineering, Data Science
Bengaluru, Karnataka, India
Location
Bengaluru
Employment Type
Full time
Location Type
On-site
Department
Models
About Sarvam
Sarvam is building the bedrock of Sovereign AI for India. The company is developing India’s full-stack sovereign AI platform, building across research, models, infrastructure and applications with a singular focus on making AI genuinely work for India. Sarvam works with leading enterprises and public institutions and is backed by Lightspeed, Peak XV, and Khosla Ventures. Sarvam partners with India’s leading brands, including Tata Capital, SBI Life, CRED, IDFC, and LIC.
About the Role
You'll work on the open-ended questions that decide what our next models look like. This is not a fine-tuning role. It's not an "apply known recipes" role. We're looking for a researcher who can take a vague hunch about architecture, optimization, data composition, or training dynamics, design the right ablations, run them at meaningful scale, and turn the result into a decision that ships in a production training run.
You will have direct access to large compute and a tight feedback loop with the engineers building our training stack and data pipelines. You will be expected to disagree with the team and defend your reasoning with evidence.
What You’ll Do
Drive open-ended research on architecture, optimization, scaling behaviour, training stability, and post-training recipes for our next generation of foundational models.
Design and execute ablations at scales that actually inform large-run decisions — including running pre-training experiments end-to-end yourself.
Translate research findings into concrete proposals for the next training run, and own those proposals through to production.
Work shoulder-to-shoulder with the infrastructure and data teams; many of the most important research questions live at that boundary.
Read broadly, write internally, and publish externally when the work merits it.
What We’re Looking For
PhD in Machine Learning, Computer Science, or a closely related field (or in the final stages of completion).
3+ years of research experience post-PhD (or equivalent depth). Exceptional early-career candidates with a strong research record will be considered.
First-author publications at top-tier ML venues such as NeurIPS, ICML, ICLR, ACL, EMNLP, or COLM.
Hands-on experience pre-training transformer-based language models from scratch, ideally at 7B+ parameters. You should be able to describe a training run you owned end-to-end, including what went wrong and how you debugged it.
Meaningful contributions to the open-source LLM ecosystem — research code, model releases, datasets, or substantive contributions to widely-used projects.
Fluency in PyTorch and comfort with distributed training. You should be able to read a training loop and immediately see where it might be slow, unstable, or wrong.
Strong intuition for experimental design — knowing what to measure, what to ablate, and what scale a result needs to hold at before it can be trusted.
Bonus Points
Work on novel architectures (mixture-of-experts, state-space models, hybrid architectures) or non-trivial modifications to standard transformers.
Experience with multilingual or multimodal pre-training.
Research contributions in post-training (RLHF, RLVR, distillation, reasoning models).
A track record of taking a research idea from prototype to a capability that shipped.
Why this role?
You will be one of a small number of people in the world who get to make architectural and training-recipe decisions on a frontier-scale model run, with the autonomy and compute to back it up.
Why Sarvam?
Sarvam is a fast-moving, high talent-density team building full-stack AI for India, working on problems that push the frontiers of AI with real population-scale impact.
Work alongside researchers, engineers, builders, and business leaders who move fast and hold each other to a very high bar
High ownership and high impact, from day one
Everything we do is AI-first, from the way we build and ship to the way we think about problems
You can work on problems that could change how an entire country learns, works, and communicates
If you want to work on problems at the frontier of AI in India, Sarvam is the place to be.