ML Infrastructure Engineer
Spectral Labs
Software Engineering, Other Engineering, Data Science
USD 350k-600k / year + Equity
Posted on Jan 24, 2026
ML Infrastructure Engineer
Who we are:
Spectral Labs is a spatial intelligence company building reasoning models for engineering physical systems. Our model SGS-1 is state-of-the-art for parametric geometry, and we are currently building the next generation of models to revolutionize how systems are designed and manufactured from the ground up. Our team is small and talent dense. We have founded quantitative trading firms and built generative design at Autodesk. Our founding members have worked on the cutting edge of applied AI at Meta, Autodesk Research and Samsung Research.
Role: In person in SF
Comp: 350-600k+ TC
What we're looking for
Spectral is seeking a team member who will develop ML pipelines to fine tune and run RL on our CAD foundation models. This person will own the infrastructure for making our models better.
Responsibilities
Optimize distributed training & RL across our GPU cluster of hundreds of H100 GPUs (FSDP, DeepSpeed, or custom parallelism strategies)
Identify and correct bottlenecks with a complex, bespoke multi-modal training + RL setup
Own our training + RL infrastructure
Work closely with researchers to unblock training experiments and reduce iteration time
Qualifications
Experience optimizing multi-node training at scale
Deep understanding of profiler traces: can understand when a system is I/O, network, CPU, or GPU bound
Comfort with NCCL internals
Experience with high-performance networking stacks (e.g., GCP TCPXO) is a plus
Experience with various different types of models and their unique training challenges (diffusion, AR models, etc)
Benefits
Compensation competitive with top opportunities, including meaningful ownership
Health insurance with 100% premium covered
Free lunch, dinner, and snacks