People Matter

Principal Site Reliability Engineer



Administration, Software Engineering
New York, NY, USA
Posted on Monday, February 26, 2024

Our Mission

Healthcare should work for patients, but it doesn’t. In their time of need, they call down outdated insurance directories. Then wait on hold. Then wait weeks for the privilege of a visit. Then wait in a room solely designed for waiting. Then wait for a surprise bill. In any other consumer industry, the companies delivering such a poor customer experience would not survive. But in healthcare, patients lack market power. Which means they are expected to accept the unacceptable.

Zocdoc’s mission is to give power to the patient. To do that, we’ve built the leading healthcare marketplace that makes it easy to find and book in-person or virtual care in all 50 states, across +200 specialties and +12k insurance plans. By giving patients the ability to see and choose, we give them power. In doing so, we can make healthcare work like every other consumer sector, where businesses compete for customers, not the other way around. In time, this will drive quality up and prices down.

We’re 15 years old and the leader in our space, but we are still just getting started. If you like solving important, complex problems alongside deeply thoughtful, driven, and collaborative teammates, read on.

Your Impact on our Mission:

Zocdoc is looking for a Principal Site Reliability Engineer to help drive the reliability, resiliency, observability, availability, and scalability of our systems and services. You’ll be challenged to drive continuous improvements to uptime and performance for our patients and providers in a constantly evolving environment. You’ll work with and provide subject matter expertise for our AWS cloud-based environments, distributed systems, monolith, and microservices. We’re looking for someone who loves challenging the status quo and strives to make everything they touch easier, faster, and more robust.

You’ll enjoy this role if you are…

  • Passionate about ensuring complex systems never skip a beat
  • Pragmatic in your decision making day-to-day
  • Motivated to learn new technologies, design patterns, and work in the cloud
  • Comfortable with failures and outages and believe in blameless post-mortems
  • Excited to work in a highly collaborative environment with diverse individuals
  • Autonomous, individually accountable, and comfortable working in a remote environment
  • A believer that diverse and inclusive teams and cultures are non-negotiable

Your day to day is…

  • Analyzing and decomposing complex distributed system challenges to drive sound design decisions targeted toward reliability, availability, scalability, and resiliency
  • Supporting our large product engineering org with their scaling, performance, and uptime needs as well as helping diagnose and debug production related issues
  • Monitoring and maintaining complex cloud-based infrastructure, systems, and services, and ensuring its uptime in order to enable millions of patients to get the care they need
  • Automating and codifying our tooling, processes, and infrastructure to speed up development and make them repeatable and error-proof
  • Analyzing and performance tuning systems, code, and networking for scaling and optimal operation
  • Mentoring your peers and colleagues by giving thoughtful feedback, with the notion that helping others means learning and growing yourself

You’ll be successful in this role if you have…

  • A Bachelor’s degree in Computer Science, Computer Engineering, or equivalent engineering experience
  • 8+ years of progressive engineering experience in Site Reliability or adjacent disciplines (DevOps, Platform, Backend Engineering, etc.)
  • 5+ years of experience in deploying, managing, and supporting modern cloud-based environments and infrastructure like AWS/Azure/GCP, Docker, Kubernetes, IaC, etc.
  • 4+ years of production on-call experience in a 24/7 cloud-based environment
  • Exceptional troubleshooting, debugging, and diagnostic skills for cloud and web-based technologies using industry standard observability tooling and frameworks
  • Experience with edge technologies such as load balancers, reverse proxies, web application firewalls, routing, etc.
  • Deep understanding of web applications and ability to troubleshoot HTTP/HTTPS, TLS, DNS, TCP/IP, and similar protocols


  • Flexible, hybrid work environment
  • Unlimited PTO
  • 100% paid employee health benefit options
  • Employer funded 401(k) match
  • L&D offerings + a free LinkedIn learning account
  • Corporate wellness programs with Headspace and Peloton
  • Sabbatical leave (for employees with 5+ years of service)
  • Competitive parental leave
  • Cell phone reimbursement
  • In office perks including:
    • Catered lunch everyday along with snacks
    • Commuter Benefits
    • Convenient Soho location

Zocdoc is committed to fair and equitable compensation practices. Salary ranges are determined through alignment with market data. Base salary offered is determined by a number of factors including the candidate’s experience, qualifications, and skills. Certain positions are also eligible for variable pay and/or equity; your recruiter will discuss the full compensation package details.
NYC Base Salary Range
$230,500$326,000 USD

About us
Zocdoc is the country’s leading digital health marketplace that helps patients easily find and book the care they need. Each month, millions of patients use our free service to find nearby, in-network providers, compare choices based on verified patient reviews, and instantly book in-person or video visits online. Providers participate in Zocdoc’s Marketplace to reach new patients to grow their practice, fill their last-minute openings, and deliver a better healthcare experience. Founded in 2007 with a mission to give power to the patient, our work each day in pursuit of that mission is guided by our six core values. Zocdoc is a private company backed by some of the world’s leading investors, and we believe we’re still only scratching the surface of what we plan to accomplish.

Zocdoc is a mission-driven organization dedicated to building teams as diverse as the patients and providers we aim to serve. In the spirit of one of our core values - Together, Not Alone, we are a company that prides itself on being highly collaborative, and we believe that diverse perspectives, experiences and contributors make our community and our platform better. We’re an equal opportunity employer committed to providing employees with a work environment free of discrimination and harassment. Applicants are considered for employment regardless of race, color, ethnicity, ancestry, religion, national origin, gender, sex, gender identity, gender expression, sexual orientation, age, citizenship, marital or parental status, disability, veteran status, or any other class protected by applicable laws.

Job Applicant Privacy Notice