People Matter

SRE Engineer

RingCentral

RingCentral

Other Engineering
Valencia, Spain
Posted on Sep 15, 2025

Position Overview

As a Site Reliability Engineer for RingCentral Events, you're not just an infrastructure owner - you're a crucial part of our mission to deliver flawless, high-scale experiences for global audiences. Your role is central to our ability to deliver a reliable and performant platform. You will be a key contributor to our software delivery flow, ensuring that changes move from development to production with speed, safety, and consistency. Additionally, you will proactively eliminate observability gaps and build a self-healing infrastructure to ensure our system performs under pressure.

Responsibilities:

  • Manage cloud infrastructure on AWS and EKS, leveraging IaC and GitOps to ensure scalability

  • Participate in service capacity planning, software performance analysis, and system tuning

  • Design, consult, re-platform, and re-factor the observability of current cloud infrastructure

  • Participate in release management, working closely with engineering teams to bring GitOps principles to our release process and manage CI/CD pipelines using GitLab CI

  • Take part in 24/7 on-call responsibilities (~2 days/month based on rotation schedule) to ensure continuous availability and quick response to issues in production

  • Conduct blameless post-mortems to learn from incidents and prevent future ones

  • Develop and test disaster recovery plans and runbooks to ensure business continuity

  • Implement security best practices and controls within the infrastructure to meet compliance standards and prepare for audits

Requirements:

  • Familiarity with cloud-native services and architectures, experience with cloud providers - our infrastructure is built on AWS

  • Experience in running mission critical services at scale without disruption

  • Hands-on experience with Kubernetes and infrastructure as code (IaC) using Terraform, focusing on scalability and efficient infrastructure management

  • Proficiency in designing and maintaining CI/CD pipelines, with a preference for GitLab CI

  • Experience with monitoring, APM, logging, and analytics tools

  • Strong problem-solving skills with the ability to analyze and debug complex distributed systems, tracing requests and data flows from the kernel to the web to identify root causes

  • Ability to spot, address, and optimize performance bottlenecks

  • Proactive approach, favoring iterative action over waiting for things to happen or to be perfect

  • Familiarity with incident, problem and change management processes and best practices

Nice to have:

  • A reliability-oriented mindset with a focus on designing and building resilient architectures

  • Previous SRE experience or knowledge, giving you a heightened awareness of what data to collect, how to display it, and how users can benefit from it

  • Knowledge of scripting languages such as Python or Go

  • Familiarity with GitOps principles and tools like ArgoCD

  • Knowledge of caching mechanisms, such as Redis

  • Experience with messaging queues like MSK Kafka, SQS or RabbitMQ

  • Familiarity with database management systems like AWS Aurora and PostgreSQL

We offer:

  • Well-coordinated professional team

  • Cutting edge technologies, interesting and challenging tasks, dynamic project, great opportunities for self-realization, professional and career growth

  • Additional Health and Life Insurance Package

  • Employee Assistance Program

  • 25 vacation days

  • ReBenefit Platform Account.

  • This role requires on-site presence at our office 4 days a week to support effective collaboration and teamwork.