Arista Networks pioneered software-driven, cognitive cloud networking for large-scale
datacenter and campus environments. Arista's award-winning platforms, ranging in
Ethernet speeds from 10 to 400 gigabits per second, redefine scalability, agility and
resilience. Arista has shipped more than 20 million cloud networking ports worldwide
with CloudVision and EOS, an advanced network operating system. Committed to open
standards, Arista is a founding member of the 25/50G consortium. Arista Networks
products are available worldwide directly and through partners.

Site Reliability Engineers at Arista are critical team members that have a breadth of knowledge encompassing all aspects of service delivery. They develop software solutions to enhance, harden and support our service delivery processes. This can include building and managing CI/CD deployment pipelines, automated testing, capacity planning, performance analysis, monitoring, alerting, chaos engineering, and auto-remediation.

The SRE should have an “automate everything” mindset, helping us bring value to our customers by deploying services with incredible speed, consistency, and availability.

The SRE constantly evaluates products and services before and after production releases to prevent, identify and fix problems that impact service availability in deploying, configuring, releasing, monitoring, recovering, and scaling.

We are hiring for 1 year Contract initially and then convert Full Time based on performance.

Job Description

Responsibilities:

Ensure the scalability, performance, and resilience of our suite of products
Work with the development and product team to establish the right monitoring and alerting strategy
Develop build, test, and deployment automation that seamlessly targets multiple cloud regions
Define and implement standards and best practices related to, system architecture, service delivery, metrics, and the automation of operational tasks
Optimize telemetry platform to identify customer-impacting events while providing relevant data to drive debugging
Partner with the engineering team to optimize the performance of services for cloud architecture
Debug Live Site events and conduct follow-up post-mortem and RCA analysis

Qualifications

B.E/B.Tech in Computer Science or equivalent
7+ years of relevant experience
Scripting languages like Bash, Python, etc.
Exposure to operational knowledge of managing applications in AWS/GCP
Experienced in automating software build, deployment, and server configuration management using tools such as Puppet, Chef, and Jenkins
Hands-on experience with Linux/Unix Administration
Good understanding of containerization concepts - docker, ECS, EKS, Kubernetes
Experience with building tools such as Jenkins
Working experience with NoSQL databases such as MongoDB, PostgreSQL, etc.
Understanding of basic networking concepts

Additional Information

All your information will be kept confidential according to EEO guidelines.

This job is no longer accepting applications

See open jobs at Arista Networks.See open jobs similar to "Site Reliability Engineer" Khosla Ventures.

See more open positions at Arista Networks

Privacy policy Cookie policy