Site Reliability Engineer (SRE/ DevOps) - Engineering Productivity
Arista Networks
This job is no longer accepting applications
See open jobs at Arista Networks.See open jobs similar to "Site Reliability Engineer (SRE/ DevOps) - Engineering Productivity" Khosla Ventures.Company Description
Arista Networks was founded to pioneer and deliver software driven cloud networking solutions for large data center storage and computing environments. Arista’s award-winning platforms, ranging in Ethernet speeds from 10 to 400 gigabits per second, redefine scalability, agility and resilience. Arista has shipped more than 20 million cloud networking ports worldwide with CloudVision and EOS, an advanced network operating system. Committed to open standards, Arista is a founding member of the 25/50GbE consortium. Arista Networks products are available worldwide directly and through partners.
Additional information and resources can be found at:
www.arista.com
www.twitter.com/aristanetworks
www.facebook.com/AristaNW
www.youtube.com/user/AristaNetworks
Job Description
Working in Engineering Productivity (EngProd), you will collaborate and work with other engineers to design, build, scale, and operate the systems that the rest of Arista’s development teams use. The EngProd team uses industry-standard systems like Ansible, Jenkins, Kubernetes, Grafana, Spinnaker, MySQL, ElasticSearch, Google Cloud, and Varnish and also internal systems that we’ve built from the ground-up to automate CI/CD, testing, analysis, and visualization.
Responsibilities:
Keeping the production status green all the time
Proactively monitor, respond to, and enhance alerts
Build automated responses to the most common alerts or work with the rest of the EngProd team to build them
Create and maintain the incident response runbooks working with the service dev teams
Debug and resolve issues impacting developer user experience and infrastructure stability
Develop patterns to support system reliability and socialize them within the EngProd team
Review and contribute to the specifications and implementations written by other team members.
Work with Arista’s software engineers to identify bottlenecks and limitations in our workflows, tooling, and infrastructure and provide fixes for those problems.
Provide support for our tools and infrastructure to Arista’s development team.
Qualifications
At least BS Computer Science or Engineering +5 years’ experience, MS Computer Science or Engineering + 4 years’ experience, or Ph.D. in Computer Science or equivalent work experience.
Knowledge of one or more of Go, Python, Javascript, Shell Scripting.
Knowledge of Linux (or UNIX).
Experience operating software systems at scale
Strong understanding of the fundamentals of storage and networking
Comfortable with Ansible and GitOps
Applied understanding of software engineering principles.
Strong problem solving and software troubleshooting skills.
Ability to design a solution and implement features independently. Ability to work in small teams.
Additional Information
All your information will be kept confidential according to EEO guidelines.
This job is no longer accepting applications
See open jobs at Arista Networks.See open jobs similar to "Site Reliability Engineer (SRE/ DevOps) - Engineering Productivity" Khosla Ventures.