Previous Job
Site Reliability Engineers (SRE)
Ref No.: 22-00029
Location: Sunnyvale, California

The Productivity Engineering SRE Team is looking for a service engineer with experience working with large-scale systems and an operational mindset to help scale our internal service offerings. This is a fantastic opportunity to enable all engineers within LinkedIn to be more productive and impact external customers. In return, you will get to work with a world class team supporting multiple enterprise services.


Site Reliability Engineers (SRE) at LinkedIn fill the mission-critical role of ensuring that our complex, web-scale systems are healthy, monitored, automated, and designed to scale. You will use your background as an operations generalist to work closely with our development teams from the early stages of design all the way through identifying and resolving production issues. The ideal candidate will be passionate about an operations role that involves deep knowledge of both the application and the product, and will also believe that automation is a key component to operating large-scale systems.


Responsibilities: - Serve as a primary point responsible for the overall health, performance, and capacity of one or more of our services. - Gain deep knowledge of both our complex internally developed applications and enterprise-class services. - Assist in the roll-out and deployment of new product features and installations to facilitate our rapid iteration and constant growth. - Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications in a large-scale Linux and Windows environment. - Work closely with development teams to ensure that platforms are designed with "operability" in mind - Function well in a fast-paced, rapidly-changing environment


Basic Qualifications: - BA/BS Degree in Computer Science or related technical discipline, or related practical experience - 2+ years unix/linux - 2+ years of programming experience using languages such as Python, Go, Java, or similar - 2+ years of developing large scale projects


Preferred Qualifications: - 4+ years in a UNIX-based large-scale web operations role. - Experience with web-based Java/J2EE architectures and JVM configuration. - Python experience, specifically for systems automation. - Previous experience working with geographically-distributed coworkers. - Strong interpersonal communication skills (including listening, speaking, and writing) and ability to work well in a diverse, team-focused environment with other SREs, Engineers, Product Managers, etc. - Knowledge of most of these: data structures, relational and non-relational databases, networking, Linux internals, filesystems, web architecture, and related topics - Experience developing, deploying, and managing Azure PaaS component based services - Knowledge of InfoSec best practices and their application to service design