Previous Job
Systems Reliability Engineer
Ref No.: 18-03056
Location: New York, New York
Position Type:Full Time
Pay Rate : $ 140,000.00 /Year
Systems Reliability Engineer
FT Role in Soho NYC

Paying up to 140k base + bonus + matching 401k + equity
Excellent Benefits and wellness plan

Position Overview
Our Adtech client is looking for an experienced Systems Reliability Engineer (SRE) to join a team of SREs dedicated to improving the reliability of our end-to-end platform. We work on some of the largest distributed systems and our core infrastructure receives hundreds of millions of requests per day. Our systems serve billions of requests per day and process terabytes of log and interaction data daily. In this role you will dive deep into operational issues, from the programming, systems, automation, and process perspectives. You will understand the challenges around integrating disparate infrastructures into a new facility and new processes and procedures.

  • Perform deep dives into both systemic and latent reliability issues; partner with software and systems engineers across the organization to produce and roll out fixes
  • Troubleshoot issues across the entire stack: hardware, software, application and network
  • Drive standardization efforts across multiple disciplines and services in conjunction with embedded SREs throughout the organization
  • Mentor SREs across the organization on best practices for everything from monitoring to troubleshooting complex code issues
  • Identify and drive opportunities to improve automation for the company; scope and create automation for deployment, management and visibility of our services
  • Participate in systems design running on both physical and virtualized platforms
  • Represent the SRE organization in design reviews and operational readiness exercises for new and existing services

  • Solid understanding of systems and application design, including the operational trade-offs of various designs
  • Practical knowledge of various aspects of service design, including messaging protocols & behavior, caching strategies and software design practices
  • Practical, solid knowledge of shell scripting and at least one higher-level language
  • Demonstrable knowledge of TCP/IP, HTTP, web application security, and experience supporting multi-tier web application architectures
  • Expert level understanding of Linux and Windows servers
  • Experience supporting container-based infrastructure
  • Comfortable configuring Storage and LAN/WAN Equipment
  • Minimum 3+ years of managing services in an internet scale *nix environment
  • Ability to prioritize tasks and work independently
  • Must be adaptable and able to focus on the simplest, most efficient & reliable solutions
  • Track record of successful practical problem solving, excellent written and interpersonal communication, and documentation skills
  • Ability to lead technical teams through design and implementation across an organization
  • B.S./B.A in computer science or similar field