Previous Job
Site Reliability Engineer
Ref No.: 18-01671
Location: Orlando, Florida

Site Reliability Engineering (SRE) is an engineering discipline that combines software engineering and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. An SRE within the Engineering Excellence team will focus on increasing our tooling and automation and improving our systems availability.

• Build tools to quickly triage issues and Client failures across hardware, software, applications and network
• In-depth analysis of service trends and implements adjustments to mitigate risk and prevent issue recurrence
• Maintain production systems by measuring and monitoring availability, latency and overall system health
• Provide guidance to software engineers related to design patterns that are resistant to failure
• Support 24x7 on-call response to critical operational issues

Basic Qualifications
• Strong technical knowledge of digital environment full stack including Mobile, Web, APIs, Messaging, Databases, Networks and their interactions
• Knowledge and understanding of the SDLC principals and key controls
• Experience working with and contributing to open source code or frameworks using Git version control
• Strong knowledge of AWS Cloud solutions and product offerings
• Experience with container technologies (i.e. Docker, Kubernetes)
• Strong understanding of monitoring methodologies and proactive monitoring using APM (i.e. AppDynamics, New Relic) solutions or other monitoring and instrumentation technologies
• Required knowledge and understanding of technical architecture, application systems design and integration in a large heterogeneous enterprise environment with hands on experience in SOA, Angular/Node, Java/J2EE, Oracle or MySQL/MariaDB programming methodologies
• Experience working in an Agile environment (i.e. Scrum, Kanban)

Preferred Qualifications
• 3+ years programming in one or more of: Java, Node, Python, Perl or C
• 2+ years UNIX systems knowledge and/or systems administration background
• Interest in designing, analyzing and troubleshooting large-scale distributed systems
• Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive
• Experience debugging, optimizing code and automating routine tasks