Previous Job
Previous
Site Reliability Engineer
Ref No.: 17-04279
Location: Portland, Maine
Title: Site Reliability Engineer
Location: Westbrook, Maine
Duration: Contract through 12/31/2017 with potential to convert PERM

The Senior Software Engineer is responsible for driving the reliability of applications and infrastructure to avoid or quickly resolve service disruptions. This will be a combination of IT operational work, development and automating. This individual will substitute software for human labor in recoveries of systems. In addition, you'll work on ensuring the following: availability, latency, performance, efficiency, change management, monitoring, emergency response and capacity planning for systems. Emphasis on partnering with product owners to define quantitative SLOs which are used to establish a "baseline” that can be used to increase velocity and take risks, or to slow/stop development to restore stability.

Key things:
  • AppDynamics
  • Splunk
  • Java
  • Oracle

Responsibilities:
  1. Manage unplanned downtime – system in motion (production)
  • Performs advanced troubleshooting and triage to insulate build teams
  • Partners to define service level objectives and implements monitoring tools, alerting, and dashboards
  • Tunes alerting and logging to reduce false positives and false negatives
  • Collaborate with Infrastructure team for service capacity planning, monitoring, and demand planning
  • Analyzes system health, errors, and run time statistics to provide input into development roadmap
  • Spend 50% of time developing system configurations and defect resolution to improve overall stability, performance and scalability
  1. Manage unplanned downtime - system in rest (non-production)
  • Creates dashboards, metrics, and code analysis tooling for early detection and prevention of defects
  • Performs and directs peer reviews to ensure compliance with best practices and adoption of technical roadmap
  • Automating, creating, reviewing, and executing deployment plans

Required Skills
  • Proficient in the following technology areas:
    • Java
    • Oracle OSB, WLI, SOA, BPEL, ODSI
    • Unix
    • Splunk for Operational Intelligence, Log Management, application monitoring
    • Knowledge of scripting languages and skills to build scripting and automation, VBScript, Windows PowerShell, Perl, Windows Management Instrumentation, Windows Remote Management, and Microsoft System Center suite of tools
  • Technical knowledge of operations hardware and applications.
  • Excellent communication skills and ability to manage vendor partners.
  • Strong reasoning, troubleshooting, problem solving and analytical skills.