Previous Job
Previous
System Reliability Engineer (SRE)
Ref No.: 18-05386
Location: Roseland, New Jersey
NTT DATA has an immediate need for a System Reliability Engineer to support our client in Roseland, NJ.

To apply for this position, please follow the link below or send your resume directly to shawn.mills@nttdata.com. For more information please call Shawn Mills at 732-362-2596.


System Reliability Engineer (SRE) is responsible for availability, performance, and capacity of large-scale distributed systems. This role combines Software and Systems Engineering to develop creative solutions to solve challenges in a more efficient, reliable, and automated way.

Principal Responsibilities:
  • Design and Develop API's, Modules, Frameworks, and Systems that support scale through automation.
  • Design and develop system and software architectures to promote efficiency in a large distributed ecosystem.
  • Participate in incident responses, then design/develop remedial solutions.
  • Proactive engagement of IT Partners to understand, collaborate, and present alternative solutions to improve product reliability.
Minimum Qualifications:
  • 5+ years of software engineering experience working with large distributed systems.
  • Able to work independently on complex analysis, design, and implementation of large-scale distributed solutions.
  • Define enhancement specifications through collaboration sessions and architecture context diagrams.
  • Expertise with Software Engineering which includes but not limited to Object Oriented Programming, Design Patterns, and API development in multiple languages, including Java, Python, and Shell.
  • Strong understanding of SQL and Table Design.
  • Experience with continuous integration, continuous delivery, configuration management, and automated testing in a virtualized/containerized environment.
  • Strong understanding of infrastructure (virtualization, operating systems, load balancers, web stacks, databases, storage solutions, and networking.)
  • Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
Preferred Qualifications:
  • Fundamental knowledge of AWS and/or other Cloud platforms.
  • Experience with designing, implementing and supporting Docker and Kubernetes platforms.
  • Experience troubleshooting and tuning Linux, Windows, and Java JVM's.
  • Experience with writing SQL queries and stored procedures.
  • Experience with writing Splunk queries and dashboards.
  • Experience Designing a Continuous Integration and Delivery model with products like Jenkins to ensure delivery of new content is fully automatable with pipelines that do automated integration and functional testing.
  • Strong understanding of application and system monitoring using blackbox and whitebox techniques.