Previous Job
Previous
Lead Hadoop Developer
Ref No.: 18-59495
Location: NYC, New York
Start Date: 08/16/2018
 Job description:
  • Total of 10+ years of experience in BI & DW with at least 4 - 6 years of experience in Big Data implementations
  • Understand business requirements and convert them into solution designs
  • Architecture, Design and Development of Big Data / data Lake Platform.
  • Understand the functional and non-functional requirements in the solution and mentor the team with technological expertise and decisions.
  • Hands-on experience in working with Hadoop Distribution platforms like HortonWorks, Cloudera, MapR and others.
  • Strong hands-on experience in working with Hive, Spark (Java / Scala / Python)
  • Experience in designing and building Hadoop data lake
  • Produce a detailed functional design document to match customer requirements.
  • Responsible for Preparation, reviewing and owning Technical design documentation.
  • reviews, and preparing documents for Big Data applications according to system standards.
  • Conducts peer reviews to ensure consistency, completeness and accuracy of the delivery.
  • Detect, analyse, and remediate performance problems.
  • Evaluates and recommends software and hardware solutions to meet user needs.
  • Responsible for project support, support mentoring, and training for transition to the support team.
  • Share best practices and be consultative to clients throughout duration of the project.
  • Take end-to-end responsibility of the Hadoop Life Cycle in the organization
  • Be the bridge between data scientists, engineers and the organizational needs.
  • Do in-depth requirement analysis and exclusively choose the work platform.
  • Full knowledge of Hadoop Architecture and HDFS is a must
  • Working knowledge of MapReduce, HBase, Pig, MongoDb, Cassandra, Impala, Oozie , Mahout, Flume, Zookeeper/Sqoop and Hive
  • In addition to above technologies , understanding of major programming/scripting
  • languages like Java, Linux, PHP, Ruby, Phyton and/or R
  • He or she should have experience in designing solutions for multiple large data warehouses with a good understanding of cluster and parallel architecture as well as high-scale or distributed RDBMS and/or knowledge on NoSQL platforms
  • Must have minimum 3+ years hands-on experience in one of the Big Data Technologies (I.e. Apache Hadoop, HDP, Cloudera, MapR)
  • MapReduce, HDFS, Hive, Hbase, Impala, Pig, Tez, Oozie, Scoop
  • Hands on experience in designing and developing BI applications
  • Excellent knowledge in Relational, NoSQL, Document Databases, Data Lakes and cloud storage
  • Expertise in various connectors and pipelines for batch and real-time data collection/delivery
  • Experience in integrating with on-premises, public/private Cloud platform
  • Good knowledge in handling and implementing secure data collection/processing/delivery
  • Desirable knowledge with the Hadoop components like Kafka, Spark, Solr, Atlas
  • Desirable knowledge with one of the Open Source data ingestion tool like Talend, Pentaho, Apache NiFi, Spark, Kafka
  • Desirable knowledge with one of the Open Source reporting tool Brit, Pentaho, JasperReport, KNIME, Google Chart API, D3