Previous Job
Previous
Big Data Engineer
Ref No.: 18-19303
Location: Cary, North Carolina
Big Data Engineer
Cary, NC
1 Year Contract / Temp to Perm *

Client / GC

Job Description:


Key Responsibilities:
• Ingesting huge volumes data from various platforms for Analytics needs.
Building and Implementing ETL process developed using Big data tools such as Spark(scala/python), Nifi etc.
•Monitoring performance and advising any necessary infrastructure changes.
•Defining data security principals and policies using Ranger and Kerberos.
• Works with IT and business customers to develop and design requirements to formulate technical design.

Supervisory Responsibilities: Leads and motivates project team members that are not direct reports, as well as providing work direction to lower-level staff members


Essential Business Experience and Technical Skills:
10+ years of solutions development experience
• Proficient understanding of distributed computing principles
• Management of Hadoop cluster, with all included services – preferably Hortonworks.
Proficiency with Hadoop v2, MapReduce, HDFS
• Proficiency in Unix scripting
• Experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming
• Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala
• Extensive Experience with Spark & Scala

• Experience in Java/MapReduce, Storm, Kafka, Flume and Data security using Ranger
• Experience with integration of data from multiple data sources, Sqoop

• Experience with NoSQL databases, such as HBase, Cassandra, MongoDB
• Performance tuning and problem solving skills is a must
• Experience in analyzing images and videos – using Big data tools is preferred.

Required:
• Proficiency and extensive experience in HDFS, Hive, Spark,Scala, Python,Hbase, Pig, Flume, Kafka etc.
• Unix & Python scripting
• SQL tuning and DW concepts


Preferred:
• Experience with Big Data Client toolkits, such as Mahout, SparkML, or H2O
• Any experience building RESTful APIs
• Exposure to Analytical tool such as SAS, SPSS is a plus
• Experience with Informatica PC 10 and implemented push down processing into Hadoop platform, is a huge plus.