Previous Job
Ref No.: 18-47011
Location: McLean, Virginia
Position Type:Full Time/Contract
Start Date: 07/06/2018
"Spark, Pyspark, Kafka, Hive , Nifi, Sqoop , oozie, Data Lakes

• Proficient experience working on Hortonworks

• Good expeience working on Hadoop platform components like: Spark, Pyspark, Kafka, Hive , Nifi, Sqoop , oozie

• Proficiency with Hadoop, MapReduce using Python, HDFS

• Experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala

• Experience with Spark 

• Expeience in scripting languages like Batch, Shell

• Experience with integration of data from multiple data sources like DB2, Sybase, Oracle, SQL server

• Experience with NoSQL databases, such as HBase, Cassandra, MongoDB

• Knowledge of various ETL techniques and frameworks, such as Flume

• Experience with various messaging systems, such as Kafka

• Experience in Python is mandatory.

• Data Robot , Blockchain, hyper Ledger , Ethereum, PDF xstream , PDF Box, Microstrategy, Tableau, HBase

• Responsible for delivery in the areas of: big data engineering/ data science/machine learning, including technology implementations and algorithm development 

• Develop scalable and reliable data solutions to move data across systems from multiple sources in real time as well as batch modes.
• Construct data staging layers and fast real-time systems to feed BI applications and machine learning algorithms

• Review and independently test the effectiveness and accuracy of Image Analytics, NLP and machine learning models

• Utilize expertise in models that leverage the newest data sources, technologies, and tools, such as machine learning, Python, Hadoop, Spark, AWS, as well as other cutting-edge tools and applications for Big Data.

• Communicates with project manager on a frequent basis. Identifies tasks and issues that may have an impact on service levels or schedules. Provides realistic task and cost estimates. 

• Investigate the impact of new technologies, applications, and data sources on the future secondary mortgage business

• Demonstrated ability to quickly learn new tools and paradigms to deploy cutting edge solutions.

• Develop both deployment architecture and scripts for automated system deployment in AWS

• Create large scale deployments using newly researched methodologies.

• Work in Agile environment
• Experience mentoring junior engineers.

• Experience in Python is mandatory.