Ref No.: 18-00042
Location: McLean, Virginia
Position Type:Contract
Scope of Work
**Candidates must be eligible to work in the United States without sponsorship and be direct W-2 employees to our vendors.**

Data Engineering:
•Cleanse, manipulate and analyze large datasets (Structured and Unstructured data – XMLs, JSONs, PDFs) using Hadoop platform.
•Develop Python, Spark, HIVE scripts to filter/map/aggregate data. Scoop to transfer data to and from Hadoop.
•Manage and implement data processes including Data Quality scripts
•Analysis and Modeling:
•Perform R&D and exploratory analysis using statistical techniques and machine learning clustering methods to understand data.
•Develop data profiling, deduping logic, matching logic for analysis
•Big Data languages such a
•5+ years of experience in processing large volumes and variety of data (Structured and unstructured data, writing code for parallel processing, XMLS, JSONs, PDFs)
•3+ years of programming experience in at least 2 – Python, Spark, Java for data processing and analysis.
•Strong SQL experience
•2+ years of experience – using Hadoop platform and performing analysis.
Familiarity with Hadoop cluster environment and configurations for resource management for analysis works Python, Spark, HIVE for analytics and developing dashboards

Education Level
Bachelor Degree

