Job Description :
Big Data/Java/ETL/Project at Pleasanton, CA:

This project is an exciting opportunity to:

Design, implement and operate data systems to meet organization needs by identifying the best method of presenting the data for business decisions.
In partnership with product owners, work with business partners on data anomalies and requests for information.
Develop, recommend and implement process and procedure changes to systematically improve data quality/integrity.
Monitor the quality of data and information, report on results, and identify and recommend system application changes required to improve the quality of data in all applications.
Manipulate and analyze complex data from varying sources and recommend ways to apply the data.
Develop new innovations in the use of the data and data mining techniques to effectively extract meaningful information from data sets.
Investigates data quality problems, conduct root-cause analysis, correct errors, and develops process improvement plans across all programs.
Integrate data from multiple sources and design, develop, and generate ad hoc and operational reports in support of objectives.


Technical Skills: At least 5+ years

In-depth experience with demonstrable ability to build full-stack Hadoop systems using a major distribution (Hortonworks / Cloudera)
Extensive ETL
STRONG Java Development
Experience designing, building and launching extremely efficient & reliable data pipelines to move structured / unstructured data (both large and small amounts) using modern data architecture/tools.
Data Workflow: Kafka, Flume, SQOOP (Incremental Imports), SQL/NoSQL: Hive, HBase, Data File Formats: AVRO / Parquet / CSV / TSV others
Experience in Spark, Spark SQL, Spark Streaming for ingestion and data processing.
Designed, developed and tuned Spark Cluster for performance optimizations.
Experience in Java and Shell Scripting.
Experience / Knowledge of Azure HDInsight


Client : K

             

Similar Jobs you may be interested in ..