Job Description :
Cloud Data Engineer,

Columbus, OH 

6+ Months Contract

Phone + In-Person/Skype



Description:


Design and build data pipelines for handline both real-time data streams and batch based integrations.
Leverage Apache Spark based tooling available within Google Cloud Platform to develop data pipelines.
Support the development of machine learning models by making data available for analysis and exposing the output of the models to other systems
Analyze large and complex datasets to identify data quality issues early in the SDLC.
Work with cross functional stakeholders to define their data needs and propose best of breed solutions.




Qualifications:


Bachelor’s degree or higher in computer science, information systems and related field.
Proficient in programming in Java or Python with prior Apache Beam/Spark experience a plus.
7+ years working in a data engineering role.
3+ years of hands on experience building and optimizing big data pipelines using Apache Beam/Spark.
2+ years of hands on experience building and integrating data pipelines with machine learning models.
1+ years of hands on experience with Google Cloud Platform (BigQuery, Data Flow/Data Prep, Cloud Functions)
Prior data quality analysis and remediation experience a plus.
Prior experience working with varying big data tools/environments a plus (Kafka, Cassandra, Hadoop, Storm and Spark