Job Description :

Data Cloud Engineer (Spark, Databricks, Snowflake, Kafka)
Alpharetta, GA – 30004 (On-site Position from day 1)
6 Months

Job Details:
Role & Responsibilities
Partner with the Advanced analytics, Machine learning and Platform team(s), across multiple project areas, and work in collaboration with team(s) in off-shore & on-shore. 
The individual would be response for end-to-end development and operationalization of cross-system data flows, data stores and distributed applications for Analytics, AIML and Visualization. 
The person would also be part of the overall cloud adoption and engineering roadmap and ensure scalable, agile and robust architecture and implementation. 
Additionally, should be able to work in a dynamic environment with limited or no supervision and should be able to knowledge-share across other team members. 
Should be comfortable and manage time working with global team on multiple initiatives.

Job Responsibilities:
Design, Implement and Operationalize distributed, scalable, and reliable data flows that ingest, process, store, and access data at scale in batch / real-time
Develop distributed applications on-prem as well as on Cloud that scale to serve ML models, analytics, rules, web-applications, and Visualizations for end-users
Partner with Analytics and AIML teams to develop and analyze features at scale. Provide SME level interface for team members to optimize their workflows, streamline operationalization and reduce time-to-market
Contribute to metadata management, Data modeling and documentation
Contribute to adoption of CI/CD, Data Ops and ML Ops practices within Data analytics, AIML and Visualization domains
Develop libraries to ease development, monitoring and control of data and models
Define and Implement metrics to monitor data flow performance as well as optimize when necessary
Explore new data sources and data from new domains

Primary Skills / Must have
Experienced professional with 3+ years of experience working towards design, architecture, development, and operationalization of data flows across Hadoop eco-system, Spark (Databricks or otherwise), Snowflake and Cloud platform(s)
Experience in developing Large scale Distributed data-driven applications leveraging Big-data technologies, Streaming technologies, Data-lakes, Micro-services, and Containerization
Experience working on cloud platforms – AWS, Azure, and their respective offerings
Experience and understanding across key SQL and NoSQL datastores – HDFS, S3, Snowflake, MongoDB, Splunk as well as In-memory datastores
Experience with Stream data processing (Kafka), and workflow scheduling tools
Programming Languages – Expertise in Scala / Python, SQL, and Shell (Scripting)
Expertise in Data analytics and Data wrangling through complex and optimized Python / Spark / SQL
Ability to work in Fast paced and Dynamic environment.
Good written and verbal communication skills

Minimum years of experience*: 8+ years

Certifications Needed: No

Interview Process (Is face to face required?): No

             

Similar Jobs you may be interested in ..