· Should have thorough understanding of Hadoop concepts
· Candidate with a mixture of Software Development, Data Engineering and Data Science skills
· Strong Software Development and Engineering skills with some basic knowledge in Machine Learning and Deep Learning
· Ability to write efficient code-- Good understanding of core Data Structures/algorithms is critical for engine development
· Good python skills following software engineering best practices
· Comfort & familiarity with SQL and Hadoop ecosystem of tools including spark
· Understanding of foundational Machine Learning concepts and some Deep Learning basics
· 5+ years’ experience in any flavor of SQL dealing with complex queries, analytics and data models.
· 3+ years’ experience in any of the modern programming languages
· Should have thorough understanding of Hadoop concepts
· Must have 4+ years of experience in Spark
· Should have Python programming experience
· Should have experience in AWS services like Glue, S3, Athena, EMR
· Should have experience in Windows PowerShell scripting
· Good Analytical skills and experience in cross functional team environment
· Healthcare domain knowledge is preferred
· The work will entail heads down coding, testing, data analysis, component packing/deployment
· Strong SQL, Teradata, Unix, Strong standard Big Data/Hadoop skillset (Hadoop hive, Sqoop, HDFS, etc) String Spark/python/Scala
· 3+ solid years’ experience with the above skillset
· 7+ years’ experience of ETL application development and implementation experience using Teradata and Informatica / any other ETL tool