Job Description :

Education: Minimum Bachelor’s degree in Computer Science, Engineering, Business Information Systems, or related field. Masters in Computing related to Pyspark and distributed computing is a major plus

 

Key Responsibilities:

·        Develop Big Data applications using -Pyspark on Hadoop, Hive and/or Kafka, HBase, MongoDB

·        Build Machine Learning models

·        Deployment on Cloud platforms

 

Experience & Skillset

MUST-HAVE

·        Total IT / development experience of minimum 4+ years of Big Data.

·        Experience in -Pyspark developing Big Data applications on Hadoop, Hive and/or Kafka, HBase, MongoDB

·        Deep knowledge of Pyspark and libraries to develop and debug complex data engineering challenges

·        Experience in developing sustainable data driven solutions with current new generation data technologies to drive our business and technology strategies

·        Exposure in deploying on Cloud platforms

 

·        Development experience on designing and developing Data Pipelines for Data Ingestion or Transformation using -Pyspark

·        Development experience in the following Big Data frameworks: File Format (Parquet, AVRO, ORC), Resource Management, Distributed Processing and RDBMS

·        Developing applications in Agile with Monitoring, Build Tools, Version Control, Unit Test, TDD, CI/CD, Change Management to support DevOps

·        Development experience with SQL and Shell Scripting experience

 

 

GOOD-TO-HAVE

·       Banking domain knowledge

·       Hands-on experience in SAS toolset / statistical modelling migrating to Machine Learning models

·       Digital Marketing Machine Learning models and use cases

·       ETL / Data Warehousing and Data Modelling experience prior to Big Data experience

·       Deep knowledge on AWS stack for big data and machine learning

             

Similar Jobs you may be interested in ..