Activities:
· Ability to work independently work on end-to-end development track with engineering mindset and product ownership attitude
· Requirement analysis and coding in accordance with coding standards
· Requirement and design playback with business
· Design pipeline end to end to meet customer requirement
· Coding in Scala using existing frameworks to meet business requirements
· Work in agile methodology involving daily standups, frequent sync up with peers.
· Delivering high quality code along with reviewing and mentoring peer’s code.
Skills Must Have : -
SCALA, SPARK, SQL, HADOOP, AWS, HIVE, Spark SQL, HIVE QL, CICD (Continuous Integration/Continuous Delivery), VCS (GIT HUB)
· Coding in Scala (60% of time)
· Designing in of HADOOP ecosystem (20% of time)
· Hands-on experience on AWS tools like EMR, EC2 (10% of time)
· Hands-on experience of SQL in Big Data: SQL, Spark SQL, Hive QL (60% of time)
· Proficient in working with large data sets and pipelines (20% of time)
· Proficient with workflow scheduling / orchestration tools (20% of time)
· Well versed with CICD process and VCS (20% of time)
· In Scala, below hands-on experience is essential
v Functional Programming and Object-Oriented Programming
v Currying and Partially Applied Functions
v Higher Order Functions
v Tail Recursion
v Futures
v Case class
v Pureconfig
v Implicit Functions
v Practical knowledge of all containers like list, map, array etc.
v Implementation of foreach loop etc. for sorting instead of using direct functions
Nice to have : -
PYTHON, JAVA, ICEBERG , KAFKA, Amazon EKS, Kubernetes, Amazon S3, Airflow ,Oozie, POSTGRES
· Experience in PYTHON and JAVA Coding.
· Experience in using KAFKA to setup workflows.
· Experience with ICEBERG tables.
· Experience with workflow scheduling / orchestration such as Airflow.
· Experience with schema design and data modeling.
· Machine learning experience is good to have.
· Experience using Cassandra database (NOSQL) is good to have.