Job Description :
Job Description:

Solution Architecture of the vendor agnostic data lake, should be comfortable with MapR, Cloudera and Hortonworks flavor of Hadoop
Defining the Data Lake Ingestion patterns and implementing them in a highly performant way
Create highly scalable and highly performant Data Pipelines and implementing ETL processing
Hands-on experience with migration of Informatica ETL to Spark and Talend based transformation in enterprise data lakes
Expert in data modelling, etl in Informatica with ability to migrate Informatica jobs to Big Data Eco-system to improve performance and run time of Informatica ETL jobs
Expert in data modeling in Big Data environment and comfortable mapping data from traditional relation databases to Big Data Systems like Hadoop, HBase, Kudu etc
Defining data consumption patterns from enterprise data lake and create downstream consumption API
Ability to articulate use cases and identify proper tools from Big Data ecosystem to address them.
Ability to identify optimal data storage patterns and when to use which data format eg: avro vs parquet vs protobuf vs thrift vs standard csv
Expert capabilities dealing with batch mode as well as real time streaming mode
Hands-on on Spark (Scala), Talend, Impala and Hive
Assist the customer in defining the ingestion and processing strategies by building reusable frameworks that can be leveraged across multiple data initiatives
Lead team for executing the data provisioning, data ingestion, data processing, data lineage, data governance and data visualization

Job Qualification:
10+ years of intense technical experience in the areas of data architecture, data modeling (defining entities and relationships in sync with business/product requirements), data management, data engineering and/or data governance.
Excellent communication (both written and verbal), interpersonal skills and experience in presenting to business and technical team including executive management.
Strong analytical and reasoning skills that result in clear, robust, flexible architectures
Proven ability to drive complex design and development efforts in an agile environment
Skill at crystalizing requirements into clear priorities, estimates, tasks and deliverables
Strong hands on experience with Big Data Platform Stack – Spark, Scala, HBase, Impala, Hive.
Strong hands on experience of performance tuning spark jobs and optimize run time of overall spark jobs
Expert hands on experience with different Talend modules including – Metadata Manager, Metadata Master Data Management, Data Quality (Data Masking & Shuffling), Big Data, Master Data Management, Data Preparation and Data Stewardship, Data Integration and Data Services module
Strong hands on expertise with data ingestion technologies like Informatica and Kafka
Strong experience in Java, Python and Scala
Ability to work with all levels of the organization
Ability to work within and lead a team
Ability to influence others and conflict/resolution skills
Expertise with data visualization tools like Tableau
             

Similar Jobs you may be interested in ..