Job Description :
At least 4 years of experience in software development life cycle.Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala Experience with Spark Experience with integration of data from multiple data sources Databaseapplication development experience in a complex enterprise environment Strong familiarity with working in a Linux environment which includes Shell scripting Experience in designing and implementing data pipelines in production using tools from the Hadoop ecosystem such as MapReduce, Hive, HBase, Spark, Sqoop, Oozie, Pig. Experience in or have a good understanding of other data lake technologies Prior experience with ETL applications such as Data Stage
The ETL Developer will be responsible for migration of applications on DataStage to Hadoop based platform
Responsible for delivering the projects end-end using latest toolstechnologies in an Agile environment, playing multiple roles as the requirements demand
This role requires solid DataStage Developer experience, as well as experience performing data analytics on Hadoop-based platforms and implementing complex ETL transformations
Transform existing ETL logic on DataStage application into Hadoop Platform.Design, develop and implement MicroServices based Data Lake Solutions on Hadoop Platform
Acquire data from primary or secondary data sources
Identify, analyze, and interpret trends or patterns in complex data sets
Innovate new ways of managing, transforming and validating data
Establish and enforce guidelines to ensure consistency, quality and completeness of data assets
Apply quality assurance best practices to all work products
Analyze, design and code business-related solutions, as well as core architectural changes, using an Agile programming approach resulting in software delivered on time and in budget
Challenge status quo and mentors development staff in terms of efficient design as well as reusable development best practices in minimizing unfavorable work variances
Communicate risks or issues stemming from projects or tickets work to core teams as well as assigned Technical Delivery Managers, proactively
Qualifications 3+ years of development experience in delivering DataStage-based ETL solutions
3+ years of Hadoop Big Data Developer experience including writing SQLs, YARN, Sqoop, Spark SQL, Hive, other ETL tools (Informatica, Talend, etc, Impala, etc
2-5 years of experience in Python, Java or Scala development
Experience developing Spark processes and performance tuning
Experience performing data analytics on Hadoop-based platforms and implementing complex ETL transformations
Strong experience with UNIX Shell Scripting to automate file preparation and database loads
Experience in data quality testing adept at writing test cases and scripts, presenting and resolving data issues Familiarity with relational database environment (Oracle, DB2, etc leveraging databases, tablesviews, stored procedures, agent jobs, etc
and integrating with SalesForce backend
Experience analyzing and designing deliverables in an Agile environment is required
Experience working on a Development teams using Object Oriented Development and Scripting languages preferred
Demonstrated independent problem solving skills and ability to develop solutions to complex analyticaldata-driven problems
Comfortable learning cutting edge technologies and applications to greenfield projects