Job Description :
Job Title: Data Architect
Location: Bellevue, WA
Duration: 6-12 Months
Interview mode : Skype with coding test

Job Description
. Overall 18 plus years of IT industry experience. Well-rounded, forward-thinking, and a highly motivated, self-starter with excellent communication and presentation skills.
. Known for creative insights and creative solutions, and able to adapt to changing environments and needs.
. Experience in large size Data Integration , Data Transformation and data management work.
. Hands on development experience in data integration using industry standard tools like Informatica, DataStage or Teradata is highly desirable
. Ability to define solution architecture involving data transformation, integration, API interface, Microservices architecture is mandatory
. Strong hands-on experience in Hadoop; HDFS Architecture, Hive, Pig, Sqoop, HBase, MongoDB, Cassandra, Oozie, Spark RDDs, Spark DataFrames, Spark Datasets,
. Experience in data and batch migration to Hadoop, and Hands on experience in installing, configuring Cloudera''s and Horton distribution. Ability to drive solution around Storage requirement, Security requirement and cluster/node configuration for best performance. Experience deploying large multiple nodes of a Hadoop and Spark cluster is strongly desirable.
. Excellent knowledge on Hadoop Architecture and ecosystems such as HDFS, configuration of nodes, YARN, Sentry, Spark, Falcon, Hbase, Hive, Pig, Sentry, Ranger.
. Incremental imports, partitioning and bucketing concepts in Hive and Spark SQL needed for optimization.
. Experience developing Oozie workflows for scheduling and orchestrating the ETL process.
. Working with large complex data sets, real-time/near real-time analytics, and distributed big data platforms.
. Good understanding of technical meta data management. Ability to use metadata for service catalog
. Good understanding of Data Governance , Data Profiling, Data Curation and data quality improvement using business rules
. Experience collecting log data from various sources and integrating it into HDFS using Flume; staging data in HDFS for further analysis.
. Worked on disaster management with Hadoop cluster, and Involved in building a multi-tenant cluster.
. Ability to troubleshoot and tune relevant programming languages like SQL, Python, Scala, PIG, Hive, RDDs, DataFrames. Able to design elegant solutions through the use of problem statements.
. Ability to do hands on development using Hadoop, Kafka, Spark is mandatory