Job Description :
Role: Hadoop Apache Spark/Database Admin/Data Bricks

Location: Thousand Oaks, CA

Duration: Long Term

Visa: Any Visa(Looking 10+ Years)

Mandatory Skills (Pls. detail as much as possible)


Designing and implementing highly performant data ingestion pipelines from multiple sources using Apache Spark and/or AWS Databricks

Strong experience in administrating Apache Airflow jobs, Scoop, Kafka, AWS (SSH, EMR, S3, EC2, etc

Developing scalable and re-usable frameworks for ingesting of data sets

Integrating the end to end data pipleline to take data from source systems to target data repositories ensuring the quality and consistency of data is maintained at all times

Working with event based / streaming technologies to ingest and process data

Working with other members of the project team to support delivery of additional project components (API interfaces, Search)

Evaluating the performance and applicability of multiple tools against customer requirements

Working within an Agile delivery / DevOps methodology

Coordinate with onsite and offshore teams


Strong knowledge of Data Management principles

Experience in building ETL / data warehouse transformation processes

Direct experience of building data pipelines using Databricks will be an added advantage

AWS certification

Experience with Open Source non-relational / NoSQL data repositories Experience working with structured and unstructured data

Experience working in a Dev/Ops environment with tools such as Maven, Jenkins, GITHub, SONAR, TFS

Priyanka| IT Recruiter
AVTECH Solutions Inc.

ext 513 (Direct)