Role: Software Engineer - Data
Location - Austin TX
Apple exp is preferred.
We are seeking a highly experienced Engineer to join our team and own the development of a Tiered data architecture for our data. In this role, they would be responsible for crafting and building a comprehensive data architecture that will enable seamless data integration and enable the delivery of high-quality insights to our leadership and business stakeholders.
Skills
Required : Apache Spark, SQL, git, Programing Language (Python, Java, Scala)
Nice to have: Understanding of Design Patterns, Able to discuss tradeoffs between RDBMS vs Distributed Storage
Key Qualifications
- Proven experience in data engineering, data architecture, or a related field
- Experience in building and deploying tiered data architecture for analytics data is a plus
- Strong understanding of data modeling, data warehousing, and ETL concepts
- Proficiency in SQL and experience with at least one major data analytics platform, such as Hadoop or Spark
- Experience with data orchestration tools like Airflow is a nice to have
- Excellent problem-solving and analytical skills, and the ability to work well under tight deadlines
- Excellent interpersonal skills and the ability to collaborate effectively with cross-functional teams
Description
- Design and implement a tiered data architecture that integrates analytics data from multiple sources in an efficient and effective manner.
- Develop data models and mapping rules to transform raw data into actionable insights and reports.
- Collaborate with the analytics and business teams to understand their requirements and deliver solutions that meet their needs.
- Ensure data quality and accuracy by developing data validation and reconciliation processes.
- Play an active role in the development and maintenance of user documentation, including data models, mapping rules, and data dictionaries.
- Collaborate with multi-functional teams to define and implement data governance policies and standards.
- Stay informed about the latest developments in data analytics and data management technologies and recommend new tools and methodologies to improve the semantic layer.