Job Description :
Data Engineer
Location: Redwood City, CA
Duration: 6 – 12 Months


Need to have 8+ years of data/software engineering in their background.
That should include a heavy amount of Java and Python
On the big data side they need 2 years of big data experience.
The preference is with Spark and AWS, but he likes people with Apache skills





Our Values

We believe we can help build a future for everyone.
We aim to be daring, but humble: We look for bold ideas — regardless of structure and stage — and help them scale by pairing engineers with subject matter experts to build tools that accelerate the pace of social progress.
We want to learn fast, but build for the long-term: We want to iterate fast and help bring new solutions to the table, but we also realize that important breakthroughs often take decades, or even centuries.
Stay close to the real problems: We engage directly in the communities we serve because no one understands our society’s challenges like those who live them every day.
Our success is dependent on building teams that include people from different backgrounds and experiences who can challenge each other''s assumptions with fresh perspectives.
To that end, we look for a diverse pool of applicants including those from historically marginalized groups — women, people with disabilities, people of color, formerly incarcerated people, people who are lesbian, gay, bisexual, transgender, and/or gender nonconforming, first and second generation immigrants, veterans, and people from different socioeconomic backgrounds.



The Opportunity

By pairing engineers with leaders in our education, science, and justice and opportunity teams, we can bring technology to the table in new ways to help drive solutions.
We are uniquely positioned to design, build, and scale software systems to help educators, scientists, and policy experts better address the myriad challenges they face.
Our technology team is already helping schools bring personalized learning tools to teachers and schools across the country and supporting scientists around the world as they develop a comprehensive reference atlas of all cells in the human body.
Meta''s Data Platform Team (DPT) builds the infrastructure and data processing pipelines to assemble a comprehensive knowledge graph for scientific data that serves the meta.org product as well as researchers in Machine Learning, NLP and the life sciences.
Members of the team have a direct impact on all the data needs of Meta and its products to accelerate science and literature discovery.
A person in this role works closely with other DPT team members, with our research science team, and with our analytics team to design, build and support technical solutions. This individual will also support us in our ongoing goal of cultivating a culture of shared best practices and knowledge around data engineering.


You will

Design, build, analyze and improve the efficiency, stability, and resiliency of data processing pipelines and Meta''s knowledge graph for scientific literature data
Design and implement robust machine learning pipelines
Collaborate with the shared infrastructure team and evangelize best practices in the team



You have

6+ years relevant coding experience
Amazon Web Services (AWS) or similar cloud providers
Experience with an object-oriented systems language such as Java or Kotlin
Experience with a scripting language such as Python & Bash
Experience with Terraform, Docker, Travis CI, Jenkins, and other infrastructure/CI tools
Experience with setting up monitoring and metrics on live production systems a plus