Job Description :

AWS Data Architect

Location - Torrance CA Onsite

Duration Full time

Visa USC and GC

Key Responsibilities:

  • Data Architecture Design:
    1. Architect and implement a scalable data hub solution on AWS using best practices for data ingestion, transformation, storage, and access control.
    1. Define data models, data lineage, and data quality standards for the DataHub.
    1. Select appropriate AWS services (S3, Glue, Redshift, Athena, Lambda) based on data volume, access patterns, and performance requirements.
    2. Come up with a design that accommodates AI/ML applications in the next phase
  • Data Ingestion and Integration:
    1. Design and build data pipelines to extract, transform, and load data from various sources (databases, APIs, flat files) into the DataHub using AWS Glue, AWS Batch, or custom ETL processes.
    1. Implement data cleansing and normalization techniques to ensure data quality.
    1. Manage data ingestion schedules and error handling mechanisms.
  • Data Governance and Access Control:
    1. Establish data access controls and security policies to protect sensitive data within the DataHub using IAM roles and policies.
    1. Develop data governance frameworks including data quality checks, data lineage tracking, and data retention policies.
  • Data Analytics Enablement:
    1. Create data catalogs and metadata management systems to facilitate data discovery and understanding by business users and data analysts.
    1. Design and implement data views and dashboards using Power BI to enable data exploration and visualization.
    2. Create data warehouses and data marts to meet the needs of the business
  • Monitoring and Optimization:
    1. Monitor data pipeline performance, data quality, and system health to identify and resolve issues proactively.
    1. Optimize data storage and processing costs by leveraging AWS cost optimization features.
  • Data Exchange
    1. Develop the required governance, security, monitoring and guard rails to enable efficient data exchange between internal application and their external vendors, partners, and SaaS providers
    1. Develop intake process, SLAs, and usage rules for internal and external data set producers and consumers

Required Skills and Experience:

  • AWS Expertise: Deep understanding of AWS data services including S3, Glue, Redshift, Athena, Lake Formation, Sep Functions, CloudWatch and EventBridge.
  • Data Modeling: Proficiency in designing dimensional and snowflake data models for data warehousing and data lakes.
  • Data Engineering Skills: Experience with ETL/ELT processes, data cleansing, data transformation, and data quality checks. Experience with Informatica IICS and ICDQ is a plus.
  • Programming Languages: Proficiency in Python, SQL, and potentially PySpark for data processing and manipulation.
  • Data Governance: Knowledge of data governance best practices including data classification, access control, and data lineage tracking.

Preferred Qualifications:

  • Experience with data lakehouse architectures and the ability to leverage both structured and unstructured data.
  • Familiarity with data visualization tools like Tableau or Power BI.
  • Strong communication and collaboration skills to work with stakeholders across business and technical teams.
  • AWS certifications related to data analytics and architecture.
             

Similar Jobs you may be interested in ..