Agentic Data Engineer to design, develop, and deploy data pipelines that leverage agentic AI that solve real-world problems.
 
 The Client is seeking a highly skilled Agentic Data Engineer to design, develop, and deploy data pipelines that leverage agentic AI that solve real-world problems. The ideal candidate will have experience in designing data process to support agentic systems, ensure data quality and facilitating interaction between agents and data. 
  
 Responsibilities: 
  - Guiding and mentoring AI engineers, helping them develop their skills and knowledge in the field. 
  - Leading and managing AI projects, ensuring they stay on track, meet deadlines, and the findings are actionable and relevant. 
  - Contributing to the creation and implementation of AI strategies that align with the organization's goals and objectives. 
  - Designing and developing data pipelines for agentic systems, develop Robust data flows to handle complex interactions between AI agents and Data sources. 
  - Ability to use advanced mathematical modeling, statistical analysis, and optimization techniques to gather and analyze data, identifying problems and developing solutions to improve efficiency in prompts. 
  - Ability to train and fine tune large language models and Design and build the data architecture, including databases, data warehouses, and data lakes, to support various data engineering tasks. 
  - Develop and manage Extract, Load, transform (ELT) processes to ensure data is accurately and efficiently moved from source systems to analytical platforms used in data science. 
  - Implement data pipelines that facilitate feedback loops, allowing human input to improve system performance in human-in-the-loop systems. 
  - Work with vector databases to store and retrieve embeddings efficiently. 
  - Collaborate with data scientists and engineers to preprocess data, train models, and integrate AI into applications. 
  - Optimize data storage and retrieval with high performance 
  
  
 Qualifications:  
  - Strong Data engineering fundamentals 
  - Utilize Big data frameworks like Spark/Databricks 
  - Training LLMs with structed and unstructured data sets. 
  - Understanding of Graph DB 
  - Experience with Azure Blob Storage, Azure Data Lakes, Azure Databricks 
  - Experience implementing Azure Machine Learning, Azure Computer Vision, Azure Video Indexer, Azure OpenAI models, Azure Media Services, Azure AI Search 
  - Determine effective data partitioning criteria 
  - Utilize data storage system spark to implement partition schemes 
  - Understanding core machine learning concepts and algorithms 
  - Familiarity with Cloud computing skills 
  - Strong programming skills in Python and experience with AI/ML frameworks. 
  - Proficiency in vector databases and embedding models for retrieval tasks. 
  - Expertise in integrating with AI agent frameworks. 
  - Experience with cloud AI services (Azure AI). 
  - Experience with GIS spatial data 
  - Strong leadership, excellent problem-solving and communication skills 
  - Proven experience in leading projects and teams, including the mentorship of AI engineers 
  - The ability to engage in critical evaluation of information, hypothesis testing, and scenario analysis. 
  - Flexibility in learning and adopting new technologies, methodologies, and tools to stay at the forefront of AI trends. 
  - Experience with Department of Transportation Data Domains developing an AI Composite Agentic Solution designed to identify and analyze data models, connect & correlate information to validate hypotheses, forecast, predict and recommend potential strategies and conduct What-if analysis. 
  - Bachelor's or master's degree in computer science, AI, Data Science, or a related field. 
  
     |   Skill   |    Required / Desired   |    Amount   |    of Experience   |  
  |   Understanding the Big data Technologies   |    Required   |    1   |    Years   |  
  |   Experience developing ETL and ELT pipelines   |    Required   |    1   |    Years   |  
  |   Experience with Spark, GraphDB, Azure Databricks   |    Required   |    1   |    Years   |  
  |   Expertise in Data Partitioning   |    Required   |    1   |    Years   |  
  |   Experience with Data conflation   |    Required   |    3   |    Years   |  
  |   Experience developing Python Scripts   |    Required   |    3   |    Years   |  
  |   Experience training LLMs with structured and unstructured data sets   |    Required   |    2   |    Years   |  
  |   Experience with GIS spatial data   |    Required   |    3   |    Years   |