TOPIC 01 : AI-powered Automation of Data Pipelines, Data Quality & Pipeline Code Quality (DWH) PFE
TOPIC 01 : AI-powered Automation of Data Pipelines, Data Quality & Pipeline Code Quality (DWH) PFE
Binit Nearshore Services•Tunisie
Data Engineering & AnalyticsMachine Learning/AIData Quality & BI
Publié il y a 7 mois
Stage
⏱️4-6 mois
💼Hybride
💰Rémunéré
📅Expiré il y a 6 mois
Reste lisible (ATS friendly).
Description du poste
Project Overview:
As part of a major migration project in the German banking sector, design and implement an intelligent Data Warehouse (DWH) environment that leverages AI-driven automation to improve pipeline orchestration, data quality assurance and ETL/ELT code optimization.
Objective: reduce manual intervention, increase data reliability, and provide intelligent recommendations for data correction and ETL/ELT code improvements; integrate a Data Quality Engine (profiling, validation, anomaly detection) using Great Expectations.
Responsibilities & Expected Deliverables:
Design and develop AI modules for anomaly detection, automated data correction, and pipeline code review that can generate recommendations for performance improvements.
Implement automated ETL/ELT data pipelines using Airflow or Prefect together with dbt; deliver an intelligent DWH architecture (staging, conformed layer, data marts) and migration-ready artifacts.
Implement automated code quality checks for ETL/ELT workflows and produce actionable code optimization suggestions; deliver a monitoring and visualization dashboard for data quality and performance KPIs (Power BI).
Technical Stack & Tools:
Core: SQL Server, dbt, Airflow (or Prefect), ETL/ELT concepts, Data Modeling (Star/Snowflake) and Docker/Git for deployment and CI workflows.
AI/LLM integration: OpenAI or local models for anomaly detection and code review automation; integrate Great Expectations for profiling, validation and anomaly detection.
Visualization & infra: Power BI for dashboards, containerization with Docker, versioning with Git; emphasis on reproducible, production-ready pipelines.
Logistics & Application:
Pre-employment internship, duration 6 months (4-6 months), Number of interns: 1, Paid internship; work expected to address real migration requirements for a banking DWH.
To apply, send your application referencing this project to stages@binitns.com (email subject: see below). Provide CV, brief cover letter describing relevant experience with dbt/Airflow/ETL and any AI/LLM projects, and links to code or notebooks if available.