CED Tunisia - 02 Intelligent Data Governance Framework for Automated Classification and Protection PFE

Design and implement an intelligent framework for automated document classification, labelling, and protection using AI and Data Loss Prevention (DLP) policies.
The project aims to enhance data governance, ensure compliance with regulatory standards (e.g., GDPR), and prevent unauthorized data exposure across Microsoft 365 and other platforms.
Number of interns: 01
Project Ref: CED-BI/SECURITY-002

Automate and scale document classification and sensitivity labelling based on content analysis.
Enforce real-time protection rules to prevent data leaks and unauthorized sharing.
Improve compliance posture and audit readiness by consistent application of labels and DLP policies.

AI models for content-based classification and sensitivity detection (AI/ML frameworks).
Data cataloging and labelling integrated with Microsoft Information Protection and Azure Information Protection.
DLP Policies enforced across Microsoft 365 using Microsoft 365 Compliance Center and Microsoft Graph API.
Implementation and automation using PowerShell and Python scripts.

Research and select or train AI/ML models for document/content sensitivity detection.
Integrate classification models with a data catalog and automated labelling workflows.
Define and implement DLP policies and rules in Microsoft 365 Compliance Center; automate policy deployment via Microsoft Graph API.
Provide scripts, documentation, and a demonstrable prototype showing automated classification, labelling, and enforcement.

Hands-on experience with Microsoft Information Protection, Azure Information Protection, and Microsoft 365 Compliance Center.
Familiarity with AI/ML frameworks for NLP/content classification and with data governance tools.
Scripting and automation experience in PowerShell and Python; experience using Microsoft Graph API.
Understanding of data protection regulations (e.g., GDPR) and DLP concepts.

Automated and scalable document classification pipeline with consistent sensitivity labelling.
DLP policies and enforcement mechanisms that provide real-time protection against data leaks.
Improved audit readiness and demonstrable metrics (e.g., classification accuracy, number of prevented exposures, policy coverage).
Deliverables include code repository, deployment/automation scripts, design documentation, test results, and a final project report/presentation.

02 Intelligent Data Governance Framework for Automated Classification and Protection PFE