CED Tunisia - 02 Intelligent Data Governance Framework for Automated Classification and Protection PFE | Hi Interns

Project overview

Design and implement an intelligent framework for automated document classification, labelling, and protection using AI and Data Loss Prevention (DLP) policies.
The goal is to enhance data governance, ensure compliance with regulatory standards (e.g., GDPR), and prevent unauthorized data exposure.
Number of interns: 01
Project Ref: CED-BI/SECURITY-002

AI models for content-based classification and sensitivity detection; integration with AI/ML frameworks.
Data cataloging and labelling workflows tied to Microsoft Information Protection and Azure Information Protection.
DLP policies to enforce protection rules across Microsoft 365 and other platforms, leveraging Microsoft 365 Compliance Center and Microsoft Graph API.
Automation and scripting using PowerShell and Python to deploy and manage policies and labels.

Research, design and prototype AI-driven document classification models capable of sensitivity detection across document types.
Implement automated labelling and data-cataloguing pipelines to apply consistent sensitivity labels at scale.
Configure and test DLP rules in Microsoft 365 Compliance Center to enforce protection and prevent data leaks.
Integrate solutions with Azure Information Protection and use Microsoft Graph API for policy/reporting automation.
Develop PowerShell/Python scripts for deployment, monitoring and remediation workflows.

Automated and scalable document classification solution with demonstrable accuracy metrics.
Consistent application of sensitivity labels across a sample dataset and documentation of labelling rules.
Real-time protection mechanisms and DLP configurations that show prevention of simulated data leaks.
Implementation report, deployment scripts, test cases, and recommendations for audit readiness and compliance.

Microsoft Information Protection, Microsoft 365 Compliance Center, Azure Information Protection.
AI/ML Frameworks and experience building or adapting models for text/content classification.
Familiarity with Data Governance Tools and DLP concepts.
Scripting/programming: PowerShell and Python; experience with Microsoft Graph API for automation.