IOVISION
IOVISION
Tunisie

05 04 04 Project 1: Development of a Secure Local Large Language Model (LLM) for Confidential Data Processing PFE

natural language processingMachine Learning EngineeringData Privacy & Security

Publié il y a environ 8 heures

Stage
⏱️3-6 mois
💼Hybride
📅Expire dans 13 jours
Ferme les onglets non utiles.

Description du poste

Project Overview

  • Design and implement a local Large Language Model (LLM) capable of processing confidential and sensitive data securely within a private infrastructure.
  • Adapt and fine-tune open-source LLMs (examples cited: GPT, BERT, LLaMA) on domain-specific custom datasets to ensure high performance while preserving data privacy.
  • Deliver a fully functional local AI assistant optimized for enterprise use cases requiring private, high-accuracy natural language understanding.

Technical Components & Approach

  • Integrate modern AI agent architectures and apply Retrieval-Augmented Generation (RAG) techniques to improve contextual understanding and response accuracy.
  • Implement vector database integration (e.g., FAISS/Milvus-style approaches) for efficient knowledge retrieval, enabling dynamic interaction with structured and unstructured data sources.
  • Focus on on-premise/private deployment concerns: secure data handling, encryption at rest/in transit, access controls, model quantization or optimization for efficient inference.

Tasks & Deliverables

  • Prepare and preprocess domain-specific datasets, design fine-tuning pipelines, and run experiments to benchmark accuracy, latency, and privacy trade-offs.
  • Produce an end-to-end solution: trained/fine-tuned model artifacts, retrieval pipeline, vector DB integration, containerized deployment (Docker/Kubernetes), and a demoable local AI assistant.
  • Provide documentation: setup and deployment guide, evaluation reports (metrics, tests), reproducible training/inference scripts, and a user guide for enterprise integration.

Evaluation & Success Criteria

  • Quantitative evaluation on domain tasks (accuracy, F1, ROUGE/BLEU where applicable) and qualitative assessment of response relevance and contextual handling via RAG.
  • Performance targets for latency and resource usage for local inference; validation of privacy requirements (no exfiltration of sensitive training data during inference).
  • Demonstrable prototype deployed on private infrastructure with test cases showing secure handling of confidential inputs.

Candidate Profile & Required Skills

  • Strong experience with NLP and transformer models (Hugging Face Transformers, PyTorch/TF) and hands-on fine-tuning of open-source LLMs.
  • Experience with retrieval systems and vector databases (FAISS, Milvus, or similar), building RAG pipelines, and knowledge of secure deployment practices.
  • Software engineering skills: Python, Docker, Linux, CI/CD basics; familiarity with privacy-preserving ML techniques (differential privacy, federated learning) is a plus.

How to Apply

  • To apply, send your CV, cover letter, and relevant code or project examples to hr@iovision.io .
  • Use the email subject: "Application — 05 04 04 Project 1: Secure Local LLM for Confidential Data Processing" and include a brief summary of related experience and proposed technical approach.
IOVISION - 05 04 04 Project 1: Development of a Secure Local Large Language Model (LLM) for Confidential Data Processing PFE | Hi Interns