Tun Up
Tun Up
Tunisie

Automated Testing & Quality Evaluation Framework for AI Chatbot & Mailbot (PFE / Internship)

SaaS / Software engineeringAI / LLMsnatural language processingSoftware Testing & BenchmarkingArchitecture Cloud / DevOps

Publié il y a environ 14 heures

Stage
⏱️4-6 mois
💼Hybride
📅Expire dans 13 jours
1% aujourd’hui > 0%.

Description du poste

Overview

  • Project: design and implement an automated quality evaluation framework for production chatbot and mailbot systems to improve reliability and quality.
  • The project is part of the Care Technology team and targets production systems using real customer interactions.

Responsibilities / What you will do

  • Build a maintainable benchmark test dataset from real customer interactions and implement a multi-layer evaluation engine (rule-based, embedding-based, LLM-based).
  • Implement false-positive control strategies using intent-scoped rules, multi-reference answers, and weighted scoring; implement automated regression detection and integrate it into the CI/CD pipeline as a quality gate.

Technical environment

  • Programming Language: Python; AI Stack: LLM APIs, Retrieval-Augmented Generation (RAG), Embeddings; APIs: Open AI or similar LLM services.
  • DevOps: Git, CI/CD (GitHub Actions); Cloud: AWS; Visualization: Streamlit / Grafana / Web dashboards; Cost control: token consumption tracking, tiered evaluation strategy.

Design & Maintenance

  • Design a scalable test maintenance system (config-driven tests, versioning, human-in-the-loop review) and implement cost-aware evaluation and optimization (tiered test execution, controlled LLM usage).
  • Deliver full technical documentation and ensure the framework is maintainable and integrable into existing pipelines.

Candidate profile / Qualifications

  • Final-year student in Software Engineering with strong foundations in software development, REST APIs, and web technologies (HTTP, JSON).
  • Good programming level in Python; experience or interest in software testing and automated testing of AI systems; interest in NLP and LLMs.
  • Comfortable with Git and basic cloud concepts; analytical mindset, strong problem-solving skills, good documentation and communication skills in English; ability to work autonomously on a complex end-to-end technical project.

How to apply

  • Apply via BambooHR or the link shared in the post (see application_link).
Tun Up - Automated Testing & Quality Evaluation Framework for AI Chatbot & Mailbot (PFE / Internship) | Hi Interns | Hi Interns