1 Image-Driven Conversational Agent Framework for Web PFE
1 Image-Driven Conversational Agent Framework for Web PFE
Lanterns Studios•Tunisie
Computer Vision (CLIP/BLIP)Mobile & Web DevelopmentIA / Machine Learning
Publié il y a 9 jours
Stage
⏱️3-6 mois
💼Hybride
📅Expire dans 5 jours
Cohérence LinkedIn / CV vérifiée.
Description du poste
Overview
This project focuses on developing a lightweight framework that generates a conversational AI agent from a single 2D image, designed specifically for web environments.
The system must produce a responsive on-screen persona capable of real-time interaction through text and optional speech, while prioritizing fast loading and minimal computation overhead.
Key Features / Objectives
Generate an interactive AI persona using only a static 2D image with lightweight facial reactions or expression cues without 3D rendering.
Provide real-time conversational capabilities (text and optional voice) and prompt-based configuration for personality, tone, and behavior.
Optimize for browser performance on low-spec devices and enable simple integration into existing web applications.
Technical Stack & Responsibilities
Implement using JavaScript and WebAssembly, leveraging ONNX Runtime Web or TensorFlow.js for model inference in-browser.
Integrate Speech-to-Text and Text-to-Speech APIs for optional voice interaction and use lightweight vision models for facial cue generation.
Responsibilities include designing the framework architecture, model selection/tuning for web inference, performance optimization, and creating integration examples/demos.
Deliverables expected: working web prototype, performance benchmarks on low-spec devices, integration guide, and documentation for prompt-based configuration.