DrugIT
DrugIT
Tunisie

Software - Engineer — Observability & Reliability

SaaS / Software engineeringBackend / Base de donnéesObservability & MonitoringSite Reliability / Systems EngineeringDevOps/CI-CDSite Reliability Engineering (SRE)

Publié il y a 5 jours

Stage
⏱️4-6 mois
💼Présentiel
📅Expire dans 9 jours
Tu construis un pipeline, pas un coup de chance.

Description du poste

Mission Make Phoenix™ & Mythik™ reliable, observable, and production-grade for long‑running AI workflows.

What you’ll do

  • Implement distributed tracing across APIs, agents, and tools
  • Design metrics, structured logs, dashboards & alerts
  • Apply reliability patterns (timeouts, retries, idempotency)
  • Support incident readiness & system debugging

What you’ll learn

  • How production systems fail—and how to fix them
  • Observability best practices used in real platforms
  • Designing for long‑running workflows
  • Engineering discipline beyond “it works on my machine”

Profile

  • Final‑year engineering student (Computer Science, Software, AI, or equivalent)
  • Strong Python backend foundations (FastAPI, async, typing)
  • Solid understanding of APIs, debugging, and system behavior
  • Curious about reliability, performance, and production systems
  • Comfortable reading documentation and learning fast