Asteroidea
Asteroidea
Tunisie

AST-2026-003 Astro Cloud Monitoring & Automated Deployment Platform DevOps 08 PFE

DevOps / CI-CDCloud MonitoringCI/CD Automation

Publié il y a 7 mois

Stage
⏱️4-6 mois
💼Hybride
📅Expiré il y a 6 mois
Reste lisible (ATS friendly).

Description du poste

Overview

  • Design and implement a centralized cloud monitoring and alerting platform to supervise all Asteroidea SaaS products.
  • Build real-time dashboards, incident detection and alerting mechanisms, and automated deployment pipelines to ensure continuous availability and reliability.

Responsibilities

  • Deploy and maintain a Kubernetes-based orchestration environment using Docker and Kubernetes for monitored services and monitoring components.
  • Implement CI/CD pipelines (GitLab CI) and automated provisioning/configuration using Terraform and Ansible to enable repeatable, auditable deployments.

Technical Scope & Tools

  • Integrate ELK Stack (Elasticsearch, Logstash, Kibana) for centralized logging and dashboards; evaluate and integrate Checkmk or a similar system for host/service checks and metrics.
  • Implement alerting and incident routing (email/Slack/pager) with clear thresholds, escalation policies, and runbooks.

Deliverables

  • Production-ready monitoring platform with dashboards, alert rules, and documented runbooks for common incidents.
  • Automated IaC modules (Terraform) and configuration playbooks (Ansible) plus GitLab CI pipelines to deploy and update the platform.

Quality, Testing & Reliability

  • Define SLOs/SLIs and build monitoring to track them (latency, error rates, availability) and produce reports for performance tuning and capacity planning.
  • Implement end-to-end tests for deployment pipelines and continuous verification of monitoring components; include CI jobs for linting and security checks.

Integration & Operational Handover

  • Ensure the platform supervises all existing Asteroidea SaaS products, integrating with service endpoints, metrics exporters and log shippers.
  • Produce operational documentation and a handover package for on-call teams, plus runbooks for alert investigation and resolution.

Required Skills & Expectations

  • Hands-on experience with Kubernetes, Docker, ELK Stack, GitLab CI, Ansible, Terraform and monitoring tools (Checkmk or similar).
  • Strong understanding of CI/CD best practices, observability principles, alerting design, and infrastructure-as-code workflows.

How to apply