ReDX Technologies
ReDX Technologies
Tunisie

Automated Application-Level Testing for HPC Software Ecosystems

HPCQuality Assurance / Software TestingDevOps / CI/CD / IntegrationCloud infrastructure / DevOpsGPU ComputingScientific Computing

Publié il y a environ 20 heures

Stage
⏱️2-3 mois
💼Présentiel
💰Rémunéré
📅Expire dans 13 jours
Vérifie que tes liens sont cliquables.

Description du poste

Brief: Design representative application-level test suites for real HPC workloads (CUDA, OpenMPI, PyTorch, TensorFlow, VASP, Quantum ESPRESSO) and integrate them into CI/CD to improve reliability and efficiency of the HPC software ecosystem.

Goals and responsibilities:

  • Build application/library test cases validating correctness and basic performance on CPU and GPU.
  • Integrate tests in CI using ReFrame and Jenkins for automated periodic validation of the software stack.
  • Ensure scalability and portability across architectures, compilers, and configurations managed with EasyBuild/Spack and modules.

Required skills:

  • Linux command line and shell; solid development in C/C++ or Python.
  • Problem‑solving mindset; good English, organization, and PM tool usage.

Planned training:

  • Linux fundamentals (Udemy), intro to HPC and parallel programming, EasyBuild & Spack, 1:1 mentorship with ReDX engineers.

Other details:

  • Recommended period: 2–3 months.
  • Compensation: Monthly stipend with potential end‑of‑internship performance bonus.
  • Opportunity to work on real HPC systems and interact with end users.
ReDX Technologies - Automated Application-Level Testing for HPC Software Ecosystems | Hi Interns | Hi Interns