As an intern, you join the core DL team and contribute at one or both layers of our training stack: Pretraining — learn generalizable representations on large, de-identified datasets; Fine-tuning — adapt those representations (and VLMs) to clinical tasks and ship improvements. Work is prioritized by expected product and clinical impact; research is a means to that end.
Where you’ll contribute
Pretraining (images ± text) Design and scale representation learning for 2D/3D medical imaging:
- Objectives: masked image modeling, self-distillation/contrastive (MAE/DINO-style), vision–language alignment (CLIP-style with radiology reports).
- Modalities/architectures: X-ray, CT, occasional MRI; 2D/3D ViTs and UNet-style decoders.
- Systems: high-throughput DICOM loaders, strong augmentations, mixed-precision, distributed training.
- Evaluation: transfer to target tasks, label-efficiency curves, robustness across sites/vendors.
Fine-tuning (product models & VLMs) Adapt and optimize models that power our products and workflows:
- Core tasks: detection/segmentation/registration; follow-up (temporal matching, lesion tracking, measurements); calibration & uncertainty.
- VLMs: image encoder → decoder LLM for report generation/summarization/structured extraction. Techniques include supervised fine-tuning on paired image–report data, instruction tuning, alignment of visual tokens (e.g., resamplers/Q-Former-style adapters), and LoRA/PEFT on both vision and language components.
- Efficiency & deployment: distillation, pruning/quantization, KV-cache and batching for throughput; ONNX/TensorRT inference.
- Evaluation: AUROC/FROC/Dice, ECE calibration; for VLMs—finding-level label agreement, RadGraph-style entity/relational metrics, factuality checks vs imaging labels; clinician review.
You’ll spend time where it moves metrics most. Some interns focus on pretraining, others on fine-tuning/VLMs; many touch both.
How we work (engineering standards)
- Reproducibility: Hydra configs, seeded runs, model registry; tracked experiments.
- Production-ready code: typed Python, tests on data/metrics, documented PRs, code review.
- Measured progress: clear win criteria on accuracy, generalization, latency, and memory.
Our stack
PyTorch (Lightning), MONAI, timm/Hugging Face; NumPy/scikit-image; DICOM tooling; Weights & Biases (W&B) / ClearML/DVC; multi-GPU training; ONNX/TensorRT for inference; containerized services.