Sort by
Refine Your Search
-
Listed
-
Category
-
Country
-
Program
-
Field
-
maintain the workload scheduler and architect quality-of-service policies. Administer Linux systems across infrastructure projects and deployment of new GPUs for research and teaching. Troubleshoot complex
-
Massachusetts Institute of Technology (MIT) | Cambridge, Massachusetts | United States | about 1 month ago
of complex AI research workloads on state-of-the-art hardware. The role will have heavy focus on optimizing existing NVIDIA GPU-based workloads for top-tier AMD GPUs, such as MI355X and beyond and will analyze
-
operation of Horizon, NSF’s next-generation leadership-class GPU system based on NVIDIA accelerator technologies. Horizon will significantly expand TACC’s capabilities in large-scale simulation, data
-
researchers from all disciplines and regions of Canada. Your opportunity: SciNet is in the process of installing a new AI capability, with a large number of high performance GPUs, as part of the ISED Sovereign
-
: SciNet is in the process of installing a new AI capability, with a large number of high performance GPUs, as part of the ISED Sovereign AI Compute initiative. The incumbent is expected to support the
-
, regional, and national professional meetings, workshops, and conferences. The Machine Learning Engineer will have the opportunity to work with leading-edge GPU and HPC technologies and engage with domain
-
Massachusetts Institute of Technology | Cambridge, Massachusetts | United States | about 1 month ago
performance of complex AI research workloads on state-of-the-art hardware. The role will have heavy focus on optimizing existing NVIDIA GPU-based workloads for top-tier AMD GPUs, such as MI355X and beyond and
-
PyTorch or TensorFlow/JAX frameworks. Tools: Experience with GPU training and large-scale data management. Soft Skills: Strong autonomy, intellectual curiosity, and ability to communicate effectively in
-
processor topology. On modern servers, Non-Uniform Memory Access (NUMA) architectures and GPU accelerators introduce asymmetric memory access costs that remain largely invisible to application-level code yet
-
managing GPU-enabled infrastructure (NVIDIA GPUs, CUDA, multi-GPU systems) in cloud and/or on-prem environments. Familiarity with GPU orchestration in Kubernetes (e.g., NVIDIA device plugin, GPU scheduling