Sort by
Refine Your Search
-
Listed
-
Country
-
Employer
- Carnegie Mellon University
- Forschungszentrum Jülich
- Nature Careers
- Oak Ridge National Laboratory
- University of Texas at Dallas
- University of Toronto
- Fraunhofer-Gesellschaft
- Technical University of Denmark
- Technical University of Munich
- University of California Davis
- University of Dayton
- ;
- AIT Austrian Institute of Technology
- California Institute of Technology
- Cold Spring Harbor Laboratory
- ETH Zurich
- Free University of Berlin
- IMEC
- Johns Hopkins University
- King Abdullah University of Science and Technology
- London School of Economics and Political Science;
- Medical Research Council
- Purdue University
- Rutgers University
- Temple University
- The Ohio State University
- University of Arkansas
- University of California Berkeley
- University of Delaware
- University of Florida
- University of Glasgow
- University of Maryland, Baltimore County
- University of North Carolina at Chapel Hill
- University of Oxford
- University of Pittsburgh
- University of Washington
- Washington University in St. Louis
- 27 more »
- « less
-
Field
-
current practice). Demonstrated expertise in AI/ML. Proven track record in application performance optimization. Advanced experience with parallel programming models (e.g., OpenMP, MPI, CUDA). Extensive
-
University of North Carolina at Chapel Hill | Chapel Hill, North Carolina | United States | about 7 hours ago
). Experience and proficiency across the software development lifecycle (version control, documentation, and testing) is required. Experience with GPU acceleration frameworks (Nvidia CUDA, PyCUDA, CuPy
-
Programming: Python, CUDA, Git
-
researchers to integrate computing techniques into research activities using common HPC programming languages, tools, and techniques including Fortran and/or C/C++, MPI, OpenMP, CUDA An equivalent combination
-
scripting methods (i.e. Python, MATLAB, C++, CUDA, Bash, and/or SQL) and machine learning / deep learning methods Active learning, exploration, optimal experiment design, Bayesian optimization, reinforcement
-
rendering into medical imaging workflows. A major focus will be on accelerating inference and training using GPU-optimised components, including custom CUDA kernels. This role offers a unique opportunity to
-
using the shared memory and message passing techniques. Knowledge of OpenMP and MPI or similar programming directives and libraries. Knowledge of GPU programming with CUDA, HIP, oneAPI or OpenMP for GPUs
-
efficiency for serving massive models. Research and implement cutting-edge optimization strategies at the kernel level (e.g., FlashAttention, custom CUDA/ROCm kernels). Build robust data pipelines
-
with job schedulers, e.g. SLURM, PBS, SGE, etc. ● Experience working at an academic institution ● Experience with parallel codes and libraries (e.g. MPI, OpenMP, Cuda) ● Experience with research and/or
-
, CUDA) and good understanding of hardware used in large scale HPC clusters such as hybrid CPU+GPU systems, memory hierarchies and file systems; experience with job schedulers (e.g., Slurm, FLUX) and