-
, and parallel computing, with a proven ability to work within highly secure and regulated environments. This role involves close collaboration with security teams, scientists, and IT leadership to ensure
-
strategic management and strict adherence to security protocols. We are looking for candidates with extensive experience in either classified HPC data center operations, architecture, parallel computing
-
systems, high-speed parallel file systems, and archival solutions critical to advancing scientific discovery and innovation. As part of ORNL’s leadership-class computing ecosystem, you will play a vital
-
frameworks to maintain secure and compliant environments. Document system architectures, processes, and best practices, and contribute to internal knowledge sharing. Participate in on-call rotations and off
-
simulation codes require a mesh to discretize the spatial domain of interest. For some applications, generating this mesh has historically been a labor-intensive process because it is shown to be a leading
-
simulation codes require a mesh to discretize the spatial domain of interest. For some applications, generating this mesh has historically been a labor-intensive process because it is shown to be a leading
-
and clustered computing services to researchers who process large data sets and/or develop code as a part of their project. Ensure the availability, performance, scalability, and security of production
-
Lustre parallel file system. NCCS serves multiple agencies including DOE, NOAA, and the Air Force. The NCCS also supports the center’s Quantum Computing User Program (QCUP) which provides access to state
-
for Science @ Scale: Pretraining, instruction tuning, continued pretraining, Mixture-of-Experts; distributed training/inference (FSDP, DeepSpeed, Megatron-LM, tensor/sequence parallelism); scalable evaluation
-
for Science @ Scale: Pretraining, instruction tuning, continued pretraining, Mixture-of-Experts; distributed training/inference (FSDP, DeepSpeed, Megatron-LM, tensor/sequence parallelism); scalable evaluation