-
, distillation, sparsity, and parallelism to improve model efficiency Work with deployment tools such as ONNX, TensorRT, cuDNN, vLLM, and SGLang for fast and reliable inference Design and scale infrastructure
-
, consider the usage of standards, and contribute to standardisation documents Present findings at international events The research activities will be hosted by the Parallel Computing and Optimisation Group
Searches related to parallel
Enter an email to receive alerts for parallel positions