Efficient CUDA Debugging: Using NVIDIA Compute Sanitizer with NVIDIA Tools Extension and Creating Custom Tools
NVIDIA Developer » CUDA
by Paul Graham
2w ago
NVIDIA Compute Sanitizer is a powerful tool that can save you time and effort while improving the reliability and performance of your CUDA applications.... Source ..read more
Visit website
Building High-Performance Applications in the Era of Accelerated Computing
NVIDIA Developer » CUDA
by Robert Jensen
3w ago
AI is augmenting high-performance computing (HPC) with novel approaches to data processing, simulation, and modeling. Because of the computational requirements... AI is augmenting high-performance computing (HPC) with novel approaches to data processing, simulation, and modeling. Because of the computational requirements of these new AI workloads, HPC is scaling up at a rapid pace. To enable applications to scale to multi-GPU and multi-node platforms, HPC tools and libraries must support that growth. NVIDIA provides a comprehensive ecosystem of… Source ..read more
Visit website
Just Released: NVIDIA cuSPARSELt 0.6
NVIDIA Developer » CUDA
by Robert Jensen
1M ago
NVIDIA cuSPARSELt harnesses Sparse Tensor Cores to accelerate general matrix multiplications. Version 0.6. adds support for the NVIDIA Hopper architecture. NVIDIA cuSPARSELt harnesses Sparse Tensor Cores to accelerate general matrix multiplications. Version 0.6. adds support for the NVIDIA Hopper architecture. Source ..read more
Visit website
CUDA Toolkit 12.4 Enhances Support for NVIDIA Grace Hopper and Confidential Computing
NVIDIA Developer » CUDA
by Rob Armstrong
1M ago
The latest release of CUDA Toolkit, version 12.4, continues to push accelerated computing performance using the latest NVIDIA GPUs. This post explains the new... The latest release of CUDA Toolkit, version 12.4, continues to push accelerated computing performance using the latest NVIDIA GPUs. This post explains the new features and enhancements included in this release: CUDA and the CUDA Toolkit software provide the foundation for all NVIDIA GPU-accelerated computing applications in data science and analytics, machine learning… Source ..read more
Visit website
Optimizing OpenFold Training for Drug Discovery
NVIDIA Developer » CUDA
by Feiwen Zhu
1M ago
Predicting 3D protein structures from amino acid sequences has been an important long-standing question in bioinformatics. In recent years, deep... Predicting 3D protein structures from amino acid sequences has been an important long-standing question in bioinformatics. In recent years, deep learning–based computational methods have been emerging and have shown promising results. Among these lines of work, AlphaFold2 is the first method that has achieved results comparable to slower physics-based computational methods. Source ..read more
Visit website
Just Released: cuBLASDx
NVIDIA Developer » CUDA
by Robert Jensen
2M ago
cuBLASDx allows you to perform BLAS calculations inside your CUDA kernel, improving the performance of your application. Available to download in Preview... cuBLASDx allows you to perform BLAS calculations inside your CUDA kernel, improving the performance of your application. Available to download in Preview now.  ..read more
Visit website
Improving CUDA Initialization Times Using cgroups in Certain Scenarios
NVIDIA Developer » CUDA
by Rahul Ramasubramanian
2M ago
Many CUDA applications running on multi-GPU platforms usually use a single GPU for their compute needs. In such scenarios, a performance penalty is paid by... Many CUDA applications running on multi-GPU platforms usually use a single GPU for their compute needs. In such scenarios, a performance penalty is paid by applications because CUDA has to enumerate/initialize all the GPUs on the system. If a CUDA application does not require other GPUs to be visible and accessible, you can launch such applications by isolating the unwanted GPUs from the CUDA process and eliminating unnecessary initializ ..read more
Visit website
Just Released: cuBLASMp
NVIDIA Developer » CUDA
by Robert Jensen
2M ago
cuBLASMp is a high-performance, multi-process, GPU-accelerated library for distributed basic dense linear algebra. It is available to download in Preview now. cuBLASMp is a high-performance, multi-process, GPU-accelerated library for distributed basic dense linear algebra. It is available to download in Preview now ..read more
Visit website
CUDA Quantum 0.5 Delivers New Features for Quantum-Classical Computing
NVIDIA Developer » CUDA
by Efrat Shabtai
2M ago
CUDA Quantum is a platform for building quantum-classical computing applications. It is an open-source programming model for heterogeneous computing such as... CUDA Quantum is a platform for building quantum-classical computing applications. It is an open-source programming model for heterogeneous computing such as quantum processor units (QPUs), GPUs, and CPUs.  CUDA Quantum accelerates workflows such as quantum simulation, quantum machine learning, quantum chemistry, and more. It optimizes these workflows as part of its compiler toolchain and uses the power of GPUs to accelerate them. C ..read more
Visit website
Unlocking GPU Intrinsics in HLSL
NVIDIA Developer » CUDA
by Alexey Panteleev
2M ago
There are some useful intrinsic functions in the NVIDIA GPU instruction set that are not included in standard graphics APIs. Updated from the original 2016 post... There are some useful intrinsic functions in the NVIDIA GPU instruction set that are not included in standard graphics APIs. Updated from the original 2016 post to add information about new intrinsics and cross-vendor APIs in DirectX and Vulkan. For example, a shader can use warp shuffle instructions to exchange data between threads in a warp without going through shared memory, which is especially valuable in pixel shaders where th ..read more
Visit website

Follow NVIDIA Developer » CUDA on FeedSpot

Continue with Google
Continue with Apple
OR