Trying to install the CUDA toolkit on Fedora 40
Reddit » CUDA
by /u/OlaoluwaM
9h ago
It seems only to have a repo for F39. I was wondering if I could use the local RPM or the .run file as an alternative, but I'm not entirely sure since they're probably both for F39 as well. Would appreciate any insights. Thanks! submitted by /u/OlaoluwaM [visit reddit] [comments ..read more
Visit website
Need help in optimisation
Reddit » CUDA
by /u/Sad_Significance5903
9h ago
Hello!! I am trying to implement a algorithm which requires to find row sum of a 2D matrix for example 0 13 21 22 = 56 13 0 12 13 = 38 21 12 0 13 = 46 22 13 13 0 = 48 I am currently using atomicAdd which is taking a lot of time to compute __global__ void rowsum(int *d_matrix, int *d_sums, int n) { long block_Idx = blockIdx.x + (gridDim.x) * blockIdx.y + (gridDim.y * gridDim.x) * blockIdx.z; long thread_Idx = threadIdx.x + (blockDim.x) * threadIdx.y + (blockDim.y * blockDim.x) * threadIdx.z; long block_Capacity = blockDim.x * blockDim.y * blockDim.z; long i = block_Idx * block_Capacity + t ..read more
Visit website
Need a recommendation for a low profile NVIDIA GPU
Reddit » CUDA
by /u/danulagod
2d ago
Hi All, I'm looking for recommendations for a low profile GPU to be used for parallel computing applications with CUDA. This GPU is to be installed in a Dell R540 server which is a 2U rack mounted server with no support for external power supplies to the GPU. I have been using an old Nvidia quadro nvs 295 and ready to upgrade to something new with more CUDA capabilities. Appreciate everyone's insight! submitted by /u/danulagod [visit reddit] [comments ..read more
Visit website
WSL + CUDA + Tensorflow + PyTorch in 10 minutes
Reddit » CUDA
by /u/Ttmx
2d ago
https://blog.tteles.dev/posts/gpu-tensorflow-pytorch-cuda-wsl/ I spent 2 days attempting to configure GPU acceleration for TF and PyTorch and condensed it into a 10 minute guide, where most of the time is spent on downloads. None of the guides I found online worked for me. I'd be very happy to receive feedback. submitted by /u/Ttmx [visit reddit] [comments ..read more
Visit website
Non-VOLTA requirement version?
Reddit » CUDA
by /u/SpartonDawg
2d ago
I am using Dask currently and wanted to experiment with cudf, I successfully installed everything in Ubunto but when I ran <conda create -n rapids-24.04 -c rapidsai -c conda-forge -c nvidia rapids=24.04 python=3.11 cuda-version=12.2> I realized my GTX 1080ti does not meat the Compute Capability. What is my best path forward? Give up and wait till I upgrade GPU - or is it stable to work with an older version? submitted by /u/SpartonDawg [visit reddit] [comments ..read more
Visit website
I had my first CUDA related job interview and the interviewer confused CUDA with Quantum Computing
Reddit » CUDA
by /u/Routine-Winner2306
3d ago
The girl that was making the interview, was talking about Quantum Computing, so I pointed out that it was not in the job description after saying that I had no Idea of Quantum computing at all, in which the women said, "that it was a requirement for the job". She got nerveous instantly. She couldn't explained if the job was requiring OpenAI's Triton or NVIDIA's Triton inference model. Sorry, I wanted to vent out. submitted by /u/Routine-Winner2306 [visit reddit] [comments ..read more
Visit website
How to set up Nsight Compute Locally to profile Remote GPUs
Reddit » CUDA
by /u/droidarmy95
3d ago
submitted by /u/droidarmy95 [visit reddit] [comments ..read more
Visit website
How to see cuBLAS data layout?
Reddit » CUDA
by /u/foxNOTflower
4d ago
nvidia doc says the cuBLAS library uses column-major storage . but I have a matrix: 1 2 3 4 5 6 7 8 9 10 ... 21 22 23 24 25 in this kernel function: //single thread print matrix __global__ void printMatrixWithIndex(int *a, int n) { for(auto r=0;r!=5;++r) { for(auto c=0;c!=5;++c) { printf("%d ", a[(r)*5+(c)]); } printf("\n"); } } it should print : 1,6,... if it is column major. But still print 1 2 3 4 5 ... complete code is here: #include <cuda_runtime.h> #include <cublas_v2.h> #include <iostream> #include <algorithm> #include <numeric> //single thread print m ..read more
Visit website
Ideas for parallel programming project
Reddit » CUDA
by /u/AnnualHold2890
4d ago
In this semester I have parallel computing course and I have to purpose a project with deadline of one month. I am a backend engineer and had been working with servers since 2018 so currently I have no idea what to do or implement as my project, what are your ideas (also have a potential to be an academic paper)? submitted by /u/AnnualHold2890 [visit reddit] [comments ..read more
Visit website
Tensorflow not detecting gpu
Reddit » CUDA
by /u/Vengeaence
4d ago
I have the proper gpu windows supported tensorflow 2.10 version installed and verified with pip. I have CUDA 11.2 installed. System path variable is set for "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin" CUDNN installed with system path set as "C:\Program Files\NVIDIA\CUDNN\v8.1\bin". I get C:\Users\Anonymous>python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))" 2024-04-21 15:27:31.033958: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found The cud ..read more
Visit website

Follow Reddit » CUDA on FeedSpot

Continue with Google
Continue with Apple
OR