Reddit » CUDA on Feedspot

Trying to install the CUDA toolkit on Fedora 40

Reddit » CUDA

by /u/OlaoluwaM

9h ago

It seems only to have a repo for F39. I was wondering if I could use the local RPM or the .run file as an alternative, but I'm not entirely sure since they're probably both for F39 as well. Would appreciate any insights. Thanks! submitted by /u/OlaoluwaM [visit reddit] [comments ..read more

Visit website

Need help in optimisation

Reddit » CUDA

by /u/Sad_Significance5903

9h ago

Hello!! I am trying to implement a algorithm which requires to find row sum of a 2D matrix for example 0 13 21 22 = 56 13 0 12 13 = 38 21 12 0 13 = 46 22 13 13 0 = 48 I am currently using atomicAdd which is taking a lot of time to compute __global__ void rowsum(int *d_matrix, int *d_sums, int n) { long block_Idx = blockIdx.x + (gridDim.x) * blockIdx.y + (gridDim.y * gridDim.x) * blockIdx.z; long thread_Idx = threadIdx.x + (blockDim.x) * threadIdx.y + (blockDim.y * blockDim.x) * threadIdx.z; long block_Capacity = blockDim.x * blockDim.y * blockDim.z; long i = block_Idx * block_Capacity + t ..read more

Visit website

Need a recommendation for a low profile NVIDIA GPU

Reddit » CUDA

by /u/danulagod

2d ago

Hi All, I'm looking for recommendations for a low profile GPU to be used for parallel computing applications with CUDA. This GPU is to be installed in a Dell R540 server which is a 2U rack mounted server with no support for external power supplies to the GPU. I have been using an old Nvidia quadro nvs 295 and ready to upgrade to something new with more CUDA capabilities. Appreciate everyone's insight! submitted by /u/danulagod [visit reddit] [comments ..read more

Visit website

WSL + CUDA + Tensorflow + PyTorch in 10 minutes

Reddit » CUDA

by /u/Ttmx

2d ago

https://blog.tteles.dev/posts/gpu-tensorflow-pytorch-cuda-wsl/ I spent 2 days attempting to configure GPU acceleration for TF and PyTorch and condensed it into a 10 minute guide, where most of the time is spent on downloads. None of the guides I found online worked for me. I'd be very happy to receive feedback. submitted by /u/Ttmx [visit reddit] [comments ..read more

Visit website

Non-VOLTA requirement version?

Reddit » CUDA

by /u/SpartonDawg

2d ago

I am using Dask currently and wanted to experiment with cudf, I successfully installed everything in Ubunto but when I ran <conda create -n rapids-24.04 -c rapidsai -c conda-forge -c nvidia rapids=24.04 python=3.11 cuda-version=12.2> I realized my GTX 1080ti does not meat the Compute Capability. What is my best path forward? Give up and wait till I upgrade GPU - or is it stable to work with an older version? submitted by /u/SpartonDawg [visit reddit] [comments ..read more

Visit website

I had my first CUDA related job interview and the interviewer confused CUDA with Quantum Computing

Reddit » CUDA

by /u/Routine-Winner2306

3d ago

The girl that was making the interview, was talking about Quantum Computing, so I pointed out that it was not in the job description after saying that I had no Idea of Quantum computing at all, in which the women said, "that it was a requirement for the job". She got nerveous instantly. She couldn't explained if the job was requiring OpenAI's Triton or NVIDIA's Triton inference model. Sorry, I wanted to vent out. submitted by /u/Routine-Winner2306 [visit reddit] [comments ..read more

Visit website

How to set up Nsight Compute Locally to profile Remote GPUs

Reddit » CUDA

by /u/droidarmy95

3d ago

submitted by /u/droidarmy95 [visit reddit] [comments ..read more

Visit website

How to see cuBLAS data layout?

Reddit » CUDA

by /u/foxNOTflower

4d ago

nvidia doc says the cuBLAS library uses column-major storage . but I have a matrix: 1 2 3 4 5 6 7 8 9 10 ... 21 22 23 24 25 in this kernel function: //single thread print matrix __global__ void printMatrixWithIndex(int *a, int n) { for(auto r=0;r!=5;++r) { for(auto c=0;c!=5;++c) { printf("%d ", a[(r)*5+(c)]); } printf("\n"); } } it should print : 1,6,... if it is column major. But still print 1 2 3 4 5 ... complete code is here: #include <cuda_runtime.h> #include <cublas_v2.h> #include <iostream> #include <algorithm> #include <numeric> //single thread print m ..read more

Visit website

Ideas for parallel programming project

Reddit » CUDA

by /u/AnnualHold2890

4d ago

In this semester I have parallel computing course and I have to purpose a project with deadline of one month. I am a backend engineer and had been working with servers since 2018 so currently I have no idea what to do or implement as my project, what are your ideas (also have a potential to be an academic paper)? submitted by /u/AnnualHold2890 [visit reddit] [comments ..read more

Visit website

Tensorflow not detecting gpu

Reddit » CUDA

by /u/Vengeaence

4d ago

I have the proper gpu windows supported tensorflow 2.10 version installed and verified with pip. I have CUDA 11.2 installed. System path variable is set for "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin" CUDNN installed with system path set as "C:\Program Files\NVIDIA\CUDNN\v8.1\bin". I get C:\Users\Anonymous>python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))" 2024-04-21 15:27:31.033958: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found The cud ..read more

Visit website

Follow Reddit » CUDA on FeedSpot