importcv2importnumpyasnpdeftest_cuda_functionality():try:# Create a test imageimg=np.random.randint(0,255,(100,100,3),dtype=np.uint8)# Try to upload to GPUgpu_img=cv2.cuda_GpuMat()gpu_img.upload(img)# Try a simple operationgpu_gray=cv2.cuda.cvtColor(gpu_img,cv2.COLOR_BGR2GRAY)# Download resultresult=gpu_gray.download()print("CUDA functionality test: PASSED")returnTrueexceptExceptionase:print(f"CUDA functionality test: FAILED - {e}")returnFalsetest_cuda_functionality()
Optimize Upload / Download memory transfer cpu/gpu
cv2.cuda.HostMem
using cv2.cuda.HostMem lets you allocate page-locked (pinned) memory in host RAM, which speeds up transfers between CPU ↔ GPU
- Normally, NumPy arrays are allocated in pageable memory → GPU transfers require an extra copy step.
- With HostMem, memory is allocated as page-locked (pinned) memory, which can be directly DMA-transferred to the GPU → faster upload/download.
importcv2importnumpyasnp# Check if CUDA is availableifnotcv2.cuda.getCudaEnabledDeviceCount():print("CUDA is not enabled or no CUDA devices found.")exit()# Define image dimensions and typerows,cols=480,640image_type=cv2.CV_8UC1# 8-bit, single channel (grayscale)# 1. Allocate page-locked host memory# You can specify the allocation type (PAGE_LOCKED or SHARED)PAGE_LOCKED=1host_mem=cv2.cuda.HostMem(rows,cols,image_type,PAGE_LOCKED)host_mem_download=cv2.cuda.HostMem(rows,cols,image_type,PAGE_LOCKED)# 2. Access the allocated memory as a NumPy array (Mat header)# This creates a NumPy array that shares the underlying memory with HostMemhost_mat=host_mem.createMatHeader()host_mat_download=host_mem_download.createMatHeader()# 3. Fill the host_mat with some data (e.g., a simple pattern)forrinrange(rows):forcinrange(cols):host_mat[r,c]=(r+c)%255# 4. Upload the data from HostMem to a GpuMatgpu_mat=cv2.cuda_GpuMat()gpu_mat.upload(host_mat)# 5. Perform a CUDA operation (e.g., Gaussian blur)gaussian_filter=cv2.cuda.createGaussianFilter(image_type,image_type,(5,5),0)gpu_blurred_mat=gaussian_filter.apply(gpu_mat)# 6. Download the result back to host memory (can be a regular NumPy array)# 6. Download from GPUgpu_blurred_mat.download(host_mat_download)# 7. Display the original and processed images (optional)cv2.imshow("Original (from HostMem)",host_mat)cv2.imshow("Blurred (on CPU after GPU processing)",host_mat_download)cv2.waitKey(0)cv2.destroyAllWindows()
createMatHeader
A HostMem buffer by itself isn’t directly usable as a NumPy array. thy are buffer in host RAM
call .createMatHeader(), which returns a cv::Mat header that views into the same memory. (wrap the buffer)
and return the buffer as ndarray