CUDA C++指针类型转换

vi4fp9gy 于 2022-12-15 发布在其他

关注(0)|答案(1)|浏览(176)

我在看CUDA C++文档。但是关于指针类型转换我有一些不明白的地方。下面是主机和设备代码。

// Host code
int width = 64, height = 64;
float* devPtr;
size_t pitch;
cudaMallocPitch(&devPtr, &pitch,
                width * sizeof(float), height);
MyKernel<<<100, 512>>>(devPtr, pitch, width, height);

// Device code
__global__ void MyKernel(float* devPtr,
                         size_t pitch, int width, int height)
{
    for (int r = 0; r < height; ++r) {
        float* row = (float*)((char*)devPtr + r * pitch);
        for (int c = 0; c < width; ++c) {
            float element = row[c];
        }
    }
}

正如你所看到的devPtr被类型转换成char，但是我不明白为什么类型转换成char而不是作为float类型递增。

c++

来源：https://stackoverflow.com/questions/74735993/cuda-c-pointer-typecasting

1条答案

按热度按时间

yfjy0ee71#

这是为了处理pitched分配（由cudaMallocPitch()创建的类型）。
间距分配将请求的分配宽度“上舍入”为特定间距，该间距以 * 字节 * 为单位指定：

cudaMallocPitch(&devPtr, &pitch,
                          ^
                          | 
               this value is indicated by the function as a row width or "pitch" in bytes

由于间距是以字节为单位指定的，因此要获得正确的pointer arithmetic：

((char*)devPtr + r * pitch);
               ^
               |
           pointer arithmetic

型
指针类型也必须是字节类型。该代码片段的目标是将devPtr增加由r指定的行数，每行由pitch字节组成。
AFAIK，在CUDA中，没有任何东西可以保证cudaMallocPitch返回的pitch的任何特定粒度。例如，理论上它可能是奇数个字节，或者是质数个字节。因此，玩弄技巧将pitch值预转换为其他类型宽度中的等效（指针算术）偏移量将是不受欢迎的。

赞(0）回复(0）举报 2022-12-15

我来回答

CUDA C++指针类型转换

1条答案

相关问题

热门标签

最新问答