Ch5-1 各种缓冲区

阅读本节前请务必先阅读Ch5-0 VKBase+.h
请先参阅Ch3-2 图像与缓冲区,我在该节中具体叙述了何为设备内存和缓冲区(缓冲区视图暂不用管),并对它们进行了简单封装,请先阅览该节并完成封装,并把VKFormat.h包含在VKBase+.h中。

内存可以具有属性:

  • VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT表示最适用于物理设备访问。

  • VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT表示在CPU一侧可见(数据可被CPU读写)。

物理设备可能可以分配同时具有这两种属性的内存,也可能不行,因此从CPU一侧向物理设备传输数据的通用方式为:首先将数据存入具有VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT属性的暂存缓冲区(staging buffer),再由物理设备将数据从暂存缓存区搬运到具有VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT属性的、访问效率更高的另一缓冲区。

Staging Buffer

如前文所述,可以在映射设备内存后,通过memcpy(...)等方式,将数据直接拷贝到具有VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT属性的缓冲区,将这类可由CPU侧直接写入的缓冲区封装为专用于数据中转的暂存缓冲区(staging buffer)。

我对暂存缓冲区采用如下设计:
1.能很容易地写入、读取数据,若写入时分配的缓冲区大小不够,则重新分配。
2.能手动释放设备内存。
3.能为其创建2d的混叠图像(仅单张2d图像,不涉及2d图像数组,作用如将设备内存绑定给缓冲区或图像中所述)。

为什么只考虑与2d图像混叠?
大部分硬件及其驱动对线性图像的支持仅限于单张2d图像,而不支持2d图像数组或1d、3d图像。
这实际上这已经非常够用了(考虑到3d渲染中的材质贴图大都是单张2d的而非数组)。

为了满足第1点,有必要记录每次映射的内存的大小,于是向VKBase+.h,vulkan命名空间中添加以下代码:

class stagingBuffer {
protected:
    bufferMemory bufferMemory;
    VkDeviceSize memoryUsage = 0;//每次映射的内存大小
    image aliasedImage;
public:
    stagingBuffer() = default;
    stagingBuffer(VkDeviceSize size) {
        Expand(size);
    }
    //Getter
    operator VkBuffer() const { return bufferMemory.Buffer(); }
    const VkBuffer* Address() const { return bufferMemory.AddressOfBuffer(); }
    VkDeviceSize AllocationSize() const { return bufferMemory.AllocationSize(); }
    VkImage AliasedImage() const { return aliasedImage; }
    //Const Function
    //该函数用于从缓冲区取回数据
    void RetrieveData(void* pData_src, VkDeviceSize size) const {
        bufferMemory.RetrieveData(pData_src, size);
    }
    //Non-const Function
    //该函数用于在所分配设备内存大小不够时重新分配内存
    void Expand(VkDeviceSize size) {
        if (size <= AllocationSize())
            return;
        Release();
        VkBufferCreateInfo bufferCreateInfo = {
            .size = size,
            .usage = VK_BUFFER_USAGE_TRANSFER_SRC_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT
        };
        bufferMemory.Create(bufferCreateInfo, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT);
    }
    //该函数用于手动释放所有内存并销毁设备内存和缓冲区的handle
    void Release() {
        bufferMemory.~bufferMemory();
    }
    void* MapMemory(VkDeviceSize size) {
        Expand(size);
        void* pData_dst = nullptr;
        bufferMemory.MapMemory(pData_dst, size);
        memoryUsage = size;
        return pData_dst;
    }
    void UnmapMemory() {
        bufferMemory.UnmapMemory(memoryUsage);
        memoryUsage = 0;
    }
    //该函数用于向缓冲区写入数据
    void BufferData(const void* pData_src, VkDeviceSize size) {
        Expand(size);
        bufferMemory.BufferData(pData_src, size);
    }
    //该函数创建线性布局的混叠2d图像
    [[nodiscard]]
    VkImage AliasedImage2d(VkFormat format, VkExtent2D extent) {
        /*待后续填充*/
    }
};

创建线性布局的混叠2d图像,考虑到这种图像的用途是作为将数据blit到另一图像的来源,首先检查对于特定图像格式,硬件是否支持线性布局的图像数据作为blit命令的来源:

VkImage AliasedImage2d(VkFormat format, VkExtent2D extent) {
    if (!(FormatProperties(format).linearTilingFeatures & VK_FORMAT_FEATURE_BLIT_SRC_BIT))
        return VK_NULL_HANDLE;
    /*待后续填充*/
}

计算图像数据的大小,若当前内存区的大小不足,返回VK_NULL_HANDLE

VkDeviceSize imageDataSize = VkDeviceSize(FormatInfo(format).sizePerPixel) * extent.width * extent.height;
if (imageDataSize > AllocationSize())
    return VK_NULL_HANDLE;

然后用vkGetPhysicalDeviceImageFormatProperties(...)取得对于特定参数组合创建的图像,所容许的图像尺寸等参数VkImageFormatProperties

VkResult VKAPI_CALL vkGetPhysicalDeviceImageFormatProperties(...) 的参数说明

VkPhysicalDevice physicalDevice

物理设备的handle

VkFormat format

图像的预期格式

VkImageType type

图像的预期类型

VkImageTiling tiling

图像数据的预期排列方式

VkImageUsageFlags usage

图像的预期用途

VkImageCreateFlags flags

创建图像时的预期flags

VkImageFormatProperties* pImageFormatProperties

若函数执行成功,将容许的图像大小、mipmap等级、图层数、采样点个数、数据大小写入*pImageFormatProperties

struct VkImageFormatProperties 的成员说明

VkExtent3D maxExtent

容许的最大尺寸

uint32_t maxMipLevels

容许的最大mipmap等级

uint32_t maxArrayLayers

容许的最大图层数

VkSampleCountFlags sampleCounts

容许的采样点个数的所有情形,对应各个bit,比如若4x多重采样可行,则与VK_SAMPLE_COUNT_4_BIT的位与结果为VK_SAMPLE_COUNT_4_BIT

VkDeviceSize maxResourceSize

容许的最大数据大小,单位为字节

VkImage AliasedImage2d(VkFormat format, VkExtent2D extent) {
    if (!(FormatProperties(format).linearTilingFeatures & VK_FORMAT_FEATURE_BLIT_SRC_BIT))
        return VK_NULL_HANDLE;
    VkDeviceSize imageDataSize = VkDeviceSize(FormatInfo(format).sizePerPixel) * extent.width * extent.height;
    if (imageDataSize > AllocationSize())
        return VK_NULL_HANDLE;
    VkImageFormatProperties imageFormatProperties = {};
    vkGetPhysicalDeviceImageFormatProperties(graphicsBase::Base().PhysicalDevice(),
        format, VK_IMAGE_TYPE_2D, VK_IMAGE_TILING_LINEAR, VK_IMAGE_USAGE_TRANSFER_SRC_BIT, 0, &imageFormatProperties);
    /*待后续填充*/
}

然后验证向函数提供的参数,以及缓冲区的现有大小是否满足要求:

VkImage AliasedImage2d(VkFormat format, VkExtent2D extent) {
    if (!(FormatProperties(format).linearTilingFeatures & VK_FORMAT_FEATURE_BLIT_SRC_BIT))
        return VK_NULL_HANDLE;
    VkDeviceSize imageDataSize = VkDeviceSize(FormatInfo(format).sizePerPixel) * extent.width * extent.height;
    if (imageDataSize > AllocationSize())
        return VK_NULL_HANDLE;
    VkImageFormatProperties imageFormatProperties = {};
    vkGetPhysicalDeviceImageFormatProperties(graphicsBase::Base().PhysicalDevice(),
        format, VK_IMAGE_TYPE_2D, VK_IMAGE_TILING_LINEAR, VK_IMAGE_USAGE_TRANSFER_SRC_BIT, 0, &imageFormatProperties);
    //检查各个参数是否在容许范围内
    if (extent.width > imageFormatProperties.maxExtent.width ||
        extent.height > imageFormatProperties.maxExtent.height ||
        imageDataSize > imageFormatProperties.maxResourceSize)
        return VK_NULL_HANDLE;//如不满足要求,返回VK_NULL_HANDLE
    /*待后续填充*/
}

然后填写图像的创建信息,就其使用场景而言,是由CPU侧写入数据,因此初始内存布局为VK_IMAGE_LAYOUT_PREINITIALIZED,bilt命令属于数据传送命令,因此图像用处为VK_IMAGE_USAGE_TRANSFER_SRC_BIT

VkImageCreateInfo imageCreateInfo = {
    .imageType = VK_IMAGE_TYPE_2D,
    .format = format,
    .extent = { extent.width, extent.height, 1 },
    .mipLevels = 1,
    .arrayLayers = 1,
    .samples = VK_SAMPLE_COUNT_1_BIT,
    .tiling = VK_IMAGE_TILING_LINEAR,
    .usage = VK_IMAGE_USAGE_TRANSFER_SRC_BIT,
    .initialLayout = VK_IMAGE_LAYOUT_PREINITIALIZED
};

若混叠图像已存在,先析构它,然后再创建新图像:

aliasedImage.~image();
aliasedImage.Create(imageCreateInfo);

在绑定设备内存前还有件事,得看看图像是否需要填充字节。
即便是线性图像,其中数据也未必是紧密排列的,过小的图像和奇数宽的图像可能需要在每行或整张图像末尾加入填充字节,以满足对齐要求。
处理填充字节视情况可能很麻烦,需要计算每一行的起始距离并逐行将数据逐行复制进缓冲区,如果图片又窄又长,可能还不如在GPU上多做一次复制来得快。这里用简单粗暴的方式处理:如果有对齐字节,直接返回VK_NULL_HANDLE

那么首先,用vkGetImageSubresourceLayout(...)来取得图像内存布局的具体参数:

void VKAPI_CALL vkGetImageSubresourceLayout(...) 的参数说明

VkDevice device

逻辑设备的handle

VkImage image

图像的handle

const VkImageSubresource* pSubresource

*pSubresource指定图像的子资源范围,即层面(颜色/深度/模板值等)、mip等级、图层索引

VkSubresourceLayout* pLayout

若函数执行成功,将所指定子资源范围所对应数据的内存布局的具体参数存入*pLayout

struct VkSubresourceLayout 的成员说明

VkDeviceSize offset

当前子资源范围对应的图像数据在整个图像的数据中的起始位置,单位为字节

VkDeviceSize size

当前子资源范围对应的图像数据的大小,单位为字节

VkDeviceSize rowPitch

每行起始位置之间相距的字节数

VkDeviceSize arrayPitch

(适用于2D图像数组)每个图层起始位置之间相距的字节数

VkDeviceSize depthPitch

(适用于3D图像)每个深度层起始位置之间相距的字节数

  • 对于2D图像/2D图像数组,若特定mipmap中某个像素的坐标为(x, y),图层为layer,则该像素的数据相对缓冲区起始位置的地址为:
    layer * arrayPitch + y * rowPitch + x * elementSize + offset
    其中elementSize为单个像素的字节数。

  • 对于3D图像,若特定mipmap中某个像素的坐标为(x, y, z),则该像素的数据相对缓冲区起始位置的地址为:
    z * depthPitch + y * rowPitch + x * elementSize + offset

  • 对于层面为VK_IMAGE_ASPECT_COLOR_BIT且mip等级和图层索引皆为0的情况,offset当然为0,因为没必要往前面填充字节。

VkImageSubresource subResource = { VK_IMAGE_ASPECT_COLOR_BIT, 0, 0 };
VkSubresourceLayout subresourceLayout = {};
vkGetImageSubresourceLayout(graphicsBase::Base().Device(), aliasedImage, &subResource, &subresourceLayout);
if (subresourceLayout.size != imageDataSize)
    return VK_NULL_HANDLE;
  • 若我们计算的图像数据大小imageDataSize与满足对齐要求的大小subresourceLayout.size不一致,说明有填充字节,返回VK_NULL_HANDLE,已创建的图像留待下一次函数调用时析构。

最后绑定设备内存,并返回所创建图像的handle,整个AliasedImage2d(...)函数如下:

[[nodiscard]]
VkImage AliasedImage2d(VkFormat format, VkExtent2D extent) {
    if (!(FormatProperties(format).linearTilingFeatures & VK_FORMAT_FEATURE_BLIT_SRC_BIT))
        return VK_NULL_HANDLE;
    VkDeviceSize imageDataSize = VkDeviceSize(FormatInfo(format).sizePerPixel) * extent.width * extent.height;
    if (imageDataSize > AllocationSize())
        return VK_NULL_HANDLE;
    VkImageFormatProperties imageFormatProperties = {};
    vkGetPhysicalDeviceImageFormatProperties(graphicsBase::Base().PhysicalDevice(),
        format, VK_IMAGE_TYPE_2D, VK_IMAGE_TILING_LINEAR, VK_IMAGE_USAGE_TRANSFER_SRC_BIT, 0, &imageFormatProperties);
    if (extent.width > imageFormatProperties.maxExtent.width ||
        extent.height > imageFormatProperties.maxExtent.height ||
        imageDataSize > imageFormatProperties.maxResourceSize)
        return VK_NULL_HANDLE;
    VkImageCreateInfo imageCreateInfo = {
        .imageType = VK_IMAGE_TYPE_2D,
        .format = format,
        .extent = { extent.width, extent.height, 1 },
        .mipLevels = 1,
        .arrayLayers = 1,
        .samples = VK_SAMPLE_COUNT_1_BIT,
        .tiling = VK_IMAGE_TILING_LINEAR,
        .usage = VK_IMAGE_USAGE_TRANSFER_SRC_BIT,
        .initialLayout = VK_IMAGE_LAYOUT_PREINITIALIZED
    };
    aliasedImage.~image();
    aliasedImage.Create(imageCreateInfo);
    VkImageSubresource subResource = { VK_IMAGE_ASPECT_COLOR_BIT, 0, 0 };
    VkSubresourceLayout subresourceLayout = {};
    vkGetImageSubresourceLayout(graphicsBase::Base().Device(), aliasedImage, &subResource, &subresourceLayout);
    if (subresourceLayout.size != imageDataSize)
        return VK_NULL_HANDLE;
    aliasedImage.BindMemory(bufferMemory.Memory());
    return aliasedImage;
}

专用于主线程的暂存缓冲区

由于之后各种缓冲区、图像的封装会大量用到暂存缓冲区,这里在stagingBuffer中添加一个静态成员stagingBuffer_mainThread专用于主线程。
你可以直接:

class stagingBuffer {
    static stagingBuffer stagingBuffer_mainThread;
    /*其他成员略*/
};
inline stagingBuffer stagingBuffer::stagingBuffer_mainThread;

如果你需要重建逻辑设备,那么有必要在stagingBuffer_mainThread的初始化器中设置回调函数,为了不必手动调用stagingBuffer_mainThread的初始化器,这里使用一些C++编程技巧来实现这一点:

class stagingBuffer {
    static inline class {
        stagingBuffer* pointer = Create();
        stagingBuffer* Create() {
            static stagingBuffer stagingBuffer;
            graphicsBase::Base().AddCallback_DestroyDevice([] { stagingBuffer.~stagingBuffer(); });
            return &stagingBuffer;
        }
        public:
        stagingBuffer& Get() const { return *pointer; }
    } stagingBuffer_mainThread;
    /*其他成员略*/
};
  • stagingBuffer_mainThread的类型是stagingBuffer的嵌套类,嵌套类中定义的变量(无论是否静态)的类型不能为外围类,因此需要定义pointer,然后再用一个函数调用作为其初始化器,使其指向局部静态变量stagingBuffer

  • 这里使用inline来免除静态成员变量的类外定义。

stagingBuffer中定义与stagingBuffer_mainThread相关的静态成员函数,最后整个stagingBuffer类如下:

class stagingBuffer {
    static inline class {
        stagingBuffer* pointer = Create();
        stagingBuffer* Create() {
            static stagingBuffer stagingBuffer;
            pointer = &stagingBuffer;
            graphicsBase::Base().AddCallback_DestroyDevice([] { stagingBuffer.~stagingBuffer(); });
        }
        public:
        stagingBuffer& Get() const { return *pointer; }
    } stagingBuffer_mainThread;
protected:
    bufferMemory bufferMemory;
    VkDeviceSize memoryUsage = 0;
    image aliasedImage;
public:
    stagingBuffer() = default;
    stagingBuffer(VkDeviceSize size) {
        Expand(size);
    }
    //Getter
    operator VkBuffer() const { return bufferMemory.Buffer(); }
    const VkBuffer* Address() const { return bufferMemory.AddressOfBuffer(); }
    VkDeviceSize AllocationSize() const { return bufferMemory.AllocationSize(); }
    VkImage AliasedImage() const { return aliasedImage; }
    //Const Function
    void RetrieveData(void* pData_src, VkDeviceSize size) const {
        bufferMemory.RetrieveData(pData_src, size);
    }
    //Non-const Function
    void Expand(VkDeviceSize size) {
        if (size <= AllocationSize())
            return;
        Release();
        VkBufferCreateInfo bufferCreateInfo = {
            .size = size,
            .usage = VK_BUFFER_USAGE_TRANSFER_SRC_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT
        };
        bufferMemory.Create(bufferCreateInfo, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT);
    }
    void Release() {
        bufferMemory.~bufferMemory();
    }
    void* MapMemory(VkDeviceSize size) {
        Expand(size);
        void* pData_dst = nullptr;
        bufferMemory.MapMemory(pData_dst, size);
        memoryUsage = size;
        return pData_dst;
    }
    void UnmapMemory() {
        bufferMemory.UnmapMemory(memoryUsage);
        memoryUsage = 0;
    }
    void BufferData(const void* pData_src, VkDeviceSize size) {
        Expand(size);
        bufferMemory.BufferData(pData_src, size);
    }
    [[nodiscard]]
    VkImage AliasedImage2d(VkFormat format, VkExtent2D extent) {
        if (!(FormatProperties(format).linearTilingFeatures & VK_FORMAT_FEATURE_BLIT_SRC_BIT))
            return VK_NULL_HANDLE;
        VkDeviceSize imageDataSize = VkDeviceSize(FormatInfo(format).sizePerPixel) * extent.width * extent. height;
        if (imageDataSize > AllocationSize())
            return VK_NULL_HANDLE;
        VkImageFormatProperties imageFormatProperties = {};
        vkGetPhysicalDeviceImageFormatProperties(graphicsBase::Base().PhysicalDevice(),
            format, VK_IMAGE_TYPE_2D, VK_IMAGE_TILING_LINEAR, VK_IMAGE_USAGE_TRANSFER_SRC_BIT, 0, &imageFormatProperties);
        if (extent.width > imageFormatProperties.maxExtent.width ||
            extent.height > imageFormatProperties.maxExtent.height ||
            imageDataSize > imageFormatProperties.maxResourceSize)
            return VK_NULL_HANDLE;
        VkImageCreateInfo imageCreateInfo = {
            .imageType = VK_IMAGE_TYPE_2D,
            .format = format,
            .extent = { extent.width, extent.height, 1 },
            .mipLevels = 1,
            .arrayLayers = 1,
            .samples = VK_SAMPLE_COUNT_1_BIT,
            .tiling = VK_IMAGE_TILING_LINEAR,
            .usage = VK_IMAGE_USAGE_TRANSFER_SRC_BIT,
            .initialLayout = VK_IMAGE_LAYOUT_PREINITIALIZED
        };
        aliasedImage.~image();
        aliasedImage.Create(imageCreateInfo);
        VkImageSubresource subResource = { VK_IMAGE_ASPECT_COLOR_BIT, 0, 0 };
        VkSubresourceLayout subresourceLayout = {};
        vkGetImageSubresourceLayout(graphicsBase::Base().Device(), aliasedImage, &subResource, &subresourceLayout);
        if (subresourceLayout.size != imageDataSize)
            return VK_NULL_HANDLE;
        aliasedImage.BindMemory(bufferMemory.Memory());
        return aliasedImage;
    }
    //Static Function
    static VkBuffer Buffer_MainThread() {
        return stagingBuffer_mainThread.Get();
    }
    static void Expand_MainThread(VkDeviceSize size) {
        stagingBuffer_mainThread.Get().Expand(size);
    }
    static void Release_MainThread() {
        stagingBuffer_mainThread.Get().Release();
    }
    static void* MapMemory_MainThread(VkDeviceSize size) {
        return stagingBuffer_mainThread.Get().MapMemory(size);
    }
    static void UnmapMemory_MainThread() {
        stagingBuffer_mainThread.Get().UnmapMemory();
    }
    static void BufferData_MainThread(const void* pData_src, VkDeviceSize size) {
        stagingBuffer_mainThread.Get().BufferData(pData_src, size);
    }
    static void RetrieveData_MainThread(void* pData_src, VkDeviceSize size) {
        stagingBuffer_mainThread.Get().RetrieveData(pData_src, size);
    }
    [[nodiscard]]
    static VkImage AliasedImage2d_MainThread(VkFormat format, VkExtent2D extent) {
        return stagingBuffer_mainThread.Get().AliasedImage2d(format, extent);
    }
};

Device Local Buffer

将具有VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT内存属性的缓冲区封装为deviceLocalBuffer,首先定义基本的Getter和创建/重建函数:

class deviceLocalBuffer {
protected:
    bufferMemory bufferMemory;
public:
    deviceLocalBuffer() = default;
    deviceLocalBuffer(VkDeviceSize size, VkBufferUsageFlags desiredUsages_Without_transfer_dst) {
        Create(size, desiredUsages_Without_transfer_dst);
    }
    //Getter
    operator VkBuffer() const { return bufferMemory.Buffer(); }
    const VkBuffer* Address() const { return bufferMemory.AddressOfBuffer(); }
    VkDeviceSize AllocationSize() const { return bufferMemory.AllocationSize(); }
    //Non-const Function
    void Create(VkDeviceSize size, VkBufferUsageFlags desiredUsages_Without_transfer_dst) {
        VkBufferCreateInfo bufferCreateInfo = {
            .size = size,
            .usage = desiredUsages_Without_transfer_dst | VK_BUFFER_USAGE_TRANSFER_DST_BIT
        };
        //短路执行,第一行的false||是为了对齐
        false ||
            bufferMemory.CreateBuffer(bufferCreateInfo) ||
            bufferMemory.AllocateMemory(VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT | VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT) && //&&运算符优先级高于||
            bufferMemory.AllocateMemory(VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT) ||
            bufferMemory.BindMemory();
    }
    void Recreate(VkDeviceSize size, VkBufferUsageFlags desiredUsages_Without_transfer_dst) {
        graphicsBase::Base().WaitIdle(); //deviceLocalBuffer封装的缓冲区可能会在每一帧中被频繁使用,重建它之前应确保物理设备没有在使用它
        bufferMemory.~bufferMemory();
        Create(size, desiredUsages_Without_transfer_dst);
    }
};
  • 优先尝试分配同时具有VK_MEMORY_PROPERTY_DEVICE_LOCAL_BITVK_MEMORY_PROPERTY_HOST_VISIBLE_BIT内存属性的设备内存。

向具有VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT,但不具有VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT内存属性的设备内存写入数据,有两种做法:
1.用vkCmdCopyBuffer(...)从暂存缓冲区拷贝到目标缓冲区。

VkResult VKAPI_CALL vkCmdCopyBuffer(...) 的参数说明

VkCommandBuffer commandBuffer

命令缓冲区的handle

VkBuffer srcBuffer

源缓冲区

VkBuffer dstBuffer

目标缓冲区

uint32_t regionCount

要被拷贝的数据块的个数

const VkBufferCopy* pRegions

指向VkBufferCopy类型的数组,用于指定要被拷贝的数据块

struct VkBufferCopy 的成员说明

VkDeviceSize srcOffset

要被拷贝的数据在源缓冲区中的起始位置,单位是字节

VkDeviceSize dstOffset

数据要被拷贝到目标缓冲区中的位置,单位是字节

VkDeviceSize size

要被拷贝的数据的大小,单位是字节

2.若数据量不大于65536个字节,用vkCmdUpdateBuffer(...)命令直接更新缓冲区(这方法也适用于具有VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT属性的设备内存)。

VkResult VKAPI_CALL vkCmdUpdateBuffer(...) 的参数说明

VkCommandBuffer commandBuffer

命令缓冲区的handle

VkBuffer dstBuffer

目标缓冲区

VkDeviceSize dstOffset

数据要被拷贝到目标缓冲区中的位置,单位是字节,必须是4的倍数

VkDeviceSize dataSize

数据大小,单位是字节,必须是4的倍数,且不多于65536个字节

const void* pData

源数据的地址

  • vkCmdUpdateBuffer(...)将更新数据直接记录在命令缓冲区中,即它会影响命令缓冲区的大小,因此其更新的数据大小不得太大。

那么首先来非常简单地包装下vkCmdUpdateBuffer(...):

class deviceLocalBuffer {
public:
    /*其他成员略*/
    //Const Function
    void CmdUpdateBuffer(VkCommandBuffer commandBuffer, const void* pData_src, VkDeviceSize size_Limited_to_65536, VkDeviceSize offset = 0) const {
        vkCmdUpdateBuffer(commandBuffer, bufferMemory.Buffer(), offset, size_Limited_to_65536, pData_src);
    }
    //适用于从缓冲区开头更新连续的数据块,数据大小自动判断
    void CmdUpdateBuffer(VkCommandBuffer commandBuffer, const auto& data_src) const {
        vkCmdUpdateBuffer(commandBuffer, bufferMemory.Buffer(), 0, sizeof data_src, &data_src);
    }
};

定义TransferData(...)函数,若具有VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT属性,直接调用bufferMemory.BufferData(...),否则在命令缓冲区中调用vkCmdCopyBuffer(...):

class deviceLocalBuffer {
public:
    /*其他成员略*/
    //Const Function
    //适用于更新连续的数据块
    void TransferData(const void* pData_src, VkDeviceSize size, VkDeviceSize offset = 0) const {
        if (bufferMemory.MemoryProperties() & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT) {
            bufferMemory.BufferData(pData_src, size, offset);
            return;
        }
        stagingBuffer::BufferData_MainThread(pData_src, size);
        auto& commandBuffer = graphicsBase::Plus().CommandBuffer_Transfer();
        commandBuffer.Begin(VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT);
        VkBufferCopy region = { 0, offset, size };
        vkCmdCopyBuffer(commandBuffer, stagingBuffer::Buffer_MainThread(), bufferMemory.Buffer(), 1, &region);
        commandBuffer.End();
        graphicsBase::Plus().ExecuteCommandBuffer_Graphics(commandBuffer);
    }
    //适用于更新不连续的多块数据,stride是每组数据间的步长,这里offset当然是目标缓冲区中的offset
    void TransferData(const void* pData_src, uint32_t elementCount, VkDeviceSize elementSize, VkDeviceSize stride_src, VkDeviceSize stride_dst, VkDeviceSize offset = 0) const {
        if (bufferMemory.MemoryProperties() & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT) {
            void* pData_dst = nullptr;
            bufferMemory.MapMemory(pData_dst, stride_dst * elementCount, offset);
            for (size_t i = 0; i < elementCount; i++)
                memcpy(stride_dst * i + static_cast<uint8_t*>(pData_dst), stride_src * i + static_cast<const uint8_t*>(pData_src), size_t(elementSize));
            bufferMemory.UnmapMemory(elementCount * stride_dst, offset);
            return;
        }
        stagingBuffer::BufferData_MainThread(pData_src, stride_src * elementCount);
        auto& commandBuffer = graphicsBase::Plus().CommandBuffer_Transfer();
        commandBuffer.Begin(VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT);
        std::unique_ptr<VkBufferCopy[]> regions = std::make_unique<VkBufferCopy[]>(elementCount);
        for (size_t i = 0; i < elementCount; i++)
            regions[i] = { stride_src * i, stride_dst * i + offset, elementSize };
        vkCmdCopyBuffer(commandBuffer, stagingBuffer::Buffer_MainThread(), bufferMemory.Buffer(), elementCount, regions.get());
        commandBuffer.End();
        graphicsBase::Plus().ExecuteCommandBuffer_Graphics(commandBuffer);
    }
    //适用于从缓冲区开头更新连续的数据块,数据大小自动判断
    void TransferData(const auto& data_src) const {
        TransferData(&data_src, sizeof data_src);
    }
};

最后整个deviceLocalBuffer如下:

class deviceLocalBuffer {
protected:
    bufferMemory bufferMemory;
public:
    deviceLocalBuffer() = default;
    deviceLocalBuffer(VkDeviceSize size, VkBufferUsageFlags desiredUsages_Without_transfer_dst) {
        Create(size, desiredUsages_Without_transfer_dst);
    }
    //Getter
    operator VkBuffer() const { return bufferMemory.Buffer(); }
    const VkBuffer* Address() const { return bufferMemory.AddressOfBuffer(); }
    VkDeviceSize AllocationSize() const { return bufferMemory.AllocationSize(); }
    //Const Function
    void TransferData(const void* pData_src, VkDeviceSize size, VkDeviceSize offset = 0) const {
        if (bufferMemory.MemoryProperties() & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT) {
            bufferMemory.BufferData(pData_src, size, offset);
            return;
        }
        stagingBuffer::BufferData_MainThread(pData_src, size);
        auto& commandBuffer = graphicsBase::Plus().CommandBuffer_Transfer();
        commandBuffer.Begin(VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT);
        VkBufferCopy region = { 0, offset, size };
        vkCmdCopyBuffer(commandBuffer, stagingBuffer::Buffer_MainThread(), bufferMemory.Buffer(), 1, &region);
        commandBuffer.End();
        graphicsBase::Plus().ExecuteCommandBuffer_Graphics(commandBuffer);
    }
    void TransferData(const void* pData_src, uint32_t elementCount, VkDeviceSize elementSize, VkDeviceSize stride_src, VkDeviceSize stride_dst, VkDeviceSize offset = 0) const {
        if (bufferMemory.MemoryProperties() & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT) {
            void* pData_dst = nullptr;
            bufferMemory.MapMemory(pData_dst, stride_dst * elementCount, offset);
            for (size_t i = 0; i < elementCount; i++)
                memcpy(stride_dst * i + static_cast<uint8_t*>(pData_dst), stride_src * i + static_cast<const uint8_t*>(pData_src), size_t(elementSize));
            bufferMemory.UnmapMemory(elementCount * stride_dst, offset);
            return;
        }
        stagingBuffer::BufferData_MainThread(pData_src, stride_src * elementCount);
        auto& commandBuffer = graphicsBase::Plus().CommandBuffer_Transfer();
        commandBuffer.Begin(VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT);
        std::unique_ptr<VkBufferCopy[]> regions = std::make_unique<VkBufferCopy[]>(elementCount);
        for (size_t i = 0; i < elementCount; i++)
            regions[i] = { stride_src * i, stride_dst * i + offset, elementSize };
        vkCmdCopyBuffer(commandBuffer, stagingBuffer::Buffer_MainThread(), bufferMemory.Buffer(), elementCount, regions.get());
        commandBuffer.End();
        graphicsBase::Plus().ExecuteCommandBuffer_Graphics(commandBuffer);
    }
    void TransferData(const auto& data_src) const {
        TransferData(&data_src, sizeof data_src);
    }
    void CmdUpdateBuffer(VkCommandBuffer commandBuffer, const void* pData_src, VkDeviceSize size_Limited_to_65536, VkDeviceSize offset = 0) const {
        vkCmdUpdateBuffer(commandBuffer, bufferMemory.Buffer(), offset, size_Limited_to_65536, pData_src);
    }
    void CmdUpdateBuffer(VkCommandBuffer commandBuffer, const auto& data_src) const {
        vkCmdUpdateBuffer(commandBuffer, bufferMemory.Buffer(), 0, sizeof data_src, &data_src);
    }
    //Non-const Function
    void Create(VkDeviceSize size, VkBufferUsageFlags desiredUsages_Without_transfer_dst) {
        VkBufferCreateInfo bufferCreateInfo = {
            .size = size,
            .usage = desiredUsages_Without_transfer_dst | VK_BUFFER_USAGE_TRANSFER_DST_BIT
        };
        false ||
            bufferMemory.CreateBuffer(bufferCreateInfo) ||
            bufferMemory.AllocateMemory(VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT | VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT) &&
            bufferMemory.AllocateMemory(VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT) ||
            bufferMemory.BindMemory();
    }
    void Recreate(VkDeviceSize size, VkBufferUsageFlags desiredUsages_Without_transfer_dst) {
        graphicsBase::Base().WaitIdle();
        bufferMemory.~bufferMemory();
        Create(size, desiredUsages_Without_transfer_dst);
    }
};

后文分别为顶点缓冲区、索引缓冲区、uniform缓冲区、storage缓冲区创建专属的类。
(注意,尽管本节中这么封装,一个缓冲区有多种用途也是可以的,只要创建时声明的用途中包含相应bit即可)

Vertex Buffer

为顶点缓冲区创建专用的类型,vertexBuffer继承deviceLocalBuffer,在创建缓冲区时默认指定VK_BUFFER_USAGE_VERTEX_BUFFER_BIT

class vertexBuffer :public deviceLocalBuffer {
public:
    vertexBuffer() = default;
    vertexBuffer(VkDeviceSize size, VkBufferUsageFlags otherUsages = 0) :deviceLocalBuffer(size, VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | otherUsages) {}
    //Non-const Function
    void Create(VkDeviceSize size, VkBufferUsageFlags otherUsages = 0) {
        deviceLocalBuffer::Create(size, VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | otherUsages);
    }
    void Recreate(VkDeviceSize size, VkBufferUsageFlags otherUsages = 0) {
        deviceLocalBuffer::Recreate(size, VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | otherUsages);
    }
};

Index Buffer

为索引缓冲区创建专用的类型,indexBuffer继承deviceLocalBuffer,在创建缓冲区时默认指定VK_BUFFER_USAGE_INDEX_BUFFER_BIT

class indexBuffer :public deviceLocalBuffer {
public:
    indexBuffer() = default;
    indexBuffer(VkDeviceSize size, VkBufferUsageFlags otherUsages = 0) :deviceLocalBuffer(size, VK_BUFFER_USAGE_INDEX_BUFFER_BIT | otherUsages) {}
    //Non-const Function
    void Create(VkDeviceSize size, VkBufferUsageFlags otherUsages = 0) {
        deviceLocalBuffer::Create(size, VK_BUFFER_USAGE_INDEX_BUFFER_BIT | otherUsages);
    }
    void Recreate(VkDeviceSize size, VkBufferUsageFlags otherUsages = 0) {
        deviceLocalBuffer::Recreate(size, VK_BUFFER_USAGE_INDEX_BUFFER_BIT | otherUsages);
    }
};

Uniform Buffer

为uniform缓冲区创建专用的类型,uniformBuffer继承deviceLocalBuffer,在创建缓冲区时默认指定VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT

class uniformBuffer :public deviceLocalBuffer {
public:
    uniformBuffer() = default;
    uniformBuffer(VkDeviceSize size, VkBufferUsageFlags otherUsages = 0) :deviceLocalBuffer(size, VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT | otherUsages) {}
    //Non-const Function
    void Create(VkDeviceSize size, VkBufferUsageFlags otherUsages = 0) {
        deviceLocalBuffer::Create(size, VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT | otherUsages);
    }
    void Recreate(VkDeviceSize size, VkBufferUsageFlags otherUsages = 0) {
        deviceLocalBuffer::Recreate(size, VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT | otherUsages);
    }
    //Static Function
    static VkDeviceSize CalculateAlignedSize(VkDeviceSize dataSize) {
        const VkDeviceSize& alignment = graphicsBase::Base().PhysicalDeviceProperties().limits.minUniformBufferOffsetAlignment;
        return dataSize + alignment - 1 & ~(alignment - 1);
    }
};

若将多个uniform缓冲区放在单个缓冲区里,这称为动态uniform缓冲区(dynamic uniform buffer),之后将缓冲区绑定到着色器中的uniform块时,可以通过改变offset绑定到动态uniform缓冲区中不同的部分。

CalculateAlignedSize(...)基于动态uniform缓冲区中,各个uniform缓冲区的最小间隔(VkPhysicalDeviceLimits::minUniformBufferOffsetAlignment,动态uniform缓冲区中每个单独的uniform缓冲区的占用大小都必须是该值的整数倍,该值必定为2的乘方),计算对于dataSize大小的数据,需要分配多少设备内存。
由于alignment必定为2的乘方,dataSize + alignment - 1 & ~(alignment - 1)的计算结果等同于(dataSize + alignment - 1) / alignment * alignment

Storage Buffer

为storage缓冲区创建专用的类型,storageBuffer继承deviceLocalBuffer,在创建缓冲区时默认指定VK_BUFFER_USAGE_STORAGE_BUFFER_BIT

class storageBuffer :public deviceLocalBuffer {
public:
    storageBuffer() = default;
    storageBuffer(VkDeviceSize size, VkBufferUsageFlags otherUsages = 0) :deviceLocalBuffer(size, VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | otherUsages) {}
    //Non-const Function
    void Create(VkDeviceSize size, VkBufferUsageFlags otherUsages = 0) {
        deviceLocalBuffer::Create(size, VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | otherUsages);
    }
    void Recreate(VkDeviceSize size, VkBufferUsageFlags otherUsages = 0) {
        deviceLocalBuffer::Recreate(size, VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | otherUsages);
    }
    //Static Function
    static VkDeviceSize CalculateAlignedSize(VkDeviceSize dataSize) {
        const VkDeviceSize& alignment = graphicsBase::Base().PhysicalDeviceProperties().limits.minStorageBufferOffsetAlignment;
        return dataSize + alignment - 1 & ~(alignment - 1);
    }
};
  • 类似动态uniform缓冲区,也有动态storage缓冲区(dynamic storage buffer)。