vulkan-3.26.1: Bindings to the Vulkan graphics API.
Safe HaskellSafe-Inferred
LanguageHaskell2010

Vulkan.Extensions.VK_HUAWEI_cluster_culling_shader

Description

Name

VK_HUAWEI_cluster_culling_shader - device extension

VK_HUAWEI_cluster_culling_shader

Name String
VK_HUAWEI_cluster_culling_shader
Extension Type
Device extension
Registered Extension Number
405
Revision
2
Ratification Status
Not ratified
Extension and Version Dependencies
VK_KHR_get_physical_device_properties2
Contact
Extension Proposal
VK_HUAWEI_cluster_culling_shader

Other Extension Metadata

Last Modified Date
2022-11-17
Interactions and External Dependencies
Contributors
  • Yuchang Wang, Huawei
  • Juntao Li, Huawei
  • Pan Gao, Huawei
  • Jie Cao, Huawei
  • Yunjin Zhang, Huawei
  • Shujie Zhou, Huawei
  • Chaojun Wang, Huawei
  • Jiajun Hu, Huawei
  • Cong Zhang, Huawei

Description

Cluster Culling Shaders (CCS) are similar to the existing compute shaders. Their main purpose is to provide an execution environment in order to perform coarse-level geometry culling and LOD selection more efficiently on the GPU.

The traditional 2-pass GPU culling solution using a compute shader sometimes needs a pipeline barrier between compute and graphics pipeline to optimize performance. An additional compaction process may also be required. This extension addresses these shortcomings, allowing compute shaders to directly emit visible clusters to the following graphics pipeline.

A set of new built-in output variables are used to express a visible cluster. In addition, a new built-in function is used to emit these variables from CCS to the IA stage. The IA stage can use these variables to fetches vertices of a visible cluster and drive vertex shaders to shading these vertices.

Note that CCS do not work with geometry or tessellation shaders, but both IA and vertex shaders are preserved. Vertex shaders are still used for vertex position shading, instead of directly outputting transformed vertices from the compute shader. This makes CCS more suitable for mobile GPUs.

New Commands

New Structures

New Enum Constants

New Built-In Variables

New SPIR-V Capability

Sample Code

Example of cluster culling in a GLSL shader

#extension GL_HUAWEI_cluster_culling_shader: enable

#define GPU_WARP_SIZE                   32
#define GPU_GROUP_SIZE                  GPU_WARP_SIZE

#define GPU_CLUSTER_PER_INVOCATION      1
#define GPU_CLUSTER_PER_WORKGROUP       (GPU_GROUP_SIZE * GPU_CLUSTER_PER_INVOCATION)

// Number of threads per workgroup
// - 1D only
// - warpsize = 32
layout(local_size_x=GPU_GROUP_SIZE, local_size_y=1, local_size_z=1) in;


#define GPU_CLUSTER_DESCRIPTOR_BINDING      0
#define GPU_DRAW_BUFFER_BINDING             1
#define GPU_INSTANCE_DESCRIPTOR_BINDING     2

const float pi_half = 1.570795;
uint instance_id;

struct BoundingSphere
{
  vec3 center;
  float radius;
};

struct BoundingCone
{
  vec3 normal;
  float angle;
};

struct ClusterDescriptor
{
  BoundingSphere sphere;
  BoundingCone cone;
  uint instance_idx;
};

struct InstanceData
{
  mat4 mvp_matrix;                      // mvp matrix.
  vec4 frustum_planes[6];               // six frustum planes
  mat4 model_matrix_transpose_inverse;  // inverse transpose of model matrix.
  vec3 view_origin;                     // view original
};

struct InstanceDescriptor
{
  uint begin;
  uint end;
  uint cluster_count;
  uint debug;
  BoundingSphere sphere;
  InstanceData instance_data;
};

struct DrawElementsCommand{
  uint indexcount;
  uint instanceCount;
  uint firstIndex;
  int  vertexoffset;
  uint firstInstance;
  uint cluster_id;
};

// indexed mode
out gl_PerClusterHUAWEI{
  uint gl_IndexCountHUAWEI;
  uint gl_InstanceCountHUAWEI;
  uint gl_FirstIndexHUAWEI;
  int  gl_VertexOffsetHUAWEI;
  uint gl_FirstInstanceHUAWEI;
  uint gl_ClusterIDHUAWEI;
};


layout(binding = GPU_CLUSTER_DESCRIPTOR_BINDING, std430) readonly buffer cluster_descriptor_ssbo
{
        ClusterDescriptor cluster_descriptors[];
};


layout(binding = GPU_DRAW_BUFFER_BINDING, std430) buffer draw_indirect_ssbo
{
        DrawElementsCommand draw_commands[];
};

layout(binding = GPU_INSTANCE_DESCRIPTOR_BINDING, std430) buffer instance_descriptor_ssbo
{
        InstanceDescriptor instance_descriptors[];
};


bool isFrontFaceVisible( vec3 sphere_center, float sphere_radius, vec3 cone_normal, float cone_angle )
{
  vec3 sphere_center_dir = normalize(sphere_center -
                           instance_descriptors[instance_id].instance_data.view_origin);

  float sin_cone_angle = sin(min(cone_angle, pi_half));
  return dot(cone_normal, sphere_center_dir) < sin_cone_angle;
}

bool isSphereOutsideFrustum( vec3 sphere_center, float sphere_radius )
{
  bool isInside = false;

  for(int i = 0; i < 6; i++)
  {
      isInside = isInside ||
      (dot(instance_descriptors[instance_id].instance_data.frustum_planes[i].xyz,
      sphere_center) + instance_descriptors[instance_id].instance_data.frustum_planes[i].w <
      sphere_radius);
  }
  return isInside;
}


void main()
{
    uint cluster_id = gl_GlobalInvocationID.x;
    ClusterDescriptor desc = cluster_descriptors[cluster_id];

    // get instance description
    instance_id = desc.instance_idx;
    InstanceDescriptor inst_desc = instance_descriptors[instance_id];

    //instance based culling
    bool instance_render = !isSphereOutsideFrustum(inst_desc.sphere.center, inst_desc.sphere.radius);

    if( instance_render)
    {
        // cluster based culling
        bool render = (!isSphereOutsideFrustum(desc.sphere.center,
        desc.sphere.radius) && isFrontFaceVisible(desc.sphere.center, desc.sphere.radius, desc.cone.norm
        al, desc.cone.angle));

        if (render)
        {
            // this cluster passed coarse-level culling, update built-in output variable.
            // in case of indexed mode:
            gl_IndexCountHUAWEI     = draw_commands[cluster_id].indexcount;
            gl_InstanceCountHUAWEI  = draw_commands[cluster_id].instanceCount;
            gl_FirstIndexHUAWEI     = draw_commands[cluster_id].firstIndex;
            gl_VertexOffsetHUAWEI   = draw_commands[cluster_id].vertexoffset;
            gl_FirstInstanceHUAWEI  = draw_commands[cluster_id].firstInstance;
            gl_ClusterIDHUAWEI      = draw_commands[cluster_id].cluster_id;

            // emit built-in output variables as a drawing command to subsequent
            // rendering pipeline.
            dispatchClusterHUAWEI();
        }
    }
}

Example of graphics pipeline creation with cluster culling shader

// create a cluster culling shader stage info structure.
VkPipelineShaderStageCreateInfo ccsStageInfo{};
ccsStageInfo.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO;
ccsStageInfo.stage = VK_SHADER_STAGE_CLUSTER_CULLING_BIT_HUAWEI;
ccsStageInfo.module = clustercullingshaderModule;
ccsStageInfo.pName =  "main";

// pipeline shader stage creation
VkPipelineShaderStageCreateInfo shaderStages[] = { ccsStageInfo, vertexShaderStageInfo, fragmentShaderStageInfo };

// create graphics pipeline
VkGraphicsPipelineCreateInfo pipelineInfo{};
pipelineInfo.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO;
pipelineInfo.stageCount = 3;
pipelineInfo.pStage = shaderStages;
pipelineInfo.pVertexInputState = &vertexInputInfo;
// ...
VkPipeline graphicsPipeline;
VkCreateGraphicsPipelines(device, VK_NULL_HANDLE, 1, &pipelineInfo, nullptr, &graphicsPipeline);

Example of launching the execution of cluster culling shader

vkCmdBindPipeline(commandBuffer, VK_PIPELINE_BIND_POINT_GRAPHICS, graphicsPipeline);
vkCmdDrawClusterHUAWEI(commandBuffer, groupCountX, 1, 1);
vkCmdEndRenderPass(commandBuffer);

Version History

  • Revision 1, 2022-11-18 (YuChang Wang)

    • Internal revisions
  • Revision 2, 2023-04-02 (Jon Leech)

    • Grammar edits.

See Also

PhysicalDeviceClusterCullingShaderFeaturesHUAWEI, PhysicalDeviceClusterCullingShaderPropertiesHUAWEI, cmdDrawClusterHUAWEI, cmdDrawClusterIndirectHUAWEI

Document Notes

For more information, see the Vulkan Specification

This page is a generated document. Fixes and changes should be made to the generator scripts, not directly.

Synopsis

Documentation

cmdDrawClusterHUAWEI Source #

Arguments

:: forall io. MonadIO io 
=> CommandBuffer

commandBuffer is the command buffer into which the command will be recorded.

-> ("groupCountX" ::: Word32)

groupCountX is the number of local workgroups to dispatch in the X dimension.

-> ("groupCountY" ::: Word32)

groupCountY is the number of local workgroups to dispatch in the Y dimension.

-> ("groupCountZ" ::: Word32)

groupCountZ is the number of local workgroups to dispatch in the Z dimension.

-> io () 

vkCmdDrawClusterHUAWEI - Draw cluster culling work items

Description

When the command is executed,a global workgroup consisting of groupCountX*groupCountY*groupCountZ local workgroup is assembled. Note that the cluster culling shader pipeline only accepts cmdDrawClusterHUAWEI and cmdDrawClusterIndirectHUAWEI as drawing commands.

Valid Usage

Valid Usage (Implicit)

  • commandBuffer must be in the recording state
  • The CommandPool that commandBuffer was allocated from must support graphics operations
  • This command must only be called inside of a render pass instance
  • This command must only be called outside of a video coding scope

Host Synchronization

  • Host access to commandBuffer must be externally synchronized
  • Host access to the CommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties

'

Command Buffer LevelsRender Pass ScopeVideo Coding ScopeSupported Queue TypesCommand Type
Primary SecondaryInside Outside Graphics Action

See Also

VK_HUAWEI_cluster_culling_shader, CommandBuffer

cmdDrawClusterIndirectHUAWEI Source #

Arguments

:: forall io. MonadIO io 
=> CommandBuffer

commandBuffer is the command buffer into which the command is recorded.

-> Buffer

buffer is the buffer containing draw parameters.

-> ("offset" ::: DeviceSize)

offset is the byte offset into buffer where parameters begin.

-> io () 

vkCmdDrawClusterIndirectHUAWEI - Issue an indirect cluster culling draw into a command buffer

Description

cmdDrawClusterIndirectHUAWEI behaves similarly to cmdDrawClusterHUAWEI except that the parameters are read by the device from a buffer during execution. The parameters of the dispatch are encoded in a DispatchIndirectCommand structure taken from buffer starting at offset.Note the cluster culling shader pipeline only accepts cmdDrawClusterHUAWEI and cmdDrawClusterIndirectHUAWEI as drawing commands.

Valid Usage

Valid Usage (Implicit)

  • buffer must be a valid Buffer handle
  • commandBuffer must be in the recording state
  • The CommandPool that commandBuffer was allocated from must support graphics operations
  • This command must only be called inside of a render pass instance
  • This command must only be called outside of a video coding scope
  • Both of buffer, and commandBuffer must have been created, allocated, or retrieved from the same Device

Host Synchronization

  • Host access to commandBuffer must be externally synchronized
  • Host access to the CommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties

'

Command Buffer LevelsRender Pass ScopeVideo Coding ScopeSupported Queue TypesCommand Type
Primary SecondaryInside Outside Graphics Action

See Also

VK_HUAWEI_cluster_culling_shader, Buffer, CommandBuffer, DeviceSize

data PhysicalDeviceClusterCullingShaderPropertiesHUAWEI Source #

VkPhysicalDeviceClusterCullingShaderPropertiesHUAWEI - Structure describing cluster culling shader properties supported by an implementation

Description

If the PhysicalDeviceClusterCullingShaderPropertiesHUAWEI structure is included in the pNext chain of the PhysicalDeviceProperties2 structure passed to getPhysicalDeviceProperties2, it is filled in with each corresponding implementation-dependent property.

Valid Usage (Implicit)

See Also

VK_HUAWEI_cluster_culling_shader, DeviceSize, StructureType

Constructors

PhysicalDeviceClusterCullingShaderPropertiesHUAWEI 

Fields

  • maxWorkGroupCount :: (Word32, Word32, Word32)

    maxWorkGroupCount[3] is the maximum number of local workgroups that can be launched by a single command. These three value represent the maximum local workgroup count in the X, Y and Z dimensions, respectively. In the current implementation, the values of Y and Z are both implicitly set as one. groupCountX of DrawCluster command must be less than or equal to maxWorkGroupCount[0].

  • maxWorkGroupSize :: (Word32, Word32, Word32)

    maxWorkGroupSize[3] is the maximum size of a local workgroup. These three value represent the maximum local workgroup size in the X, Y and Z dimensions, respectively. The x, y and z sizes, as specified by the LocalSize or LocalSizeId execution mode or by the object decorated by the WorkgroupSize decoration in shader modules, must be less than or equal to the corresponding limit. In the current implementation, the maximum workgroup size of the X dimension is 32, the others are 1.

  • maxOutputClusterCount :: Word32

    maxOutputClusterCount is the maximum number of output cluster a single cluster culling shader workgroup can emit.

  • indirectBufferOffsetAlignment :: DeviceSize

    indirectBufferOffsetAlignment indicates the alignment for cluster drawing command buffer stride. cmdDrawClusterIndirectHUAWEI::offset must be a multiple of this value.

Instances

Instances details
Storable PhysicalDeviceClusterCullingShaderPropertiesHUAWEI Source # 
Instance details

Defined in Vulkan.Extensions.VK_HUAWEI_cluster_culling_shader

Show PhysicalDeviceClusterCullingShaderPropertiesHUAWEI Source # 
Instance details

Defined in Vulkan.Extensions.VK_HUAWEI_cluster_culling_shader

Eq PhysicalDeviceClusterCullingShaderPropertiesHUAWEI Source # 
Instance details

Defined in Vulkan.Extensions.VK_HUAWEI_cluster_culling_shader

FromCStruct PhysicalDeviceClusterCullingShaderPropertiesHUAWEI Source # 
Instance details

Defined in Vulkan.Extensions.VK_HUAWEI_cluster_culling_shader

ToCStruct PhysicalDeviceClusterCullingShaderPropertiesHUAWEI Source # 
Instance details

Defined in Vulkan.Extensions.VK_HUAWEI_cluster_culling_shader

Zero PhysicalDeviceClusterCullingShaderPropertiesHUAWEI Source # 
Instance details

Defined in Vulkan.Extensions.VK_HUAWEI_cluster_culling_shader

data PhysicalDeviceClusterCullingShaderFeaturesHUAWEI Source #

VkPhysicalDeviceClusterCullingShaderFeaturesHUAWEI - Structure describing whether cluster culling shader is enabled

Description

If the PhysicalDeviceClusterCullingShaderFeaturesHUAWEI structure is included in the pNext chain of the PhysicalDeviceFeatures2 structure passed to getPhysicalDeviceFeatures2, it is filled in to indicate whether each corresponding feature is supported. PhysicalDeviceClusterCullingShaderFeaturesHUAWEI can also be used in the pNext chain of DeviceCreateInfo to selectively enable these features.

Valid Usage (Implicit)

See Also

VK_HUAWEI_cluster_culling_shader, Bool32, StructureType

Constructors

PhysicalDeviceClusterCullingShaderFeaturesHUAWEI 

Fields

Instances

Instances details
Storable PhysicalDeviceClusterCullingShaderFeaturesHUAWEI Source # 
Instance details

Defined in Vulkan.Extensions.VK_HUAWEI_cluster_culling_shader

Show PhysicalDeviceClusterCullingShaderFeaturesHUAWEI Source # 
Instance details

Defined in Vulkan.Extensions.VK_HUAWEI_cluster_culling_shader

Eq PhysicalDeviceClusterCullingShaderFeaturesHUAWEI Source # 
Instance details

Defined in Vulkan.Extensions.VK_HUAWEI_cluster_culling_shader

FromCStruct PhysicalDeviceClusterCullingShaderFeaturesHUAWEI Source # 
Instance details

Defined in Vulkan.Extensions.VK_HUAWEI_cluster_culling_shader

ToCStruct PhysicalDeviceClusterCullingShaderFeaturesHUAWEI Source # 
Instance details

Defined in Vulkan.Extensions.VK_HUAWEI_cluster_culling_shader

Zero PhysicalDeviceClusterCullingShaderFeaturesHUAWEI Source # 
Instance details

Defined in Vulkan.Extensions.VK_HUAWEI_cluster_culling_shader

type HUAWEI_CLUSTER_CULLING_SHADER_EXTENSION_NAME = "VK_HUAWEI_cluster_culling_shader" Source #