April 27th, 2026
0 reactions

Announcing Shader Model 6.10 Preview and AgilitySDK 720 Preview

Engineer

Overview


Today, we are pleased to announce that Shader Model 6.10 and other features have been officially released with Agility SDK 1.720-preview and complementary DXC 1.10.2605.2. AgilitySDK 1.720-preview exposes the following features. There’s more detail further below, including download and driver links.

  • Shader Model 6.10 (via DXC 1.10.2605.2):
    • linalg::Matrix
    • Group Wave Index
    • Variable Group Shared Memory
    • Raytracing intrinsics
      • TriangleObjectPositions
      • ClusterID
  • Batched Asynchronous Command List APIs

Downloads

Hardware Support

IHV Driver Link(s)
AMD AMD Software: AgilitySDK Developer Preview Edition 25.30.41.02 
Intel Intel® Arcâ„¢ Graphics – Windows* 
NVIDIA Contact your developer relations representative for in-development driver access.

See Appendix > Feature Support for the full table of each feature’s supported hardware.

Features


HLSL Features (Shader Model 6.10):

linalg::Matrix

Shader Model 6.10 introduces a set of Matrix APIs covering a broad swath of use cases. Collectively the feature is called LinAlg (short for Linear Algebra).

We’ve written a dedicated a blog post covering the feature in depth here.

Also see the GDC 2026 blog putting this feature in context of the overall ML story for DirectX here.

HLSL Spec: hlsl-specs/proposals/0035-linalg-matrix.md at main · microsoft/hlsl-specs

Group Wave Index

Shader Model 6.10 introduces two new intrinsics, GetGroupWaveIndex() and GetGroupWaveCount(), that give compute, mesh, amplification, and node shaders direct knowledge of wave-level structure within a thread group. GetGroupWaveIndex() returns the current wave’s index (0 to N-1) and GetGroupWaveCount() returns the total number of waves executing the group. These enable wave-level work specialization and cooperation without relying on unsafe workarounds like dividing SV_GroupIndex by WaveGetLaneCount(), which is not guaranteed to be correct across all hardware. A single code path now works portably across all wave sizes.

HLSL Spec: hlsl-specs/proposals/0048-group-wave-index.md at main · microsoft/hlsl-specs

Variable Group Shared Memory

Shader Model 6.10 lifts the longstanding 32 KB (28 KB for mesh shaders) cap on groupshared memory by exposing the actual hardware limit through a new runtime query, MaxGroupSharedMemoryPerGroup. Shader authors can use a new [GroupSharedLimit(<bytes>)] entry-point attribute to declare the maximum shared memory their shader requires, giving the compiler a compile-time portability check while still allowing access to the full capacity of modern GPUs. Shaders that omit the attribute continue to be validated against the legacy limits, so existing code is unaffected. This unlocks algorithms like large tile culling, software rasterization bins, and big matrix workloads that were previously constrained by the spec rather than the hardware.

HLSL Spec: hlsl-specs/proposals/0049-variable-groupshared-memory.md at main · microsoft/hlsl-specs

Raytracing intrinsics

TriangleObjectPositions() is an intrinsic that can be called from an Any hit or Closest hit shader or RayQuery to obtain the positions of the vertices for the triangle that has been hit.

Spec: https://github.com/microsoft/hlsl-specs/blob/main/proposals/0041-triangle-object-positions.md

ClusterID() is an intrinsic that can be called from an Any hit or Closest hit shader or RayQuery to return the user defined ID of a cluster.  This isn’t currently useful since clustered geometry support for DXR isn’t ready yet.

HLSL Spec: https://github.com/microsoft/hlsl-specs/blob/main/proposals/0045-clustered-geometry.md

D3D12 Raytracing spec with work-in progress clustered geometry design (not shipped yet): https://github.com/microsoft/DirectX-Specs/blob/master/d3d/Raytracing2.md

Once the features in this spec ship (tentatively starting with a preview fall 2026), the ClusterID() intrinsic will become useful.

 

D3D12 Features:

Batched Asynchronous Command List APIs

D3D12’s legacy CopyBufferRegion, ClearUnorderedAccessViewFloat/Uint, ResolveSubresource, and similar commands all execute strictly in series because the old ResourceBarrier model has no way to express a dependency between two operations of the same type (e.g. copy-dest to copy-dest). This means the GPU stalls between every sequential copy or clear, even when the operations touch completely independent memory. The Batched Async Commands feature addresses this by introducing new command list methods that remove the implicit serialization contract, allowing the driver and hardware to overlap independent work within a single batch call. Developers opt into explicit synchronization using enhanced barriers only where true data hazards exist – such as when two copies write to overlapping regions of the same buffer – and everything else runs concurrently.

The feature also modernizes clears with ClearTextureSubresources, which clears textures directly by resource pointer and format – no RTV, UAV, descriptor heaps, or special resource flags required. This is notably the first D3D12 clear that works on block-compressed formats. Correspondingly, FillBuffers adds batched, format-aware or raw-pattern buffer fills with configurable repeat counts, replacing the descriptor gymnastics of UAV clears. In addition, new ClearBoundRenderTargetViews and ClearBoundDepthStencilView commands further improve ergonomics by operating on currently bound targets, enabling mid-render-pass clears and batch clearing multiple RTVs in a single call.

PIX


PIX supports all features released here. See the PIX release blog: https://devblogs.microsoft.com/pix/pix-2604-27004-preview/

Appendix


Feature Support

Using the latest drivers linked in Overview > Hardware Support:

AMD Intel NVIDIA
linAlg::Matrix Supported on AMD Radeon™ RX 9000 series graphics products.  Planned for an upcoming release.  Supported on all RTX hardware.
Group Wave Index Supported on AMD Radeon™ RX 7000 and 9000 series graphics products.  Supported on Intel® Arc™ B-Series Graphics.  Planned for an upcoming release. 
Variable Group Shared Memory Supported on AMD Radeon™ RX 7000 and 9000 series graphics products. 

Supports default memory limit size only. Higher size limits are planned for future driver releases. 

Supported on Intel® Arc™ B-Series Graphics.  Supported on all RTX hardware. 

Values differ across hardware. 

Raytracing intrinsics: TriangleObjectPositions/ClusterID Supported on AMD Radeon™ RX 7000 and 9000 series graphics products.   Supported on Intel® Arc™ B-Series Graphics.  Supported on all RTX hardware.  
Batched Asynchronous Command List APIs Supported on AMD Radeon™ RX 7000 and 9000 series graphics products. Supported on Intel® Arc™ B-Series Graphics.  Supported on all RTX hardware.  

 

Category

Author

Amar Patel
Engineer

0 comments