Jump to content

User:MaxDZ8/Shader model Proto

From Wikipedia, the free encyclopedia

Shader models are functionality sets introduced in the various Direct3D releases. Because the term has seen widespread use and is often cited to final users, using the term to identify a specific hardware class is now commonplace.

This page actually covers only the effects of different shader models on shaders programming but a shader model usually introduces other functionalities as well, such as better precision, or instancing. Providing description of those features is left to a future revision of this document.

Shader versions and resource limits[1][2]
Version Instruction slots Constant count
vs 1.1 96 (4)
vs 2.0 256 (4)
vs 2.a 256 (4)
vs 3.0 (1) (4)
ps 1.1 - ps 1.3 8 - 12 8
ps 1.4 28 (two phases) 8
ps 2.0 96 32
ps 2.a (2) 32
ps 2.b (2) 32
ps 3.0 (3) 224

(1): D3DCAPS9.MaxVertexShader30InstructionSlots
(2): D3DCAPS9.D3DPSHADERCAPS2_0.NumInstructionSlots
(3): D3DCAPS9.MaxPixelShader30InstructionSlots
(4): D3DCAPS9.MaxVertexShaderConst

Hardware support for shading models

[edit]
Card (chip) Vertex shader Pixel shader
GeForce 3 (NV20) 1.1 1.1
Radeon 8500 up to 9200 (R200) 1.1 1.4
GeForce 4 Ti (NV25) 1.1 1.3
Parhelia 2.0 1.3
Radeon 9500 and later (R3xx) 2.0 2.0
Wildcat VP 10 1.1 1.2
GeForce FX (NV30) 2.a 2.a
GeForce 6xxx (NV4x) 3.0 3.0
Radeon X800 (R4xx) 2.0 2.b (1)

(1): The 2.b pixel shader model is actually a subset of 2.a thus providing less functionality.

Arithmetic instructions

[edit]

How to read this table: dark gray means not available in this profile.

Instruction mnemonic Description Used slots
Vertex shader Pixel shader
1.1-1.3 1.4 2.0 2.x 3.0 1.1-1.3 1.4 2.0 3.0
add Add two vector registers 1
abs Absolute value 1
crs Cross product 2
dp3, dp4 Dot product of 3D or 4D vectors 1
dst Compute distance vector (???) 1
exp/expp 2x, full and partial precision. 10/1 1/1
frc Fractional part 3 1
lrp Linear interpolation of values. 2
lit ??? 1 3???
log/logp log2(x), full and partial precision. 10/1 1/1
m3x2/m3x3/m3x4 Multiply two 3xn matrices. 2/3/4
m4x3/m4x4 Multiply two 4xn matrices. 3/4
mad Multiply first two vectors, then add third one. 1
min/max Return component-wise min/max vector. 1
mov Move ??first?? to ??second?? 1
mova Move float value to address register. 1
mul Component-wise multiply 1
nop No operation. 1
nrm Normalize vector 3
pow xy 3
rcp Reciprocal. 1
rsq Reciprocal SQuare root. 1
sincos Optimized computation of both sine and cosine. 8
sge Set if greater or equal than. 1
sgn Sign 3
slt Set if less than. 1
sub Subtract. 1


Flow instructions

[edit]
Instruction mnemonic Description Used slots
Vertex shader Pixel shader
2.0 2.x 3.0 1.1-1.3 1.4 2.0 3.0
call "Push IP and jump" to subroutine. 2
callnz bool Conditional jump to subroutine if boolean register is not zero. 3
ret Return from subroutine. 1
if bool Execute block if condition is met. 3
else Execute block if condition is not met. 1
endif End of if/else instructions. 1
loop Begin loop instruction block. 3
endloop End of loop instruction block. 2
rep Begin rep instruction block 2
endrep End of rep instruction block. 2

Setup instructions

[edit]

All setup instructions take 0 slots.

Instruction mnemonic Description Notes
dcl_usage_input Bind vertex stream to input register.
def Define constant.
defb Define boolean constant (for static flow control). Introduced in vs2.0
defi Define integer constant (for static flow control). Introduced in vs2.0
label ??? Introduced in vs2.0
vs Declare profile, must be first instruction. VS only


Vertex shader registers

[edit]

All registers are four-component wide, unless otherwise noted. RIVEDERE QUESTA TABELLA

Register mnemonic Description Count Read/Write Relative addressing Notes
1.1 In a0 Address register 14 RW No Only .x write mask allowed in vs1.x, all 4 comps available on vs2
aL Loop register 1 Ro No Only .x write mask allowed in vs1.x, all 4 comps available on vs2
cn Float constant register >96 (1) Ro Use a0.x Default 0,0,0,0., INF read/ist in vs1.x, 2read/inst in vs2
vn Input register from vertex stream 16 Ro No Default 0,0,0,1.
rn Temporary register 12 RW No Undefined, will cause error if read before initialization.
Out ??? ??? ??? ??? ??? ???


Pixel shader registers

[edit]

All registers are four-component wide, unless otherwise noted.

Register mnemonic Description Count Read/Write Dimensionality Notes
1.1 In ??? ??? ??? ??? ??? ???
Out oPos Position register 1 Wo 4D
oFog Fog density register 1 Wo 1D
oDn Color register 2 Wo 4D oD0 is diffuse color, oD1 is specular.
oTn Texcoord register 8 Wo 4D Indexable as oT[a0.x+n].


http://msdn.microsoft.com/library/default.asp?url?/library/en-us/directx9_c/dx9_graphics_reference_asm.asp

1.1 first released

2.0 static flow control, 4 cmp address reg, new ops, new regs

2.x dyn flow cntr, nesting, more temps, predication, new ops, new regs

3.0 tex lookups (samplers), indexable regs, 32tmps, new ops.


  1. ^ "Programming vertex and pixel shaders" by Wolfgang Engel, ISBN 1-58450-349-1.
  2. ^ Microsoft MSDN, Direct3D reference, the D3DCAPS9 and D3DPSHADERCAPS2_0 structures.