User:MaxDZ8/Shader model Proto
Shader models are functionality sets introduced in the various Direct3D releases. Because the term has seen widespread use and is often cited to final users, using the term to identify a specific hardware class is now commonplace.
This page actually covers only the effects of different shader models on shaders programming but a shader model usually introduces other functionalities as well, such as better precision, or instancing. Providing description of those features is left to a future revision of this document.
Version | Instruction slots | Constant count |
---|---|---|
vs 1.1 | 96 | (4) |
vs 2.0 | 256 | (4) |
vs 2.a | 256 | (4) |
vs 3.0 | (1) | (4) |
ps 1.1 - ps 1.3 | 8 - 12 | 8 |
ps 1.4 | 28 (two phases) | 8 |
ps 2.0 | 96 | 32 |
ps 2.a | (2) | 32 |
ps 2.b | (2) | 32 |
ps 3.0 | (3) | 224 |
(1): D3DCAPS9.MaxVertexShader30InstructionSlots |
Hardware support for shading models
[edit]Card (chip) | Vertex shader | Pixel shader |
---|---|---|
GeForce 3 (NV20) | 1.1 | 1.1 |
Radeon 8500 up to 9200 (R200) | 1.1 | 1.4 |
GeForce 4 Ti (NV25) | 1.1 | 1.3 |
Parhelia | 2.0 | 1.3 |
Radeon 9500 and later (R3xx) | 2.0 | 2.0 |
Wildcat VP 10 | 1.1 | 1.2 |
GeForce FX (NV30) | 2.a | 2.a |
GeForce 6xxx (NV4x) | 3.0 | 3.0 |
Radeon X800 (R4xx) | 2.0 | 2.b (1) |
(1): The 2.b pixel shader model is actually a subset of 2.a thus providing less functionality. |
Arithmetic instructions
[edit]How to read this table: dark gray means not available in this profile.
Instruction mnemonic | Description | Used slots | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Vertex shader | Pixel shader | |||||||||
1.1-1.3 | 1.4 | 2.0 | 2.x | 3.0 | 1.1-1.3 | 1.4 | 2.0 | 3.0 | ||
add | Add two vector registers | 1 | ||||||||
abs | Absolute value | 1 | ||||||||
crs | Cross product | 2 | ||||||||
dp3, dp4 | Dot product of 3D or 4D vectors | 1 | ||||||||
dst | Compute distance vector (???) | 1 | ||||||||
exp/expp | 2x, full and partial precision. | 10/1 | 1/1 | |||||||
frc | Fractional part | 3 | 1 | |||||||
lrp | Linear interpolation of values. | 2 | ||||||||
lit | ??? | 1 | 3??? | |||||||
log/logp | log2(x), full and partial precision. | 10/1 | 1/1 | |||||||
m3x2/m3x3/m3x4 | Multiply two 3xn matrices. | 2/3/4 | ||||||||
m4x3/m4x4 | Multiply two 4xn matrices. | 3/4 | ||||||||
mad | Multiply first two vectors, then add third one. | 1 | ||||||||
min/max | Return component-wise min/max vector. | 1 | ||||||||
mov | Move ??first?? to ??second?? | 1 | ||||||||
mova | Move float value to address register. | 1 | ||||||||
mul | Component-wise multiply | 1 | ||||||||
nop | No operation. | 1 | ||||||||
nrm | Normalize vector | 3 | ||||||||
pow | xy | 3 | ||||||||
rcp | Reciprocal. | 1 | ||||||||
rsq | Reciprocal SQuare root. | 1 | ||||||||
sincos | Optimized computation of both sine and cosine. | 8 | ||||||||
sge | Set if greater or equal than. | 1 | ||||||||
sgn | Sign | 3 | ||||||||
slt | Set if less than. | 1 | ||||||||
sub | Subtract. | 1 |
Flow instructions
[edit]Instruction mnemonic | Description | Used slots | |||||||
---|---|---|---|---|---|---|---|---|---|
Vertex shader | Pixel shader | ||||||||
2.0 | 2.x | 3.0 | 1.1-1.3 | 1.4 | 2.0 | 3.0 | |||
call | "Push IP and jump" to subroutine. | 2 | |||||||
callnz bool | Conditional jump to subroutine if boolean register is not zero. | 3 | |||||||
ret | Return from subroutine. | 1 | |||||||
if bool | Execute block if condition is met. | 3 | |||||||
else | Execute block if condition is not met. | 1 | |||||||
endif | End of if/else instructions. | 1 | |||||||
loop | Begin loop instruction block. | 3 | |||||||
endloop | End of loop instruction block. | 2 | |||||||
rep | Begin rep instruction block | 2 | |||||||
endrep | End of rep instruction block. | 2 |
Setup instructions
[edit]All setup instructions take 0 slots.
Instruction mnemonic | Description | Notes | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
dcl_usage_input | Bind vertex stream to input register. | |||||||||
def | Define constant. | |||||||||
defb | Define boolean constant (for static flow control). | Introduced in vs2.0 | ||||||||
defi | Define integer constant (for static flow control). | Introduced in vs2.0 | ||||||||
label | ??? | Introduced in vs2.0 | ||||||||
vs | Declare profile, must be first instruction. | VS only |
Vertex shader registers
[edit]All registers are four-component wide, unless otherwise noted. RIVEDERE QUESTA TABELLA
Register mnemonic | Description | Count | Read/Write | Relative addressing | Notes | ||
---|---|---|---|---|---|---|---|
1.1 | In | a0 | Address register | 14 | RW | No | Only .x write mask allowed in vs1.x, all 4 comps available on vs2 |
aL | Loop register | 1 | Ro | No | Only .x write mask allowed in vs1.x, all 4 comps available on vs2 | ||
cn | Float constant register | >96 (1) | Ro | Use a0.x | Default 0,0,0,0., INF read/ist in vs1.x, 2read/inst in vs2 | ||
vn | Input register from vertex stream | 16 | Ro | No | Default 0,0,0,1. | ||
rn | Temporary register | 12 | RW | No | Undefined, will cause error if read before initialization. | ||
Out | ??? | ??? | ??? | ??? | ??? | ??? |
Pixel shader registers
[edit]All registers are four-component wide, unless otherwise noted.
Register mnemonic | Description | Count | Read/Write | Dimensionality | Notes | ||
---|---|---|---|---|---|---|---|
1.1 | In | ??? | ??? | ??? | ??? | ??? | ??? |
Out | oPos | Position register | 1 | Wo | 4D | ||
oFog | Fog density register | 1 | Wo | 1D | |||
oDn | Color register | 2 | Wo | 4D | oD0 is diffuse color, oD1 is specular. | ||
oTn | Texcoord register | 8 | Wo | 4D | Indexable as oT[a0.x+n]. |
1.1 first released
2.0 static flow control, 4 cmp address reg, new ops, new regs
2.x dyn flow cntr, nesting, more temps, predication, new ops, new regs
3.0 tex lookups (samplers), indexable regs, 32tmps, new ops.
- ^ "Programming vertex and pixel shaders" by Wolfgang Engel, ISBN 1-58450-349-1.
- ^ Microsoft MSDN, Direct3D reference, the D3DCAPS9 and D3DPSHADERCAPS2_0 structures.