User:MaxDZ8/Shader model Proto

Shader models are functionality sets introduced in the various Direct3D releases. Because the term has seen widespread use and is often cited to final users, using the term to identify a specific hardware class is now commonplace.

This page actually covers only the effects of different shader models on shaders programming but a shader model usually introduces other functionalities as well, such as better precision, or instancing. Providing description of those features is left to a future revision of this document.

Shader versions and resource limits^[1]^[2]
Version	Instruction slots	Constant count
vs 1.1	96	(4)
vs 2.0	256	(4)
vs 2.a	256	(4)
vs 3.0	(1)	(4)
ps 1.1 - ps 1.3	8 - 12	8
ps 1.4	28 (two phases)	8
ps 2.0	96	32
ps 2.a	(2)	32
ps 2.b	(2)	32
ps 3.0	(3)	224
(1): D3DCAPS9.MaxVertexShader30InstructionSlots (2): D3DCAPS9.D3DPSHADERCAPS2_0.NumInstructionSlots (3): D3DCAPS9.MaxPixelShader30InstructionSlots (4): D3DCAPS9.MaxVertexShaderConst

Hardware support for shading models

Card (chip)	Vertex shader	Pixel shader
GeForce 3 (NV20)	1.1	1.1
Radeon 8500 up to 9200 (R200)	1.1	1.4
GeForce 4 Ti (NV25)	1.1	1.3
Parhelia	2.0	1.3
Radeon 9500 and later (R3xx)	2.0	2.0
Wildcat VP 10	1.1	1.2
GeForce FX (NV30)	2.a	2.a
GeForce 6xxx (NV4x)	3.0	3.0
Radeon X800 (R4xx)	2.0	2.b (1)
(1): The 2.b pixel shader model is actually a subset of 2.a thus providing less functionality.

Arithmetic instructions

How to read this table: dark gray means not available in this profile.

Instruction mnemonic	Description	Used slots
		Vertex shader					Pixel shader
		1.1-1.3	1.4	2.0	2.x	3.0	1.1-1.3	1.4	2.0	3.0
add	Add two vector registers	1
abs	Absolute value			1
crs	Cross product			2
dp3, dp4	Dot product of 3D or 4D vectors	1
dst	Compute distance vector (???)	1
exp/expp	2^x, full and partial precision.	10/1		1/1
frc	Fractional part	3		1
lrp	Linear interpolation of values.			2
lit	???	1		3???
log/logp	log₂(x), full and partial precision.	10/1		1/1
m3x2/m3x3/m3x4	Multiply two 3xn matrices.	2/3/4
m4x3/m4x4	Multiply two 4xn matrices.	3/4
mad	Multiply first two vectors, then add third one.	1
min/max	Return component-wise min/max vector.	1
mov	Move ??first?? to ??second??	1
mova	Move float value to address register.			1
mul	Component-wise multiply	1
nop	No operation.	1
nrm	Normalize vector			3
pow	x^y			3
rcp	Reciprocal.	1
rsq	Reciprocal SQuare root.	1
sincos	Optimized computation of both sine and cosine.			8
sge	Set if greater or equal than.	1
sgn	Sign			3
slt	Set if less than.	1
sub	Subtract.	1

Flow instructions

Instruction mnemonic	Description	Used slots
		Vertex shader			Pixel shader
		2.0	2.x	3.0	1.1-1.3	1.4	2.0	3.0
call	"Push IP and jump" to subroutine.	2
callnz bool	Conditional jump to subroutine if boolean register is not zero.	3
ret	Return from subroutine.	1
if bool	Execute block if condition is met.	3
else	Execute block if condition is not met.	1
endif	End of if/else instructions.	1
loop	Begin loop instruction block.	3
endloop	End of loop instruction block.	2
rep	Begin rep instruction block	2
endrep	End of rep instruction block.	2

Setup instructions

All setup instructions take 0 slots.

Instruction mnemonic	Description	Notes
dcl_usage_input	Bind vertex stream to input register.
def	Define constant.
defb	Define boolean constant (for static flow control).	Introduced in vs2.0
defi	Define integer constant (for static flow control).	Introduced in vs2.0
label	???	Introduced in vs2.0
vs	Declare profile, must be first instruction.	VS only

Vertex shader registers

All registers are four-component wide, unless otherwise noted. RIVEDERE QUESTA TABELLA

		Register mnemonic	Description	Count	Read/Write	Relative addressing	Notes
1.1	In	a0	Address register	14	RW	No	Only .x write mask allowed in vs1.x, all 4 comps available on vs2
		aL	Loop register	1	Ro	No	Only .x write mask allowed in vs1.x, all 4 comps available on vs2
		cn	Float constant register	>96 (1)	Ro	Use a0.x	Default 0,0,0,0., INF read/ist in vs1.x, 2read/inst in vs2
		vn	Input register from vertex stream	16	Ro	No	Default 0,0,0,1.
	rn	Temporary register	12	RW	No	Undefined, will cause error if read before initialization.
	Out	???	???	???	???	???	???

Pixel shader registers

All registers are four-component wide, unless otherwise noted.

		Register mnemonic	Description	Count	Read/Write	Dimensionality	Notes
1.1	In	???	???	???	???	???	???
	Out	oPos	Position register	1	Wo	4D
		oFog	Fog density register	1	Wo	1D
		oDn	Color register	2	Wo	4D	oD0 is diffuse color, oD1 is specular.
		oTn	Texcoord register	8	Wo	4D	Indexable as oT[a0.x+n].

http://msdn.microsoft.com/library/default.asp?url?/library/en-us/directx9_c/dx9_graphics_reference_asm.asp

1.1 first released

2.0 static flow control, 4 cmp address reg, new ops, new regs

2.x dyn flow cntr, nesting, more temps, predication, new ops, new regs

3.0 tex lookups (samplers), indexable regs, 32tmps, new ops.

^ "Programming vertex and pixel shaders" by Wolfgang Engel, ISBN 1-58450-349-1.
^ Microsoft MSDN, Direct3D reference, the D3DCAPS9 and D3DPSHADERCAPS2_0 structures.