Direct3D 9 - Pixel Shader - Version Differences

Last edited 2026-02-28


Each version supports a different number of maximum instruction slots.


The meaning of colors in the Type, Version and Working column:

Color

Description

Green

The version works in Fusion.

Yellow

The version has not been confirmed to work in Fusion.

Red

The version does not work in Fusion.

Pixel Shader Version Comparison (Wikipedia)

Type

Version

Working?

Dependent

texture

limit

Texture

instruction

limit

Arithmetic

instruction

limit

Position

register

Instruction

slots

Executed

instructions

Texture

indirections

Interpolated

registers

Instruction

predication

Index

input

registers

Temp

registers

Constant

registers

Arbitrary

swizzling

Gradient

instructions

Loop

count

register

Face

register

Dynamic

flow

control

Bitwise

Operators

Native

Integers

Note

hlsl

ps_1_0

No

4

4

8

No

8

8

4

2 + 4

No

No

2

8

No

No

No

No

No

No

No

This pixel shader version is not supported and the compiler automatically changes the version to ps_1_1

hlsl

ps_1_1

Yes

4

4

8

No

8 + 4

8 + 4

4

2 + 4

No

No

2 + 4

8

No

No

No

No

No

No

No


hlsl

ps_1_2

Yes

4

4

8

No

8 + 4

8 + 4

4

2 + 4

No

No

3 + 4

8

No

No

No

No

No

No

No


hlsl

ps_1_3

Yes

4

4

8

No

8 + 4

8 + 4

4

2 + 4

No

No

3 + 4

8

No

No

No

No

No

No

No


hlsl

ps_1_4

Yes

6

6 * 2

8 * 2

No

(8 + 6) * 2

(8 + 6) * 2

4

2 + 8

No

No

6

8

No

No

No

No

No

No

No


hlsl

ps_2_0

Yes

8

32

64

No

64 + 32

64 + 32

4

2 + 8

No

No

12 to 32

32

No

No

No

No

No

No

No


hlsl

ps_2_a

Yes

Unlimited

Unlimited

Unlimited

No

512

512

Unlimited

2 + 8

Yes

No

22

32

Yes

Yes

No

No

No

No

No

Currently the best version for writing advanced effects for Direct3D 9.

hlsl

ps_2_b

Yes

8

Unlimited

Unlimited

No

512

512

4

2 + 8

No

No

32

32

No

No

No

Yes

No

No

No


hlsl

ps_2_sw

Unknown

8?

32?

64?

No?

64 + 32?

64 + 32?

4?

2 + 8?

No?

No?

12 to 32?

32?

No?

No?

No?

No?

No?

No?

No?

Executed in software (CPU). Primarily used for debugging, it allows you to verify whether a graphics driver is generating rendering errors by comparing hardware results with a reference software implementation. This is very slow compared to execution on the GPU due to the serial nature of the CPU architecture.

hlsl

ps_3_0

No

(Broken)

Unlimited

Unlimited

Unlimited

Yes

>= 512

65536

Unlimited

10

Yes

Yes

32

224

Yes

Yes

Yes

Yes

Yes (24)

No

No

Broken texCoord, i.e. only one pixel is displayed stretched over the whole; it is possible that this will be fixed.

hlsl

ps_3_sw

Unknown

Unlimited?

Unlimited?

Unlimited?

Yes?

>= 512?

65536?

Unlimited?

10?

Yes?

Yes?

32?

224?

Yes?

Yes?

Yes?

Yes?

Yes (24)?

No?

No?

Executed in software (CPU). Primarily used for debugging, it allows you to verify whether a graphics driver is generating rendering errors by comparing hardware results with a reference software implementation. This is very slow compared to execution on the GPU due to the serial nature of the CPU architecture.


"(8 + 6) * 2" for Executed instructions means:

    • 8 texture instructions and 6 arithmetic instructions in 2 phases.
    • i.e. total of 16 texture instructions and 12 arithmetic instructions.


">= 512" for Instuctions slots:

    • The specification guarantees at least 512 instruction slots.
    • The GPU can have more, but not less.


  • Dependent texture limit - Determines how deeply texture readings can be nested. This means that the coordinates to sample from a texture (B) are calculated based on a color previously sampled from another texture (A).
  • Texture instruction limit - The maximum number of texture sampling instructions that can be placed in a single shader.
  • Arithmetic instruction limit - The maximum number of math (arithmetic) instructions that can be placed in the shader code.
  • Position register - Access the vPos register (or equivalent) which contains the (X,Y) coordinates of the currently rendered pixel on the screen.
  • Instruction slots - The maximum number of assembler instructions that a shader program stored in memory can contain.
  • Executed instructions - The maximum number of instructions the card can process for one pixel while running.
  • Texture indirections - A parameter closely related to the Dependent texture limit. It specifies the number of "levels" of dependency. If the texture coordinates depend on the result of another texturing operation, this is indirection.
  • Interpolated registers - The number of registers transferring data from the Vertex Shader to the Pixel Shader (e.g. colors, UV coordinates, normal vectors) that are interpolated (averaged) for each pixel of the triangle.
  • Instruction predication - A technique that allows for conditional execution without the use of expensive branching. The card executes both branches of code (true and false), but stores the result of only the correct one, based on a logical condition.
  • Index input registers - Registers used to address (index) other registers, necessary for more complex loops and arrays.
  • Temp registers - The amount of cache memory (local variables) available to the shader to store the results of intermediate calculations.
  • Constant registers - The amount of memory for constant values ​​sent from the application (CPU) to the graphics card (e.g. matrices, light positions, material colors).
  • Arbitrary swizzling - Possibility to freely swap the order of color channels (e.g. using the red channel as alpha: .r to .a) in any instruction.
  • Gradient instructions - Availability of partial derivative instructions (e.g., ddx, ddy) that allow you to determine how quickly a value changes between adjacent pixels.
  • Loop count register - A special register (often aL) that handles counters for loops.
  • Face register - vFace register, which tells the shader whether the front or back side of a triangle is being rendered (useful for two-sided material rendering).
  • Dynamic flow control - Support for true conditional statements (if, else) and loops (for, while) where control depends on data computed in real time, not just on constants known at compile time.
  • Bitwise Operators - Support for bitwise logical operations (AND, OR, XOR, bit shifts <<, >>).
  • Native Integers - Support for integer types (int), the lack of which means that everything is a floating-point number (float), which makes precision difficult.

Created with the Personal Edition of HelpNDoc: Transform Your Help Documentation Process with a Help Authoring Tool