We always left them enabled, which turned off HiZ in some cases.
This should improve performace with Hyper-Z.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
This should be as fast as no HTILE for stencil. I think we can still get full
performance with depth-only rendering even if stencil is present in the buffer
but not used, but I'm not 100% sure. This may be revisited when HiS and fast
stencil clear are implemented.
This fixes a hang in Brutal Legend.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64471
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
This is a golden setting on RV740, but there is a hw bug which recommends
setting it on all R7xx chipsets.
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
It's almost the same.
This enables tiling for HTILE. It also enables Hyper-Z for other texture
targets (1D, 1D_ARRAY, 2D_ARRAY, CUBE, CUBE_ARRAY, 3D, RECT).
2D array depth textures are tested by Unigine Sanctuary and my new piglit
test.
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
This fixes rendering to a non-zero layer/face/slice with HTILE.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72685
v2: added the assertion
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
This should make a machine which is running piglit more responsive at times.
e.g. streaming-texture-leak can easily eat 600 MB because of how fast it
creates new textures.
We were only using it to get at its type, which we already know because
it's a builtin variable.
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We were only using it to get at its type, which we already know because
it's a builtin variable.
v2 (Ken): Rebase on Matt's optimized gl_FrontFacing calculations.
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The replicated data clear shader needs to be SIMD16, or else the GPU
will hang. So, compile it even if INTEL_DEBUG=no16 is set.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Current method of generating distribution tar-balls involves manually
invoking make + target name in the appropriate places. This temporary
solution is used until we get 'make dist' working.
Currently it does not work, as in order to have the target (which is
also a filename) available in the final Makefile we need to add a PHONY
target + use the correct target name.
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Now that saturate is implemented natively as an instruction,
we can cut down on unneeded functionality.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
v3: Since the fs backend can emit saturate as a separate instruction, there is
no need to detect for min/max instructions and to rewrite the instruction tree
accordingly. On the other hand, we don't need to emit a separate saturated
mov either when the expression generating src can do saturate directly.
v4: Add can_do_saturate() check before enabling saturate modifer (Ken)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Now that saturate is implemented natively as instruction,
we can cut down on unneeded functionality.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
When sel conditon is bounded within 0 and 1.0. This allows code as:
mov.sat a b
sel.ge dst a 0.25F
To be propagated as:
sel.ge.sat dst b 0.25F
v3: - Syntax clarifications in inst->saturate assignment
- Remove extra parenthesis when assigning src_reg value
from copy_entry (Matt Turner)
v4: - Take channels into consideration when propagating saturated instructions.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
When sel conditon is bounded within 0 and 1.0. This allows code as:
mov.sat a b
sel.ge dst a 0.25F
To be propagated as:
sel.ge.sat dst b 0.25F
v3: Syntax clarifications in inst->saturate assignment (Matt Turner)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
v2: - Output max(saturate(x),b) instead of saturate(max(x,b))
- Make sure we do component-wise comparison for vectors (Ian Romanick)
v3: - Add missing condition where the outer constant value is > 0.0 and
inner constant is 1.0.
- Fix comments to show that the optimization is a commutative operation
(Matt Turner)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
v2: - Output min(saturate(x),b) instead of saturate(min(x,b)) suggested by Ilia Mirkin
- Make sure we do component-wise comparison for vectors (Ian Romanick)
v3: - Add missing condition where the outer constant value is zero and
inner constant is < 1
- Fix comments to reflect we are doing a commutative operation (Matt Turner)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
v2: - Check that the base type is float (Ian Romanick)
v3: - Make sure comments reflect that we are doing a commutative operation
- Add missing condition where the inner constant is 1.0 and outer constant is 0.0
- Make indexing of operands easier to read (Matt Turner)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Now that we have the ir_unop_saturate implemented as a single
instruction, generate the correct simplified expression.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Suggested by Matt. This patch combines and moves back the code-generation
functions from generate_vec4_instruction() into generate_code(). Makes
generate_code() a bit larger, but helps us to count loops in a
straightforward manner.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
According to the cited documentation section (but in the newer docs),
x_scaledown is the same for 2x and 4x MSAA.
+47 piglits.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83081
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.3" <mesa-stable@lists.freedesktop.org>
brw_meta_fast_clear.c:211:17: warning: 'x_scaledown' may be used
uninitialized in this function [-Wmaybe-uninitialized]
unsigned int x_scaledown, y_scaledown;
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>