mirror of
https://gitlab.freedesktop.org/mesa/mesa.git
synced 2026-04-19 08:00:36 +02:00
i965: Update comments about Z16 being slow.
We've learned a few things since we originally disabled Z16; this attempts to summarize the issue. I am no expert on this subject, though, so the comment may not be totally accurate. I did some benchmarking on GM45 and Ironlake, and discovered that for GLBenchmark 2.7 EgyptHD, using Z16 was 3% slower on GM45 (n=15), and 4.5% slower on Ironlake (n=95). So, we can drop the "on Ivybridge" aspect of the comment - it's always slower. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chia-I Wu <olv@lunarg.com>
This commit is contained in:
parent
313104e8d5
commit
be000b4d19
1 changed files with 10 additions and 7 deletions
|
|
@ -620,13 +620,16 @@ brw_init_surface_formats(struct brw_context *brw)
|
|||
ctx->TextureFormatSupported[MESA_FORMAT_Z_FLOAT32] = true;
|
||||
ctx->TextureFormatSupported[MESA_FORMAT_Z32_FLOAT_S8X24_UINT] = true;
|
||||
|
||||
/* It appears that Z16 is slower than Z24 (on Intel Ivybridge and newer
|
||||
* hardware at least), so there's no real reason to prefer it unless you're
|
||||
* under memory (not memory bandwidth) pressure. Our speculation is that
|
||||
* this is due to either increased fragment shader execution from
|
||||
* GL_LEQUAL/GL_EQUAL depth tests at the reduced precision, or due to
|
||||
* increased depth stalls from a cacheline-based heuristic for detecting
|
||||
* depth stalls.
|
||||
/* Benchmarking shows that Z16 is slower than Z24, so there's no reason to
|
||||
* use it unless you're under memory (not memory bandwidth) pressure.
|
||||
*
|
||||
* Apparently, the GPU's depth scoreboarding works on a 32-bit granularity,
|
||||
* which corresponds to one pixel in the depth buffer for Z24 or Z32 formats.
|
||||
* However, it corresponds to two pixels with Z16, which means both need to
|
||||
* hit the early depth case in order for it to happen.
|
||||
*
|
||||
* Other speculation is that we may be hitting increased fragment shader
|
||||
* execution from GL_LEQUAL/GL_EQUAL depth tests at reduced precision.
|
||||
*
|
||||
* However, desktop GL 3.0+ require that you get exactly 16 bits when
|
||||
* asking for DEPTH_COMPONENT16, so we have to respect that.
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue