These include functions for adding and removing various bits of IR and
helpers for iterating over all the sources and destinations of an
instruction. This is similar to ir.cpp.
v2: Jason Ekstrand <jason.ekstrand@intel.com>:
whitespace and automake fixes
This includes all the instructions, ifs, loops, functions, etc. This is
similar to the information in ir.h.
v2: Jason Ekstrand <jason.ekstrand@intel.com>:
Include ralloc and hash_table from the util directory
whitespace fixes
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-By glenn.kennard <glenn.kennard@gmail.com>
It turns out the simulator was not treating this bit the same as the RPi,
and I'd forgotten to remove it when turning on early Z. The result was
that you'd get big chunks of your rendering missing.
This reverts commit 0543630d0b.
It caused flickering artifacts in Steam games such as Team Fortress 2 or
Left 4 Dead 2.
We could probably only enable this optimization by also making sure the
shader code only uses either SI_PARAM_LINEAR_CENTROID or
SI_PARAM_LINEAR_CENTER, not both. This would probably require a shader
variant.
Sorry I didn't remember this when reviewing the reverted change.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
You would not believe the mess GCC 4.8.3 generated for the old
switch-statement.
On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:
32-bit: Difference at 95.0% confidence -0.37374% +/- 0.184057% (n=40)
64-bit: Difference at 95.0% confidence 0.966722% +/- 0.338442% (n=40)
The regression on 32-bit is odd. Callgrind says the caller,
_mesa_is_valid_prim_mode is faster. Before it says 2,293,760
cycles, and after it says 917,504.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Multithread:
32-bit: Difference at 95.0% confidence 0.416027% +/- 0.163529% (n=40)
64-bit: Difference at 95.0% confidence 0.494771% +/- 0.259985% (n=40)
Gl32Batch7 had no difference proven at 95.0% confidence (n=120) on
32-bit or 64-bit.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The previous check was insufficient (as it did not take 'indices' into
consideration), and DX10 hardware does not need this check anyway.
Since index_bytes is no longer used, remove it.
On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:
32-bit: Difference at 95.0% confidence 1.66929% +/- 0.230107% (n=40)
64-bit: Difference at 95.0% confidence -1.40848% +/- 0.288038% (n=40)
The regression on 64-bit is odd. Callgrind says the caller,
validate_DrawElements_common is faster. Before it says 10,321,920
cycles, and after it says 8,945,664.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This doesn't affect performance, but it feels more correct.
On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:
32-bit: No difference proven at 95.0% confidence (n=120)
64-bit: No difference proven at 95.0% confidence (n=120)
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Instead of having an extra pointer indirection in one of the hottest
loops in the driver.
On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:
32-bit: Difference at 95.0% confidence 1.98515% +/- 0.20814% (n=40)
64-bit: Difference at 95.0% confidence 1.5163% +/- 0.811016% (n=60)
v2 (Ken): Cut size of array from 64 to 57 to save memory.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
With the switch-statement, GCC 4.8.3 produces a small pile of code with
a branch.
00000000 <brw_get_index_type>:
000000: 8b 54 24 04 mov 0x4(%esp),%edx
000004: b8 01 00 00 00 mov $0x1,%eax
000009: 81 fa 03 14 00 00 cmp $0x1403,%edx
00000f: 74 0d je 00001e <brw_get_index_type+0x1e>
000011: 31 c0 xor %eax,%eax
000013: 81 fa 05 14 00 00 cmp $0x1405,%edx
000019: 0f 94 c0 sete %al
00001c: 01 c0 add %eax,%eax
00001e: c3 ret
However, this could be two instructions.
00000000 <brw_get_index_type>:
000000: 2d 01 14 00 00 sub $0x1401,%eax
000005: d1 e8 shr %eax
000007: 90 nop
000008: 90 nop
000009: 90 nop
00000a: 90 nop
00000b: c3 ret
The function was also moved to the header so that it could be inlined at
the two call sites. Without this, 32-bit also needs to pull the
parameter from the stack. This means there is a push, a call, a move,
and a ret added to a two instruction function. The above code shows the
function with __attribute__((regparm=1)), but even this adds several
extra instructions. There is also an extra instruction on 64-bit to
move the parameter to %eax for the subtract.
On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:
32-bit: Difference at 95.0% confidence 0.818589% +/- 0.234661% (n=40)
64-bit: Difference at 95.0% confidence 0.54554% +/- 0.354092% (n=40)
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
...so that it can be inlined in the two places that call it.
On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:
32-bit: No difference proven at 95.0% confidence (n=120)
64-bit: Difference at 95.0% confidence 1.24042% +/- 0.382277% (n=40)
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
We were happily printing "Native code for unnamed vertex shader" and
"VS vec4" program for geometry shaders in our INTEL_DEBUG=gs output,
as well as the KHR_debug output used by shader-db.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
A lot of messages hardcoded the string "FS", which is confusing on
Broadwell, where we use this code for VS support as well.
shader-db particularly got confused, as it reported two "FS SIMD8"
shaders, and no vertex shaders at all. Craziness ensued.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Only GNU indent is supported when indenting autogenerated format_pack.c
and format_unpack.c files. Some non-GNU indent (Mac OS X and FreeBSD)
add extra whitespaces than break the build of those files.
Fallback to 'cat' if a non-GNU indent is found.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=88335
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
The 8888 suggests 8-bit components which is not correct, so
replace that with the actual size of the components in each
format.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>