Now that all drivers are using brw_cs_get_dispatch_info() we can
remove one function (which is now unused) and reduce the scope of the
other.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10504>
And since right_mask is already provided as part of dispatch_info,
just use that instead of storing it.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10504>
We have this small calculations repeated in each Intel driver, so move
them to a single place to be reused. Also includes "right_mask" since
is always used in the same context and depends on the dispatch info
values.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10504>
There's a recently discovered HW bug affecting hardware at least as far
back as Skylake where, if the LOD is out-of-bounds for any SIMD lane,
then garbage may be returned in all SIMD lanes. The easy solution is to
set lower_txs_lod so that we always have a constant LOD of 0 which we
know a priori is always in-bounds. Fortunately, not many shaders
actually use textureSize() with LOD.
Shader-db results on Ice Lake:
total instructions in shared programs: 19948537 -> 19948564 (<.01%)
instructions in affected programs: 3859 -> 3886 (0.70%)
helped: 0
HURT: 7
One of the shaders is in Civilization: Beyond Earth, and the rest are
all in Civilization VI.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10538>
The source_depth_to_render_target flag can get set on old gen4-5 HW in a
few cases which are independent of the app writing gl_FragDepth. It
should be safe to just use fetch_payload_reg in that case instead of
depending in interpolation setup. This fixes a bug with certain very
simple shaders where we might end up not including the depth when we
should have.
While we're here, rework the logic around setting src_depth and add a
comment so it's more clear what's going on.
Fixes: 6d4070f3dd "intel/compiler: add support for fragment coordinate..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10596>
There were two bugs which crep in here as part of 64551610d1:
forgetting that exec sizes in HW are in log2 space and having the
exec_size condition for the subtype backwards.
Fixes: 64551610d1 "intel/compiler: rework message descriptors..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10588>
Available on Gen11+.
v2: Order shading rate in correct order (Samuel)
v3: Move CPS_STATE emission to genX_state.c
v4: Don't override various output structures (Jason)
v5: Rebase on top master (Lionel)
v6: Fix invalid VkPhysicalDeviceFragmentShadingRatePropertiesKHR
(min|max)FragmentShadingRateAttachmentTexelSize values (Ken)
Drop #endif comment
v7: Limit extension to Gfx11+ (Lionel)
Support conservative raster (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>
Those helpers exist primarily to sort out some of the weirdness around
Gen4-6 dataport access. On Gen5 and earlier, everything was called
"dataport" and, instead of the SFID we have today there was a "target
cache" parameter in the descriptor. There are also some bits that moved
around on various gens depending on read vs. write. Starting with Gen6,
most things which target one of the data cache SFIDs should use
brw_dp_desc() instead.
v2: Drop backward comment (Ken)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>
It's a Gen6 XFB thing. It's never used for anything else so there's no
point in having a target cache switch.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>
v2: Drop new internal opcodes (Jason)
Simplify code (Jason)
v3: Add Z computation for coarse pixels
v4: Document things a little
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>
v2: Use the new inst->ex_desc field (Jason)
v3: Drop CPS LoD compensation from sampler messages (Lionel)
v4: Drop useless uses_rate_shading (Ken)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>
Render target message descriptors are slightly different from the
dataport ones. In particular the msg_type field is on bits 14:17 for
RT while bits 14:18 for DP.
v2: Drop unused send_commit_msg field in brw_fb_write_desc() (Ken)
v3: Rebase on top renaming (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>
The prototype uses a pointer and the actual function definition had an
array. For some reason, GCC never complained about this until GCC 11.
This fixes a compile warning when building with GCC 11.
Fixes: 09ced65420 "intel/isl: Add format conversion code"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10537>
With the missing else, this prints the compacted hex followed by hex
for an uncompacted version of the compacted instruction. It also
doesn't print hex for instructions that are not compacted.
Fixes: bc4a127d6e ("intel/disasm: Label support in shader disassembly for UIP/JIP")
Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4245
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10535>
This makes the vertex order of TRISTRIP and TRISTRIP_ADJ primitves
consistent between XFB output and GS input. Technically, the Vulkan
spec allows us to XFB out in whatever order we want but being consistent
with GS inputs is probably nicer to apps.
Fixes: 36ee2fd61c "anv: Implement the basic form of VK_EXT_transform_feedback"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10460>
Normally, we never see NULL in a source. However, starting with
eab1c55590, we can with a SHADER_OPCODE_SEND if it only has the first
payload. We were inserting barriers which adds unnecessary scheduling
dependencies and takes a lot of compile time because inserting a single
barrier is an O(n) operation.
All the extra O(n) can have a surprisingly large effect. This cuts the
runtime of dEQP-VK.binding_model.buffer_device_address.set3.depth3.
basessbo.convertcheckuv2.store.single.std140.frag by a factor of 20x for
a debug build.
Shader-db results on ICL:
total instructions in shared programs: 19918983 -> 19921610 (0.01%)
instructions in affected programs: 884074 -> 886701 (0.30%)
helped: 1688
HURT: 817
helped stats (abs) min: 1 max: 163 x̄: 4.23 x̃: 1
helped stats (rel) min: 0.02% max: 12.50% x̄: 1.08% x̃: 0.61%
HURT stats (abs) min: 1 max: 2674 x̄: 11.95 x̃: 2
HURT stats (rel) min: 0.11% max: 70.22% x̄: 1.71% x̃: 1.03%
95% mean confidence interval for instructions value: -1.97 4.06
95% mean confidence interval for instructions %-change: -0.28% -0.06%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs: 976503324 -> 975884809 (-0.06%)
cycles in affected programs: 82581703 -> 81963188 (-0.75%)
helped: 4144
HURT: 5010
helped stats (abs) min: 1 max: 79294 x̄: 311.31 x̃: 8
helped stats (rel) min: <.01% max: 53.69% x̄: 2.00% x̃: 0.51%
HURT stats (abs) min: 1 max: 92266 x̄: 134.04 x̃: 8
HURT stats (rel) min: <.01% max: 218.09% x̄: 3.25% x̃: 0.53%
95% mean confidence interval for cycles value: -119.85 -15.29
95% mean confidence interval for cycles %-change: 0.68% 1.07%
Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree).
total spills in shared programs: 10659 -> 12014 (12.71%)
spills in affected programs: 441 -> 1796 (307.26%)
helped: 7
HURT: 12
total fills in shared programs: 11551 -> 14429 (24.92%)
fills in affected programs: 993 -> 3871 (289.83%)
helped: 8
HURT: 11
total sends in shared programs: 1025832 -> 1025353 (-0.05%)
sends in affected programs: 2241 -> 1762 (-21.37%)
helped: 105
HURT: 1
helped stats (abs) min: 1 max: 87 x̄: 4.57 x̃: 2
helped stats (rel) min: 5.56% max: 54.72% x̄: 11.37% x̃: 10.00%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for sends value: -7.39 -1.65
95% mean confidence interval for sends %-change: -12.95% -7.70%
Sends are helped.
LOST: 93
GAINED: 109
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4648
Fixes: eab1c55590 "intel/fs: Support SENDS in SHADER_OPCODE_SEND"
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10412>
Namely we want to be able to emit the following dynamically :
* On Gfx 7/7.5 : 3DSTATE_VM, 3DSTATE_BLEND_STATE_POINTERS
* On Gfx 8+ : 3DSTATE_VM, 3DSTATE_BLEND_STATE_POINTERS,
3DSTATE_PS_BLEND
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10206>
Following 505d176a8e ("anv: disable baked in pipeline bits from
dynamic emission path") dynamic bits of extensions that are not
enabled should not be there.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10206>
It's impossible to see the names of instructions if the terminal's
color scheme uses black as foreground. Just set it to white - it
will look good on any color scheme.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10341>
This changes behavior. Now Baytrail will be decoded with the
gen7.xml instead of the gen75.xml. Haswell is the only graphics
hardware generation 75 and Baytrail is closer to Ivybridge
in most ways. (Kenneth Graunke)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>