Minor adjustments to formatting of the copyright line, but keep
dates and holders. "Authors" entries that could be
obtained via Git logs were also removed.
The license in brw_disasm.c and elk_disasm.c don't match directly
any SPDX pattern I could find, so kept as is.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39503>
The previous Gfx12+ implementation using bit masking is failing for FP8
types, so replacing with explicit lookup tables.
For float types, the encoding now aligns with brw_data_type_float, ensuring
correct behavior for DPAS and other 3-source instructions.
Fixes: d1d4e3d530 ("brw: Add EU assembler support for float8")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39448>
Code kept track of blocks both in a linked list and
in an array. Change the client code of the list to
just use the array so we just maintain one.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39246>
The code currently don't remove blocks, when a block is about to become
empty, the code will replace the last instruction with a NOP.
If we want to have actual block removals again, there are other
strategies than removing them as we iterate (e.g. allow empty blocks
and then collect them in a pass or right after iteration).
So remove those macros.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39246>
Change the few other cases to an inline function that
does the same job. This macro will change in ways that
are not compatible with the non-assembler usages.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39363>
This optimization doesn't work when the ray query index isn't uniform across
the subgroup, which is something the spec allows. While there are some smart
ways to fix this and still avoid unnecessary spilling, its not worth investing
the time until we find a realtime raytracing workload that actually needs to
use multiple live ray queries for something.
Fixes: 1f1de7eb ("anv,brw: Allow multiple ray queries without spilling to a shadow stack")
Acked-by: Sagar Ghuge <sagar.ghuge@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39445>
Disable RHWO by default for singlesample draws and for MSAA
draws if a drirc key is set (avoid perf hit if not needed).
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39404>
This commit change the BVH layout a little so that we can load the BVH
offset as constant rather than reading from memory.
We have to force the instance leaves pointer at the end which gets used
in copy.comp shader.
Totals:
Instrs: 54798 -> 54728 (-0.13%)
Send messages: 3854 -> 3847 (-0.18%)
Cycle count: 1915106 -> 1913954 (-0.06%); split: -0.07%, +0.01%
Non SSA regs after NIR: 18594 -> 18575 (-0.10%)
Totals from 7 (7.37% of 95) affected shaders:
Instrs: 5532 -> 5462 (-1.27%)
Send messages: 367 -> 360 (-1.91%)
Cycle count: 132592 -> 131440 (-0.87%); split: -1.01%, +0.14%
Non SSA regs after NIR: 1989 -> 1970 (-0.96%)
PERCENTAGE DELTAS Shaders Instrs Send messages Cycle count Non SSA regs after NIR
q2rtx-rt-pipeline 95 -0.13% -0.18% -0.06% -0.10%
--------------------------------------------------------------------------------------
All affected 7 -1.27% -1.91% -0.87% -0.96%
--------------------------------------------------------------------------------------
Total 95 -0.13% -0.18% -0.06% -0.10%
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39106>
Copying the last byte was pointless, since the next line overwrites it,
and resulted in a compiler warning:
../src/intel/common/intel_measure.c: In function 'intel_measure_init':
../src/intel/common/intel_measure.c:68:7: warning: 'strncpy' specified bound 1024 equals destination size [-Wstringop-truncation]
68 | strncpy(env_copy, env, 1024);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
This allows dropping -Wno-error=stringop-truncation from the
debian-x86_64-asan & debian-{arm64,x86_64}-ubsan CI jobs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39429>
In software scoreboard (Gfx12+) use information from previous
instructions to trim out-of-order dependencies. For example, in
send g1, g2 ($1)
mov g3, g1 ($1.dst) // Depends on g1 (destination of $1)
mov g4, g2 ($1.src) // Depends on g2 (source of $1)
mov g5, g1 ($1.dst) // Depends on g1 (destination of $1)
only the first `mov` needs to be annotated, because the execution will
stall until that dependency is fulfilled, which in this case means the
`send` is done and `g1` was already written.
Note that while `$x.dst` implies `$x.src`, the reverse is not true, so
if the first `mov` did not exist, both second and third `mov` in the
example would have to keep their annotations.
This patch add resolution of implicit out-of-order dependencies that are
visible inside a block.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3526>
There's agreement now these are helpful and widely supported. We can
always fallback to a custom vector class later if necessary.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3526>
So that we can put the coarse_pixel_dispatch value available to NIR
lowering.
LNL internal fossildb changes:
Totals from 40 (0.01% of 490838) affected shaders:
Instrs: 33321 -> 33311 (-0.03%); split: -0.04%, +0.01%
Cycle count: 780136 -> 779936 (-0.03%); split: -0.03%, +0.00%
Max live registers: 5292 -> 5298 (+0.11%)
Non SSA regs after NIR: 26638 -> 26464 (-0.65%)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38996>
Each node has their own opacity bits, so we don't need to track these
opacity flags at header level.
This commit also fixes the instance flag. Instance flag is 8bit wide,
but we were always using 4 lower bits.
Cc: mesa-stable
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39053>
Setting this bit always might hurt performance. It might forces
traversal to treat all leafs always valid.
Cc: mesa-stable
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39053>
Makes a bunch of copy propagation and other passes work much better.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39382>
Gfx 12.5 struct has only one major difference with gfx9, that is OaCntr lenght,
while on gfx 9 it is 36 uint64_t long on gfx 12.5 it is 38 uint64_t long.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
We are missing handling for gfx12.5 so to add it we will need a switch case over
verx.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
Looking at the reference code, there is no new struct for Xe3 so it should
use the same struct as Xe2.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
With no more users of intel_perf_load_configuration() it can be
removed with other i915 functions around it.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
We have no usage of the information returned by
intel_perf_load_configuration(). It is only used to add a copy of the
configuration so we have the metric id but we could instead get the
metric id from sysfs, that is added by mdapi.
Xe KMD don't have a uAPI to query the metrics configuration, so
using sysfs also fixes the integration of mdapi with Xe KMD.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
There is no usage for register_config outside of
anv_AcquirePerformanceConfigurationINTEL(), so we don't need to store
it.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Lukasz Stalmirski <lukasz.stalmirski@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32842>
Not only is it questionable for code quality to not call nir_opt_algebraic_late
after nir_opt_algebraic, it also breaks correctness for late lowerings.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39180>
We can have only one of those calls before the 'if GFX_VERx10 >= 125' block.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39362>
Fixes:
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.whole_copy_before_resolving_transfer.2_bit
Otherwise we attempt to use blorp and hit various asserts later in:
- blorp_copy_supports_blitter
- blorp_xy_block_copy_blt
Fixes: 61287b00f3 ("anv: Stop using RCS companion for MSAA copy/clear on Xe3+")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39346>