This replaces brw_fs_get_dispatch_enables(), which was added in
b9403b1c47 ("intel: factor out dispatch PS enabling logic"), but this
function will not work well for future changes to 3DSTATE_PS.
So, instead, this moves the related code into a "genX" file which can
directly update 3DSTATE_PS for the given platform.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20329>
i965 used these, but Gallium drivers do this lowering via a separate
nir_lower_tex call from st/mesa. Vulkan drivers don't use these at all.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20223>
None of the drivers have used this since we dropped i965, and BLORP
no longer uses it as of the previous commit. We can also drop the
former compressed_multisample_tex_mask (now padding) field so that
things remain 64-bit aligned.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20223>
The compiler looks at this key field to determine whether to perform
an MCS fetch for a txf_ms or samples_identical texture message, if a
nir_tex_src_ms_mcs_intel source wasn't provided. If it isn't set,
it instead uses constant 0 (nothing is compressed).
All of the drivers (iris, crocus, anv, hasvk) unconditionally set this
to ~0 because we don't want to pay for costly shader recompiles (which
can cause nasty stuttering). Most textures are compressed anyway, and
the hardware ignores the l2dms MCS parameter if MCS is disabled.
The only user was BLORP, which sets the key field based on whether the
texture's aux usage has MCS. But if it has MCS, it also does the MCS
fetch itself and supplies it directly. Otherwise, it relies on the
compiler to fill in the 0 value. But it could easily just provide the
0 value itself in that case and not rely on the compiler at all.
With that fixed, we can just drop the key fields entirely. We leave
them as padding for now to avoid repacking structures; we won't need
to after the next commits anyway.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20223>
VK_PIPELINE_CREATE_RAY_TRACING_SKIP_AABBS_BIT_KHR and
VK_PIPELINE_CREATE_RAY_TRACING_SKIP_TRIANGLES_BIT_KHR, when specified,
make TraceRay behave as if the corresponding shader flags were set, but
without affecting the value of IncomingRayFlags in shaders.
v2 (Lionel): Improve comments
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19152>
Make it clearer we are dealing with multiple patches,
works better in constrast with SINGLE_PATCH.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18151>
This backend lowering code has been dead since the removal of i965 -
nothing in the current source tree ever sets the flag.
This is handled by iris_setup_uniforms() and crocus_setup_uniforms().
Variable group size does not appear to be a feature in anv.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18055>
That way we can look at the SBT entries for debug purposes.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17908>
When the compiler pads a data structure, the padded bytes will not be
initialized. Shader keys are compared with memcmp and unitialized
bytes within the structure breaks this mechanism.
Explicitly pad the structures with members, so the compiler is forced
to initialize them. Add a warning to indicate if a change to
alignment in any of the data structures requires additional padding.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17749>
[anholt: changed to make all drivers do the right thing by moving the
payload barycentric check into the compiler]
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17381>
This structure will contain the opcode mapping tables in the next
commit. For now, this is the mechanical change to plumb it into all
the necessary places, and it continues simply holding devinfo.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17309>
src/mesa/main includes are for Mesa's OpenGL implementation, and the
compiler is used in Vulkan drivers and other tools. We really only
needed one #define, which is that we offer 32 samplers. It probably
makes more sense to have our own defined limit for that rather than
importing a project-wide value which theoretically could be adjusted,
so swap MAX_SAMPLERS for a new BRW_MAX_SAMPLERS and call it a day.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17309>
For cases with lots of very small primitives, this may improve
performance because we're not executing those dead channels all the
time.
Shader-db reports no instruction or cycle-count changes. However, by
hacking up the driver to report when this optimization triggers, it
appears to affect about 10% of shader-db.
v2 (Kenneth Graunke): Always enable VMask prior to XeHP for now,
because using VMask on those platforms allows us to perform the
eliminate_find_live_channel() optimization. However, XeHP doesn't
seem to have packed fragment shader dispatch, so we lose that
optimization regardless, and there's no reason not to avoid vmask.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1054>
With this option enabled range of input values for fsin and fcos is
limited to [-2*pi : 2*pi] by calculating the reminder after 2*pi modulo
division. This helps to improve calculation precision for large input
arguments on Intel.
-v2: Add limit_trig_input_range option to prog_key to update shader
cache (Lionel)
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16388>
Instead of reusing the in/out slot mechanism, use a separated NIR
variable mode. This will make easier later to implement staging the
output in shared memory (and storing all at the end to the URB).
Note to get 64-bit type support we currently rely on the
brw_nir_lower_mem_access_bit_sizes() pass.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15022>
This index will be used for accessing ray query data in memory.
v2: Drop a MOV (Caio)
v3: Rework back code emission (Caio)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>
This will allow to reuse the same intrinsic for various topology based
ID.
v2: fix intrinsic comment (Caio)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>
Replace load_mesh_global_arg_addr_intel with a more general intrinsic
load_mesh_inline_data_intel, since inline data now hold both
a pointer descriptor information and the first few push constants.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14788>
To avoid dragging gl.h into places it has no business being,
defined tessellation primitive mode to an enum.
This has a lot of fallout all over the place.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>
This will be used by crocus and iris to clamp pointsizes only
on the last stage of the shader compile.
Fixes: 3077d96856 ("crocus: Clamp VS point sizes to the HW limits as required.")
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14359>
Because 0 is no longer a recognizable value (it's NEVER, which isn't a
good default), we add an emit_alpha_test bool to tell the back-end when
to bother alpha testing. This lets us only touch crocus with the
change.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14157>
The current packed dispatch assumptions for fragment shaders seem to
be the reason that the fs-readFirstInvocation-uint-loop Piglit
test-case for the ARB_shader_ballot extension fails on DG2 in
combination with the patches in this series that enable pixel pipe
hashing (thanks Jordan for reporting the regression). I've confirmed
that the brw_fs_test_dispatch_packing() test fails on DG2 hardware for
fragment shaders, while it succeeds for other shader stages,
indicating that the PSD hardware no longer guarantees packed dispatch.
Disable it.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13569>