Fixes:
//exec is empty here
loop {
%1:s[16-17] = ...
if () {
break
}
%2:s[16-17] = ...
continue_or_break
}
%3 = phi %1, undef
//because of the undef, %2 can use s[16-17] and overwrite the address
load(%3:s[16-17])
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: bbe4652430 ("aco: create lcssa phis for continue_or_break loops when necessary")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11333
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29838>
The ISL_AUX_USAGE_STC_CCS is the last defined usage. We could
get a random value from isl_encode_aux_mode[] once it is passed
as index if its element is not initialized.
Explicit initialization of ISL_AUX_USAGE_HIZ_CCS_WT is added too.
Suggested by Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>
There should be some deeper causes to dig out. The bo-caching
system shouldn't affect the compression by design.
Fixes:
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb9_e5_linear
dEQP-GLES3.functional.texture.filtering.3d.formats.rgb9_e5_linear_mipmap_linear
The two cases can pass if we run them respectively. But once they
are fed to glcts in a test case list file (test.list) to run together,
the second test case hangs for a while and eventually fails, regardless
which of them is the second.
./glcts --deqp-caselist-file=test.list
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>
Also move the declaration of a local variable to where
it is going to use.
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>
The texturing and rendering preparation functions restrict
fast clear support in some cases to account for limitations
on prior platforms. Instead of updating those checks to avoid
resolves on Xe2, we can bypass them by representing the aux
state of a fast-cleared surface as compressed-no-clear. This
is valid because there is no longer a bit pattern which
references a clear value stored outside of the aux surface.
Suggested by Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>
From Bspec 56423 (r58507), the legacy full resovling and
partial resolving options are gone since Xe2. They also
cause hang on Xe2 if not disabled.
Some suggested code from Nanley Chery <nanley.g.chery@intel.com> is
included.
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>
The render surface state doesn't have these features any
more since Xe2.
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29906>
indirect store lowering will use if/else which changes
the control flow of the shader.
Fixes: 110887de2b ("nir: Add a new pass to lower array dereferences on vectors")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29894>
indirect store lowering will change control flow,
so we should not preserve control flow metadate
when it's present.
Fixes: 35b8f6f40b ("nir: Add a new pass to lower array dereferences on vectors")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29894>
Update mapping between render target surface formats and
compression formats.
Some preexisting correct mappings are also re-ordered to
the order of types in the spec for an easier verification
(top to bottom and left to right).
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29905>
The number of components for the second source is -1 to avoid validation of
its value. Some supported configurations will have the component count of
that matrix different than the others.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28834>
It turns out the problem I was trying to catch in be4fa59a72
("intel/brw: Clear write_accumulator flag when changing the
destination") also came from the DPAS lowering pass itself. Checking for
invalid uses of the feature in fs_validate helped detect the problem.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28834>
The original goal was to get rid of a bunch of the magic constants
sprinkled through the function. Once I did that, I realized that there
was a lot my symmertry between the row-major and column-major paths
possible.
It's +6 lines of code, but about 15 of those lines are comments
explaining things that were not obvious in the original code.
v2: Save duplicated condition in a variable with a meaningful
name. Suggested by Caio.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28834>
Even though the hardware does not naively support these configurations,
there are many potential benefits to advertising them. These
configurations can theoretically use half the memory bandwidth for loads
and stores. For large matrices, that can be the limiting in performance.
The current implementation, however, has a number of significant
problems.
The conversion from float16 to float32 is performed in the driver during
conversion from NIR. As a result, many common usage patterns end up
doing back-to-back conversions to and from float16 between matrix
multiplications (when the result of one multiplication is used as the
accumulator for the next).
The float16 version of the matrix waste half the possible register
space. Each float16 value sits alone in a dword. This is done so that
the per-invocation slice of an 8x8 float16 result matrix and an 8x8
float32 result matrix will have the same number of elements. This makes
it possible to do straightforward implementations of all the unary_op
type conversions in NIR.
It would be possible to perform N:M element type conversions in the
backend using specialized NIR intrinsics. However, per #10961, this
would be very, very painful. My hope is that, once a suitable resolution
for that issue can be found, support for these configs can be restored.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28834>
This fixes some cacheline flush artifacts
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29882>
The check we had for BGR vs. RGB formats was testing completely the
wrong thing. Fix it so we can restore the previous set of configs we
expose to the frontend, which also fixes surfaceless platform on s390x.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: ad0edea53a ("st/dri: Check format properties from format helpers")
Closes: mesa/mesa#11360
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29837>
The rounding mode was only set for a subset of floating point
conversions. This patch sets it for all of them.
Fixes all the dEQP-VK.*.float_controls.* CTS tests.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29843>
BSpec 62720 states that the previous and current offsets remains the
same as previous gfx versions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29899>
Also added pec_offset to struct intel_perf_query_info and two new
hw variables needed by this XML, those changes are required to at
least compile with this new XML.
pec_offset will be set in the next patches.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29899>
Next patch will need to frequently get the count of supported engine
for compute and copy engines, so to reduce the overhead of doing
KMD queries at every call here caching this information into
intel_device_info struct.
With that ANV and Iris would need to set this information as intel/dev
can't depend on intel/common, so here adding a single function
to update intel_device_info with all fields filled by intel/common
functions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29899>
Xe2 platforms will have way more query fields and allocation of that
will need to be increased but first lets add a function to return the
max_fields and assert if tried to access more query fields then
allocated.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29899>
Most places we follow the newest GFX version first, so doing that
here.
No changes in behavior exepected.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29899>