This is purely a file-move + #include fixup + build system changes.
Other cleanups will follow in subsequent commits.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The hardware registers store the half-size/width in 12.4 fixed point
format, so 8192 is the maximum.
Fixes dEQP-GLES3.functional.rasterization.*
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Together with the previous patches, this fixes
dEQP-GLES31.functional.primitive_bounding_box.wide_points.*
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This is a bit conservative, but a more precise solution requires access
to the rasterizer state. This is something to tackle after the fork between
r600 and radeonsi.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
We'll use it in the scissors / clip / guardband state.
v2: avoid a performance regression on r600 when applied to
(pre-fork) stable branches
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Fixes an assert in fd_acc_query_register_provider() about query provider
not already registered.
Fixes: 3f6b3d9d ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE")
Signed-off-by: Rob Clark <robdclark@gmail.com>
A recent commit fixed the case of 8888 integer cube maps, which need the
workaround of replacing the data format with USCALED/SSCALED. However,
this broke the case of non-8888 integer cube maps; those still need the
fix of shifting the texture coordinates.
Fixes KHR-GL45.texture_gather.plain-gather-int-cube-array and similar.
Fixes: 6fb0c1013b ("radeonsi: workaround for gather4 on integer cube maps")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The operation performed is all the same as LODQ, but with the usual
differences between dx10 and GL texture opcodes, that is separate resource
and sampler indices (plus result swizzling, and setting z/w channels
to zero).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
For 1080p video transcode, the height will be scaled to 1088 when deint
to progressive buffer. Set dst rect to make sure no scale.
Fixes: 3ad8687 "st/va: use new vl_compositor_yuv_deint_full() to deint"
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Andy Furniss <adf.lists@gmail.com>
Note: this causes spurious regressions in some current piglit tests,
because the tests incorrectly assume that there is no denorm support for
doubles. I'm going to send out a fix for those tests as well.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
The LLVM intrinsic has existed for a long time. The current name was
established in LLVM 3.9.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
The status quo is quite the mess:
1. tgsi_exec will do a per-channel computation, and store the dst[0]
result (significand) correctly for each channel. The dst[1] result
(exponent) will be written to the first bit set in the writemask.
So per-component calculation only works partially.
2. r600 will only do a single computation. It will replicate the
exponent but not the significand.
3. The docs pretend that there's per-component calculation, but even
get dst[0] and dst[1] confused.
4. Luckily, st_glsl_to_tgsi only ever emits single-component instructions,
and kind-of assumes that everything is replicated, generating this for
the dvec4 case:
DFRACEXP TEMP[0].xy, TEMP[1].x, CONST[0][0].xyxy
DFRACEXP TEMP[0].zw, TEMP[1].y, CONST[0][0].zwzw
DFRACEXP TEMP[2].xy, TEMP[1].z, CONST[0][1].xyxy
DFRACEXP TEMP[2].zw, TEMP[1].w, CONST[0][1].zwzw
Settle on the simplest behavior, which is single-component calculation
with replication, document it, and adjust tgsi_exec and r600.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Sourcing the exponent for the zw destination pair from Z is consistent
with both tgsi_exec and gallivm. In practice, st_glsl_to_tgsi always
generates per-channel instructions anyway.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
This fixes a warning caused by the fork (note the change in the function
signature):
../../../../../mesa-src/src/gallium/drivers/r600/r600_state_common.c: In function ‘r600_init_common_state_functions’:
../../../../../mesa-src/src/gallium/drivers/r600/r600_state_common.c:2974:36: warning: assignment from incompatible pointer type [-Wincompatible-pointer-types]
rctx->b.set_occlusion_query_state = r600_set_occlusion_query_state;
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This fixes the extremely unlikely case that an application uses
0x80000000 or 0x3f800000 as border color for an integer texture and
helps in the also, but perhaps slightly less, unlikely case that 1 is
used as a border color.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
The hardware does this automatically for unorm formats, but we need to
do it manually for unorm depth formats that have been upgraded to
Z32_FLOAT.
Fixes dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_depth
and others.
Fixes: d4d9ec55c5 ("radeonsi: implement TC-compatible HTILE")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
The hardware usually does this automatically. However, we upgrade
depth to Z32_FLOAT to enable TC-compatible HTILE, which means the
hardware no longer clamps the comparison value for us.
The only way to tell in the shader whether a clamp is required
seems to be to communicate an additional bit in the descriptor
table. While VI has some unused bits in the resource descriptor,
those bits have unfortunately all been used in gfx9. So we use
an unused bit in the sampler state instead.
Fixes dEQP-GLES3.functional.texture.shadow.2d.linear.equal_depth_component32f
and many other tests in dEQP-GLES3.functional.texture.shadow.*
Fixes: d4d9ec55c5 ("radeonsi: implement TC-compatible HTILE")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Not that those are super common or useful, but hey! Fun corner cases
of the API...
Fixes dEQP-GLES31.functional.geometry_shading.emit.*
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
It has to happen after descriptor uploads since otherwise we'll print out
the wrong GPU list / incorrectly claim descriptor corruption.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Redundant with the recently added ac_llvm_context::chip_class.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Since our driver support arb_provoking_vertex, we can start
advertising PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION
Fixes ./clipflat & ./arb-provoking-vertex-render piglit tests
Tested piglit, glretrace on Hw 11 and Hw 13
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Currently we are blitting the whole resource when the RS is used to
de-/tile a resource. This can be very inefficient for large resources
where the transfer is only changing a small part of the resource
(happens a lot with glTexSubImage2D).
Optimize this by only blitting the tile aligned subregion of the
resource, which the transfer is going to change.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
This is useful if we only need to copy part of a larger resource, mostly
when using the RS engine to de-/tile on pipe transfers.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
The RS can blit abitrary tile aligned subregions of a resource by
adjusting the buffer offset.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
The function pipe_buffer_map() is only for linear pipe buffer,
with height as 0, and it's not for any 2D textures.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
It adds reference links for arguments usage and bind of resource_create().
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Previous get_paramf links same as get_param. It changes the reference link to
PIPE_CAPF_*
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
I remember thinking "gosh, it would be nice if I could do a kernel-style
'if (!IS_ENABLED(DEBUG))' instead of using an #ifdef, so the code was
compiled on both builds", and then forgot to test a release build anyway.
Fixes: a8fd58eae5 ("vc4: Add labels to BOs for debug builds or with VC4_DEBUG=surf set.")
Reported-by: Derek Foreman <derekf@osg.samsung.com>
This has proven to be incredibly useful for debugging CMA allocation
failures and driving memory management improvements. However, we don't
want to burden entry and exit from the BO cache with the labeling ioctl's
overhead on release builds.
This unbreaks waffle/gbm (piglit/gbm) which fails initialization.
v2: also don't set queryDmaBufFormats
Reviewed-by: Daniel Stone <daniel@fooishbar.org>
That's unnecessary to double-check that dcc_offset is not 0
because all callers already check that.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>