Unlike textures, this doesn't get lowered for us. (Would be nice
if they were.. at least until we are ready to deal w/ indirect
indexing..)
Signed-off-by: Rob Clark <robdclark@gmail.com>
Run this pass late (after opt loop) to move load_const instructions back
into the basic blocks which use the result, in cases where a load_const
is only consumed in a single block.
This helps reduce register usage in cases where the backend driver
cannot lower the load_const to a uniform.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
We can have arrays of images or samplers. But I forgot to handle that
case long ago. Suprised no one complained yet.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Save the next person from digging through the code to figure out what
the indirect_mask parameter actually does.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
AMDVLK uses 64 (distributed) and 16 (non-distributed).
radeonsi will use 63 and 16.
* This might improve tessellation performance on Hawaii, Bonaire, Tahiti,
Pitcairn. (they will use 16)
* I'm not sure if this matters for 1 SE configs.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
This is the case for the simulator environment, and broke many blitter
tests by trying to texture from linear while the HW can only actually do
UIF/UBLINEAR/LT. Just make a temporary and copy into it with the CPU,
then blit from that.
This is the kind of path that should use the TFU, but I haven't exposed
that hardware yet.
Fixes dEQP-GLES3.functional.fbo.blit.default_framebuffer.*
Some fprintfs were probably left unintentionally a few years ago and are
a bit of a nuisance.
Fixes: 2d3301e4d5 ("virgl: fix reference counting of prime handles")
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
The prog->Shaders[i]->IsES check was accidentally removed causing
ES linking rules to be applied to desktop GLSL.
Fixes: 725b1a406d ("mesa/util: add allow_glsl_relaxed_es driconfig override")
This relaxes a number of ES shader restrictions allowing shaders
to follow more desktop GLSL like rules.
This initial implementation relaxes the following:
- allows linking ES shaders with desktop shaders
- allows mismatching precision qualifiers
- always enables standard derivative builtins
These relaxations allow Google Earth VR shaders to compile.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Google Earth VR shaders uses builtins in constant expressions with
GLSL 1.10. That feature wasn't allowed until GLSL 1.20.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Glibc has the same code to get program_invocation_short_name. However
for some reason the short name gets mangled for some wine apps.
For example with Google Earth VR I get:
program_invocation_name:
"/home/tarceri/.local/share/Steam/steamapps/common/EarthVR/Earth.exe"
program_invocation_short_name:
"e"
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
The Vulkan API has only one patch version shared among all of the
major.minor versions. We should also advertise the same patch version
regardless of major.minor.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106941
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This should fix TF across a glFlush() or TF pause/restart. Fixes
dEQP-GLES3.functional.transform_feedback.array.interleaved.lines.highp_float
and many, many others.
If we are on gen8+ and have context isolation support, just make that
constant buffer address be absolute, so we can use it for push UBOs too.
v2: Do not duplicate constant_buffer_0_is_relative flag (Jason)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
It's a bit late to round up after an integer division.
Fixes: de88979413 "radv: Implement VK_AMD_shader_info"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Fixes: 67f40dadaa "mesa: add support for ARB_sample_locations"
Cc: Rhys Perry <pendingchaos02@gmail.com>
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Drops the number of time we set the scissor by 4x for F1 2017,
which results in a consistent performance improvement of about 4%.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 922cd38172 "radv: implement out-of-order rasterization when it's safe on VI+"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Shouldn't make any functional difference, just that `liblibanv_gen90.a`
will now be called `libanv_gen90.a`.
Fixes: 3218056e0e "meson: Build i965 and dri stack"
Fixes: d1992255bb "meson: Add build Intel "anv" vulkan driver"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
ARB_texture_float references US Patent #6,650,327 [1] which has a filing date
of June 16 1998.
According to [2], patents filed after 1995 expire 20 years from the filing
date, giving an expiration of June 17 2018.
[1] https://www.google.com/patents/US6650327
[2] https://en.wikipedia.org/wiki/Term_of_patent_in_the_United_States
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
As the previous use of shuffle_32bit_load_result_to_64bit_data
had a source/destination overlap for 64-bit. Now a temporary destination
is used for 64-bit cases to use shuffle_from_32bit_read that doesn't
handle src/dst overlaps.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Previously, the shuffle function had a source/destination overlap that
needs to be avoided to use shuffle_from_32bit_read. As we can use for
the shuffle destination the destination of removed MOVs.
This change also avoids the internal MOVs done by the previous shuffle
to deal with possible overlaps.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
shuffle_from_32bit_read manages 32-bit reads to 32-bit destination
in the same way that the previous loop so now we just call the new
function for all bitsizes, simplifying also the 64-bit load_input.
v2: Add comment about future 16-bit support (Jason Ekstrand)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This implementation avoids two unneeded MOVs for each 64-bit
component. One was done in the old shuffle, to avoid cases of
src/dst overlap but this is not the case. And the removed MOV
was already being being done in the shuffle.
Copy propagation wasn't able to remove them because shuffle
destination values are defined with partial writes because they
have stride == 2.
v2: Reword commit log summary (Jason Ekstrand)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
do_untyped_vector_read is used at load_ssbo and load_shared.
The previous MOVs are removed because shuffle_from_32bit_read
can handle storing the shuffle results in the expected destination
just using the proper offset.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>