Patch moves functions higher so that we can utilize them from
test_disk_cache_create which is modified by next patch.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
LLVM can't shrink loads.
Polaris10:
Totals from affected shaders:
SGPRS: 62528 -> 59955 (-4.11 %)
VGPRS: 44708 -> 44616 (-0.21 %)
Spilled SGPRs: 16 -> 8 (-50.00 %)
Code Size: 1355504 -> 1355172 (-0.02 %) bytes
Max Waves: 11710 -> 11670 (-0.34 %)
Vega10:
Totals from affected shaders:
SGPRS: 51448 -> 50371 (-2.09 %)
VGPRS: 39140 -> 39048 (-0.24 %)
Spilled SGPRs: 16 -> 16 (0.00 %)
Code Size: 1307188 -> 1304296 (-0.22 %) bytes
Max Waves: 11312 -> 11292 (-0.18 %)
This reduces SGPRs spilling in MadMax, and it also reduces
number of SGPRs in DOW3 and F12017. The number of waves slightly
decreases in F1 but I don't see any performance changes after
benchmarking it. Talos and Serious Sam are not affected because
they don't use any push constants.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This is a very simple pass that just shrinks load_push_constant
intrinsics when some components are unused. For now, it can just
shrink vec4 to vec3, vec3 to vec2 and so on.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The SPIR-V parser splits in/out struct variables and creates
a separate variable for each first-level member of the struct.
When the struct variable has an initializer this means that we also
need to split the initializer.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
According with OpenGL GLSL 3.20 spec, section 4.3.9:
"It is a link-time error if any particular shader interface
contains:
- two different blocks, each having no instance name, and each
having a member of the same name, or
- a variable outside a block, and a block with no instance name,
where the variable has the same name as a member in the block."
This fixes a previous commit 9b894c8 ("glsl/linker: link-error using the
same name in unnamed block and outside") that covered this case, but
did not take in account that precision qualifiers are ignored when
comparing blocks with no instance name.
With this commit, the original tests
KHR-GL*.shaders.uniform_block.common.name_matching keep fixed, and also
dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision
regression is fixed, which was broken by previous commit.
v2: use helper varibles (Matteo Bruni)
Fixes: 9b894c8 ("glsl/linker: link-error using the same name in unnamed block and outside")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104668
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104777
CC: Mark Janes <mark.a.janes@intel.com>
CC: "18.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Matteo Bruni <matteo.mystral@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
The materials are now moved to the end of the
generic attributes block to the range 4-15.
Before, the way the position and generic 0 attribute
is handled was dependent on the presence and kind of
the currently attached vertex program. With this
change the way the position attribute and the generic 0
attribute is treated only depends on the enabled
flag of those two arrays.
This will later help to untangle the update dependencies
between enabled arrays and shader inputs.
v2: s,VERT_ATTRIB_MAT_OFFSET,VERT_ATTRIB_MAT0,g
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
Instead of just assuming that the material attributes
just overlap with the generic attributes 0-12, give
them symbolic defines so that we can easier move them
to an other range.
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
min(a+b, c+d) >= 0 becomes (a+b >= 0 && c+d >= 0).
No shader-db changes, but it does prevent 6 to 12 instruction
regressions in the next patch on all measured Intel platforms.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
v2: Rebase on almost 2 years. Require that one of the arguments to fmin
or fmax be used only once. This prevents some regressions.
shader-db results:
Skylake and Broadwell had similar results. Skylake shown.
total instructions in shared programs: 14526021 -> 14525913 (<.01%)
instructions in affected programs: 4613 -> 4505 (-2.34%)
helped: 31
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 3.48 x̃: 4
helped stats (rel) min: 0.62% max: 6.67% x̄: 3.31% x̃: 2.42%
total cycles in shared programs: 533118710 -> 533118403 (<.01%)
cycles in affected programs: 34334 -> 34027 (-0.89%)
helped: 24
HURT: 0
helped stats (abs) min: 4 max: 24 x̄: 12.79 x̃: 14
helped stats (rel) min: 0.25% max: 2.40% x̄: 1.08% x̃: 1.03%
No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
We need this because we will always copy fs outputs to temps and
split the arrays, but do not want to do either of these with fs
inputs as it is unnessisary and makes handling interpolateAt
builtins difficult.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
c2acf97fcc changed the use of double_inputs_read to be
inconsitent with its previous meaning. Here we re-enable the
gather info code that was removed as the modified code from
c2acf97fcc now uses the double_inputs member rather than
double_inputs_read.
This change allows us to use double_inputs_read with gallium
drivers without impacting double_inputs which is used by i965.
We also make use of the compiler option vs_inputs_dual_locations
to allow for the difference in behaviour between drivers that handle
vs inputs as taking up two locations for doubles, versus those that
treat them as taking a single location.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Allows nir drivers to either use a single or dual locations for
vs double inputs.
i965 uses dual locations for both OpenGL and Vulkan drivers, for
now gallium OpenGL drivers only use a single location.
The following patch will also make use of this option when
calling nir_shader_gather_info().
Reviewed-by: Karol Herbst <kherbst@redhat.com>
First we move double_inputs_read into a vs struct in the union,
double_inputs_read is only used for vs inputs so this will
save space and also allows us to add a new double_inputs field.
We add the new field because c2acf97fcc changed the behaviour
of double_inputs_read, and while it's no longer used to track
actual reads in i965 we do still want to track this for gallium
drivers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This change cleans following scary warnings in valgrind output
when disk cache is being written:
==6532== Uninitialised byte(s) found during client check request
==6532== at 0x14423FAD: blob_write_bytes (blob.c:152)
==6532== by 0x144240FB: blob_write_uint32 (blob.c:194)
==6532== by 0x144001A5: write_tex (nir_serialize.c:613)
and later (loads of):
==6532== Use of uninitialised value of size 8
==6532== at 0x62FCD9E: crc32_z (in /usr/lib64/libz.so.1.2.11)
==6532== by 0x13F65014: util_hash_crc32 (crc32.c:127)
==6532== by 0x13F5DABA: cache_put (disk_cache.c:947)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Namely extend the EXTRA_DIST list, instead of re-assigning it and bring
back a file dropped by mistake.
Fixes: 436ed65d38 ("autotools: include meson build files in tarball")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
This adds the meson.build, meson_options.txt, and a few scripts that are
used exclusively by the meson build.
v2: - Remove accidentally included changes needed to test make dist with
LLVM > 3.9
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
According with OpenGL GLSL 4.20 spec, section 4.3.9, page 57:
"It is a link-time error if any particular shader interface
contains:
- two different blocks, each having no instance name, and each
having a member of the same name, or
- a variable outside a block, and a block with no instance name,
where the variable has the same name as a member in the block."
This means that it is a link error if for example we have a vertex
shader with the following definition.
"layout(location=0) uniform Data { float a; float b; };"
and a fragment shader with:
"uniform float a;"
As in both cases we refer to both uniforms as "a", and thus using
glGetUniformLocation() wouldn't know which one we mean.
This fixes KHR-GL*.shaders.uniform_block.common.name_matching.
v2: add fixed tests (Tapani)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
To avoid compilation warnings and because this helper
shouldn't update anything.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This creates two new internal dependencies, idep_nir_headers and
idep_nir. The former encapsulates the generation of nir_opcodes.h and
nir_builder_opcodes.h and adding src/compiler/nir as an include path.
This ensures that any target that needs nir headers will have the
includes and that the generated headers will be generated before the
target is build. The second, idep_nir, includes the first and
additionally links to libnir.
This is intended to make it easier to avoid race conditions in the build
when using nir, since the number of consumers for libnir and it's
headers are quite high.
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Don't use intermediate variables, use consistent whitespace.
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Currently the meosn build has a mix of two styles:
arg : [foo, ...
bar],
and
arg : [
foo, ...,
bar,
]
For consistency let's pick one. I've picked the later style, which I
think is more readable, and is more common in the mesa code base.
v2: - fix commit message
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
If MaxAttribs were ever raised to 32, undefined behavior would occur.
We had already gone to the effort (albeit incorrectly) handle this in
one case, so fix them all.
CID: 1369628
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
If max_index were ever 32, the linker would have marked all 32
locations as invalid instead of marking none of them as invalid. It's
a good thing the maximum value actually set by any driver for
MaxAttribs is 16.
Found by inspection while investigating CID 1369628.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
All cases where the result could be non-visit_continue would have
already returned.
CID: 401351, 1224465, 1224466
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
None of these are necessary because result->type is the only thing used
outside the giant switch-statement.
CID: 1230983, 1230984
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Intel was the only user and now NIR can do the lowering.
v2: do not try to handle it as a system value directly for the SPIR-V
path. In GL we rather handle it as a uniform like we do for the
GLSL path (Jason).
v3: drop LowerTESPatchVerticesIn as well (Jason)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
New generated files from:
bb1e6ff161 ("spirv: Add a prepass to set types on vtn_values")
65fc16c974 ("autotools: set XA versions in configure.ac and configure header file")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Technically, the GLSLang bug related to this can also affect SSBO writes
where the bool -> uint conversion is missing. However, the only known
shipping application with an old enough version of GLSLang to cause
issues with this is the new DOOM game so we keep the workaround as small
as possible.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Previously, we were storing a pointer to the vtn_value because we use it
to look up decorations when we create input/output variables. This
works, but it also may be useful to have the id itself so we may as well
store that instead.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Now that higher levels are enforcing decoration sanity, we don't need
the vtn_asserts here. This function *should* be safe but we still want
a few well-placed regular asserts in case something goes awry.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>