The HW apparently has some issues (or at least a much more complicated VCM
calculation) with non-combined segments, and the closed source driver also
uses combined I/O. Until I get the last CTS failure resolved (which does
look plausibly like some VPM stomping), let's use combined I/O too.
Instead of a single i2b and b2i, we now have i2b32 and b2iN where N is
one if 8, 16, 32, or 64. This leads to having a few more opcodes but
now everything is consistent and booleans aren't a weird special case
anymore.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Meson test has a concepts of suites, which allow tests to be grouped
together. This allows for a subtest of tests to be run only (say only
the tests for nir). A test can be added to more than one suite, but for
the most part I've only added a test to a single suite, though I've
added a compiler group that includes nir, glsl, and glcpp tests.
To use this you'll need to invoke meson test directly, instead of ninja
test (which always runs all targets). it can be invoked as:
`meson test -C builddir --suite $suitename` (meson test has addition
options that are pretty useful).
Tested-By: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
The way nir_lower_clip_vs() works with store_output intrinsics makes a
ton of assumptions about the driver_location field.
In i965 and iris, I'd rather do this lowering early and work with
variables. v3d may want to switch to that as well, and ir3 could too,
but I'm not sure exactly what would need updating. For now, handle
both methods.
Reviewed-by: Eric Anholt <eric@anholt.net>
Pretty much all of the scripts are python2+3 compatible.
Check and allow using python3, while adjusting the PYTHON2 refs.
Note:
- python3.4 is used as it's the earliest supported version
- python2 chosen prior to python3
v2: use python2 by default
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
We were doing this late after nir_lower_io, but we can just reuse the core
code. By doing it at this stage, we won't even set up the VS attributes
as inputs, reducing our VPM size.
We always emit 4 slots per slot because things like color output and
position processing in the epilogue will potentially look up more values
than the variable declaration had. However, when we get a .location_frac
!= 0, we don't want to overwrite components of the following
.driver_location.
For supporting scalar VPM i/o at the NIR level, we need to do a pass over
the vars to figure out how big each attribute is after DCE. Once we've
done that, we can just walk over c->vattr_sizes[] instead of bothering
with vars.
Fixes the following building error in vc4 build:
In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_render_cl.c:34:
In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_drv.h:27:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_simulator_validate.h:34:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_context.h:39:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_cl.h:56:
gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h:12:10:
fatal error: 'cle/v3d_packet_helpers.h' file not found
^~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
Fixes: 5b102160ae ("broadcom/genxml: Introduce a V3D packet/struct decoder.")
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Fixes the following building error:
In file included from external/mesa/src/broadcom/cle/v3d_decoder.c:38:
In file included from external/mesa/src/broadcom/cle/v3d_packet_helpers.h:29:
external/mesa/src/gallium/auxiliary/util/u_math.h:42:10:
fatal error: 'pipe/p_compiler.h' file not found
^~~~~~~~~~~~~~~~~~~
1 error generated.
Fixes: 5b102160ae ("broadcom/genxml: Introduce a V3D packet/struct decoder.")
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Fixes the following building error, happening when building both intel and broadcom:
Gen Header: libmesa_broadcom_genxml_32 <= v3d_packet_v21_pack.h
FAILED: gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h
/bin/bash -c "python external/mesa/src/broadcom/cle/gen_pack_header.py \
external/mesa/src/broadcom/cle/v3d_packet_v21.xml \
> gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h"
Traceback (most recent call last):
File "external/mesa/src/broadcom/cle/gen_pack_header.py", line 626, in <module>
p = Parser(sys.argv[2])
IndexError: list index out of range
header-gen macro is already defined by Intel genxml building rules
and the existing header-gen does not have the $(PRIVATE_VER) argument,
infact the bash command line logged in the building error is missing
exactly $(PRIVATE_VER) argument
Renaming the macro as pack-header-gen in src/broadcom/Android.genxml.mk
solves the building error, another possible way is to keep the gen rules
commands expanded and not use the macros.
Fixes: 7f80a9ff13 ("vc4: Introduce XML-based packet header generation like Intel's.")
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Currently we have two sets of functions for bit counts, one in gallium
and one in core mesa. The ones in core mesa are header only in many
cases, since they reduce to "#define _mesa_bitcount popcount", but they
provide a fallback implementation. This is important because 32bit msvc
doesn't have popcountll, just popcount; so when nir (for example)
includes the core mesa header it doesn't (and shouldn't) link with core
mesa. To fix this we'll promote the version out of gallium util, then
replace the core mesa uses with the util version, since nir (and other
non-core mesa users) can and do link with mesautils.
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
There were two bugs working together to make things mostly work: I wasn't
dividing the VPM output size available by the size of a batch (vertex),
but I also had the size of the VPM reduced by a factor of 8.
Fixes dEQP-GLES3.functional.vertex_array_objects.all_attributes and it
seems also my intermittent varying failures.
Fixes: 1561e4984e ("v3d: Emit the VCM_CACHE_SIZE packet.")
This reverts commit ae7898dfdb.
Turns out the python scripts are _not_ fully python 3 compatible.
As Ilia reported using get_xmlpool.py with LANG=C produces some weird
output - see the link for details.
Even though the issue was spotted with the autoconf build, it exposes a
genuine problem with the script (and lack of lang handling of the meson
build.)
https://lists.freedesktop.org/archives/mesa-dev/2018-August/203508.html
Pretty much all of the scripts are python2+3 compatible.
Check and allow using python3, while adjusting the PYTHON2 refs.
Note:
- python3.4 is used as it's the earliest supported version
- python3 chosen prior to python2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Now that all the build scripts are compatible with both Python 2 and 3,
we can flip the switch and tell Meson to use the latter.
Since Meson already depends on Python 3 anyway, this means we don't need
two different Python stacks to build Mesa.
Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Found when debugging register spilling -- we would try to spill the dest
of a STVPMV, inserting spill code after entering the last segment. In
fact, we were likely to to choose to do this, given that the STVPMV "dest"
temp was never read from, making it cheap to spill.
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
The simulator complained that we had write responses outstanding at shader
end. It seems that a TMU read does not guarantee that previous TMU writes
by the thread have completed, which surprised me.
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
This is useful for periodically testing out register spilling to see how
it goes on simple shaders, rather than only failing on insanely
complicated ones.
I missed an important part when porting the change over, fixing my
compiler warning but breaking -Werror=format-security.
Fixes: e6ff5ac446 ("v3d: use snprintf(..., "%s", ...) instead of strncpy")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107443
This instruction is used to ensure that TMU stores have been processed
before moving on. In particular, you need any TMU ops to be done by the
time the shader ends.
A V3D_DEBUG=clif file from a non-texturing .shader_test can now be
successfully run through the CLIF runner in the simulator. Now I need to
build an open source CLIF runner against the v3d DRM module.
We need to dump each buffer's contents in order for a CLIF file, so we
need to collect all of the relocs into a buffer (such as the indirect CL
full of both uniforms and GL shader states) before we start dumping.
These will match the names that the CLIF parser expects to see. I may in
the future decide to change more of the other names so that I match the
names the HW/closed SW team uses for their packets, rather than the names
in the spec (which only they and I can read anyway).
This matches what CLIF parsing expects, and makes
TILE_BINNING_MODE_CONFIGURATION_COMMON_CONFIGURATION into a much more
legible TILE_BINNING_MODE_CFG_COMMON.
The CLIF format expects american english spelling, and the rest of Mesa is
too. I was previously adhering to the spec's spelling, which is
counterproductive.
A few of the upcoming changes would make the V3D_DEBUG=cl output less
readable, so let's make proper CLIF file production be under a separate
V3D_DEBUG=clif flag.
V3D only has one of these (the top 16 bits of a float32) left in its CLs,
but VC4 had many more. This gets us proper pretty-printing of the values
instead of a large uint.
By default after saying you are emitting a buffer, it'll expect a buffer
size. Once you set a format, it'll keep parsing that format until you
announce something else.
Previously, we emitted in XML order, which I happen to type in the
decreasing offset order of the specifications. However, the CLIF parser
wants increasing offsets.
With CLIFs, the parser will choose an address for the buffer being
created, so we need to use effectively relocations to buffers instead of
the addresses that the driver uses. This is also a whole lot more
intelligible for console output than raw addresses!