Memtrace aubs are similar to classic aubs, with the major
difference being how command submission is serialized (as register
writes instead of a high-level submit message). Some internal
tools generate or consume only memtrace aubs.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
New generated files from:
bb1e6ff161 ("spirv: Add a prepass to set types on vtn_values")
65fc16c974 ("autotools: set XA versions in configure.ac and configure header file")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
This has only been compile tested.
v2: - Have a single option for opencl (Eric E)
- fix typo "tgis" -> "tgsi" (Curro)
- Don't add "lib" to pipe loader libraries, which matches the
autotools behavior
v3: - Remove trailing whitespace
- Make PIPE_SEARCH_DIR an absolute path
v4: - add trailing / to LIBCLC defines
Acked-by: Curro Jerez <currojerez@riseup.net>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
cc: Aaron Watry <awatry@gmail.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Currently that's dri, libgl-xlib, and osmesa.
v2: - put drivers on a separate line from normal dependencies (Eric E)
cc: George Kyriazis <george.kyriazis@intel.com>
cc: Tim Rowley <timothy.o.rowley@intel.com>
cc: Bruce Cherniak <bruce.cherniak@intel.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
This enables the SWR driver, but doesn't actually hook it up to any of
the targets yet. I felt like this patch was big and complicated enough
without adding that.
v2: - Fix typo 'delemeited' -> 'delimited' (Eric E)
- Fix type 'errror' -> 'error' (Eric E)
- Use variables to hold files instead of looking above the current
meson build (Eric E)
- Use foreach loops to reduce the number of unique generators
- Add comment about why some generators have names and some are just
added to a list
v3: - Remove trailing whitespace
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
nir_to_llvm_context will always be NULL for radeonsi so we need
work around this.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes the following piglit tests in radeonsi:
vs-tcs-tes-tessinner-tessouter-inputs-quads.shader_test
vs-tcs-tes-tessinner-tessouter-inputs-tris.shader_test
vs-tes-tessinner-tessouter-inputs-quads.shader_test
vs-tes-tessinner-tessouter-inputs-tris.shader_test
v2: make use of si_shader_io_get_unique_index_patch()
via the helper in the previous patch rather than
shader_io_get_unique_index()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This will be shared by the tgsi and nir backends.
v2: move si_shader_io_get_unique_index_patch() call inside
the helper.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Technically, the GLSLang bug related to this can also affect SSBO writes
where the bool -> uint conversion is missing. However, the only known
shipping application with an old enough version of GLSLang to cause
issues with this is the new DOOM game so we keep the workaround as small
as possible.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Previously, we were storing a pointer to the vtn_value because we use it
to look up decorations when we create input/output variables. This
works, but it also may be useful to have the id itself so we may as well
store that instead.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Now that higher levels are enforcing decoration sanity, we don't need
the vtn_asserts here. This function *should* be safe but we still want
a few well-placed regular asserts in case something goes awry.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This reworks the error checking on our generic handling of decorations.
The objective is to validate all of the SPIR-V assumptions we make
up-front and convert redundant checks to compiled-out asserts. The most
important part of this is to ensure that member decorations only occur
on OpTypeStruct and that the member is never out-of-bounds. This way
later code can assume that the member is sane and not have to worry
about OOB array access due to a misplaced OpMemberDecorate.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This is a bit simpler since we have fewer enum values in the case. It's
also a bit more efficient because we're making fewer glsl_get_* calls.
While we're at it, add better type validation.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Now that vtn_base_type is a real and full base type, we can switch on
that instead of the GLSL base type which is a lot fewer cases in our
switch.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
VGPR1 = InstanceID / StepRate0; // StepRate0 can be set to 1
Ported from RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
For Vega10 and Raven that need a special workaround for the
scissor bug.
This seems to give a minor boost for Talos and Dota 2, at least.
To reduce the cost of memcmp, the driver checks if it's
really useful to do the comparison.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
VGPR1 is only needed for topology that needs 3 offsets like
triangles or quads.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `u_thread_setname':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:66: undefined reference to `pthread_setname_np'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `thrd_join':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../include/c11/threads_posix.h:336: undefined reference to `pthread_join'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `u_thread_create':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:48: undefined reference to `pthread_sigmask'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `thrd_create':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../include/c11/threads_posix.h:296: undefined reference to `pthread_create'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `u_thread_create':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:50: undefined reference to `pthread_sigmask'
/builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:50: undefined reference to `pthread_sigmask'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `call_once':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../include/c11/threads_posix.h:96: undefined reference to `pthread_once'
../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `u_thread_get_time_nano':
/builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:84: undefined reference to `pthread_getcpuclockid'
collect2: error: ld returned 1 exit status
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Igor Gnatenko <ignatenko@redhat.com>
This was never enabled in secondary buffers because hiz_enabled was
never set to true for those.
If the app provides a framebuffer in the inheritance info when beginning
a secondary buffer, we can determine if HiZ is enabled and therefore
allow the PMA optimization to be enabled within the command buffer.
This improves performance by ~13% on an internal benchmark on Skylake.
v2: Use anv_cmd_buffer_get_depth_stencil_view().
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Respect the std430 rules for determining offset and size of struct
members when using a std430 buffer. std140 rules lead to wrong buffer
offsets in that case.
Fixes my test case attached in Bugzilla. No piglit changes.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104492
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
A part of the driver constbuf area is allocated for bindless images. Any
update requires uploading to all driver constbufs. This also extends the
driver constbuf to 64KB, up from 2KB.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>