With an earlier commit we've spit the flags parsing to a separate
function, but forgot to update all the dri modules to use it.
Noticed when we've enabled KHR_debug for every dri module - fdo#93048
Fixes: 38366c0c6e "dri_util: Don't assume __DRIcontext->driverPrivate
is a gl_context"
Cc: Mark Janes <mark.a.janes@intel.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
The commit b4e198f47f changed the offset and bits parameters of the
bitfield insert operation from scalars to vectors. However, the lowering
of ldexp on doubles operates on each vector component and emits scalar
code (since it has to deal with the lower and upper 32-bit chunks of
each double component), so it needs its bits and offset parameters to
be scalars.
Fixes fp64 regression (crash) in:
spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-ldexp-dvec4.shader_test
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Relocations are 64 bits on Gen8+. Most CTS tests that send
non-trivial work to the GPU would fail when run from a single deqp-vk
invocation because they were effectively relying on reloc presumed
offsets to be wrong so the kernel would come and apply relocations
correctly.
Before we were asuming that a deref would either be something in a block or
something that we could pass off to NIR directly. However, it is possible
that someone would choose to load/store/copy a split structure all in one
go. We need to be able to handle that.
Previously, we were creating nir_deref's immediately. Now, instead, we
have an intermediate vtn_access_chain structure. While a little more
awkward initially, this will allow us to more easily do structure splitting
on-the-fly.
Since the GREMEDY extensions are normally only exposed by the gremedy
debugger (and could possibly trigger debug paths in the app), we don't
expose the extension by default, but instead only with
ST_DEBUG=gremedy.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
The extension spec[1] describes DEBUG_TYPE_MARKER as "Annotation of the
command stream". So for DEBUG_TYPE_MARKER, also pass the buf to the
driver's EmitStringMarker() to be inserted in the command stream.
[1] https://www.opengl.org/registry/specs/KHR/debug.txt
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
If we're continuing a render pass, make sure we don't emit the depth and
stencil buffer addresses before we set the state base addresses.
Fixes crucible func.cmd-buffer.small-secondaries
The texture mipmap completeness checking code was checking whether all
of the faces have the same size. However this is pointless because the
code just above it checks whether the face has the expected size
calculated for the mipmap level anyway so the error condition could
never be reached. This patch just removes it.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
According to the GL 1.4 spec section 3.8.10, a cubemap texture is only
complete if:
• The level base arrays of each of the six texture images making up
the cube map have identical, positive, and square dimensions.
• The level base arrays were each specified with the same internal
format.
• The level base arrays each have the same border width.
Previously the texture completeness code was only checking the first
point. This patch makes it additionally check the other two.
This fixes the following two dEQP tests:
deqp-gles2.functional.texture.completeness.cube.format_mismatch_rgba_rgb_level_0_neg_z
deqp-gles2.functional.texture.completeness.cube.format_mismatch_rgb_rgba_level_0_pos_z
And also this Piglit test:
spec/!opengl 2.0/incomplete-cubemap-format
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93792
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The buffers are referenced from r600_update_driver_const_buffers()
-> r600_set_constant_buffer() -> u_upload_data(), but nothing
ever releases the reference. Similar case with driver_consts.
Found using valgrind.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
This currently sets the base and size of all push constants to the
entire push constant block. The idea is that we'll use the base and size
to eventually optimize the amount we actually push, but for now we don't
do that.
main/shaderapi.c:1318:51: warning: format specifies type 'unsigned int' but the argument has type 'GLhandleARB' (aka 'unsigned long') [-Wformat]
_mesa_debug(ctx, "glDeleteObjectARB(%u)\n", obj);
~~ ^~~
%lu
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Removes the public symbol _glapi_create_table_from_handle from
libGL.so.1.2.0 on all platforms except Darwin.
Since the symbol is not used on other platforms it makes sense to
build glapi_gentable.c only on Darwin.
As a side effect it accelerates the build a bit and reduces the size
of libGL.so.1.2.0 as follows:
size lib/libGL.so.1.2.0 on my system shows
text data bss dec hex filename
469211 21848 2720 493779 788d3 lib/libGL.so.1.2.0 before
420988 11240 2720 434948 6a304 lib/libGL.so.1.2.0 after
A little bit of history:
_glapi_create_table_from_handle was introduced in
commit 85937f4c0d
Author: Jeremy Huddleston <jeremyhu@apple.com>
Date: Thu Jun 9 16:59:49 2011 -0700
glapi: Add API that can create a _glapi_table from a dlfcn handle
Example usage:
void *handle = dlopen(opengl_library_path, RTLD_LOCAL);
struct _glapi_table *disp = _glapi_create_table_from_handle(handle,
"gl");
Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
and the only user in mesa was added in
commit f35913b96e
Author: Jeremy Huddleston <jeremyhu@apple.com>
Date: Thu Jun 9 17:29:51 2011 -0700
apple: Use _glapi_create_table_from_handle to initialize our
dispatch table
Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
gl_gentable.py was also used for XQuartz in xserver 1.11 - 1.14.
v2: Fix typos in commit message
Add missing XORG_GLAPI_OUTPUTS += \ into src/mapi/glapi/gen/Makefile.am
Add glapi_gentable.c to EXTRA_DIST for inclusion in the release
tarball
v3: Fix commit message: s/gl_gentable.c/glapi_gentable.c/
Reported-by: Arlie Davis <arlied@google.com>
Cc: Jeremy Huddleston <jeremyhu@apple.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
This patch significantly reduces the size of the libGL.so binary. It does
not change the (externally visible) behavior of libGL.so at all.
gl_gentable.py generates a function, _glapi_create_table_from_handle.
This function allocates a large dispatch table, consisting of 1300 or so
function pointers, and fills this dispatch table by doing symbol lookups
on a given shared library. Previously, gl_gentable.py would generate a
single, very large _glapi_create_table_from_handle function, with a short
cluster of lines for each entry point (function). The idiom it generates
was a NULL check, a call to snprintf, a call to dlsym / GetProcAddress,
and then a store into the dispatch table. Since this function processes
a large number of entry points, this code is duplicated many times over.
We can encode the same information much more compactly, by using a lookup
table. The previous total size of _glapi_create_table_from_handle on x64
was 125848 bytes. By using a lookup table, the size of
_glapi_create_table_from_handle (and the related lookup tables) is reduced
to 10840 bytes. In other words, this enormous function is reduced by 91%.
The size of the entire libGL.so binary (measured when stripped) itself drops
by 15%.
So the purpose of this change is to reduce the binary size, which frees up
disk space, memory, etc.
size lib/libGL.so.1.2.0 on my system shows (Andreas)
text data bss dec hex filename
565947 11256 2720 579923 8d953 lib/libGL.so.1.2.0 before
469211 21848 2720 493779 788d3 lib/libGL.so.1.2.0 after
v2: Incorporate Matt's feedback.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Tested-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
This allows us to first generate atomic operations for shared
variables using these opcodes, and then later we can lower those to
the shared atomics intrinsics with nir_lower_io.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Previously we were receiving shared variable accesses via a lowered
intrinsic function from glsl. This change allows us to send in
variables instead. For example, when converting from SPIR-V.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Take reading shader outputs into account, and use setFlagsDef for the
carry since we rely on having i->flagsDef being set.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>