Removes the public symbol _glapi_create_table_from_handle from
libGL.so.1.2.0 on all platforms except Darwin.
Since the symbol is not used on other platforms it makes sense to
build glapi_gentable.c only on Darwin.
As a side effect it accelerates the build a bit and reduces the size
of libGL.so.1.2.0 as follows:
size lib/libGL.so.1.2.0 on my system shows
text data bss dec hex filename
469211 21848 2720 493779 788d3 lib/libGL.so.1.2.0 before
420988 11240 2720 434948 6a304 lib/libGL.so.1.2.0 after
A little bit of history:
_glapi_create_table_from_handle was introduced in
commit 85937f4c0d
Author: Jeremy Huddleston <jeremyhu@apple.com>
Date: Thu Jun 9 16:59:49 2011 -0700
glapi: Add API that can create a _glapi_table from a dlfcn handle
Example usage:
void *handle = dlopen(opengl_library_path, RTLD_LOCAL);
struct _glapi_table *disp = _glapi_create_table_from_handle(handle,
"gl");
Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
and the only user in mesa was added in
commit f35913b96e
Author: Jeremy Huddleston <jeremyhu@apple.com>
Date: Thu Jun 9 17:29:51 2011 -0700
apple: Use _glapi_create_table_from_handle to initialize our
dispatch table
Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
gl_gentable.py was also used for XQuartz in xserver 1.11 - 1.14.
v2: Fix typos in commit message
Add missing XORG_GLAPI_OUTPUTS += \ into src/mapi/glapi/gen/Makefile.am
Add glapi_gentable.c to EXTRA_DIST for inclusion in the release
tarball
v3: Fix commit message: s/gl_gentable.c/glapi_gentable.c/
Reported-by: Arlie Davis <arlied@google.com>
Cc: Jeremy Huddleston <jeremyhu@apple.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
This patch significantly reduces the size of the libGL.so binary. It does
not change the (externally visible) behavior of libGL.so at all.
gl_gentable.py generates a function, _glapi_create_table_from_handle.
This function allocates a large dispatch table, consisting of 1300 or so
function pointers, and fills this dispatch table by doing symbol lookups
on a given shared library. Previously, gl_gentable.py would generate a
single, very large _glapi_create_table_from_handle function, with a short
cluster of lines for each entry point (function). The idiom it generates
was a NULL check, a call to snprintf, a call to dlsym / GetProcAddress,
and then a store into the dispatch table. Since this function processes
a large number of entry points, this code is duplicated many times over.
We can encode the same information much more compactly, by using a lookup
table. The previous total size of _glapi_create_table_from_handle on x64
was 125848 bytes. By using a lookup table, the size of
_glapi_create_table_from_handle (and the related lookup tables) is reduced
to 10840 bytes. In other words, this enormous function is reduced by 91%.
The size of the entire libGL.so binary (measured when stripped) itself drops
by 15%.
So the purpose of this change is to reduce the binary size, which frees up
disk space, memory, etc.
size lib/libGL.so.1.2.0 on my system shows (Andreas)
text data bss dec hex filename
565947 11256 2720 579923 8d953 lib/libGL.so.1.2.0 before
469211 21848 2720 493779 788d3 lib/libGL.so.1.2.0 after
v2: Incorporate Matt's feedback.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Tested-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
This allows us to first generate atomic operations for shared
variables using these opcodes, and then later we can lower those to
the shared atomics intrinsics with nir_lower_io.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Previously we were receiving shared variable accesses via a lowered
intrinsic function from glsl. This change allows us to send in
variables instead. For example, when converting from SPIR-V.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Take reading shader outputs into account, and use setFlagsDef for the
carry since we rely on having i->flagsDef being set.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Doing that is clearly a bug. We can't quite assert as st/mesa may hit this,
but increase at least visibility of it a bit.
(For the non-refcounted objects it would be illegal too, but we can't detect
that unless we'd store the context ourselves. Plus, those don't tend to cause
random crashes at context or object destruction time... So just sampler views,
surfaces and so targets for now.)
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
I removed this mistakenly in 2dbc20e456. I
actually thought it should not be necessary and a piglit run didn't show
any differences, but this shouldn't have been in there.
draw_prepare_shader_outputs() is in fact dependent on NEW_RASTERIZER.
The new polygon-mode-facing test indeed shows why this is necessary, there's
lots of invalid reads and writes with valgrind (also crashes without
valgrind), because the pre-pipeline vertex size doesn't match the
post-pipeline vertex size (note this won't help much with stages which don't
have the prepare hook which can grow the vertex size, in particular the wide
point stage, but this isn't used by llvmpipe). The test still won't pass, of
course, but it is only usage of uninitialized values now, which is much
less dangerous...
(Albeit I'm pretty sure for i915 it really is not needed anymore as it
doesn't care about the extra outputs and doesn't call
draw_prepare_shader_outputs().)
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Patch moves uniform calculation to happen during link_uniforms, this
is possible with help of UniformRemapTable that has all the reserved
locations.
Location assignment for implicit locations is changed so that we
utilize also the 'holes' that explicit uniform location assignment
might have left in UniformRemapTable, this makes it possible to fit
more uniforms as previously we were lazy here and wasting space.
Fixes following CTS tests:
ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max
ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max-array
v2: code cleanups, increment NumUniformRemapTable correctly, fix
find_empty_block to work properly and add some more comments.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Fixes piglit regression after fixes to duplicate layout rules.
Previously catching multiple layouts was relying on the code
meant to catch duplicates within a single layout(...), this
change triggers the rules for multiple layouts.
Cc: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Phi handling is somewhat intrinsically tied to the CFG. Moving it here
makes it a bit easier to handle that. In particular, we can now do SSA
repair after we've done the phi node second-pass. This fixes 6 CTS tests.
This is a port of Matt's GLSL IR lowering pass to NIR. It's required
because we translate SPIR-V directly to NIR, bypassing GLSL IR.
I haven't introduced a lower_ldexp flag, as I believe all current NIR
consumers would set the flag. i965 wants this, vc4 doesn't implement
this feature, and st_glsl_to_tgsi currently lowers ldexp
unconditionally anyway.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
struct.pack('i', val) interprets `val` as a signed integer, and dies
if `val` > INT_MAX. For larger constants, we need to use 'I' which
interprets it as an unsigned value.
This patch makes us use 'I' for all values >= 0, and 'i' for negative
values. This should work in all cases.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>