brw_optimize.c's brw_opcodes table was a copy of brw_disasm.c's
opcode_descs table, but with an additional field: is_arith. Now that
I've deleted that, the two are identical. Keep the one in brw_disasm.c.
Reviewed-by: Eric Anholt <eric@anholt.net>
All users of basic block analysis simply create their own local
variables. Nobody uses the visitor-wide field.
Reviewed-by: Eric Anholt <eric@anholt.net>
The old brw_remove_grf_to_mrf_moves() pass is obsolete and replaced by
fs_visitor::compute_to_mrf().
The old brw_remove_duplicate_mrf_moves() pass is obsolete and replaced
by fs_visitor::remove_duplicate_mrf_writes().
The remaining pass, brw_set_dp4_dependency_control(), is currently
unused, but could be, so I'm leaving it for now.
Reviewed-by: Eric Anholt <eric@anholt.net>
At this point, it's just gl_shader_program. Nobody even uses it; even
the program that creates them only returns gl_shader_program pointers.
Reviewed-by: Eric Anholt <eric@anholt.net>
The passthrough pipeline needs to check index values (which might be passed
through) as they can be invalid (which causes crashes and various assertion
failures if the clip code runs). Obviously, rendering won't be well-defined,
but those bogus indices might come directly from apps.
There were already debug printfs which reported the out-of-bounds indices but
we really ought to not crash.
While checking at that point doesn't seem like the most efficient solution,
it seems there isn't really another appropriate function to do it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Clean up a few magic numbers and rework the code a bit.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Assert the the CB format is valid and default to
the INVALID hw format rather than ~0U when the format
doesn't match for non-debug builds.
v2: use INVALID hw format rather than ~0U
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Assert that the DB format is valid and default to
the INVALID hw format rather than ~0U when the format
doesn't match for non-debug builds.
v2: use INVALID hw format rather than ~0U
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
This is necessary for backwards compatibility with pre-SI for stencil.
Fixes a number of stencil related piglit tests, and real apps using stencil.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Fixes assertion failure with Mesa demo glsl/samplers.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
On Gen6-7, we don't compact clip planes, and nr_userclip_plane_consts
is the last bit set, so iterating from i = 0..nr_userclip_plane_consts
covers all active clip planes and is the right thing to do.
works and is the right thing to do.
However, that doesn't work at all on Gen4-5. Since we don't compact
clip planes, we skip over ones which aren't active (via the continue
statement). We also set set nr_userclip_plane_consts to the number of
active clip planes, which means that we end the loop after checking that
many bits. If the set of clip planes wasn't contiguous, this means we'd
fail to find the last few.
By changing the iteration to MAX_CLIP_PLANES, we correctly find all of
the active clip planes.
Fixes regressions since 66c8473e02 (replacing the old VS backend) in
Piglit's spec/glsl-1.20/execution/clipping/fixed-clip-enables and
oglconform's mustpass(basic.clip) and userclip(basic.allCases).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56791
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
There's no compaction, so we can drop that code and simply use 'i'.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Since Gen4-5 compacts clip planes and Gen6-7 doesn't, it makes sense to
split them into separate code paths. This patch simply copies the code
to both halves; the next commits will simplify it.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
The previous 1023-entry chaining hash table never resized, so it was very
inefficient when there were many objects live. While one could have an even
more efficient implementation than this (keep an array for genned names with
packed IDs, or take advantage of the fact that key == hash or key ==
*(uint32_t *)data to store less data), this is fairly fast, and I want a nice
replacement hash table for other parts of Mesa, too.
It improves Minecraft performance 12.3% +/- 1.4% (n=9), dropping hash lookups
from 8% of the profile to 0.5%.
I also tested cairo-gl, which should be a pessimal workload for this hash
table: around 247000 FBOs created and destroyed, only around 65 live at any
time, and few lookups of them between creation and destruction. No
statistically significant performance difference at n=76 (mean 20.3/20.4
seconds, sd 2.8/3.2 seconds). If I remove the >20 seconds outliers that
appear to be due to thermal throttling, there's possibly a .97% +/- 0.31%
performance win (n=61/59). The choice of cutoff for outliers feels a lot like
cooking the data, but I've gone through this process 3 times for minor
iterations of the code with the same conclusion each time.
Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Mesa's chaining hash table for object names is slow, and this should be much
faster. I namespaced the functions under _mesa_*, to avoid visibility
troubles that we may have had before with hash_table_* functions.
v2: Move .c file to main/, const a few things, clean up loop conditions,
add/extend some comments.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
sparc/clip.c got moved to sparc/sparc-clip.c to avoid doing this workaround in
the parent directory.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
While simplifying mesa/Makefile.am, the more important feature of this commit
is allowing a file with the same name to appear in both main/ and program/.
v2: [chadv] Add changes to Android makefiles.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Signed-off-by: Chad Versace <chad.versace@linux.intel.com> (v2)
The pair of files src/mesa/Android.mk and src/mesa/Android.gen.mk are too
long and complex to be easily understood. This patch belongs to a series
that decomposes them into several easily digestible makefiles.
This patch move the rules for libmesa_st_mesa.a from Android.mk to
Android.libmesa_st_mesa.mk.
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
The pair of files src/mesa/Android.mk and src/mesa/Android.gen.mk are too
long and complex to be easily understood. This patch belongs to a series
that decomposes them into several easily digestible makefiles.
This patch move the rules for libmesa_dricore.a from Android.mk to
Android.libmesa_dricore.mk.
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
The pair of files src/mesa/Android.mk and src/mesa/Android.gen.mk are too
long and complex to be easily understood. This patch belongs to a series
that decomposes them into several easily digestible makefiles.
This patch move the rules for host executable mesa_gen_matypes from
Android.mk to Android.mesa_gen_matypes.mk.
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
The pair of files src/mesa/Android.mk and src/mesa/Android.gen.mk are too
long and complex to be easily understood. This patch belongs to a series
that decomposes them into several easily digestible makefiles.
This patch move the rules for the host and target libmesa_glsl_utils.a
from Android.mk to Android.libmesa_glsl_utils.mk.
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
They were always used with the corresponding *_FILES variables now that
automake handles rule generation.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>