If an instruction reads from a constant register that contains
immediates using an invalid swizzle, we can avoid generating MOV
instructions to fix up the swizzle by loading the immediates into a
different constant register that can be read using a valid swizzle.
This only affects r300 and r400 cards.
For example:
CONST[1] = { -3.5000 3.5000 2.5000 1.5000 }
MAD temp[4].xy, const[0].xy__, const[1].xz__, input[0].xy__;
========== Before this change would be lowered to: =========
CONST[1] = { -3.5000 3.5000 2.5000 1.5000 }
MOV temp[0].x, const[1].x___;
MOV temp[0].y, const[1]._z__;
MAD temp[4].xy, const[0].xy__, temp[0].xy__, input[0].xy__;
========== After this change is lowered to: ===============
CONST[1] = { -3.5000 3.5000 2.5000 1.5000 }
CONST[2] = { 0.0000 -3.5000 2.5000 0.0000 }
MAD temp[4].xy, const[0].xy__, const[2].yz__, input[0].xy__;
============================================================
This change reduces one of the Lightsmark shaders from 133 to 91
instructions.
v2:
- Fix crash caused by swizzles with only inline constants.
Use per asic golden values.
Programming this register doesn't seem to be strictly
necessary on SI, but programming it wrong leads to
rendering issues or reduced performance so just
go ahead and program the golden values explicitly
to avoid any potential problems down the road.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
For precise lts support I had to do some magic with the library names, which works fine
as long as the libraries from pkg-config are used.
The parts with src/gallium/targets/va-*/Makefile will not apply on the master branch,
but do apply to the 9.0 branch.
NOTE: This is a candidate for the 9.0 branch.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Not maintained since 2008. Doubtful that it's worked in quite a while.
Also see commit 32ac8cb05 which removed VMS stuff from Makefile in 2009.
Cc: Jouk Jansen <j.jansen@tudelft.nl>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
fixes regression introduced in 9078441072
Targets for making lex.yy.c program_parse.tab.c and program_parse.tab.h
got moved into its own Makefile
Reviewed-by: Matt Turner <mattst88@gmail.com>
This was added in version 22 of the GL_ARB_sync spec.
Fixes gles3conform's sync_error_waitsync_timeout test.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
All the other range checks on index already return the proper error,
INVALID_VALUE.
Fixes gles3conform's instanced_arrays_invalid test.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
brw_optimize.c's brw_opcodes table was a copy of brw_disasm.c's
opcode_descs table, but with an additional field: is_arith. Now that
I've deleted that, the two are identical. Keep the one in brw_disasm.c.
Reviewed-by: Eric Anholt <eric@anholt.net>
All users of basic block analysis simply create their own local
variables. Nobody uses the visitor-wide field.
Reviewed-by: Eric Anholt <eric@anholt.net>
The old brw_remove_grf_to_mrf_moves() pass is obsolete and replaced by
fs_visitor::compute_to_mrf().
The old brw_remove_duplicate_mrf_moves() pass is obsolete and replaced
by fs_visitor::remove_duplicate_mrf_writes().
The remaining pass, brw_set_dp4_dependency_control(), is currently
unused, but could be, so I'm leaving it for now.
Reviewed-by: Eric Anholt <eric@anholt.net>
At this point, it's just gl_shader_program. Nobody even uses it; even
the program that creates them only returns gl_shader_program pointers.
Reviewed-by: Eric Anholt <eric@anholt.net>
The passthrough pipeline needs to check index values (which might be passed
through) as they can be invalid (which causes crashes and various assertion
failures if the clip code runs). Obviously, rendering won't be well-defined,
but those bogus indices might come directly from apps.
There were already debug printfs which reported the out-of-bounds indices but
we really ought to not crash.
While checking at that point doesn't seem like the most efficient solution,
it seems there isn't really another appropriate function to do it.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Clean up a few magic numbers and rework the code a bit.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Assert the the CB format is valid and default to
the INVALID hw format rather than ~0U when the format
doesn't match for non-debug builds.
v2: use INVALID hw format rather than ~0U
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Assert that the DB format is valid and default to
the INVALID hw format rather than ~0U when the format
doesn't match for non-debug builds.
v2: use INVALID hw format rather than ~0U
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
This is necessary for backwards compatibility with pre-SI for stencil.
Fixes a number of stencil related piglit tests, and real apps using stencil.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Fixes assertion failure with Mesa demo glsl/samplers.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>