fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-23 13:20:14 +01:00

Author	SHA1	Message	Date
Marek Olšák	de2e28366a	radeonsi: compile geometry shaders immediately they have only 1 variant Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	f7a8b6fff5	radeonsi: split out code for deleting si_shader Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	e21142087c	radeonsi: move code writing tess factors into a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	dc5fc3c2f6	radeonsi: make LLVM IR dumping less messy Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	c1041366db	radeonsi: move a few r600_can_dump_shader calls to where they're needed Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	b6d5666fbf	radeonsi: remove useless code that handles dx10_clamp_mode "enable-no-nans-fp-math" is a wrong string and there was a disagreement about fixing it. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	57271d5364	radeonsi: dump SPI_PS_INPUT values along with shader stats Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	5a53628f45	radeonsi: read SPI_PS_INPUT_ADDR from LLVM if it returns it Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	9483fcc7f2	radeonsi: don't force gl_SampleMaskIn to 1 for smoothing Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	c379c2540b	radeonsi: split PS input interpolation code into its own function This will be used by the fragment shader prolog. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	b9126dcda8	radeonsi: implement forcing per-sample_interpolation using the shader key only It was partly a state and partly emulated by shader code, but since we want to do this in a fragment shader prolog, we need to put it into the shader key, which will be used to generate the prolog. This also removes the spi_ps_input states and moves the registers to the PS state. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	4596f3c1b8	radeonsi: remove si_shader::ps_input_interpolate tgsi_shader_info has this too. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	6dda2455c8	radeonsi: move BCOLOR PS input locations after all other inputs BCOLOR inputs were immediately after COLOR inputs. Thus, all following inputs were offset by 1 if color_two_side was enabled, and not offset if it was not enabled, which is a variation that's problematic if we want to have 1 variant per shader and the variant doesn't care about color_two_side (that should be handled by other bytecode attached at the beginning). Instead, move BCOLOR inputs after all other inputs, so BCOLOR0 is at location "num_inputs" if it's present. BCOLOR1 is next. This also allows removing si_shader::nparam and si_shader::ps_input_param_offset, which are useless now. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	606e4185f3	radeonsi: move SPI_PS_INPUT_CNTL value computation to a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	90cbbe1c12	radeonsi: generate a color_two_side variant only if the shader reads colors Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	4bbbaaf191	radeonsi: move si_shader_context initialization into a separate function This will be re-used later. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	a3e9a5f9f8	st/mesa: remove st_is_program_native The default scenario sets GL_TRUE too. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	7046c588eb	st/mesa: unify destroy_program_variants cases for TCS, TES, GS Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-09 21:19:50 +01:00
Marek Olšák	75be3ee9f9	st/mesa: unify get_variant functions for TCS, TES, GS Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-09 21:19:50 +01:00
Marek Olšák	b8d31fdedf	st/mesa: unify variants and delete functions for TCS, TES, GS no difference between those Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-09 21:19:50 +01:00
Brian Paul	fe14110f35	mesa: fix incorrect viewport position when GL_CLIP_ORIGIN = GL_LOWER_LEFT Ilia Mirkin found/fixed the mistake. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93813 Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-09 11:27:48 -07:00
Brian Paul	0193e20df5	mesa: rewrite save_CallLists() code When glCallLists() is compiled into a display list, preserve the call as a single glCallLists rather than 'n' glCallList calls. This will matter for an upcoming display list optimization project. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-09 11:27:48 -07:00
Brian Paul	711d5347cf	mesa: add missing error check in _mesa_CallLists() Generate GL_INVALID_VALUE if n < 0. Return early if n==0 or lists==NULL. v2: fix formatting, also check for lists==NULL. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-09 11:27:48 -07:00
Brian Paul	b1ddc03633	mesa: whitespace clean-ups in dlist.h And remove 'extern' qualifiers.	2016-02-09 11:27:48 -07:00
Brian Paul	7d18faf8e7	st/mesa: don't allocate bitmap drawing state until needed Most apps don't use glBitmap so don't allocate the bitmap cache or gallium state objects/shaders/etc until the first call to st_Bitmap(). v2: simplify a conditional, per Gustaw Smolarczyk. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 11:27:48 -07:00
Brian Paul	a5799de3dc	st/mesa: move the setup_bitmap_vertex_data() code into draw_bitmap_quad() Now all the code to setup the vertex data and draw it is in one place. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 11:27:48 -07:00
Brian Paul	130d34ce65	st/mesa: refactor some bitmap drawing code Move setup/restoration of rendering state into helper functions. This makes the draw_bitmap_quad() function much more concise. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 11:27:47 -07:00
Ilia Mirkin	922be4eab9	mesa: remove hack to fix up GL_ANY_SAMPLES_PASSED results Both st/mesa and i965 should return a true/false result now, and the only other driver implementing queries (radeon) doesn't support ARB_occlusion_query2 which added that pname. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 11:59:35 -05:00
Ilia Mirkin	7aca4bb9b1	st/mesa: make use of the occlusion predicate query Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 11:59:35 -05:00
Ilia Mirkin	50235ab3ab	nv50: add PIPE_QUERY_OCCLUSION_PREDICATE support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-02-09 11:59:35 -05:00
Ilia Mirkin	0cb1dda36e	nv30: add PIPE_QUERY_OCCLUSION_PREDICATE support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-09 11:59:35 -05:00
Ilia Mirkin	0d04ec2fd2	ilo: add PIPE_QUERY_OCCLUSION_PREDICATE support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2016-02-09 11:59:27 -05:00
Nicolai Hähnle	c260175677	draw: use util_pstipple_* function for stipple pattern textures and samplers This reduces code duplication. Suggested-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-09 10:01:57 -05:00
Nicolai Hähnle	452e51bf1e	draw: use util_pstipple_create_fragment_shader This reduces code duplication. It also adds support for drivers where the fragment position is a system value. Suggested-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-09 10:01:32 -05:00
Marek Olšák	83b4d701c0	winsys/radeon: fix a wrong NUM_TILE_PIPES value from the kernel Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94019 Tested-by: Nick Sarnie <commendsarnex@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-09 15:26:40 +01:00
Timothy Arceri	1aae5e8ced	nir: remove unused nir_variable fields These are used in GLSL IR to removed unused varyings and match transform feedback variables. There is no need to use these in NIR. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:49:06 +11:00
Timothy Arceri	6235b69134	glsl: remove unrequired forward declaration This was added in `2548092ad8` although I don't see why as it was already in the linker.h header. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:48:55 +11:00
Timothy Arceri	9dd6a4ea79	glsl: clean up and fix bug in varying linking rules The existing code was very hard to follow and has been the source of at least 3 bugs in the past year. The existing code also has a bug for SSO where if we have a multi-stage SSO for example a tes -> gs program, if we try to use transform feedback with gs the existing code would look for the transform feedback varyings in the tes stage and fail as it can't find them. V2: Add more code comments, always try to remove unused inputs to the first stage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:44:22 +11:00
Timothy Arceri	fd0b89ad8d	glsl: simplify ES Vertex/Fragment shader requirements We really just needed to skip the existing ES < 3.1 check if we have a compute shader, all other scenarios are already covered. * No shaders is a link error. * Geom or Tess without Vertex is a link error which means we always require a Vertex shader and hence a Fragment shader. * Finally a Compute shader linked with any other stage is a link error. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:44:15 +11:00
Timothy Arceri	55fa3c44bc	glsl: simplify required stages for linking rules Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:44:11 +11:00
Timothy Arceri	20823992b4	glsl: small tidy up now that link_shaders() exits early with 0 shaders Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:44:07 +11:00
Timothy Arceri	76cfb47207	glsl: don't attempt to link empty program Previously an empty program would go through the entire link_shaders() function and we would have to be careful not to cause a segfault. In core profile also now set link_status to false by generating an error, it was previously set to true. From Section 7.3 (PROGRAM OBJECTS) of the OpenGL 4.5 spec: "Linking can fail for a variety of reasons as specified in the OpenGL Shading Language Specification, as well as any of the following reasons: - No shader objects are attached to program." V2: Only generate an error in core profile and add spec quote (Ian) V3: generate error in ES too, remove previous check which was only applying the rule to GL 4.5/ES 3.1 and above. My understand is that this spec change is clarifying previously undefined behaviour and therefore should be applied retrospectively. The ES CTS tests for this are in ES 2 I suspect it was passing because it would have generated an error for not having both a vertex and fragment shader. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:44:02 +11:00
Matt Turner	371c4b3c48	nir: Recognize open-coded bitfield_reverse. Helps 11 shaders in UnrealEngine4 demos. I seriously hope they would have given us bitfieldReverse() if we exposed GL 4.0 (but we do expose ARB_gpu_shader5, so why not use that anyway?). instructions in affected programs: 4875 -> 4633 (-4.96%) cycles in affected programs: 270516 -> 244516 (-9.61%) I suspect there's a lot of room to improve nir_search/opt_algebraic's handling of this. We'd actually like to match, e.g., step2 by matching step1 once and then doing a pointer comparison for the second instance of step1, but unfortunately we generate an enormous tuple for instead. The .text size increases by 6.5% and the .data by 17.5%. text data bss dec hex filename 22957 45224 0 68181 10a55 nir_libnir_la-nir_opt_algebraic.o 24461 53160 0 77621 12f35 nir_libnir_la-nir_opt_algebraic.o I'd be happy to remove this if Unreal4 uses bitfieldReverse() if it is in a GL 4.0 context once we expose GL 4.0. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 21:20:58 -08:00
Matt Turner	2d0d9755da	nir: Handle large unsigned values in opt_algebraic. The next patch adds an algebraic rule that uses the constant 0xff00ff00. Without this change, the build fails with return hex(struct.unpack('I', struct.pack('i', self.value))[0]) struct.error: 'i' format requires -2147483648 <= number <= 2147483647 The hex() function handles integers of any size, and assigning a negative value to an unsigned does what we want in C. The pack/unpack is unnecessary (and as we see, buggy). Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>	2016-02-08 20:38:17 -08:00
Matt Turner	7be8d07732	nir: Do opt_algebraic in reverse order. Walking the SSA definitions in order means that we consider the smallest algebraic optimizations before larger optimizations. So if a smaller rule is part of a larger rule, the smaller one will happen first, preventing the larger one from happening. instructions in affected programs: 32721 -> 32611 (-0.34%) helped: 106 In programs whose nir_optimize loop count changes (129 of them): before: 1164 optimization loops after: 1071 optimization loops Of the 129 affected, 16 programs' optimization loop counts increased. Prevents regressions and annoyances in the next commits. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 20:38:17 -08:00
Matt Turner	a8f0960816	nir: Recognize product of open-coded pow()s. Prevents regressions in the next commit. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 20:38:17 -08:00
Matt Turner	9f02e3ab03	nir: Add opt_algebraic rules for xor with zero. instructions in affected programs: 668 -> 664 (-0.60%) helped: 4 Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 20:38:17 -08:00
Timothy Arceri	3fd4280759	glsl: validate arrays of arrays on empty type delclarations Fixes: dEQP-GLES31.functional.shaders.arrays_of_arrays.invalid.empty_declaration_without_var_name_fragment dEQP-GLES31.functional.shaders.arrays_of_arrays.invalid.empty_declaration_without_var_name_vertex Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-09 13:52:52 +11:00
Kenneth Graunke	74f956c416	i965: Use nir_lower_load_const_to_scalar(). I don't know why, but we never hooked up this pass Eric wrote. Otherwise, you can end up with stupid scalarized code such as: vec4 ssa_7 = load_const (0.0, 0.0, 0.0, 0.0) vec4 ssa_8 = ... vec1 ssa_9 = feq ssa_8, ssa_7 vec1 ssa_10 = feq ssa_8.y, ssa_7.y vec1 ssa_11 = feq ssa_8, ssa_7.z vec1 ssa_12 = feq ssa_8.y, ssa_7.w ssa_8.xyxy == <0, 0, 0, 0> should only take two feq instructions. shader-db on Skylake: total instructions in shared programs: 9121153 -> 9120749 (-0.00%) instructions in affected programs: 32421 -> 32017 (-1.25%) helped: 277 HURT: 69 total cycles in shared programs: 69003364 -> 69000912 (-0.00%) cycles in affected programs: 899186 -> 896734 (-0.27%) helped: 313 HURT: 403 This also prevents regressions when disabling channel expressions. v2: Don't call opt_cse afterwards (requested by Matt). It should happen in the optimization loop below anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-02-08 18:10:34 -08:00
Timothy Arceri	184afd8fd9	mesa: remove now unused sampler index handing code Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-09 12:03:02 +11:00

... 16 17 18 19 20 ...

77025 commits