fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 13:48:06 +02:00

Author	SHA1	Message	Date
Francisco Jerez	48d6fc5eb6	intel/fs: Initialize mlen for gen7 varying pull constant load messages. This makes the message length available at the IR level, which should save some guesswork in a future commit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	6643143f6e	intel/eu: Assert that the instruction is send-like in brw_set_desc_ex(). Constructing a descriptor in-place as part of the immediate of an ALU instruction is no longer supported. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	6f81e2b994	intel/eu: Get rid of the return value of brw_send_indirect_message(). The return value is not used anymore. This allows simplifying the code slightly, and in addition it should frustrate anybody's attempts to continue using the obsolete piecemeal approach to construct a message descriptor in combination with brw_send_indirect_message(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	b3cce4c130	intel/eu: Get rid of the return value of brw_send_indirect_surface_message(). All users of brw_send_indirect_surface_message() should be providing a full descriptor immediate up front by now, this isn't necessary anymore. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	95b5367149	intel/eu: Use descriptor constructors for dataport typed surface messages. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	94166cef40	intel/eu: Use descriptor constructors for dataport scattered byte surface messages. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	2a9605d610	intel/eu: Use descriptor constructors for dataport untyped surface messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	8e707fc2af	intel/eu: Provide single descriptor argument to brw_send_indirect_surface_message(). Instead of the current message_len, response_len and header_present arguments. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	b10b4e7c45	intel/eu: Use descriptor constructors for pixel interpolator messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	8fa4bc4676	intel/eu: Use descriptor constructors for dataport write messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	2bac890bf5	intel/eu: Use descriptor constructors for dataport read messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	27c211e30f	intel/eu: Use descriptor constructors for sampler messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	1c90ae5acc	intel/eu: Provide desc immediate argument up front to brw_send_indirect_message(). The current approach of returning a setup instruction where additional descriptor fields can be specified is still supported in order to keep things working, but it will be removed later in this series. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	b382bdde1d	TRIVIAL: intel/eu: Use a local devinfo variable in brw_shader_time_add(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	c3793d49e4	intel/eu: Use brw_set_desc() along with a helper to set common descriptor controls. This replaces brw_set_message_descriptor() with the composition of brw_set_desc() and a new inline helper function that packs the common message descriptor controls into an integer. The goal is to represent all message descriptors as a 32-bit integer which is written at once into the instruction, which is more flexible (SENDS anyone?), robust (see `d2eecf0b0b` fixing an issue ultimately caused by some bits of the extended message descriptor being left undefined) and future-proof than the current approach of specifying the individual descriptor fields directly into the instruction. This approach also seems more self-documenting, since it will allow removing calls to functions with way too many arguments like brw_set__message() and brw_send_indirect_message(), and instead provide a single descriptor argument constructed from an appropriate combination of brw__desc() helpers. Note that because brw_set_message_descriptor() was (conditionally?) overriding fields of the instruction which strictly speaking weren't part of the message descriptor, this involves calling brw_inst_set_sfid() and brw_inst_set_eot() in some cases in addition to brw_set_desc(). v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	20b962232b	intel/eu: Define SET_BITS helper more easily reusable than SET_FIELD. Allows to specify a bitfield based on its upper and lower bounds instead of a symbolic field definition, kind of what the current GET_BITS macro is to GET_FIELD. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	d0f589a55b	intel/eu: Define helper to specify the descriptor immediates of a SEND instruction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	f55884cad3	intel/eu: Add brw_inst.h helpers for the SEND(C) descriptor and extended descriptor. This introduces helpers that can be used to specify or extract the whole descriptor of a SEND message instruction at once. Because the the instruction encoding of these is rather awkward on some generations using the generic brw_inst.h macros doesn't seem like an option. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Iago Toral Quiroga	213491600a	intel/compiler: emit actual barriers for working-group level barriers Until now we have assumed that we could skip emitting these barriers in the general case based on empirical testing and a few assumptions detailed in a comment in the driver code, however, recent CTS tests have showed that we actually need them to produce correct behavior. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 07:46:34 +02:00
Jose Maria Casanova Crespo	cd0afab99b	i965/fs: Enable store_ssbo for 8-bit types. v2: Update comment according to this patch. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:50 +02:00
Jose Maria Casanova Crespo	11c904d0d3	intel/compiler: relax brw_eu_validate for byte raw movs When the destination is a BYTE type allow raw movs even if the stride is not exact multiple of destination type and exec type, execution type is Word and its size is 2. This restriction was only allowing stride==2 destinations for 8-bit types. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:49 +02:00
Jose Maria Casanova Crespo	87fc9af3fc	i965/fs: Enable conversions to 8-bit integers Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:49 +02:00
Jose Maria Casanova Crespo	030472c1f0	i965: Support for 8-bit base types in helper functions Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:49 +02:00
Jose Maria Casanova Crespo	232ed89802	i965/fs: Register allocator shoudn't use grf127 for sends dest Since Gen8+ Intel PRM states that "r127 must not be used for return address when there is a src and dest overlap in send instruction." This patch implements this restriction creating new grf127_send_hack_node at the register allocator. This node has a fixed assignation to grf127. For vgrf that are used as destination of send messages we create node interfereces with the grf127_send_hack_node. So the register allocator will never assign to these vgrf a register that involves grf127. If dispatch_width > 8 we don't create these interferences to the because all instructions have node interferences between sources and destination. That is enough to avoid the r127 restriction. This fixes CTS tests that raised this issue as they were executed as SIMD8: dEQP-VK.spirv_assembly.instruction.graphics.8bit_storage.8struct_to_32struct.storage_buffer_*int_geom Shader-db results on Skylake: total instructions in shared programs: 7686798 -> 7686797 (<.01%) instructions in affected programs: 301 -> 300 (-0.33%) helped: 1 HURT: 0 total cycles in shared programs: 337092322 -> 337091919 (<.01%) cycles in affected programs: 22420415 -> 22420012 (<.01%) helped: 712 HURT: 588 Shader-db results on Broadwell: total instructions in shared programs: 7658574 -> 7658625 (<.01%) instructions in affected programs: 19610 -> 19661 (0.26%) helped: 3 HURT: 4 total cycles in shared programs: 340694553 -> 340676378 (<.01%) cycles in affected programs: 24724915 -> 24706740 (-0.07%) helped: 998 HURT: 916 total spills in shared programs: 4300 -> 4311 (0.26%) spills in affected programs: 333 -> 344 (3.30%) helped: 1 HURT: 3 total fills in shared programs: 5370 -> 5378 (0.15%) fills in affected programs: 274 -> 282 (2.92%) helped: 1 HURT: 3 v2: Avoid duplicating register classes without grf127. Let's use a node with a fixed assignation to grf127 and create interferences to send message vgrf destinations. (Eric Anholt) v3: Update reference to CTS VK_KHR_8bit_storage failing tests. (Jose Maria Casanova) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: 18.1 <mesa-stable@lists.freedesktop.org>	2018-07-10 00:14:49 +02:00
Jose Maria Casanova Crespo	0e47ecb29a	intel/compiler: grf127 can not be dest when src and dest overlap in send Implement at brw_eu_validate the restriction from Intel Broadwell PRM, vol 07, section "Instruction Set Reference", subsection "EUISA Instructions", Send Message (page 990): "r127 must not be used for return address when there is a src and dest overlap in send instruction." v2: Style fixes (Matt Turner) Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: 18.1 <mesa-stable@lists.freedesktop.org>	2018-07-10 00:14:49 +02:00
Jose Maria Casanova Crespo	6706b421f0	intel/fs: use uint type for per_slot_offset at GS This helps us to compact original instruction: mul(8) g3<1>D g6<8,8,1>UD 0x00000006UD { align1 1Q }; So now we emit: mul(8) g3<1>UD g6<8,8,1>UD 0x00000006UD { align1 1Q compacted }; Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-07-09 15:28:48 +02:00
Iago Toral Quiroga	81ca08e030	intel/compiler: remove unused function Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 13:21:48 +02:00
Ian Romanick	f8e54d02f7	intel/compiler: Relax mixed type restriction for saturating immediates At the time of commit `7bc6e455e2` (i965: Add support for saturating immediates.) we thought mixed type saturates would be impossible. We were only thinking about type converting moves from D to F, for example. However, type converting moves w/saturate from F to DF are definitely possible. This change minimally relaxes the restriction to allow cases that I have been able trigger via piglit tests. Fixes new piglit tests: - arb_gpu_shader_fp64/execution/built-in-functions/fs-sign-sat-neg-abs.shader_test - arb_gpu_shader_fp64/execution/built-in-functions/vs-sign-sat-neg-abs.shader_test Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-06 16:20:10 -07:00
Ian Romanick	9626ea497d	i965/vec4: Properly handle sign(-abs(x)) This is achived by copying the sign(abs(x)) optimization from the FS backend. On Gen7 an earlier platforms, this fixes new piglit tests: - glsl-1.10/execution/vs-sign-neg-abs.shader_test - glsl-1.10/execution/vs-sign-sat-neg-abs.shader_test Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-06 16:20:07 -07:00
Ian Romanick	88bd37c010	i965/fs: Properly handle sign(-abs(x)) Fixes new piglit tests: - glsl-1.10/execution/fs-sign-neg-abs.shader_test - glsl-1.10/execution/fs-sign-sat-neg-abs.shader_test - glsl-1.10/execution/vs-sign-neg-abs.shader_test - glsl-1.10/execution/vs-sign-sat-neg-abs.shader_test Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-06 16:20:04 -07:00
Mathieu Bridon	0f7b18fa0d	python: Use the print function In Python 2, `print` was a statement, but it became a function in Python 3. Using print functions everywhere makes the script compatible with Python versions >= 2.6, including Python 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-06 10:04:22 -07:00
Ian Romanick	965a06dbd7	i965/vec4: Make the vec4_visitor::nir_emit_instr default case unreachable The bug fixed by the previous commit went undetected because extra stderr messages are not flagged by the CI. Copy the solution from fs_visitor::nir_emit_instr and mark the default case unreachable. An alternate solution is to delete the default case so that the compiler will issue a warning. That may require more work since there are other (impossible) cases that exist. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-05 21:13:32 -07:00
Ian Romanick	a4d4787327	intel/compiler: More DCE after lowering Some of the lowering passes, nir_lower_locals_to_regs for example, can cause some previously live code to be dead. This pass in particular leaves a bunch of nir_instr_type_deref instructions floating around. This causes shader-db runs on Gen5 through Haswell to spew tons of messages like: VS instruction not yet implemented by NIR->vec4 UnrealEngine4/EffectsCaveDemo/239.shader_test is one shader that generates these messages. Cleaning up the dead code fixes that. To verify, I did a shader-db before and after. Even though all the messages are gone, the results make my brain hurt. :( Haswell total cycles in shared programs: 411890163 -> 411891145 (<.01%) cycles in affected programs: 57016 -> 57998 (1.72%) helped: 3 HURT: 11 helped stats (abs) min: 2 max: 154 x̄: 96.67 x̃: 134 helped stats (rel) min: 0.08% max: 2.23% x̄: 1.42% x̃: 1.96% HURT stats (abs) min: 18 max: 686 x̄: 115.64 x̃: 20 HURT stats (rel) min: 0.81% max: 7.12% x̄: 1.87% x̃: 0.93% 95% mean confidence interval for cycles value: -51.39 191.67 95% mean confidence interval for cycles %-change: -0.14% 2.46% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total cycles in shared programs: 259114802 -> 259115032 (<.01%) cycles in affected programs: 24034 -> 24264 (0.96%) helped: 1 HURT: 9 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.08% max: 0.08% x̄: 0.08% x̃: 0.08% HURT stats (abs) min: 18 max: 48 x̄: 25.78 x̃: 20 HURT stats (rel) min: 0.80% max: 1.94% x̄: 1.08% x̃: 0.80% 95% mean confidence interval for cycles value: 12.42 33.58 95% mean confidence interval for cycles %-change: 0.54% 1.38% Cycles are HURT. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Fixes: `5a02ffb733` nir: Rework lower_locals_to_regs to use deref instructions Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-05 21:13:21 -07:00
Neil Roberts	2d5ddbe960	i965: Fix output register sizes when variable ranges are interleaved In `6f5abf3146` this code was fixed to calculate the maximum size of an attribute in a seperate pass and then allocate the registers to that size. However this wasn’t taking into account ranges that overlap but don’t have the same starting location. For example: layout(location = 0, component = 0) out float a[4]; layout(location = 2, component = 1) out float b[4]; Previously, if ‘a’ was processed first then it would allocate a register of size 4 for location 0 and it wouldn’t allocate another register for location 2 because it would already be covered by the range of 0. Then if something tries to write to b[2] it would try to write past the end of the register allocated for ‘a’ and it would hit an assert. This patch changes it to scan for any overlapping ranges that start within each range to calculate the maximum extent and allocate that instead. Fixed Piglit’s arb_enhanced_layouts/execution/component-layout/ vs-fs-array-interleave-range.shader_test Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Fixes: `6f5abf3146` "i965: Fix output register sizes when multiple variables share a slot."	2018-07-04 10:57:51 +02:00
Ian Romanick	995d993710	i965/vec4: Don't cmod propagate from CMP to ADD if the writemask isn't compatible Otherwise we can incorrectly cmod propagate in situations like add(8) g10<1>.xD g2<0>.xD -16D ... cmp.ge.f0(8) null<1>D g2<0>.xD 16D ... (+f0) sel(8) g21<1>.xyUD g14<4>.xyyyUD g18<4>.xyyyUD Sadly, this change hurts quite a few shaders. v2: Refactor writemask compatibility check into a separate function. Suggested by Caio. Ivy Bridge and Haswell had similar results. (Haswell shown) total instructions in shared programs: 12968489 -> 12968738 (<.01%) instructions in affected programs: 60679 -> 60928 (0.41%) helped: 0 HURT: 249 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.22% max: 0.81% x̄: 0.46% x̃: 0.44% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %-change: 0.44% 0.48% Instructions are HURT. total cycles in shared programs: 409171965 -> 409172317 (<.01%) cycles in affected programs: 260056 -> 260408 (0.14%) helped: 0 HURT: 176 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.04% max: 0.34% x̄: 0.17% x̃: 0.17% 95% mean confidence interval for cycles value: 2.00 2.00 95% mean confidence interval for cycles %-change: 0.16% 0.18% Cycles are HURT. Sandy Bridge total instructions in shared programs: 10423577 -> 10423753 (<.01%) instructions in affected programs: 40667 -> 40843 (0.43%) helped: 0 HURT: 176 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.29% max: 0.79% x̄: 0.48% x̃: 0.42% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %-change: 0.46% 0.51% Instructions are HURT. total cycles in shared programs: 146097503 -> 146097855 (<.01%) cycles in affected programs: 503990 -> 504342 (0.07%) helped: 0 HURT: 176 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.02% max: 0.36% x̄: 0.12% x̃: 0.11% 95% mean confidence interval for cycles value: 2.00 2.00 95% mean confidence interval for cycles %-change: 0.11% 0.13% Cycles are HURT. No changes on any other platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Fixes: `cd635d149b` i965/vec4: Propagate conditional modifiers from compares to adds Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-02 19:19:16 -07:00
Ian Romanick	fb6dc8e894	intel/compiler: Silence unused parameter warnings brw_nir.c src/intel/compiler/brw_nir.c: In function ‘brw_nir_lower_vue_outputs’: src/intel/compiler/brw_nir.c:464:32: warning: unused parameter ‘is_scalar’ [-Wunused-parameter] bool is_scalar) ^~~~~~~~~ src/intel/compiler/brw_nir.c: In function ‘lower_bit_size_callback’: src/intel/compiler/brw_nir.c:610:57: warning: unused parameter ‘data’ [-Wunused-parameter] lower_bit_size_callback(const nir_alu_instr alu, void data) ^~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-02 16:17:19 -07:00
Jason Ekstrand	06412bfc98	anv,intel: Enable nir_opt_large_constants for Vulkan According to RenderDoc, this shaves 99.6% of the run time off of the ambient occlusion pass in Skyrim Special Edition when running under DXVK and shaves 92% off the runtime for a reasonably representative frame. When running the actual game, Skyrim goes from being a slide-show to a very stable and playable framerate on my SKL GT4e machine. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:50 -07:00
Francisco Jerez	c2c803be7b	intel/fs: Build 32-wide FS shaders. Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-28 13:25:21 -07:00
Jason Ekstrand	d5e028a57b	intel/fs: Add fields to wm_prog_data for SIMD32 dispatch Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	bcbc7d3a17	intel/fs: Fix nir_intrinsic_load_helper_invocation for SIMD32. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	7144247c2c	intel/fs: Fix fs_builder::sample_mask_reg() for 32-wide FS dispatch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	37c1df28c9	intel/fs: Fix Gen6+ interpolation setup for SIMD32 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	e208bc3bb7	intel/fs: Get rid of MOV_DISPATCH_TO_FLAGS We can just emit the MOV in the two places where we use this. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	5e3028d826	intel/fs: Emit MOV_DISPATCH_TO_FLAGS once for the centroid workaround There's no reason for us to emit it a pile of times and then have a whole pass to clean it up. Just emit it once like we really want. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	40fe108e2b	intel/fs: Generalize the unlit centroid workaround This generalizes the unlit centroid workaround so it's less code and now supports SIMD32. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	1d381731e0	intel/fs: Fix sample id setup for SIMD32. v2 (Jason Ekstrand): - Disallow gl_SampleId in SIMD32 on gen7 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	2fd0aed89a	intel/fs: Fix Gen7 compressed source region alignment restriction for SIMD32 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	6909aed90e	intel/fs: Implement 32-wide FS payload setup on Gen6+ Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	f6c4aace22	intel/fs: Extend thread payload layout to SIMD32 And handle 32-wide payload register reads in fetch_payload_reg(). v2 (Jason Ekstrand); - Fix some whitespace and brace placement Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	8f143f70d6	intel/fs: Wrap FS payload register look-up in a helper function. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00

1 2 3 4 5 ...

558 commits