fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 22:18:18 +02:00

Author	SHA1	Message	Date
Caio Marcelo de Oliveira Filho	d438261e05	intel: Add INTEL_DEBUG=nofc for disabling fast clears Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-09 13:29:26 -07:00
Caio Marcelo de Oliveira Filho	9560c9b498	anv: Enable VK_EXT_shader_subgroup_{ballot,vote} Anvil now supports and passes Vulkan CTS tests matching dEQP-VK.subgroups..ext_shader_subgroup_ballot. dEQP-VK.subgroups..ext_shader_subgroup_vote. and crucible tests matching func.shader-ballot.* func.shader-subgroup-vote.* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-08 16:34:00 -07:00
Tapani Pälli	e4a826b2c8	anv/android: fix images created with external format support This fixes a case where user first creates image and then later binds it with memory created from AHW buffer. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-08 07:19:05 +03:00
Caio Marcelo de Oliveira Filho	f7ca072ab2	anv: Implement VK_KHR_shader_clock Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-07 09:12:12 -07:00
Lionel Landwerlin	12bf1308c4	intel/isl: set vertical surface alignment on null surfaces Just following the spec. Somewhat unclear whether this applies to NULL surfaces. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-05 20:54:33 +00:00
Lionel Landwerlin	ff1a5aadbf	intel/isl: set surface array appropriately This doesn't seem to affect anything. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-05 20:54:33 +00:00
Lionel Landwerlin	c445d6f66e	intel/isl: Set null surface format to R32_UINT It appears we never had a test in piglit or deqp sampling from a null surface... It turns out this triggers a hang on IVB only. Updating the null surface format to R32_UINT fixes the hang on ivb and doesn't affect other platforms, so set it by default for all platforms. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1872 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-05 20:54:33 +00:00
Lionel Landwerlin	d36763b2a4	intel: fix subslice computation from topology data We're missing the offset of the slice in the subslice mask... This worked for most platforms that don't have first slice fused off because we would reread the same mask from slice0 again and again... Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c1900f5b0f` ("intel: devinfo: add helper functions to fill fusing masks values") Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1869 Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2019-10-05 23:05:03 +03:00
Lionel Landwerlin	907c2397f0	intel/error2aub: add support for platforms without PPGTT Not much to do to enable this, just make sure to always write to the GGTT :) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-04 22:31:15 +00:00
Rafael Antognolli	cdc331c6f9	anv/block_pool: Align anv_block_pool state to 64 bits. On 64 bits platforms, some atomic operations like __sync_fetch_and_add() have constant time, but on 32 bits platforms they are implemented with a loop and might take much longer. Additionally, it seems like if their operands are not aligned to 64 bits, they also require extra memory accesses. From the Intel Architecture's Developer Manual Vol. 1, 4.1.1: "A word or doubleword operand that crosses a 4-byte boundary or a quadword operand that crosses an 8-byte boundary is considered unaligned and requires two separate memory bus cycles for access." Forcing the u64 field to be aligned to 64 bits seems to make the unit tests that are stressing this finish much faster. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-03 12:40:33 -07:00
Anuj Phogat	0d60621101	intel/isl/icl: Use halign 8 instead of 4 hw workaround v1 by Topi Pohjolainen v2,v3 by Anuj Phogat: - Apply for gen >= 11 - Remove wa_bug_xxx function - Use helper functions Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-03 17:18:41 +00:00
Lionel Landwerlin	1c6fdbc83c	intel: fix topology query i915 will report ENODEV on generations prior to Haswell because there is no point in reporting values on those. This is prior any fusing could happen on parts with identical PCI ids. This query call was previously only triggered on generations that support performance queries, which happens to match generation for which i915 reports topology, but the commit pointed below started using it on all generations. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1860 Cc: <mesa-stable@lists.freedesktop.org> Fixes: `96e1c945f2` ("i965: Move device info initialization to common code") Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2019-10-02 22:25:44 +00:00
Rafael Antognolli	b9994cb8d5	intel/tools: Fix aubinator usage of rb_tree. The order of comparison has changed, so we need to invert the logic of "insert_left" when using rb_tree_insert_at(). Fixes: `dae33052db` (util/rb_tree: Reverse the order of comparison functions). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-30 13:43:23 -07:00
Jason Ekstrand	6c858b9a91	intel/fs: Fix fs_inst::flags_read for ANY/ALL predicates Without this, we were DCEing flag writes because we didn't think their results were used because we didn't understand that an ANY32 predicate actually read all the flags. Fixes: `df1aec763e` "i965/fs: Define methods to calculate the flag..." Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-09-27 19:31:43 +00:00
Maya Rashish	e16fadd545	intel/compiler: avoid truncating int64_t to int Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Maya Rashish <maya@netbsd.org>	2019-09-26 17:46:26 +00:00
Lionel Landwerlin	da2d67fc3b	anv: gem-stubs: return a valid fd got anv_gem_userptr() Fixes invalid close(-1) in the unit tests. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-25 22:02:51 +03:00
Andres Gomez	5e87f48f1d	i965/fs: set rounding mode when emitting the flrp instruction flrp was forgotten when already adding the rounding mode for other instructions. Fixes: `ba1e25e1aa` ("i965/fs: set rounding mode when emitting fadd, fmul and ffma instructions") Suggested-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2019-09-24 12:06:59 +03:00
Andres Gomez	6f1468c371	i965/fs: add a comment about how the rounding mode in fmul is set After `1711bf6cf2` ("intel/fs: Generate better code for fsign multiplied by a value"), the conflicts resolution for setting the rounding mode after the fused fmul and fsign optimization is non obvious. Basically, the optimization doesn't really result in a MUL, or any other operation which would need to have the rounding mode set. Hence, we set it just before the actual MUL in the treatment of fmul. Fixes: `ba1e25e1aa` ("i965/fs: set rounding mode when emitting fadd, fmul and ffma instructions") Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2019-09-24 11:24:15 +03:00
Kenneth Graunke	b9e93db208	intel: Increase Gen11 compute shader scratch IDs to 64. From the MEDIA_VFE_STATE docs: "Starting with this configuration, the Maximum Number of Threads must be set to (#EU * 8) for GPGPU dispatches. Although there are only 7 threads per EU in the configuration, the FFTID is calculated as if there are 8 threads per EU, which in turn requires a larger amount of Scratch Space to be allocated by the driver." It's pretty clear that we need to increase this for scratch address calculations, because the FFTID has a certain bit-pattern. The quote above seems to indicate that we should increase the actual thread count programmed in MEDIA_VFE_STATE as well, but we think the intention is to only bump the scratch space. Fixes GPU hangs in Bioshock Infinite and Synmark's CSDof on Icelake 8x8. Fixes: `5ac804bd9a` ("intel: Add a preliminary device for Ice Lake") Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-09-23 16:59:40 -07:00
Kenneth Graunke	50c0dd8621	Revert "intel/gen11+: Enable Hardware filtering of Semi-Pipelined State in WM" This reverts commit `729de1488f`. It turns out that, although the register is in the logical context, it isn't whitelisted, so we can't actually write it from userspace batch buffers. The write just becomes a noop, which is why we saw no performance changes. I manually whitelisted it, and still observed no performance gains, but it did regress KHR-GL46.texture_cube_map_array.color_depth_attachments on the iris driver. So we might need to fix something before enabling this. To prevent it randomly getting turned on should the kernel ever whitelist this register, we revert the patch for now.	2019-09-23 16:31:23 -07:00
Kenneth Graunke	8489206e9d	intel/genxml: Stop manually scrubbing 'α' -> "alpha" 'α' has never appeared in any genxml files, so there's no need to replace it with the word "alpha". Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-09-23 20:24:54 +00:00
Kenneth Graunke	aa7ac32976	isl: Drop WaDisableSamplerL2BypassForTextureCompressedFormats on Gen11 Gen11 doesn't require us to bypass the L2 cache for BC* images anymore. The documentation is a bit hard to follow on this point, but the Windows driver clearly only applies this workaround on Gen9, and their commit history indicates that this was an intentional change to drop the workaround for Gen11+. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-20 15:35:17 -07:00
Jason Ekstrand	7d861ab812	anv: Advertise VK_KHR_shader_subgroup_extended_types Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-20 18:02:15 +00:00
Jason Ekstrand	03255da225	intel/fs: Do 8-bit subgroup scan operations in 16 bits Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-20 18:02:15 +00:00
Jason Ekstrand	651725f7a1	intel/fs: Allow CLUSTER_BROADCAST to do type conversion We can't really handle it in the little-core 64-bit case but it's not really needed there. Where we really want this is for when we need to do 16 -> 8-bit conversions. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-20 18:02:15 +00:00
Jason Ekstrand	3515c0e9cf	intel/fs: Allow UB, B, and HF types in brw_nir_reduction_op_identity Because byte immediates aren't a thing on GEN hardware, we return a signed or unsigned word immediate in the byte case. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-20 18:02:15 +00:00
Paulo Zanoni	10532c6831	intel/fs: don't forget the stride at generate_shuffle During generate_shuffle(), when we use byte sized registers we end up with a destination stride of 2. We don't take the stride into consideration when selecting the group offset for the last MOV operation, which means we end up moving things to the wrong place, leaving the last few channels untouched. Take the destination stride in consideration so we don't miss the last channels. v2: Assert this is not necessary for the IVB special case (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-20 10:57:05 -07:00
Jason Ekstrand	dae33052db	util/rb_tree: Reverse the order of comparison functions The new order matches that of the comparison functions accepted by the C standard library qsort() functions. Being consistent with qsort will hopefully help avoid developer confusion. The only current user of the red-black tree is aub_mem.c which is pretty easy to fix up. Reviewed-by: Lionel Landwerlin <lionel.g.lndwerlin@intel.com>	2019-09-20 17:37:25 +00:00
Eric Engestrom	3c1a24de07	anv: implement ICD interface v4 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-20 08:31:58 +00:00
Eric Engestrom	19db95e78e	anv: split instance dispatch table This effectively breaks the instance dispatch table in 2 with entry points using a physical device as first argument getting their own dispatch table. As a result we now have to check instance & physical device dispatch table instead of just the instance dispatch table before. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-20 08:31:58 +00:00
Jason Ekstrand	0c4e89ad5b	Move blob from compiler/ to util/ There's nothing whatsoever compiler-specific about it other than that's currently where it's used. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-19 19:56:22 +00:00
Caio Marcelo de Oliveira Filho	fa080f03d3	intel/fs: Add Fall-through comment Reviewed-by: Andres Gomez <agomez@igalia.com>	2019-09-19 10:02:16 -07:00
Arcady Goldmints-Orlov	5ec5fecc26	anv: fix descriptor limits on gen8 Later generations support bindless for samplers, images, and buffers and thus per-stage descriptors are not limited by the binding table size. However, gen8 doesn't support bindless images and thus needs to report a lower per-stage limit so that all combinations of descriptors that fit within the advertised limits are reported as supported by vkGetDescriptorSetLayoutSupport. Fixes test dEQP-VK.api.maintenance3_check.descriptor_set Fixes: `79fb0d27f3` ("anv: Implement SSBOs bindings with GPU addresses in the descriptor BO") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-19 09:10:40 -05:00
Paulo Zanoni	8e614c7a29	intel/fs: fix SHADER_OPCODE_CLUSTER_BROADCAST for SIMD32 The current code can create functions with a width of 32, which is not supported by our hardware. Add some code to simplify how we express what we want and prevent such cases. For some unknown reason, all the tests I could run seem to work even with these unsupported MOVs. Fixes: `b0858c1cc6` "intel/fs: Add a couple of simple helper opcodes" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-19 02:48:27 +00:00
Paulo Zanoni	c99df52873	intel/fs: the maximum supported stride width is 16 There are cases where we try to generate registers with a stride of 32, while the hardware maximum is just 16. This happens, for example, when using 8 bit integers on SIMD32. This results in a crash because the variable 'width' has a value of 32: ../../src/intel/compiler/brw_reg.h:550: brw_reg brw_vecn_reg(unsigned int, brw_reg_file, unsigned int, unsigned int): Assertion `!"Invalid register width"' failed. This change prevents the crash and makes the tests pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-19 02:48:27 +00:00
Paulo Zanoni	cebf447d16	intel/fs: roll the loop with the <0,1,0> additions in emit_scan() IMHO the code is easier to understand this way, being explicit that we're doing exactly the same thing every time. No functional changes. v2: Adjust the loop breaking condition (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-19 02:47:17 +00:00
Paulo Zanoni	d9ddf5076d	intel/fs: make scan/reduce work with SIMD32 when it fits 2 registers When dealing with uint16_t and uint8_t on SIMD32 we can do all the operations using just 2 registers, so we don't hit the recursion at the beginning of emit_scan(). Because of that, we need to actually compute scan/reduce for channels 31:16. v2: Still missed instructions (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-19 02:47:17 +00:00
Kenneth Graunke	0e4a75f917	intel/compiler: Record whether any pull constant loads occur I would like for iris to be able to avoid setting up SURFACE_STATE for UBOs in the common case where all constants are pushed. Unfortunately, we don't know up front whether everything will be pushed: the backend is allowed to demote pushed UBOs to pull loads fairly late in the process. This is probably desirable though, as we'd like the backend to be able to re-pull pushed data to break up long live ranges in response to register pressure. Here we simply add a "are there any pull loads at all" boolean to prog_data, which is a bit crude but at least allows us to skip work in the common "everything pushed" case. We could skip more work by tracking exactly which UBO surfaces are pulled in a bitmask, but I wanted to avoid bringing back the old mark_surface_used() mechanism. Finer-grained tracking could allow us to skip a bit more work when multiple UBOs are in use and /some/ are 100% pushed, but others are accessed via pulls. However, I'm not sure how common this is and it would save at most 4 pull descriptors, so we defer that for now. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-18 15:44:22 -07:00
Kenneth Graunke	f76a724e06	intel/compiler: Set "Null Render Target" ex_desc bit on Gen11 When there are no color regions (i.e. a depth only pass), we can set the "Null Render Target" bit in the Gen11 RT write extended message descriptor to indicate that it should behave as if it's writing to a null render target, without the need for a binding table entry. This lets drivers avoid setting up that null RT binding table entry, but more importantly means the HW doesn't actually have to bother looking up the surface state. Together with the next patch, this improves performance in Car Chase on an Icelake 8x8 (locked to 700Mhz) by 0.0445526% +/- 0.0132736% (n=832). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 14:27:51 -07:00
Samuel Iglesias Gonsálvez	f5dd6dfe01	anv: enable VK_KHR_shader_float_controls and SPV_KHR_float_controls This adds support for VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT_CONTROLS_PROPERTIES_KHR and enables de Vulkan and SPIR-V extensions. Also, notice that this includes the updates applied to the VkPhysicalDeviceFloatControlsPropertiesKHR structure in the extension VK_KHR_shader_float_controls v4 and Vulkan 1.1.116. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez	9b07020a4f	i965/fs: add support for shader float control to remove_extra_rounding_modes() The remove_extra_rounding_modes() optimization will remove duplicated rounding mode changes. v2: - Fix bug in the rounding mode change (Alejandro). v3: - Fix rounding modes. v4: - Updated to renamed shader info member and enum values (Andres). v5: - Simplify flags logic operations (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez	9bd88d10d8	i965/fs: set rounding mode when emitting nir_op_f2f32 or nir_op_f2f16 v2: - Consider nir_op_f2f16 case too (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez	ba1e25e1aa	i965/fs: set rounding mode when emitting fadd, fmul and ffma instructions v2: - Updated to renamed shader info member (Andres). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez	9da56ffc52	i965/fs: add emit_shader_float_controls_execution_mode() and aux functions We need this function to emit code that setups the control register later with the defined execution mode for the shader. Therefore, we emit it as the first instruction. v2: - Fix bug in setting the default mode mask in brw_rnd_mode_from_nir(). - Fix support for rounding modes in brw_rnd_mode_from_nir(). v3: - Updated to renamed shader info member and enum values (Andres). v4: - Add actual emission as first instruction of emit_nir_code (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez	8a6507b6fe	i965/fs/generator: add new opcode to set float controls modes in control register Before this commit, we had only FPRoundingMode decoration (the per instruction one) that is applied during the SPIR-V handling. In vtn_alu we find out the rounding mode, and generate the code accordingly that later will be used to look for the respective nir_op_f2f16_{rtz,rtne}. Per-instruction gets prioritized because we make them explicit conversions (with RTZ or RTNE nir opcodes) and they will override the default execution mode defined with float controls. However, we need to come back to the mode defined by float controls after the execution of the FP Rounding instruction. Therefore, the new SHADER_OPCODE_FLOAT_CONTROL_MODE opcode will be used to set the default rounding mode and denorms treatment in the whole shader while the pre-existent SHADER_OPCODE_RND_MODE, will be used as prioritized rounding mode in a per-instruction basis. v2: - Fix bug in defining BRW_CR0_FP_MODE_MASK. v3: - Update comment (Caio). v4: - Split the patch into the helper and the new opcode (this one) (Caio). v5: - Add an explanation on the actual purpose and priority of the newly introduced opcode in the commit log (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez	28da9558f5	i965/fs/generator: refactor rounding mode helper in preparation for float controls v2: - Fix bug in defining BRW_CR0_FP_MODE_MASK. v3: - Update comment (Caio). v4: - Split the patch into the helper (this one) and the new opcode (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez	cdace5b0c6	i965/fs/nir: add nir_op_unpack_half_2x16_split_*_flush_to_zero The denorm mode is set in the control register, no need to do something else. v2: - Add an assert to make sure that we realize if this assumption is broken in the future (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	3c474f8513	intel/nir: do not apply the fsin and fcos trig workarounds for consts If we have fsin or fcos trigonometric operations with constant values as inputs, we will multiply the result by 0.99997 in brw_nir_apply_trig_workarounds, making the result wrong. Adjusting the rules so they do not apply to const values we let a later constant fold to deal with it. v2: - Do not early constant fold but only apply the trig workaround for non constants (Caio). - Add fixes tag to commit log (Caio). Fixes: `bfd17c76c1` "i965: Port INTEL_PRECISE_TRIG=1 to NIR." Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Sergii Romantsov	2bfcf04345	nir/large_constants: pass after lowering copy_deref v2: by J.Ekstrand suggestion moved lowering of large constants after lowering of copy_deref is done. CC: Jason Ekstrand <jason@jlekstrand.net> CC: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111450 Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>	2019-09-16 11:23:48 +00:00
Lionel Landwerlin	0616b7ac90	vulkan: add vk_x11_strict_image_count option This option strictly allocate the minImageCount given by the application at swapchain creation. This works around application that do not deal with the fact that the implementation allocates more images than the minimum specified. v2: Add values in default drirc (Bas) v3: specify engine name/version (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111522 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Cc: 19.2 <mesa-stable@lists.freedesktop.org>	2019-09-15 15:37:02 +03:00

1 2 3 4 5 ...

4625 commits