fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-24 21:28:10 +02:00

Author	SHA1	Message	Date
Jose Maria Casanova Crespo	f1a9936ee1	i965/fs: Add byte scattered write message and fs support v2: (Jason Ekstrand) - Enable bit_size parameter to scattered messages to enable different bitsizes byte/word/dword. - Remove use of brw_send_indirect_scattered_message in favor of brw_send_indirect_surface_message. - Move scattered messages to surface messages namespace. - Assert align1 for scattered messages and assume Gen8+. - Inline brw_set_dp_byte_scattered_write. v3: - Remove leftover newline (Topi Pohjolainen) - Rename brw_data_size to brw_scattered_data_element and use defines instead of an enum (Jason Ekstrand) - Assert scattered write for Gen8+ and Haswell (Jason Ekstrand) Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Alejandro Piñeiro	d038deaa40	i965/fs: Add remove_extra_rounding_modes optimization Although from SPIR-V point of view, rounding modes are attached to the operation/destination, on i965 it is a status, so we don't need to explicitly set the rounding mode if the one we want is already set. Taking into account that the default mode is RTE, one possible optimization would be optimize out the first RTE set for each block. For in order to work, we would need to take into account block interrelationships. At this point, it is not worth to complicate the optimization for such small gain. v2: Use a single SHADER_OPCODE_RND_MODE opcode taking an immediate with the rounding mode (Curro) v3: Reset optimization for every block. (Jason Ekstrand) Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Alejandro Piñeiro	82fa4d45e7	i965/fs: Enable rounding mode on f2f16 ops By default we don't set the rounding mode. We only set round-to-near-even or round-to-zero mode if explicitly set from nir. v2: Use a single SHADER_OPCODE_RND_MODE opcode taking an immediate with the rounding mode (Curro) v3: Use new helper brw_rnd_mode_from_nir_op (Jason Ekstrand) Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Alejandro Piñeiro	d6cd14f213	i965/fs: Define new shader opcode to set rounding modes Although it is possible to emit them directly as AND/OR on brw_fs_nir, having a specific opcode makes it easier to remove duplicate settings later. v2: (Curro) - Set thread control to 'switch' when using the control register - Use a single SHADER_OPCODE_RND_MODE opcode taking an immediate with the rounding mode. - Avoid magic numbers setting rounding mode field at control register. v3: (Curro) - Remove redundant and add missing whitespace lines. - Match printing instruction to IR opcode "rnd_mode" v4: (Topi Pohjolainen) - Fix code style. Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo	ac8d4734f6	i965: Add support for control register Control register cr0 in i965 can be used to change the rounding modes in 32-bit to 16-bit floating-point conversions. From intel Skylake PRM, vol 07, section "Register and Tegister Regions", subsection "Control Register" (page 754): "Subregister cr0.0:ud contains normal operation control fields such as the floating-point mode ... " Floating-point Rounding mode is changed at bits 5:4 of cr0.0: "Rounding Mode. This field specifies the FPU rounding mode. It is initialized by Thread Dispatch." 00b = Round to Nearest or Even (RTNE) 01b = Round Up, toward +inf (RU) 10b = Round Down, toward -inf (RD) 11b = Round Toward Zero (RTZ)" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Alejandro Piñeiro	5d5ee507fb	i965/fs: Handle 32-bit to 16-bit conversions Conversions to 16-bit need having aligment between the 16-bit and 32-bit types. So the conversion operations unpack 16-bit types to with an stride=2 and then applies a MOV with the conversion. v2 (Jason Ekstrand): - Avoid the general use of stride=2 for 16-bit register types. v3 (Topi Pohjolainen) - Code style fix (Jason Ekstrand) - Now nir_op_f2f16 was renamed to nir_op_f2f16_undef because conversion to f16 with undefined rounding is explicit Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Alejandro Piñeiro	a05b6f25bf	i965/fs: Remove BRW_REGISTER_TYPE_HF assert at get_exec_type Note that we don't remove the assert at i965/vec4. At this point half float support is only for the scalar backend. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo	75a88d8567	i965: Support for 16-bit base types in helper functions v2: Fixed calculation of scalar size for 16-bit types. (Jason Ekstrand) v3: Fix coding style (Topi Pohjolainen) Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Eduardo Lima <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Alejandro Piñeiro	2d28ca7000	i965/vec4: Handle 16-bit types at type_size_xvec4 These types have similar vec4 sizes as their 32-bit counterparts. The vec4 backend doesn't support 16-bit types and probably never will, but this method is called by the scalar backend at fs_visitor::nir_setup_outputs(), so we still need to provide valid vec4 sizes for 16-bit types. In the future, something different should be implemented to avoid this dependency. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jason Ekstrand	8761a04d0d	anv: Add support for the variablePointers feature Not to be confused with variablePointersStorageBuffer which is the subset of VK_KHR_variable_pointers required to enable the extension. This means we now have "full" support for variable pointers. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 22:01:54 -08:00
Jason Ekstrand	32c859125b	anv: Handle nir_intrinsic_vulkan_resource_reindex Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 22:01:54 -08:00
Chad Versace	a932aee7a8	intel/isl: Declare private array as static const It's array isl_drm.c:modifier_info[] . Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-12-04 10:16:33 -08:00
Lionel Landwerlin	2ead8f1690	anv: query CS timestamp frequency from the kernel The reference value in gen_device_info isn't going to be acurate on Gen10+. We should query it from the kernel, which reads a couple of register to compute the actual value. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2017-12-04 18:05:20 +00:00
Jason Ekstrand	0a10e3770f	vulkan/wsi: Initialize individual WSI interfaces in wsi_device_init Now that we have anv_device_init/finish functions, there's no reason to have the individual driver do any more work than that. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	2e3e55110b	vulkan/wsi: Drop some unneeded cruft from the API This drops the unneeded callbacks struct as well as the queue_get_family callback we were using before we'd pulled QueuePresent inside. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	c1b1be5196	vulkan/wsi: Add wrappers for all of the surface queries This lets us move wsi_interface to wsi_common_private.h Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	82931dc007	vulkan/wsi: Drop the can_handle_different_gpu parameter from get_support Both anv and radv can handle prime now. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	516dfb34e1	vulkan/wsi: Add a helper for AcquireNextImage Unfortunately, due to the fact that AcquireNextImage does not take a queue, the ANV trick for triggering the fence won't work in general. We leave dealing with the fence up to the caller for now. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Dave Airlie	8ff49951c3	vulkan/wsi: move swapchain create/destroy to common code v2 (Jason Ekstrand): - Rebase - Alter the names of the helpers to better match the vulkan entrypoints - Use the helpers in anv Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	393aa3f6c9	vulkan/wsi: Move get_images into common code This moves bits out of all four corners (anv, radv, x11, wayland) and into the wsi common code. We also switch to using an outarray to ensure we get our return code right. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	1117f843fe	anv/wsi: Enable prime support Now that we're using the same common code as radv, we get prime support for free. Just enable it. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	ac95335b61	anv/wsi: Use the common QueuePresent code Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	e12688f365	vulkan/wsi: Do image creation in common code This uses the mock extension created in a previous commit to tell the driver that the image it's just been asked to create is, in fact, a window system image with whatever assumptions that implies. There was a lot of redundant code between the two drivers to do basically exactly the same thing. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	d50937f137	vulkan/wsi: Implement prime in a completely generic way Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	3dabb4011f	anv/image: Implement the wsi "extension" Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	a44744e01d	anv: Require a dedicated allocation for modified images This lets us set the BO tiling when we allocate the memory. This is required for GL to work properly. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	7d19e570e1	anv/image: Add a drm_format_mod field At the moment, this is always initialized to DRM_FORMAT_MOD_INVALID. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	ab18e8e59b	anv: Implement VK_EXT_external_memory_dma_buf This is a modified version of the patch originally sent by Chad Versace. The primary difference is that this version claims that OPQAUE_FD and DMA_BUF are compatible handle types. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	764fc1643c	vulkan/wsi: Add a wsi_device_init function This gives the opportunity to collect some function pointers if we'd like which will be very useful in future. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Daniel Stone	c1163f7b1c	vulkan/wsi: Add a wsi_image structure This is used to hold information about the allocated image, rather than an ever-growing function argument list. v2 (Jason Ekstrand): - Rename wsi_image_base to wsi_image Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Dave Airlie	2cbeb32555	vulkan/wsi: use function ptr definitions from the spec. This just seems cleaner, and we may expand this in future. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	e19c623128	spirv: Convert the supported_extensions struct to spirv_options This is a bit more general and lets us pass additional options into the spirv_to_nir pass beyond what capabilities we support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-02 08:09:11 -08:00
Rafael Antognolli	2919adffe9	intel/compiler: Implement WaClearTDRRegBeforeEOTForNonPS. The bspec describes: "WA: Clear tdr register before send EOT in all non-PS shader kernels mov(8) tdr0:ud 0x0:ud {NoMask}" Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-12-01 11:27:27 -08:00
Vadym Shovkoplias	cdb3eb7174	intel/blorp: Fix possible NULL pointer dereferencing Fix incomplete check of input params in blorp_surf_convert_to_uncompressed() which can lead to NULL pointer dereferencing. Fixes: `5ae8043fed` ("intel/blorp: Add an entrypoint for doing bit-for-bit copies") Fixes: `f395d0abc8` ("intel/blorp: Internally expose surf_convert_to_uncompressed") Reviewed-by: Emil Velikov <emli.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-11-30 16:20:05 +02:00
Vinson Lee	8c1e4b1afc	anv: Check if memfd_create is already defined. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103909 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-30 01:36:46 -08:00
Iago Toral Quiroga	8620f7ebbc	i965/vec4: use a temp register to compute offsets for pull loads 64-bit pull loads are implemented by emitting 2 separate 32-bit pull load messages, where the second message loads from an offset at +16B. That addition of 16B to the original offset should not alter the original offset register used as source for the pull load instruction though, since the compiler might use that same offset register in other instructions (for example, for other pull loads in the shader code that take that same offset as reference). If the pull load is 32-bit then we only need to emit one message and we don't need to do offset calculations, but in that case the optimizer should be able to drop the redundant MOV. Fixes the following test on Haswell: KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103007	2017-11-30 07:57:53 +01:00
Kenneth Graunke	3d68329a65	i965: Move perf_debug and WARN_ONCE back to brw_context.h. These were moved to src/intel/common/gen_debug.h, but they are not common code. They assume that brw_context or gl_context variables exist, named brw or ctx. That isn't remotely true outside of i965. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-28 15:23:16 -08:00
Lionel Landwerlin	349712018b	i965: add a debug option to disable oa config loading This provides a good way to verify we haven't broken using the perf driver on older kernels (which don't have the oa config loading mechanism). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-28 13:34:04 +00:00
Jason Ekstrand	d7c8c7bd9d	intel/blorp: Drop blorp_resolve_ccs_attachment The only reason why we needed that version was because the Vulkan driver needed to be able to create the surface states so it could handle indirect clear colors. Now that blorp handles them natively, there's no need for the extra entrypoint. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-27 16:22:13 -08:00
Jason Ekstrand	5bc2849af9	anv: Let blorp handle indirect clear colors for CCS resolves Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-27 16:22:13 -08:00
Jason Ekstrand	34b95f88e6	anv: Move get_fast_clear_state_address into anv_private.h While we're at it, we break it into two nicely named functions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-27 16:22:13 -08:00
Jason Ekstrand	8915621882	intel/blorp: Take a range of layers in blorp_ccs_resolve Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-27 16:22:13 -08:00
Jason Ekstrand	67b676f0c5	intel/blorp: Add initial support for indirect clear colors Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-27 16:22:12 -08:00
Jason Ekstrand	86becfd2de	intel/blorp: Add fast-clear to the special case in MSAA resolves This doesn't go all the way of avoiding the txf_ms if it's fast-cleared, however it does at least make us only do it once. This should improve performance of MSAA resolves in the presence of lots of clear color. Without the patch, enabling fast-clears in the multisampling Sascha demo drops the framerate by about 10%. With this patch, enabling fast-clears increases the demo's framerate by 25%. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-27 16:22:11 -08:00
Jason Ekstrand	dc21c3937c	intel/blorp/blit: Rename blorp_nir_txf_ms_mcs That name is already taken by one of the helpers in blorp_nir_builder.h and, while we haven't moved the guts of blorp_blit.c there yet, we'd like to start using some things from that header. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-27 16:19:38 -08:00
Iago Toral Quiroga	f1873956db	i965/vec4: fix splitting of interleaved attributes When we split an instruction that reads an uniform value (vstride 0) we need to respect the vstride on the second half of the instruction (that is, the second half should read the same region as the first). We were doing this already, but we didn't account for stages that have interleaved input attributes which also have a vstride of 0 and need the same treatment. Fixes the following on Haswell: KHR-GL45.enhanced_layouts.varying_locations KHR-GL45.enhanced_layouts.varying_array_locations KHR-GL45.enhanced_layouts.varying_structure_locations Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Andres Gomez <agomez@igalia.com>	2017-11-24 09:24:06 +01:00
Eric Engestrom	1d3944aeeb	genxml: fix assert guards This removes a few hundred warnings on debug builds with asserts off. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-23 09:44:16 +00:00
Lionel Landwerlin	d4c52c5408	anv: flag batch & instruction BOs for capture When the kernel support flagging our BO, let's mark batch & instruction BOs for capture so then can be included in the error state. v2: Only add EXEC_CAPTURE if supported (Kristian) v3: Fix operator precedence issue (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-22 22:53:27 +00:00
Lionel Landwerlin	118a8c7587	anv: setup BO flags at state_pool/block_pool creation This will allow to set the flags on any anv_bo created/filled from a state pool or block pool later. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-22 22:53:27 +00:00
Kristian H. Kristensen	24609377f9	intel/genxml: Add helpers for determining field type Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-21 11:15:06 -08:00

... 90 91 92 93 94 ...

7038 commits