fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-02-03 17:20:26 +01:00

Author	SHA1	Message	Date
Jason Ekstrand	d2eecf0b0b	intel/compiler/icl: Clear "null render target" bit in extended message descriptor Otherwise all our render target writes go no where. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Anuj Phogat	1484876ef7	intel/compiler/icl: Update the assert in brw_stage_has_packed_dispatch() Rafael ran piglit with the test code enabled and saw no additional GPU hangs. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Anuj Phogat	f05e0d9c2a	intel/common/icl: Disable hiz surface sampling On gen11+ AUX_HIZ is not a supported value for surfaces being sampled by the 3D sampler. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Anuj Phogat	370af9dcc0	intel/common/icl: Add L3 config ICL uses the same L3 configs as CNL, just leaving the SLM configs out. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Matt Turner	f56693af4b	intel/tools/aubinator: Drop platform list from print_help() We all know the platform names, and I don't want to update this list continually. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Caio Marcelo de Oliveira Filho	12c22b897a	anv/pipeline: don't pass constant view index in multiview If view mask has only one bit set, view index is effectively a constant, so doesn't need to be passed to the next stages, just always set it. Part of this was in the original patch that added anv_nir_lower_multiview.c but disabled. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-21 14:49:50 -07:00
Caio Marcelo de Oliveira Filho	5e7c1d05d4	anv/pipeline: use less instructions for multiview The view_index is encoded in the remainder of dividing instance id by the number of views in the view mask (n). In the general case (handled by the else clause), there is a need to map from 0..n-1 into the number of the view being masked. For that a map is encoded. In the case only the first n bits in the mask are set, the mapping is trivial, 0..n-1 already represent what view is being referred to. That case was in the original patch that added anv_nir_lower_multiview.c but disabled. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-21 14:49:50 -07:00
Rafael Antognolli	5297a17571	aubinator_error_decode: Compare only the class_name of the ring. ring_name is "<class_name> + <instance_id>" (e.g. rcs0). So we need to first compare the class name only, then get the instance id. Without this, INSTDONE is not being decoded. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-03-21 11:35:15 -07:00
Scott D Phillips	cab8df1e3e	intel/tools: aubinator: Catch gen11 "enhanced execlist" submission Different registers are used for execlist submission in gen11, so also watch those. This code only watches element zero of the submit queue, which is all aubdump currently writes. Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-21 11:07:15 -07:00
Lionel Landwerlin	7f977d51b3	intel: genxml: add INSTPM/CS_DEBUG_MODE2 registers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-20 16:58:30 +00:00
Scott D Phillips	d849d36c6c	anv: off-by-one in GetDescriptorSetLayoutSupport Loop was accessing one more than bindingCount elements from pBindings, accessing uninitialized memory. Fixes: `ddc4069122` ("anv: Implement VK_KHR_maintenance3") Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-20 07:58:10 -07:00
Caio Marcelo de Oliveira Filho	f6338c3b85	anv/pipeline: set active_stages early Since the intermediate states of active_stages are not used, i.e. active_stages is read only after all stages were set into it, just set its value before compiling the shaders. This will allow to conditionally run certain passes based on what other shaders are being used, e.g. a certain pass might only be applicable to the vertex shader if there's no geometry or tessellation shader being used. v2: Use vk_to_mesa_shader_stage. (Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-19 18:00:49 +00:00
Caio Marcelo de Oliveira Filho	318073ce66	anv/pipeline: fail if TCS/TES compile fail v2: Add Fixes tag. (Lionel) Fixes: `e50d4807a3` ("anv: Compile TCS/TES shaders.") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-19 18:00:49 +00:00
Eric Anholt	7db1c09d12	anv: Silence warning about heap_size. We only get VK_SUCCESS if it was initialized, but apparently my compiler doesn't track that far. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-16 15:10:05 -07:00
Eric Anholt	d25640c3a3	i965: Silence compiler warning about promoted_constants. We only have a cfg != NULL if we went through one of the paths that set it, but my compiler doesn't figure that out. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `6411defdcd` ("intel/cs: Re-run final NIR optimizations for each SIMD size")	2018-03-16 15:09:55 -07:00
Eric Anholt	9f89452ea3	anv: Silence compiler warnings about uninitialized bind_offset. This is a legitimate warning: if anv's blorp_alloc_binding_table() throws an error from anv_cmd_buffer_alloc_blorp_binding_table(), we silently continue to use this undefined value. The rest of this code doesn't seem very allocation-error-proof, though, either. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-16 15:09:47 -07:00
Matt Turner	f3833f1ca7	intel/compiler: Use gen_get_device_info() in test_eu_validate Previously the unit test filled out a minimal devinfo struct. A previous patch caused the test to begin assert failing because the devinfo was not complete. Avoid this by using the real mechanism to create devinfo. Note that we have to drop icl from the table, since we now rely on the name -> PCI ID translation done by gen_device_name_to_pci_device_id(), and ICL's PCI IDs are not upstream yet. Fixes: `f89e735719` ("intel/compiler: Check for unsupported register sizes.") Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-16 13:20:21 -07:00
Matt Turner	54db78b196	intel: Add cfl to gen_device_name_to_pci_device_id() Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-16 13:20:21 -07:00
Rafael Antognolli	f89e735719	intel/compiler: Check for unsupported register sizes. Make sure we don't emit 64 bit types if the hardware doesn't support them. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-16 09:27:16 -07:00
Lionel Landwerlin	51783f3e7d	anv: silence unused variable warning Fixes: `59b0ea0c74` ("anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-15 18:56:26 +00:00
Lionel Landwerlin	0f544a3c51	anv: silence unused function warning on gen11 [84/227] Compiling C object 'src/intel/vulkan/libanv_gen110@sta/genX_blorp_exec.c.o'. ../src/intel/vulkan/genX_blorp_exec.c:68:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function] blorp_get_surface_base_address(struct blorp_batch *batch) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-15 18:55:42 +00:00
Karol Herbst	b617bfcccf	compiler: int8/uint8 support OpenCL kernels also have int8/uint8. v2: remove changes in nir_search as Jason posted a patch for that Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-03-14 10:08:42 -04:00
Iago Toral Quiroga	1a0aba7216	anv/entrypoints: VkGetDeviceProcAddr returns NULL for core instance commands `af5f2322d0` addressed this for extension commands, but the spec mandates this behavior also for core API commands. From the Vulkan spec, Table 2. vkGetDeviceProcAddr behavior: device pname return ---------------------------------------------------------- (..) device core device-level command fp (...) See that it specifically states "device-level". Since the vk.xml file doesn't state if core commands are instance or device level, we identify device level commands as the ones that take a VkDevice, VkQueue or VkCommandBuffer as their first parameter. Fixes test failures in new work-in-progress CTS tests. Also see the public issue: https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers/issues/2323 v2: - Include reference to github issue (Emil) - Rebased on top of Vulkan 1.1 changes. v3: - Remove the not in the condition and switch the then/else cases (Jason) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-14 08:09:15 +01:00
Iago Toral Quiroga	a631575ff4	anv/entrypoints: dispatches to VkQueue are device-level v2: - Add trampoline functions (Jason) - Add an assertion for unhandled trampoline cases Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-14 08:09:15 +01:00
Jordan Justen	24b415270f	intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview Ken suggested that we might be underallocating scratch space on HD 400. Allocating scratch space as though there was actually 8 EUs seems to help with a GPU hang seen on synmark CSDof. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-09 16:15:58 -08:00
Ian Romanick	1583f49eaa	i965/vec4: Allow CSE on subset VF constant loads v2: Rewrite the code that generates the VF mask. Suggested by Ken. No changes on other platforms. Haswell, Ivy Bridge, and Sandy Bridge had similar results. (Haswell shown) total instructions in shared programs: 13059891 -> 13059884 (<.01%) instructions in affected programs: 431 -> 424 (-1.62%) helped: 7 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.19% max: 5.26% x̄: 2.05% x̃: 1.49% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -3.39% -0.71% Instructions are helped. total cycles in shared programs: 409260032 -> 409260018 (<.01%) cycles in affected programs: 4228 -> 4214 (-0.33%) helped: 7 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.28% max: 2.04% x̄: 0.54% x̃: 0.28% 95% mean confidence interval for cycles value: -2.00 -2.00 95% mean confidence interval for cycles %-change: -1.15% 0.07% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-08 15:26:26 -08:00
Ian Romanick	360899d457	i965/vec4: Relax writemask condition in CSE If the previously seen instruction generates more fields than the new instruction, still allow CSE to happen. This doesn't do much, but it also enables a couple more shaders in the next patch. It helped quite a bit in another change series that I have (at least for now) abandoned. v2: Add some extra comentary about the parameters to instructions_match. Suggested by Ken. No changes on Skylake, Broadwell, Iron Lake or GM45. Ivy Bridge and Haswell had similar results. (Ivy Bridge shown) total instructions in shared programs: 11780295 -> 11780294 (<.01%) instructions in affected programs: 302 -> 301 (-0.33%) helped: 1 HURT: 0 total cycles in shared programs: 257308315 -> 257308313 (<.01%) cycles in affected programs: 2074 -> 2072 (-0.10%) helped: 1 HURT: 0 Sandy Bridge total instructions in shared programs: 10506687 -> 10506686 (<.01%) instructions in affected programs: 335 -> 334 (-0.30%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-08 15:26:26 -08:00
Ian Romanick	52c7df1643	i965/fs: Merge CMP and SEL into CSEL on Gen8+ v2: Fix several problems handling inverted predicates. Add a much bigger comment around the BRW_CONDITIONAL_NZ case. v3: Allow uniforms and shader inputs as sources for the original SEL and CMP instructions. This enables a LOT more shaders to receive CSEL merging (5816 vs 8564 on SKL). v4: Report progress. Broadwell and Skylake had similar results. (Broadwell shown) helped: 8527 HURT: 0 helped stats (abs) min: 1 max: 27 x̄: 2.44 x̃: 1 helped stats (rel) min: 0.03% max: 17.80% x̄: 1.12% x̃: 0.70% 95% mean confidence interval for instructions value: -2.51 -2.36 95% mean confidence interval for instructions %-change: -1.15% -1.10% Instructions are helped. total cycles in shared programs: 559442317 -> 558288357 (-0.21%) cycles in affected programs: 372699860 -> 371545900 (-0.31%) helped: 6748 HURT: 1450 helped stats (abs) min: 1 max: 32000 x̄: 182.41 x̃: 12 helped stats (rel) min: <.01% max: 66.08% x̄: 3.42% x̃: 0.70% HURT stats (abs) min: 1 max: 2538 x̄: 53.08 x̃: 14 HURT stats (rel) min: <.01% max: 96.72% x̄: 3.32% x̃: 0.90% 95% mean confidence interval for cycles value: -179.01 -102.51 95% mean confidence interval for cycles %-change: -2.37% -2.08% Cycles are helped. LOST: 0 GAINED: 6 No changes on earlier platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v3] Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-08 15:26:26 -08:00
Kenneth Graunke	70de61594d	i965/fs: Add infrastructure for generating CSEL instructions. v2 (idr): Don't allow CSEL with a non-float src2. v3 (idr): Add CSEL to fs_inst::flags_written. Suggested by Matt. v4 (idr): Only set BRW_ALIGN_16 on Gen < 10 (suggested by Matt). Don't reset the access mode afterwards (suggested by Samuel and Matt). Add support for CSEL not modifying the flags to more places (requested by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v3] Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-08 15:26:26 -08:00
Jason Ekstrand	c217607b65	anv: Support version overrides While always sketchy to do, this is useful for debugging. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	d6b65222df	anv: Enable Vulkan 1.1 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	03c07ac548	anv: Add support for SPIR-V 1.3 subgroup operations This requires us to bump the subgroup size to 32 for all shader stages because Vulkan requires that to be a physical device query. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	8b4a5e641b	intel/fs: Add support for subgroup quad operations NIR has code to lower these away for us but we can do significantly better in many cases with register regioning and SIMD4x2. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	2292b20b29	intel/fs: Implement reduce and scan opeprations Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	4150920b95	intel/fs: Add a helper for emitting scan operations This commit adds a helper to the builder for emitting "scan" operations. Given a binary operation #, a scan takes the vector [a0, a1, ..., aN] and returns the vector [a0, a0 # a1, ..., a0 # a1 # ... # aN] where each channel contains the combination of all previous channels. The sequence of instructions to perform the scan is fairly optimal; a 16-wide scan on a 32-bit type is only 6 instructions. The subgroup scan and reduction operations will be implemented in terms of this. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	b0858c1cc6	intel/fs: Add a couple of simple helper opcodes Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	90c9f29518	i965/fs: Add support for nir_intrinsic_shuffle Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	7cfece820d	i965/fs: Support nir_intrinsic_vote_feq Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	44681e4795	nir: Generalize nir_intrinsic_vote_eq The SPIR-V extension wants us to be able to do an AllEqual on any vector or scalar type. This has two implications: 1) We need to be able to handle vectors so we switch the vote_eq intrinsics to be vectorized intrinsics. 2) We need to handle floats which have different behavior with respect to +-0, NaN, etc. than the integer variant so we need two variants. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	974daec495	i965/fs: Implement basic SPIR-V subgroup intrinsics Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	59b0ea0c74	anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER From the Vulkan 1.1 spec: "Vulkan 1.0 implementations were required to return VK_ERROR_INCOMPATIBLE_DRIVER if apiVersion was larger than 1.0. Implementations that support Vulkan 1.1 or later must not return VK_ERROR_INCOMPATIBLE_DRIVER for any value of apiVersion." Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	cbab2d1da5	anv: Implement vkEnumerateInstanceVersion Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Iago Toral Quiroga	605fd7c0da	anv/device: fail to initialize device if we have queues with unsupported flags This is not strictly necessary since users should not be requesting any flags that are not valid for the list of enabled features requested and we already fail if they attempt to use an unsupported feature, however it is an easy to implement sanity check that would help developes realize that they are doing things wrong, so we might as well do it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-07 12:13:47 -08:00
Iago Toral Quiroga	b262f17b15	anv/device: GetDeviceQueue2 should only return queues with matching flags From the Vulkan 1.1 spec, VkDeviceQueueInfo2 structure: "The queue returned by vkGetDeviceQueue2 must have the same flags value from this structure as that used at device creation time in a VkDeviceQueueCreateInfo instance. If no matching flags were specified at device creation time then pQueue will return VK_NULL_HANDLE." For us this means no flags at all since we don't support any. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	9c8b40001d	anv: Support querying for protected memory Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	773a51e772	anv: Implement GetDeviceQueue2 This belongs to the protected memory feature but there's nothing about it that's specific to protected memory. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	68df93ecbc	anv: Trivially implement VK_KHR_device_group Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	dfe18be09e	anv: Implement vkCmdDispatchBase This is part of the device groups extension/feature but it's a decent chunk of work in its own right so it's worth breaking into its own patch. The mechanism we use is fairly straightforward: we just push the base work group id into the shader and add it to the work group id we get from dispatch. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	ddc4069122	anv: Implement VK_KHR_maintenance3 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	1deb7967c8	anv: Support VkPhysicalDeviceShaderDrawParameterFeatures This advertises the VK_KHR_shader_draw_parameters functionality as a "core optimal feature" in Vulkan 1.1. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00

... 246 247 248 249 250 ...

15202 commits