fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 19:58:19 +02:00

Author	SHA1	Message	Date
Francisco Jerez	0a6e46d44d	intel/fs/gen11+: Handle ROR/ROL in lower_simd_width(). Prevents invalid code from being emitted for ROR/ROL instructions in SIMD32 shaders. The problem can be reproduced with the following tests while forcing SIMD32 to be used for fragment shaders: piglit.shaders.glsl-rotate-left piglit.shaders.glsl-rotate-right However the issue could occur in production already with compute shaders and a workgroup size large enough to trigger SIMD32 dispatch. Fixes: `83fdec0f0d` "intel/compiler: Enable the emission of ROR/ROL instructions" Cc: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-10 11:00:24 -08:00
Jason Ekstrand	9b71171442	anv: Re-use flush_descriptor_sets in flush_compute_state There's no reason to hand-roll all of the memory re-allocation fall-back code for compute shaders. It's just duplicated complexity. This also makes it more clear in flush_compute_state where the MEDIA_INTERFACE_DESCRIPTOR_LOAD command gets emitted relative to other packets in the command stream. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2020-01-09 19:45:00 -06:00
Jason Ekstrand	ae72d1238c	anv: Flag descriptors dirty when gl_NumWorkgroups is used Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2020-01-09 19:45:00 -06:00
Jason Ekstrand	ca6b3b11af	anv: Don't add dynamic state base address to push constants on Gen7 Because Gen7 push constants are already relative to dynamic state base address, they aren't really an address. It's deceptive to return an address from the helper function. Instead, let's leave it as a special-case in the gen7-11 helper; we don't need the helper for code de-duplication for Gen7 anyway. Fixes: `67d2cb3e93` "anv: Add get_push_range_address() helper" Closes: #2323 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2020-01-09 19:44:06 -06:00
Jason Ekstrand	3dec68e682	genxml: Remove a non-existant HW bit	2020-01-09 18:40:20 -06:00
Lionel Landwerlin	60e0db3bfb	anv: fix intel perf queries availability writes The availability is not written at the location changed in ee6fbb95a74d... Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `ee6fbb95a7` ("anv: Properly handle host query reset of performance queries") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2020-01-09 20:42:36 +02:00
Lionel Landwerlin	4578d4ae52	anv: don't close invalid syncfd semaphore Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2020-01-08 18:20:50 +02:00
Jason Ekstrand	b788cccfe2	intel/disasm: Fix decoding of src0 of SENDS There is no instruction field for the register file for src0 because it's always GRF. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3309> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3309>	2020-01-08 14:14:16 +00:00
Jason Ekstrand	803fad43c3	intel/nir: Add a memory barrier before barrier() Our barrier instruction does not implicitly do a memory fence but the GLSL barrier() intrinsic is supposed to. The easiest back-portable solution is to just add the NIR barriers. We'll sort this out more properly in later commits. Cc: mesa-stable@lists.freedesktop.org Closes: #2138 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2020-01-07 21:52:19 -06:00
Kenneth Graunke	defb3a9465	anv: Only enable EWA LOD algorithm when doing anisotropic filtering. Updated documentation renames "Anisotropic Algorithm" to "LOD Algorithm" and adds a note for Gen9+ saying "The EWA Algorithm should only be enabled for Anisotropic Filtering modes." and indicating that the extra accuracy shouldn't be necessary for other modes, and comes at a cost. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2020-01-04 14:27:22 -08:00
Jason Ekstrand	52ad1712ed	anv: Allow HiZ in TRANSFER_SRC_OPTIMAL on Gen8-9 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-04 12:25:54 -08:00
Jason Ekstrand	b274469daa	intel/blorp: Use the source format when using blorp_copy with HiZ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-04 12:25:54 -08:00
Jason Ekstrand	95cc5438eb	blorp: Allow reading with HiZ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-04 12:25:54 -08:00
Jason Ekstrand	4a1093005c	blorp: Stop whacking Z24 depth to BGRA8 The shader code required to do this is int(sat(x) * UINT24_MAX) which isn't really worth all the effort to avoid. Doing the format conversion, on the other hand, prevents us from sampling with HiZ which is something that we very much want on gen8-9 where we can. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-04 12:25:54 -08:00
Caio Marcelo de Oliveira Filho	75a19186b2	anv: Ignore some CreateInfo structs when rasterization is disabled According to the description of VkGraphicsPipelineCreateInfo(), pViewportState, pMultisampleState, pDepthStencilState and pColorBlendState must be ignored when rasterization is not enabled. This avoids potentially invalid pointers being dereferenced when rasterization is disabled. Tested with `demos_x64 VK_Parameter_Zoo` from Renderdoc repository. v2: Don't store the `raster_enabled` as part of anv_pipeline, just query it from the create info. This avoids storing a state that's only used during pipeline creation. (Jason) Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2258 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch> [v1] Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> [v1] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2020-01-03 13:57:31 -08:00
Caio Marcelo de Oliveira Filho	6755b6315b	anv: Drop unused function parameter Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2020-01-03 13:29:49 -08:00
Jason Ekstrand	9bd8000c6c	anv: Drop unneeded struct keywords All VkFoo structs are typedef'd to not need the struct keyword. Leaving it in there is just extra characters and breaks Vulkan's aliasing when stuff gets promoted to core versions. It's better to just never use struct for VkFoo. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2020-01-03 11:32:34 -06:00
Kenneth Graunke	d0d28c783d	iris: Set nir_shader_compiler_options::unify_interfaces. This is technically enabling the option in the common intel backend code, but only the st/nir linker uses the option, so it's iris-only. Fixes Piglit's spec/glsl-1.50/execution/geometry/clip-distance-vs-gs-out Closes: #2274 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3249> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3249>	2020-01-03 00:41:50 +00:00
Kenneth Graunke	7a9c0fc0d7	intel: Drop Gen11 WaBTPPrefetchDisable workaround This isn't needed on production Icelake hardware. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3250> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3250>	2020-01-03 00:20:17 +00:00
Jason Ekstrand	ac70442ce1	anv: Properly advertise sampledImageIntegerSampleCounts We support the same set of samples for integer color formats as for non-integer. We've been advertising it wrong since before the initial Vulkan 1.0 release. :-( Fixes: `d689745303` "vk/0.210.0: Rework device features and limits" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-24 08:31:44 -06:00
Ross Zwisler	cabcbb4db0	intel: limit shader geometry on BDW GT1 Similar to the SKL GT1 fix introduced here: `b1ba7ffdbd` we need to limit the .urb.max_entries[MESA_SHADER_GEOMETRY] on BDW GT1 to address failures in these two tests: dEQP-GLES31.functional.geometry_shading.layered.render_with_default_layer_3d dEQP-GLES31.functional.geometry_shading.layered.render_with_default_layer_2d_array The value 690 was found via bisection. 691 is the actual max on the hardware I'm using, but 690 seemed like a nice round number. Signed-off-by: Ross Zwisler <zwisler@google.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3173> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3173>	2019-12-20 10:47:52 +00:00
Lionel Landwerlin	afdc0121b5	i965/iris/perf: factor out frequency register capture Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3113> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3113>	2019-12-18 14:23:17 +02:00
Caio Marcelo de Oliveira Filho	c61ad77cd2	anv/gen12: Temporarily disable VK_KHR_buffer_device_address (and EXT) For the sake of our testing infrastructure, disable this extension for TGL until we can sort out a hang in Vulkan CTS. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-17 11:07:41 -08:00
Caio Marcelo de Oliveira Filho	766fdeccf9	intel/vec4: Fix lowering of multiplication by 16-bit constant Existing code was ignoring whether the type of the immediate source was signed or not. If the source was signed, it would ignore small negative values but it also would wrongly accept values between INT16_MAX and UINT16_MAX, causing the atual value to later be reinterpreted as a negative number (under 16-bits). Fixes tests/shaders/glsl-mul-const.shader_test in Piglit for older platforms that don't support MUL with 32x32 types and use vec4. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-17 10:45:22 -08:00
Caio Marcelo de Oliveira Filho	2137be22fa	intel/fs: Fix lowering of dword multiplication by 16-bit constant Existing code was ignoring whether the type of the immediate source was signed or not. If the source was signed, it would ignore small negative values but it also would wrongly accept values between INT16_MAX and UINT16_MAX, causing the atual value to later be reinterpreted as a negative number (under 16-bits). Fixes tests/shaders/glsl-mul-const.shader_test in Piglit for platforms that don't support MUL with 32x32 types, including ICL and TGL. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2186 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-17 10:45:22 -08:00
Iván Briano	a649bbffee	anv: Export VK_KHR_buffer_device_address only when really supported Fixes: `1b6991ba1d` ("anv: Implement VK_KHR_buffer_device_address") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>	2019-12-16 19:24:46 +00:00
Iván Briano	0fd93b9589	anv: Export filter_minmax support only when it's really supported Fixes: `bea4d4c78c` ("anv: add VK_EXT_sampler_filter_minmax support") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>	2019-12-16 19:24:46 +00:00
Lionel Landwerlin	c056193288	anv: drop unused parameter from apply layout pass Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-16 14:35:25 +02:00
Lionel Landwerlin	7c223cf316	anv: constify pipeline layout in nir passes Was hoping to find potential issues but nothing. Still probably a good idea. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-16 14:35:22 +02:00
Caio Marcelo de Oliveira Filho	c06ba83589	intel/fs: Lower 64-bit MOVs after lower_load_payload() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3070> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3070>	2019-12-14 21:12:21 +00:00
Eric Engestrom	c327245257	anv: drop unused #include Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-13 20:42:40 +00:00
Eric Engestrom	d600b19640	intel/compiler: replace `0` pointer with `NULL` Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-13 20:16:20 +00:00
Eric Engestrom	8074f68b3b	intel/compiler: add ASSERTED annotation to avoid "unused variable" warning Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-13 20:16:20 +00:00
Lionel Landwerlin	bd888bc1d6	i965/iris: perf-queries: don't invalidate/flush 3d pipeline Our current implementation of performance queries is fairly harsh because it completely flushes and invalidates the 3d pipeline caches at the beginning and end of each query. An argument can be made that this is how performance should be measured but it probably doesn't reflect what the application is actually doing and the actual cost of draw calls. A more appropriate approach is to just stall the pipeline at scoreboard, so that we measure the effect of a draw call without having the pipeline in a completely pristine state for every draw call. v2: Use end of pipe PIPE_CONTROL instruction for Iris (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-13 11:27:22 +02:00
Lionel Landwerlin	a575b3cd5c	intel/perf: drop batchbuffer flushing at query begin This was initially intended to fix issues with the query timings going occassionally high. It turns out there was a bug in the attribution of OA reports to our context when parsing the OA data. This led to reports flagged with other context IDs to be included in our queries results. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-13 11:27:17 +02:00
Lionel Landwerlin	2c5eb1df68	anv: fix assumptions about temporary fence payload Since `f9a3d9738b` temporary BO_WSI are definitely a thing so we have an assert wrong. Take that opportunity to expand a bit on an existing comment. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `f9a3d9738b` ("anv: Use BO fences/semaphores for AcquireNextImage") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ivan Briano <ivan.briano@intel.com>	2019-12-12 10:10:48 +00:00
Lionel Landwerlin	52bc235f2a	anv: fix fence underlying primitive checks We appear to have got lucky that the only type of temporary fence payload we could have was a syncobj and that would only happen when the type of the permanent payload was also a syncobj. This code was broken if that assumption changed and it did in commit `f9a3d9738b`. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ivan Briano <ivan.briano@intel.com>	2019-12-12 10:10:48 +00:00
Jason Ekstrand	776cfde699	anv: Bump the advertised patch version to 129 We've been keeping up with the spec updates. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-11 18:52:08 +00:00
Jason Ekstrand	5f5f5019bd	anv: Unconditionally advertise Vulkan 1.1 Vulkan 1.1 requires VK_KHR_external_fence which requires syncobj support to be actually usable. However, it doesn't strictly require that we support any external handle types. We should be able to advertise 1.1 even on old kernels that don't have syncobj support. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-11 18:52:08 +00:00
Jason Ekstrand	98a83d0fce	anv: Flush the queue on DeviceWaitIdle When we have syncobj_wait, we can trust in WAIT_FOR_SUBMIT but when we don't, we only have BO waits and those aren't quite as nice. This commit adds a flag to _anv_queue_submit to wait for the queue to drain before returning. This gives us the behavior we need to implement DeviceWaitIdle. Fixes: `246261f0ad` "anv: prepare the driver for delayed submissions" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-11 18:52:08 +00:00
Eric Engestrom	b2dac806f8	intel: add mi_builder_test for gen12 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-11 15:38:19 +00:00
Kenneth Graunke	69d7782b15	intel/decoder: Make get_state_size take a full 64-bit address and a base i965 wants to use an offset from a base because everything is in a single buffer whose address may be relocated, and all base addresses are set to the start of that buffer. iris wants to use a full 64-bit address, because state lives in separate buffers which may be in the shader, surface, and dynamic memory zones, where addresses grow downward from the top of a 4GB zone, So it's very possible for a 32-bit offset to exist relative to multiple bases, leading to the wrong state size.	2019-12-10 19:10:49 -08:00
Kenneth Graunke	0f2f561a10	anv: Enable Gen11 Color/Z write merging optimization TCCNTLREG contains additional L3 cache write merging optimizations. The default value on my system appears to be: - URB Partial Write Merging (bit 0) - L3 Data Partial Write Merging (bit 2) - TC Disable (bit 3) Windows drivers appear to set bit 1 as well to enable "Color/Z Partial Write Merging". This should solve an issue we were seeing where MRT benchmarks were using substantially more bandwidth than they ought. However, we have not observed it to cause measurable FPS gains. It is unclear whether we should be setting bit 0 or bit 3, so for now we leave those at the hardware default value. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-10 16:19:46 -08:00
Kenneth Graunke	0b74f85870	intel/genxml: Add a partial TCCNTLREG definition TCCNTLREG contains additional cache programming settings. In particular, there are several write combining controls we'd like to use. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-10 16:19:33 -08:00
Jason Ekstrand	41691ac016	ANV: Stop advertising smoothLines support on gen10+ Reviewed-by: Ivan Briano <ivan.briano@intel.com>	2019-12-10 20:13:56 +00:00
Lionel Landwerlin	5fdea9f401	anv: fix incorrect VMA alignment for CCS main surfaces Maybe finer way of dealing with this requirement would be to increase the number of pdevice->memory.types[] to add a category for special alignment cases. Meanwhile this fixes the problem of CCS surface alignment and it's probably not going to cause issues given the size of our address space. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `6af8a4acc4` ("anv: Add aux-map translation for gen12+") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-10 16:06:54 +00:00
Lionel Landwerlin	dcfe1903c3	anv: fix missing gen12 handling Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `181be14d43` ("anv: Build for gen12") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-10 16:06:54 +00:00
Anuj Phogat	1a32fbd48c	intel: Add pci-ids for Jasper Lake Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-09 12:22:57 -08:00
Anuj Phogat	11fdd5f52c	intel: Add device info for 1x4x6 Jasper Lake Also removing the FIXME comments after matching the numbers with updated documentation. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-09 12:22:56 -08:00
Jason Ekstrand	0f60aa4037	anv: Re-emit all compute state on pipeline switch It's a very odd case to hit in the real world. However, there are some CTS tests which switch back and forth between dispatch and clear without changing the pipeline. Fixes: `bc612536eb` "anv: Emit a dummy MEDIA_VFE_STATE before switching..." Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-07 04:03:35 +00:00

... 7 8 9 10 11 ...

5405 commits