fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-27 14:28:22 +02:00

Author	SHA1	Message	Date
Alyssa Rosenzweig	89fdbb6707	panfrost/midgard: Add fcsel_i opcode Whereas a normal fcsel acts on a boolean input in r31.w, the fcsel_i variant acts on an integer input in r31.w, which can be preloaded with an instruction like imov (with the appropriate negate flag on the source). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:15 +00:00
Alyssa Rosenzweig	121417ef1d	panfrost: Implement scissor test This preliminary implementation should handle some basic cases. Future work should scissor the FRAGMENT job as well for efficiency. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:14 +00:00
Alyssa Rosenzweig	bd9446e719	panfrost: Fix viewports Our viewport code hardcoded a number of wrong assumptions, which sort of sometimes worked but was definitely wrong (and broke most of dEQP). This corrects the logic, accounting for flipped-Y framebuffers, which fixes... most of dEQP. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:10 +00:00
Alyssa Rosenzweig	9da4603fb6	panfrost/midgard: Fix b2f32 swizzle for vectors Fixes issues in most of dEQP-GLES2.functional.shaders.* Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-26 23:35:08 +00:00
Dave Airlie	e77013fb7f	softpipe: fix clears to only clear specified color buffers. This fixes piglit clearbuffer-mixed-format Reviewed-by: Brian Paul <brianp@vmware.com>	2019-03-27 07:53:32 +10:00
Dave Airlie	7f7c9425a8	draw/vs: partly fix basevertex/vertex id This gets the basevertex from the draw depending on whether it's an indexed or non-indexed draw. We still fail a transform feedback test for vertex id, as the vertex id actually an index id, and isn't getting translated properly to a vertex id, suggestions on how/where to fix that welcome. Reviewed-by: Brian Paul <brianp@vmware.com>	2019-03-27 07:52:28 +10:00
Kristian H. Kristensen	a752422bd4	freedreno/ir3: Track whether shader needs derivatives In `1088b788` ("freedreno/ir3: find # of samplers from uniform vars") we started counting number of samplers based on the uniform vars instead of number of cat5 instructions. We used the number of samplers to determine whether to enable derivatives, but when we only use derivatives and no samplers, that now breaks. Track whether we need derivatives explicitly and use that to enable the state. Fixes: `1088b788` ("freedreno/ir3: find # of samplers from uniform vars") Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-03-25 18:36:48 -07:00
Andre Heider	12f11e6fe6	st/nine: enable csmt per default on iris iris is thread safe, enable csmt for a ~5% performace boost. Signed-off-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Axel Davy <davyaxel0@gmail.com>	2019-03-25 22:21:19 +01:00
Danylo Piliaiev	c8abe03f3b	i965,iris,anv: Make alpha to coverage work with sample mask From "Alpha Coverage" section of SKL PRM Volume 7: "If Pixel Shader outputs oMask, AlphaToCoverage is disabled in hardware, regardless of the state setting for this feature." From OpenGL spec 4.6, "15.2 Shader Execution": "The built-in integer array gl_SampleMask can be used to change the sample coverage for a fragment from within the shader." From OpenGL spec 4.6, "17.3.1 Alpha To Coverage": "If SAMPLE_ALPHA_TO_COVERAGE is enabled, a temporary coverage value is generated where each bit is determined by the alpha value at the corresponding sample location. The temporary coverage value is then ANDed with the fragment coverage value to generate a new fragment coverage value." Similar wording could be found in Vulkan spec 1.1.100 "25.6. Multisample Coverage" Thus we need to compute alpha to coverage dithering manually in shader and replace sample mask store with the bitwise-AND of sample mask and alpha to coverage dithering. The following formula is used to compute final sample mask: m = int(16.0 * clamp(src0_alpha, 0.0, 1.0)) dither_mask = 0x1111 * ((0xfea80 >> (m & ~3)) & 0xf) \| 0x0808 * (m & 2) \| 0x0100 * (m & 1) sample_mask = sample_mask & dither_mask Credits to Francisco Jerez <currojerez@riseup.net> for creating it. It gives a number of ones proportional to the alpha for 2, 4, 8 or 16 least significant bits of the result. GEN6 hardware does not have issue with simultaneous usage of sample mask and alpha to coverage however due to the wrong sending order of oMask and src0_alpha it is still affected by it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109743 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-03-25 13:54:55 -07:00
Dave Airlie	551950cacd	draw/gs: fix point size outputs from geometry shader. If the geom shader emits a point size we failed to find it here, use the correct API to look it up. Fixes: tests/spec/glsl-1.50/execution/geometry/point-size-out.shader_test Reviewed-by: Brian Paul <brianp@vmware.com>	2019-03-26 05:17:06 +10:00
Dave Airlie	d3836510d2	draw: bail instead of assert on instance count (v2) With indirect rendering it's fine to set the instance count parameter to 0, and expect the rendering to be ignored. Fixes assert in KHR-GLES31.core.compute_shader.pipeline-gen-draw-commands on softpipe v2: return earlier before changing fpstate Reviewed-by: Brian Paul <brianp@vmware.com>	2019-03-26 05:16:56 +10:00
Leo Liu	382401aab7	vl/dri3: remove the wait before getting back buffer The wait here is unnecessary since we got a pool of back buffers, and the wait for swap buffer will happen before the present pixmap, at the same time the previous back buffer will be put back to pool for reuse after the check for PresentIdleNotify event Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2019-03-25 12:20:31 -04:00
Kishore Kadiyala	e1d8057160	android: static link with libexpat with Android O+ In Android O, MESA needs to statically link libexpat so that it's in same VNDK namespace. v2: apply change also to anv driver (Tapani) v3: use += in anv change (Eric Engestrom) Change-Id: I82b0be5c817c21e734dfdf5bfb6a9aa1d414ab33 Signed-off-by: Kishore Kadiyala <kishore.kadiyala@intel.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-25 10:11:57 +02:00
Rob Clark	6fd5a7ff8c	freedreno: add ESSL cap Report 320 for a6xx, which isn't quite true (no geom/tess, in particular), but other caps keep the reported GL and GLSL versions correct (3.1 / 3.10 es). But reporting 320 will switch on EXT_gpu_shader5, which is the goal. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-22 16:39:14 -04:00
Rob Clark	de481947d9	gallium: add PIPE_CAP_ESSL_FEATURE_LEVEL Adds a new cap to allow drivers to expose higher shading language versions in GLES contexts, to avoid having to report an artificially low version for the benefit of GL contexts. The motivation is to expose EXT_gpu_shader5 even though a driver may not support all the features needed for the corresponding GL extension (ARB_gpu_shader5). Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-03-22 16:39:13 -04:00
Vinson Lee	93c81ca336	swr: Fix build with llvm-9.0. Fix build error after llvm-9.0svn r352827 ("[opaque pointer types] Add a FunctionCallee wrapper type, and use it."). In file included from ./rasterizer/jitter/builder.h:158:0, from swr_shader.cpp:35: ./rasterizer/jitter/gen_builder_meta.hpp: In member function ‘llvm::Value* SwrJit::Builder::VGATHERPD(llvm::Value, llvm::Value, llvm::Value, llvm::Value, llvm::Value, const llvm: :Twine&)’: ./rasterizer/jitter/gen_builder_meta.hpp:51:117: error: no matching function for call to ‘cast(llvm::FunctionCallee)’ Function pFunc = cast<Function>(JM()->mpCurrentModule->getOrInsertFunction("meta.intrinsic.VGATHERPD", pFuncTy)); ^ Suggested-by: Philip Meulengracht <the_meulengracht@hotmail.com> Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-03-22 13:13:51 -07:00
Samuel Pitoiset	23d30f4099	spirv,nir: lower frexp_exp/frexp_sig inside a new NIR pass This lowering isn't needed for RADV because AMDGCN has two instructions. It will be disabled for RADV in an upcoming series. While we are at it, factorize a little bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-22 19:41:46 +01:00
Chris Wilson	db99d02fce	iris: Push heavy memchecker code to DEBUG Invoking VALGRIND_CHECK_MEM_IS_DEFINED pulls in enough code to convince gcc to not inline __gen_uint and results in a lot of packing code ending up out-of-line with lots of stack copying. To ameliorate this, only insert the check inside the packer if DEBUG is defined and instead perform the validation checking before submitting the batch to the kernel. This should give accurate results if --trace-origins=yes is used, and failing that we can recompile in full debug mode to check on insertion. Improve drawoverhead baseline by 25% with a default build with valgrind-dev installed (with effectively no loss of vg coverage). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-22 10:38:03 -07:00
Kenneth Graunke	87f865aab3	iris: Fix batch chaining map_next increment. Caught by Chris Wilson; split out from his valgrind patch.	2019-03-22 09:31:15 -07:00
Rob Clark	dbac1a80d1	freedreno/ir3: rename has_kill to no_earlyz There are other cases where we need to disable early-z, like image writes. So rename to something more generic. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-22 08:53:28 -04:00
Kenneth Graunke	66c100a8d6	iris: Skip resolves and flushes altogether if unnecessary Improves drawoverhead baseline scores by 1.17x.	2019-03-21 20:28:17 -07:00
Kenneth Graunke	365886ebe1	iris: Skip framebuffer resolve tracking if framebuffer isn't dirty Improves drawoverhead baseline score by 1.86x.	2019-03-21 20:28:17 -07:00
Kenneth Graunke	1d05d24b1d	iris: Skip input resolve handling if bindings haven't changed This brings the drawoverhead 16 Tex w/ no state change score from 22% of baseline to 97% of baseline.	2019-03-21 20:28:17 -07:00
Kenneth Graunke	a342f2deb1	iris: Fix util_vma_heap_init size for IRIS_MEMZONE_SHADER Fixes assertions when disabling bucket allocators.	2019-03-21 19:07:17 -07:00
Dave Airlie	9dd92d08a5	softpipe: fix integer texture swizzling for 1 vs 1.0f The swizzling was putting float one in not integer 1. This fixes a lot of arb_texture_view-rendering-formats cases. Reviewed-by: Brian Paul <brianp@vmware.com>	2019-03-22 09:30:35 +10:00
Dave Airlie	aae5ba72ab	softpipe: remove shadow_ref assert. I don't think this really buys us anything and TG4 with cubemap arrays falls over because sampler == 2, but otherwise works fine. Fixes: ./bin/textureGather fs shadow r CubeArray repeat on softpipe with ARB_gpu_shader5 enabled. Reviewed-by: Brian Paul <brianp@vmware.com>	2019-03-22 09:30:29 +10:00
Dave Airlie	8dc8b1361a	softpipe: handle 32-bit bitfield inserts Fixes piglits if ARB_gpu_shader5 is enabled Reviewed-by: Brian Paul <brianp@vmware.com>	2019-03-22 09:30:26 +10:00
Dave Airlie	7b7cb1bc35	softpipe: fix 32-bit bitfield extract These didn't deal with the width == 32 case that TGSI is defined with. Fixes piglit tests if ARB_gpu_shader5 is enabled. Reviewed-by: Brian Paul <brianp@vmware.com>	2019-03-22 09:30:21 +10:00
Eric Anholt	16f2770eb4	v3d: Upload all of UBO[0] if any indirect load occurs. The idea was that we could skip uploading the constant-indexed uniform data and just upload the uniforms that are variably-indexed. However, since the VS bin and render shaders may have a different set of uniforms used, this meant that we had to upload the UBO for each of them. The first case is generally a fairly small impact (usually the uniform array is the most space, other than a couple of FSes in shader-db), while the second is a larger impact: 3DMMES2 was uploading 38k/frame of uniforms instead of 18k. Given that the optimization is of dubious value, has a big downside, and is quite a bit of code, just drop it. No change in shader-db. No change on 3DMMES2 (n=15).	2019-03-21 14:20:50 -07:00
Eric Anholt	320e96bace	v3d: Move constant offsets to UBO addresses into the main uniform stream. We'd end up with the constant offset in the uniform stream anyway, since they're bigger than small immediates. Avoids the extra uniforms and adds in the shader in favor of just adding once on the CPU. shader-db: total instructions in shared programs: 6496865 -> 6494851 (-0.03%) total uniforms in shared programs: 2119511 -> 2117243 (-0.11%)	2019-03-21 14:20:50 -07:00
Eric Anholt	c36d2793ec	v3d: Rename v3d_tmu_config_data to v3d_unit_data. I want to reuse this for encoding small constant UBO/SSBO offsets into the uniform stream to reduce the extra uniform loads and adds for the small constant offsets.	2019-03-21 14:20:50 -07:00
Karol Herbst	99f202432b	nv50/ir/nir: support gather offsets v2: only emit offsets if those are !0 Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-03-21 02:58:41 +00:00
Rafael Antognolli	e7c8402163	iris: Let blorp update the clear color for us. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:26 -07:00
Rafael Antognolli	93123417dd	iris: Track fast clear color. v2: Update tracked clear color when we update the surface state. v3: Update all aux surface states when updating the clear color. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:26 -07:00
Rafael Antognolli	5658c661de	iris: Stall on the CPU and resolve predication during fast clears. Only if the clear color/depth is changing. In those cases, it's hard to keep track of the current clear color, and aux state of some layers, when predication is enabled. So simplify everything by stalling on the few cases where we would have a fast clear color change with predication. v2: - fix comment (Ken) - explicitly check for predicate state after resolving it (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:26 -07:00
Rafael Antognolli	ce830a364e	iris: Add iris_resolve_conditional_render(). This function can be used to stall on the CPU and resolve the predicate for the conditional render. It will convert ice->state.predicate from IRIS_PREDICATE_STATE_USE_BIT to either IRIS_PREDICATE_STATE_RENDER or IRIS_PREDICATE_STATE_DONT_RENDER, depending on the result of the query. v2: - return void (Ken) - update the stored condition (Ken) - simplify the code leading to resolve the predicate (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Rafael Antognolli	131b42f0aa	iris: Implement fast clear color. If all the restrictions are satisfied, do a fast clear instead of regular clear. v2: - add perf_debug() when we can't fast clear (Ken) - improve comment: s/miptree/resource/ (Ken) - use swizzle_color_value from blorp (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Rafael Antognolli	7f6344a726	iris: Bring back check for srgb and fast clear color. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Rafael Antognolli	a8b5ea8ef0	iris: Add function to update clear color in surface state. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Rafael Antognolli	32c8fa6411	iris: Add helper to convert fast clear color. It needs to be converted to a value that can be used by ISL (and our hardware SURFACE_STATE structure). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Rafael Antognolli	51638cf18a	iris: Fast clear depth buffers. Check and do a fast clear instead of a regular clear on depth buffers. v3: - remove swith with some cases that we shouldn't wory about (Ken) - more parens into the has_hiz check (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Rafael Antognolli	34d00b4410	iris: Use the clear depth when emitting 3DSTATE_CLEAR_PARAMS. Take the clear depth into account when IRIS_DIRTY_DEPTH_BUFFER is marked as dirty. Also update the blorp surface clear color. v2: Use a single if (zres && zres->aux.bo) (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Rafael Antognolli	37f2692591	iris: Allocate buffer space for the fast clear color. Also store clear color in the iris_resource. Always allocate clear color state buffer. v2: - Make clear_color_offset be 64 bits (Ken). - Simplify the logic to decide when to memset the aux buffer (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 16:46:25 -07:00
Dave Airlie	04189565a0	softpipe: fix texture view crashes I noticed we crashed piglit arb_texture_view-rendering-formats when run on softpipe. This fixes the clear tiles to use the surface format not the underlying storage format. This fixes a bunch of srgb piglits as well. Fixes: `396ac41fc2` (softpipe: add integer support) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-03-21 05:06:07 +10:00
Kenneth Graunke	3c3f250456	nvc0: Skip new update barrier bits I added new barrier bits in `220c1dce1e` and made most drivers skip them. I thought nvc0 was already skipping those but missed the else case here, which does something. So make it explicitly skip like I did everywhere else. Thanks to Ilia for catching this. Fixes: `220c1dce1e` gallium: Add PIPE_BARRIER_UPDATE_BUFFER and UPDATE_TEXTURE bits.	2019-03-20 10:30:32 -07:00
Kenneth Graunke	220c1dce1e	gallium: Add PIPE_BARRIER_UPDATE_BUFFER and UPDATE_TEXTURE bits. The glMemoryBarrier() function makes shader memory stores ordered with respect to things specified by the given bits. Until now, st/mesa has ignored GL_TEXTURE_UPDATE_BARRIER_BIT and GL_BUFFER_UPDATE_BARRIER_BIT, saying that drivers should implicitly perform the needed flushing. This seems like a pretty big assumption to make. Instead, this commit opts to translate them to new PIPE_BARRIER bits, and adjusts existing drivers to continue ignoring them (preserving the current behavior). The i965 driver performs actions on these memory barriers. Shader memory stores go through a "data cache" which is separate from the render cache and other read caches (like the texture cache). All memory barriers need to flush the data cache (to ensure shader memory stores are visible), and possibly invalidate read caches (to ensure stale data is no longer visible). The driver implicitly flushes for most caches, but not for data cache, since ARB_shader_image_load_store introduced MemoryBarrier() precisely to order these explicitly. I would like to follow i965's approach in iris, flushing the data cache on any MemoryBarrier() call, so I need st/mesa to actually call the pipe->memory_barrier() callback. Fixes KHR-GL45.shader_image_load_store.advanced-sync-textureUpdate and Piglit's spec/arb_shader_image_load_store/host-mem-barrier on the iris driver. Roland said this looks reasonable to him. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-03-19 23:43:33 -07:00
Tapani Pälli	3e534489ec	iris: mark switch case fallthrough CID: 1444103 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 08:21:50 +02:00
Tapani Pälli	03cbfbd913	iris: initialize num_cbufs Currently initialized only if 'ish' is non-NULL. CID: 1444106 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-20 08:20:09 +02:00
Daniel Stone	d258b787fa	panfrost: Properly align stride Handle buffers whose width is not aligned to 16px by padding the stride and storing it accordingly. This does not reject imports for images whose stride is not sufficiently aligned. v2: make sure bo->stride is set on imported buffers, and add missing variable definition. (Tomeu) Tested-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-20 04:20:42 +00:00
Eric Anholt	17115da6ad	v3d: Expose the dma-buf modifiers query. This allows DRI3 to pick between UIF and raster according to whether we're pageflipping or not and whether the pageflipping display can do UIF, avoiding copies for the windowed/composited case that previously was forced to linear. Improves windowed glmark2 -b build:use-vbo=false performance by 30.7783% +/- 13.1719% (n=3)	2019-03-19 08:59:01 -07:00

1 2 3 4 5 ...

37284 commits