fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-07 04:58:05 +02:00

Author	SHA1	Message	Date
Boyuan Zhang	23c5e8bc58	radeon/vce: handle newly added parameters Replace the previous hardcoded value with newly defined parameters Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-14 09:49:21 +02:00
Boyuan Zhang	5490068fb1	st/omx: assign previous values to new structure Assign previously hardcoded values for OMX to newly defined structure. As a result, OMX behaviour will not change at all. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-14 09:49:14 +02:00
Boyuan Zhang	b86bf4b568	vl: add parameters for VAAPI encode Allow to specify more parameters in the encoding interface which previously just hardcoded in the encoder Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-07-14 09:49:07 +02:00
Christian König	9ce52baf7f	st/mesa: fix reference counting bug in st_vdpau Otherwise we leak the resources created for the DMA-buf descriptors. Signed-off-by: Christian König <christian.koenig@amd.com> Cc: 12.0 <mesa-stable@lists.freedesktop.org> Tested-and-Reviewed by: Leo Liu <leo.liu@amd.com> Ack-by: Tom St Denis <tom.stdenis@amd.com>	2016-07-14 09:33:44 +02:00
Eric Anholt	9194473dd2	vc4: Emit resets of the uniform stream at the starts of blocks. If a block might be entered from multiple locations, then the uniform stream will (probably) be at different points, and we need to make sure that it's pointing where we expect it to be. The kernel also enforces that any block reading a uniform resets uniforms, to prevent reading outside of the uniform stream by using looping.	2016-07-13 23:54:15 -07:00
Eric Anholt	44df061aaa	vc4: Add support for scheduling of branch instructions. For now we don't fill the delay slots, and instead just drop in NOPs.	2016-07-13 23:54:15 -07:00
Eric Anholt	a59da513d3	vc4: Move the QPU instructions to schedule into each block. We'll want to schedule them individually, to handle delay slots.	2016-07-13 23:54:15 -07:00
Eric Anholt	37ecc61662	vc4: Disable vc4_opt_vpm in the presence of control flow. It's a really valuable pass currently, but it will be a mess to rewrite for control flow. For now, just disable it if we have multiple blocks present.	2016-07-13 23:54:15 -07:00
Eric Anholt	ee69cfd11d	vc4: Convert vc4_opt_dead_code to work in the presence of control flow. With control flow, we can't be sure that we'll see the uses of a variable before its def as we walk backwards. Given that NIR is eliminating our long chains of dead code, a simple solution for now seems fine. This slightly changes the order of some optimizations, and so an opt_vpm happens before opt_dce, causing 3 dead MOVs to be turned into dead FMAXes in Minecraft: instructions in affected programs: 52 -> 54 (3.85%)	2016-07-13 23:54:15 -07:00
Eric Anholt	4e797bd98f	vc4: Update copy propagation for control flow. Previously, we could assume that a MOV from a temp was always an available copy, because all temps were SSA in NIR, and their non-SSA state in QIR was just due to the fact that they were from a bcsel or pack_unorm_4x8, so we could use the current value of the temp after that series of QIR instructions to define it. However, this is no longer the case with control flow. Instead, we track a new array of MOVs defined within the block that haven't had their source or dest killed yet, and use that primarily. We fall back to looking through the QIR defs array to handle across-block MOVs, but now require that copies from the SSA defs have an SSA src as well.	2016-07-13 23:54:15 -07:00
Samuel Iglesias Gonsálvez	94135e8736	i965/fs: emit DIM instruction to load 64-bit immediates in HSW v2 (Matt): - Use brw_imm_df() as source argument of DIM instruction. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-14 08:11:50 +02:00
Samuel Iglesias Gonsálvez	0534863c47	i965/eu: set DF imm value to the source of DIM According to HSW's PRM, vol02b, the DIM instruction has the following restriction: "Restriction : src0 must be immediate. src0 must specify the :f (F, Float) type encoding but is an immediate 64-bit DF (Double Float) value. dst must have type DF." This commit allows to upload the immediate 64-bit DF value to the source of a DIM instruction even when it is of float type encoding. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-14 08:06:01 +02:00
Samuel Iglesias Gonsálvez	6e28976d35	i965: enable the emission of the DIM instruction v2 (Matt): - Take a DF source argument for the DIM instruction emission in the visitors. - Indentation. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-14 08:06:01 +02:00
Jason Ekstrand	b9e99282a6	anv: Add a stub for CmdCopyQueryPoolResults on Ivy Bridge Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-13 20:31:27 -07:00
Timothy Arceri	a738732abf	i965: fix compiler warnings for 32bit build Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-07-14 12:03:59 +10:00
Tim Rowley	29f53d7937	Revert "gallium: Force blend color to 16-byte alignment" This reverts commit `d8d6091a84`. Heap allocations may be only 8-byte aligned on 32-bit system, and so having members with 16-byte alignment (such as in the case where pipe_blend_color is embedded in radeonsi's si_context) is undefined behavior which indeed causes crashes when compiled with gcc -O3. Cc: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96835 Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com> Acked-by: Chuck Atkins <chuck.atkins@kitware.com>	2016-07-13 13:55:33 -05:00
Jason Ekstrand	48ed8b6f26	isl/state: Add support for handling auxiliary surfaces Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	76e2dcc131	isl: Add an auxiliary surface usage enum Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	3ab3d97ac9	isl: Add support for color control surfaces Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	219024b9a7	isl: Add support for multisample compression surfaces Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	33dc8549fb	isl: Add support for HiZ surfaces Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	fc3650a0a9	isl: Kill off isl_format_layout::bs Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	1f0433f075	isl: Take bpb rather than bs in tiling_get_info Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	01855d7331	isl: Use bpb in a few places where it's more natural than bs Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	8c76b9bdce	isl: Use bpb for determining YUV image padding When we initially dropped bpb in favor of bs, we accidentally didn't change this one line properly. This brings it back to what it should be. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	cf9ff082b4	isl: Bring back isl_format_layout::bpb A while ago we got rid of the bits-per-block because we thought we didn't need it. We're about to introduce some very useful 1 and 2-bit formats so we really should be able to handle them again. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	0bd3a7e931	isl: Change the physical size of a W-tile to 128x32 Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	4b62c19c32	isl: Rework the way we define tile sizes. This is based on a very long set of discussions between Chad and myself about how we should properly represent HiZ and CCS buffers. The end result of that discussion was that a tiling actually has two different sizes, a logical size in elements, and a physical size in bytes and rows. This commit reworks ISL's pitch and size calculations to work in terms of these two sizes. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	7270bd0607	isl: Rework the way we handle surface padding Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	a52f26d6e8	isl: Use ARRAY_PITCH_SPAN_FULL for depth/stencil surfaces on gen7 We helpfully inserted a PRM quotation about how we need to use ARRAY_PITCH_SPAN_FULL and then set it to COMPACT. Oops... Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	0d48ac627a	isl: Stop multiplying height by block size The row pitch already specifies the size of a row of elements. Multiplying by the block height simply causes us to allocate as muc as 12 times more memory than needed for compressed textures. Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	58c1b1088b	isl: Get rid of tiling_get_extent It was unused Reviewed-by: Chad Versace <chad.versace@intel.com>	2016-07-13 11:47:37 -07:00
Jason Ekstrand	49476576dd	nir/spirv: Don't multiply the push constant block size by 4 I have no idea why we were multiplying by 4 before. The offsets we get from SPIR-V are in bytes and so is nir->num_uniforms so there's no need to do any adjustment whatsoever. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "12.0" <mesa-stable@lists.freedesktop.org>	2016-07-13 11:35:29 -07:00
Jason Ekstrand	1eed753ee8	anv/pipeline: Assert that the number of uniforms from NIR fits	2016-07-13 11:35:24 -07:00
Marek Olšák	0f7a6ea5e7	radeonsi: report accurate SGPR and VGPR spills Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	d227dbe272	radeonsi: add a workaround for a compute VGPR-usage LLVM bug v2: use abort(), describe which LLVM version is affected Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	f4d1de7f86	radeonsi: use LLVMGetTypeKind to tell if an input is an array of descriptors just a cleanup Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	785073ed0b	radeonsi: replace !tbaa with !invariant.load no change in generated code thanks to dereferenceable(n) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	348b9a5b1c	radeonsi: set dereferenceable attribute on descriptor arrays This allows moving the loads arbitrarily in the Sinking pass. 26002 shaders in 14643 tests Totals: SGPRS: 2080160 -> 2080160 (0.00 %) VGPRS: 798875 -> 797826 (-0.13 %) Spilled SGPRs: 108485 -> 79165 (-27.03 %) Spilled VGPRs: 327 -> 327 (0.00 %) Scratch VGPRs: 1656 -> 1652 (-0.24 %) dwords per thread Code Size: 36127192 -> 35559780 (-1.57 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 212464 -> 212672 (0.10 %) Wait states: 0 -> 0 (0.00 %) PERCENTAGES / App Shaders SGPRs VGPRs SpillSGPR SpillVGPR Scratch CodeSize MaxWaves Waits (unknown) 4 . . . . . . . . 0ad 6 . . . . . . . . alien_isolation 2938 . 0.04 % -8.53 % . . -0.71 % -0.06 % . anholt 10 . . . . . . . . batman_arkham_origins 589 . -0.58 % -79.54 % . . -6.72 % 0.57 % . bioshock-infinite 1769 . -0.65 % -89.32 % . . -4.73 % 0.48 % . borderlands2 3968 . -0.31 % -51.21 % . . -4.09 % 0.22 % . brutal-legend 338 . -0.03 % -2.95 % . . -0.06 % . . civilization_beyond.. 116 . . -14.17 % . . -0.88 % . . counter_strike_glob.. 1142 . . . . . . . . dirt-showdown 541 . -0.56 % -40.14 % . -3.45 % -1.82 % 0.35 % . dolphin 22 . . . . . 0.16 % . . dota2 1747 . . . . . 0.01 % . . europa_universalis_4 76 . -0.23 % -42.11 % . . -0.96 % . . f1-2015 774 . -0.09 % -28.89 % . . -2.60 % 0.09 % . furmark-0.7.0 4 . . . . . . . . gimark-0.7.0 10 . . . . . . . . glamor 16 . . . . . . . . humus-celshading 4 . . . . . . . . humus-domino 6 . . . . . . . . humus-dynamicbranching 24 . 0.71 % . . . 0.29 % -0.45 % . humus-hdr 10 . . . . . . . . humus-portals 2 . . . . . . . . humus-volumetricfog.. 6 . . . . . . . . left_4_dead_2 1762 . . . . . . . . metro_2033_redux 2670 . -0.10 % -7.15 % . . -0.03 % . . nexuiz 80 . . . . . . . . pixmark-julia-fp32 2 . . . . . . . . pixmark-julia-fp64 2 . . . . . . . . pixmark-piano-0.7.0 2 . . . . . . . . pixmark-volplosion-.. 2 . . . . . . . . plot3d-0.7.0 8 . . . . . . . . portal 474 . . . . . . . . sauerbraten 7 . . . . . . . . serious_sam_3_bfe 392 . . -13.20 % . . -1.81 % . . supertuxkart 4 . . . . . . . . talos_principle 324 . -0.21 % -18.39 % . . -2.73 % 0.14 % . team_fortress_2 808 . . . . . . . . tesseract 430 . 0.08 % -68.57 % . . -0.45 % . . tessmark-0.7.0 6 . . . . . . . . thea 172 . . . . . 0.03 % . . ue4_effects_cave 299 . -0.04 % -10.15 % . . -0.25 % 0.04 % . ue4_elemental 586 . -0.02 % -13.93 % . . -0.13 % 0.02 % . ue4_lightroom_inter.. 74 . -0.17 % -70.00 % . . -1.27 % . . ue4_realistic_rende.. 92 . . -32.58 % . . -0.35 % . . unigine_heaven 322 . 0.12 % -54.17 % . . -1.42 % -0.12 % . unigine_sanctuary 264 . . . . . . . . unigine_tropics 210 . . . . . . . . unigine_valley 278 . -0.15 % -40.74 % . . -2.00 % 0.09 % . unity 72 . . . . . 0.03 % . . warsow 176 . . . . . . . . warzone2100 4 . . . . . 0.13 % . . witcher2 1040 . -0.03 % -86.28 % . . -0.28 % 0.01 % . xcom_enemy_within 1236 . -0.24 % -63.54 % . . -0.93 % 0.18 % . yofrankie 82 . -0.61 % -100.00 % . . -0.83 % 0.41 % . ----------------------------------------------------------------------------------------------------------- Total 26002 . -0.13 % -27.03 % . -0.24 % -1.57 % 0.10 % . Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	6596ecf8c5	gallivm: add helper lp_add_attr_dereferenceable Not sure if this is the right way to do it, but it seems to work. v2: make it a no-op on LLVM <= 3.5 Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	bccf9de4df	radeonsi: clean up shader value metadata code No change in behavior. BTW, tbaa_md_kind == 1, which was the magic number in the code. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	d7d7e6adbe	radeonsi: remove LLVMNoUnwindAttribute uses always set by gallivm Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	c4807505c0	radeonsi: fix a typo in SI_PARAM_LINEAR_* handling introduced in `476e9cee1d` Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	f2f573e777	gallium/radeon: normalize the code style no change in behavior Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Marek Olšák	ed3912d0da	radeonsi: just save buffer sizes instead of buffers while recording IBs whole buffer objects are not needed Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-13 19:46:16 +02:00
Jon Turney	fc8139b146	Add c99_alloca.h include to fix compilation on Cygwin Fix compilation on Cygwin, since `50b22354`, by adding c99_alloca.h include, which should know how to portably make the alloc() prototype available. Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-13 16:11:36 +01:00
Topi Pohjolainen	7d29fee4a8	i965/blorp: Cleanup leftovers from push constant disabling Setup for pixel shader push constants is the same as for other stages. Note that on gen8+ the if-else branches were identical and the generation check for packet size redundant. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-13 12:10:03 +03:00
Topi Pohjolainen	26778da571	i965/blorp/gen7+: Bring back push constant setup This is partial revert of commit `cc2d0e64`. It looks that even though blorp disables a stage the corresponding 3DSTATE_CONSTANT_XS packet is needed to be programmed. Hardware seems to try to fetch the constants even for disabled stages. Therefore care needs to be taken that the constant buffer is set up properly. Blorp will continue to trash it into non-existing such as before. It is possible that this could be omitted on SKL where the constant buffer is considered when the corresponding binding table settings are changed. Bspec: "The 3DSTATE_CONSTANT_* command is not committed to the shader unit until the corresponding (same shader) 3DSTATE_BINDING_TABLE_POINTER_* command is parsed." However, as CONSTANT_XS packet itself does not seem to stall on its own, it is safer to emit the packets for SKL also. Possible alternative to blorp trashing could have been to setup defaults in the beginning of each batch buffer. However, hardware doesn't seem to tolerate these packets being programmed multiple times per primitive. Bspec for IVB: "It is invalid to execute this command more than once between 3D_PRIMITIVE commands." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96878 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-07-13 12:09:35 +03:00
Nicolai Hähnle	65d48fcf8c	radeonsi: silence Coverity warning Coverity's analysis is too weak to understand that r600_init_flushed_depth(_, _, NULL) only returns true when flushed_depth_texture was assigned a non-NULL value. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-13 09:52:39 +02:00
Samuel Iglesias Gonsálvez	a2bd7334ed	i965/fs: do d2x lowering before simd splitting So that we can have gen7 split large writes produced by this lowering pass. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-07-13 07:09:41 +02:00

... 3 4 5 6 7 ...

83410 commits