fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-01-29 05:10:23 +01:00

Author	SHA1	Message	Date
Ian Romanick	8eb36c9129	intel/fs: Emit logical-not of operands on Gen8+ On Gen8+ specifying negation of a logical operation such as AND actually performs a logical-not. Take advantage of this to generate fewer instructions. v2: Major rebase. Use nir_src_as_alu_instr. Fix swizzle handling. No changes on any pre-Gen8 platform. Skylake and Broadwell had similar results. (Broadwell shown) total instructions in shared programs: 15466902 -> 15466274 (<.01%) instructions in affected programs: 1262953 -> 1262325 (-0.05%) helped: 682 HURT: 4 helped stats (abs) min: 1 max: 5 x̄: 1.02 x̃: 1 helped stats (rel) min: 0.03% max: 2.40% x̄: 0.18% x̃: 0.04% HURT stats (abs) min: 1 max: 62 x̄: 17.50 x̃: 3 HURT stats (rel) min: 0.03% max: 1.89% x̄: 0.53% x̃: 0.10% 95% mean confidence interval for instructions value: -1.10 -0.73 95% mean confidence interval for instructions %-change: -0.19% -0.15% Instructions are helped. total cycles in shared programs: 410996093 -> 410950440 (-0.01%) cycles in affected programs: 144389048 -> 144343395 (-0.03%) helped: 519 HURT: 51 helped stats (abs) min: 1 max: 1060 x̄: 104.46 x̃: 140 helped stats (rel) min: 0.01% max: 10.98% x̄: 0.34% x̃: 0.03% HURT stats (abs) min: 1 max: 4060 x̄: 167.90 x̃: 22 HURT stats (rel) min: <.01% max: 8.20% x̄: 0.96% x̃: 0.25% 95% mean confidence interval for cycles value: -97.16 -63.02 95% mean confidence interval for cycles %-change: -0.32% -0.13% Cycles are helped. total spills in shared programs: 95311 -> 95329 (0.02%) spills in affected programs: 881 -> 899 (2.04%) helped: 0 HURT: 4 total fills in shared programs: 93629 -> 93634 (<.01%) fills in affected programs: 794 -> 799 (0.63%) helped: 1 HURT: 2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	06eaaf2de9	intel/fs: Refactor ALU source and destination handling to a separate function Other places will need to do this soon to properly handle source swizzles. The patch looks a little odd, but the change is pretty straight forward. All of the swizzle and mask handling is moved out, but the code for handling move instructions and vecN instructions remains in nir_emit_alu. I'm not terribly pleased with the "need_dest" parameter, but get_nir_dest is (somewhat surprisingly) destructive. I am open to suggestions of alternatives. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	fb3ca9109c	intel/fs: Handle OR source modifiers in algebraic optimization Found by inspection. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	c9d5bd050c	intel/fs: Relax type matching rules in cmod propagation from MOV instructions To allow cmod propagation from a MOV in a sequence like: and(16) g31<1>UD g20<8,8,1>UD g22<8,8,1>UD mov.nz.f0(16) null<1>F g31<8,8,1>D A similar change to the vec4 backend had no effect. Somewhere between `c1ec582059` and `40fc4b5acd` (1,094 commits) the effectiveness of this patch diminished, and as of commit `d7e0d47b9d` (nir: Add a bunch of b2[if] optimizations) this optimization no longer has any effect on any platform. A later patch "intel/fs: Use De Morgan's laws to avoid logical-not of a logic result on Gen8+," generates some instruction sequences that require this change in order for cmod propagation to make progress. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	eae19f5f19	nir/algebraic: Replace i2b used by bcsel or if-statement with comparison All of the helped shaders are in Deus Ex. I looked at a couple shaders, and they have a pattern like: vec1 32 ssa_373 = i2b32 ssa_345.w vec1 32 ssa_374 = bcsel ssa_373, ssa_20, ssa_0 ... vec1 32 ssa_377 = ine ssa_345.w, ssa_0 if ssa_377 { ... vec1 32 ssa_416 = i2b32 ssa_385.w vec1 32 ssa_417 = bcsel ssa_416, ssa_386, ssa_374 ... } The massive help occurs because the i2b32 is removed, then other passes determine that ssa_374 must be ssa_20 inside the if-statement allowing the first bcsel to also be deleted. v2: Rebase on 1-bit Boolean changes. v3: Fix i2b32 vs ine problem in if-statement replacement. Noticed by Bas. Skylake total instructions in shared programs: 15241394 -> 15186287 (-0.36%) instructions in affected programs: 890583 -> 835476 (-6.19%) helped: 355 HURT: 0 helped stats (abs) min: 1 max: 497 x̄: 155.23 x̃: 149 helped stats (rel) min: 0.09% max: 16.49% x̄: 6.10% x̃: 6.59% 95% mean confidence interval for instructions value: -165.07 -145.39 95% mean confidence interval for instructions %-change: -6.42% -5.77% Instructions are helped. total cycles in shared programs: 373846583 -> 371023357 (-0.76%) cycles in affected programs: 118972102 -> 116148876 (-2.37%) helped: 343 HURT: 14 helped stats (abs) min: 45 max: 118284 x̄: 8332.32 x̃: 6089 helped stats (rel) min: 0.03% max: 38.19% x̄: 2.48% x̃: 1.77% HURT stats (abs) min: 120 max: 4126 x̄: 2482.79 x̃: 3019 HURT stats (rel) min: 0.16% max: 17.37% x̄: 2.13% x̃: 1.11% 95% mean confidence interval for cycles value: -8723.28 -7093.12 95% mean confidence interval for cycles %-change: -2.57% -2.02% Cycles are helped. total spills in shared programs: 32401 -> 23465 (-27.58%) spills in affected programs: 24457 -> 15521 (-36.54%) helped: 343 HURT: 0 total fills in shared programs: 37866 -> 31765 (-16.11%) fills in affected programs: 18889 -> 12788 (-32.30%) helped: 343 HURT: 0 Broadwell and Haswell had similar results. (Haswell shown) Haswell total instructions in shared programs: 13764783 -> 13750679 (-0.10%) instructions in affected programs: 1176256 -> 1162152 (-1.20%) helped: 334 HURT: 21 helped stats (abs) min: 1 max: 358 x̄: 42.59 x̃: 47 helped stats (rel) min: 0.09% max: 11.81% x̄: 1.30% x̃: 1.37% HURT stats (abs) min: 1 max: 61 x̄: 5.76 x̃: 1 HURT stats (rel) min: 0.03% max: 1.84% x̄: 0.17% x̃: 0.03% 95% mean confidence interval for instructions value: -43.99 -35.47 95% mean confidence interval for instructions %-change: -1.35% -1.08% Instructions are helped. total cycles in shared programs: 386511910 -> 385402528 (-0.29%) cycles in affected programs: 143831110 -> 142721728 (-0.77%) helped: 327 HURT: 39 helped stats (abs) min: 16 max: 25219 x̄: 3519.74 x̃: 3570 helped stats (rel) min: <.01% max: 10.26% x̄: 0.95% x̃: 0.96% HURT stats (abs) min: 16 max: 4881 x̄: 1065.95 x̃: 997 HURT stats (rel) min: <.01% max: 16.67% x̄: 0.70% x̃: 0.24% 95% mean confidence interval for cycles value: -3375.59 -2686.60 95% mean confidence interval for cycles %-change: -0.92% -0.64% Cycles are helped. total spills in shared programs: 100480 -> 97846 (-2.62%) spills in affected programs: 84702 -> 82068 (-3.11%) helped: 316 HURT: 21 total fills in shared programs: 96877 -> 94369 (-2.59%) fills in affected programs: 69167 -> 66659 (-3.63%) helped: 316 HURT: 9 No changes on Ivy Bridge or earlier platforms. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:42:14 -08:00
Ian Romanick	d2056ab993	intel/vec4: Emit constants for some ALU sources as immediate values In some cases of flow control, the constant propagation is not able to determine that the source of an instruction must be a constant value. When we still have NIR SSA values, we can easily determine this. Emit the immediate value during code generation to possible avoid spurious loads of constants into registers. I wrote this patch to prevent a couple trivial regressions in vec4 shaders caused by "nir/algebraic: Replace i2b used by bcsel or if-statement with comparison". The final result was quite a bit better than that... No shader-db changes on any Gen8+ platform. v2: Assert that we never get a negation source modifier on Gen8+. Suggested by Ken. This should never happen because we don't normally use vec4 for Gen8+ (requires and environment variable to force it), and there's no code to generate these negations. Still, erring on the side of caution is better. Haswell total instructions in shared programs: 13776218 -> 13764783 (-0.08%) instructions in affected programs: 663931 -> 652496 (-1.72%) helped: 3495 HURT: 1 helped stats (abs) min: 1 max: 30 x̄: 3.28 x̃: 2 helped stats (rel) min: 0.21% max: 10.00% x̄: 1.79% x̃: 1.49% HURT stats (abs) min: 24 max: 24 x̄: 24.00 x̃: 24 HURT stats (rel) min: 12.24% max: 12.24% x̄: 12.24% x̃: 12.24% 95% mean confidence interval for instructions value: -3.39 -3.15 95% mean confidence interval for instructions %-change: -1.84% -1.75% Instructions are helped. total cycles in shared programs: 386818984 -> 386511910 (-0.08%) cycles in affected programs: 20379636 -> 20072562 (-1.51%) helped: 3052 HURT: 476 helped stats (abs) min: 2 max: 12516 x̄: 110.40 x̃: 6 helped stats (rel) min: 0.05% max: 24.68% x̄: 1.58% x̃: 0.69% HURT stats (abs) min: 2 max: 416 x̄: 62.76 x̃: 24 HURT stats (rel) min: 0.10% max: 10.75% x̄: 4.03% x̃: 2.18% 95% mean confidence interval for cycles value: -115.57 -58.51 95% mean confidence interval for cycles %-change: -0.93% -0.73% Cycles are helped. total spills in shared programs: 100482 -> 100480 (<.01%) spills in affected programs: 79 -> 77 (-2.53%) helped: 3 HURT: 1 total fills in shared programs: 96883 -> 96877 (<.01%) fills in affected programs: 85 -> 79 (-7.06%) helped: 4 HURT: 0 Ivy Bridge total instructions in shared programs: 12000562 -> 11990113 (-0.09%) instructions in affected programs: 572581 -> 562132 (-1.82%) helped: 3106 HURT: 0 helped stats (abs) min: 1 max: 30 x̄: 3.36 x̃: 2 helped stats (rel) min: 0.21% max: 10.00% x̄: 1.86% x̃: 1.49% 95% mean confidence interval for instructions value: -3.49 -3.23 95% mean confidence interval for instructions %-change: -1.91% -1.81% Instructions are helped. total cycles in shared programs: 180958504 -> 180664500 (-0.16%) cycles in affected programs: 19991810 -> 19697806 (-1.47%) helped: 2654 HURT: 486 helped stats (abs) min: 2 max: 12516 x̄: 121.61 x̃: 6 helped stats (rel) min: 0.05% max: 20.66% x̄: 1.48% x̃: 0.68% HURT stats (abs) min: 2 max: 396 x̄: 59.18 x̃: 24 HURT stats (rel) min: 0.05% max: 9.62% x̄: 3.82% x̃: 2.16% 95% mean confidence interval for cycles value: -125.62 -61.64 95% mean confidence interval for cycles %-change: -0.76% -0.56% Cycles are helped. Sandy Bridge total instructions in shared programs: 10842336 -> 10835438 (-0.06%) instructions in affected programs: 395340 -> 388442 (-1.74%) helped: 1926 HURT: 0 helped stats (abs) min: 1 max: 22 x̄: 3.58 x̃: 2 helped stats (rel) min: 0.10% max: 9.68% x̄: 1.78% x̃: 1.42% 95% mean confidence interval for instructions value: -3.73 -3.43 95% mean confidence interval for instructions %-change: -1.84% -1.72% Instructions are helped. total cycles in shared programs: 154590074 -> 154569050 (-0.01%) cycles in affected programs: 8159932 -> 8138908 (-0.26%) helped: 1670 HURT: 228 helped stats (abs) min: 2 max: 260 x̄: 18.13 x̃: 6 helped stats (rel) min: 0.02% max: 8.70% x̄: 0.74% x̃: 0.28% HURT stats (abs) min: 2 max: 1798 x̄: 40.58 x̃: 14 HURT stats (rel) min: 0.03% max: 12.97% x̄: 1.04% x̃: 0.31% 95% mean confidence interval for cycles value: -13.51 -8.64 95% mean confidence interval for cycles %-change: -0.60% -0.46% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8212357 -> 8206587 (-0.07%) instructions in affected programs: 323664 -> 317894 (-1.78%) helped: 1457 HURT: 0 helped stats (abs) min: 1 max: 12 x̄: 3.96 x̃: 3 helped stats (rel) min: 0.33% max: 11.49% x̄: 1.86% x̃: 1.44% 95% mean confidence interval for instructions value: -4.14 -3.78 95% mean confidence interval for instructions %-change: -1.93% -1.78% Instructions are helped. total cycles in shared programs: 187668016 -> 187657422 (<.01%) cycles in affected programs: 14856234 -> 14845640 (-0.07%) helped: 1372 HURT: 83 helped stats (abs) min: 2 max: 24 x̄: 7.92 x̃: 6 helped stats (rel) min: 0.02% max: 1.14% x̄: 0.12% x̃: 0.08% HURT stats (abs) min: 2 max: 14 x̄: 3.20 x̃: 2 HURT stats (rel) min: 0.03% max: 0.60% x̄: 0.12% x̃: 0.12% 95% mean confidence interval for cycles value: -7.65 -6.91 95% mean confidence interval for cycles %-change: -0.11% -0.10% Cycles are helped. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-01 12:41:46 -08:00
Eric Engestrom	fc82ea1350	Revert "swr/rast: Archrast codegen updates" This reverts the following commits: `71a76a47cc` "swr/codegen: fix autotools build" `7763e664ce` "meson/swr: replace hard-coded path with current_build_dir()" `773b3ceaca` "swr/rast: Fix autotools and scons codegen" `16e10b8c30` "swr/rast: Add general SWTag statistics" `b45a15a39f` "swr/rast: Add string handling to AR event framework" `8608a747aa` "swr/rast: Add initial SWTag proto definitions" `93cd9905c8` "swr/rast: Cleanup and generalize gen_archrast" The last one in this list broke all the build systems that can build this (meson, autotools & scons). See MR !304 for more details: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/304 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-01 16:46:32 +00:00
Fritz Koenig	12af6b30a3	freedreno/a6xx: Enable UBWC modifier Adding the supported_modifiers allows buffers to be created with UBWC	2019-03-01 15:51:16 +00:00
Fritz Koenig	4715e7a98a	freedreno: UBWC allocator UBWC requires space for a metadata or flag buffer that contains compression data. Each 16x4 tile of image data corresponds to a byte of compression data. This buffer needs to be stored before (at a lower address) the image buffer in order to match up with what the display driver. This allows the display driver to directly scan-out at UBWC buffer.	2019-03-01 15:51:16 +00:00
Fritz Koenig	3e6758a4e7	freedreno/a6xx: UBWC support Universal bandwidth compression(UBWC) reduces memory bandwidth by compressing buffers. This compression takes the form of a full sized image buffer as well as a smaller metadata buffer.	2019-03-01 15:51:16 +00:00
Fritz Koenig	41082446db	freedreno: pass count to query_dmabuf_modifiers query_dmabuf_modifiers needs to know the max number of modifiers that the list will hold.	2019-03-01 15:51:16 +00:00
Eric Engestrom	2793417ec6	anv: fix typo Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-01 11:20:28 +00:00
Eric Engestrom	258e463db5	anv: remove spaces around kwargs assignment pylint complains: > C0326: No space allowed around keyword argument assignment Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-01 11:20:28 +00:00
Eric Engestrom	7b704fd2fd	anv: drop unused parameter I'm guessing a previous version of this script used an index-based map of entrypoints, but that's not the case anymore. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-01 11:20:28 +00:00
Eric Engestrom	b503d4e458	anv: simplify chained comparison Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-01 11:20:28 +00:00
Caio Marcelo de Oliveira Filho	1458aa1f78	nir/copy_prop_vars: handle indirect vector elements Differently than the direct case, the indirect array derefs of vector are handled like regular derefs, with the exception that we ignore any vector entry that has SSA values when performing a load. Such SSA values don't help loading of the indirect unless we emit an if-ladder. Copy_derefs are supported for indirects. Also enable two tests that now pass. v2: Remove unnecessary temporaries. Be clearer when identifying the case where copy_entry doesn't help when we are dealing with an indirect array_deref (of a vector). (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-28 23:55:31 -08:00
Caio Marcelo de Oliveira Filho	6c0de78cc2	nir/copy_prop_vars: prefer using entries from equal derefs When looking up an entry to use, always prefer an equal match, as it more likely to contain reusable SSA or derefs to propagate. This will be necessary when adding entries with array derefs of vectors, because we don't want the vector if the equal entry (an array deref of that vector) is present. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-28 23:55:31 -08:00
Caio Marcelo de Oliveira Filho	61965afd00	nir/copy_prop_vars: add tests for indirect array deref Both on an actual array and on a vector, and an extra test on a vector mixing direct and indirect access. The vector tests are disabled and will be enabled by a later commit. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-28 23:55:31 -08:00
Caio Marcelo de Oliveira Filho	96c32d7776	nir/copy_prop_vars: handle load/store of vector elements When direct array deref is used on a vector type (for loads and stores), copy_prop_vars is now smart to propagate values it knows about. Given a 'vec4 v', storing to v[3] will update the copy entry for v and it is equivalent to a write to v.w. Loading from v[1] will try first to see if there's a known value for v.y -- and drop the load in that case. The copy entries still always refer to the entire vectors, so the operations happen on the parent deref (the 'vector') and the values are fixed accordingly. It might be the case now that certain entries have not only different SSA defs in each element but also those come from different components than they are set to, because stores to individual elements always come from a SSA definition with a single component. Tests related to these cases are now enabled. v2: Instead of asserting on invalid indices, "load" an undef and remove the store. (Jason) v3: Merge code path for the cases of is_array_deref_of_vector into the regular code path. Add a base_index parameter to value_set_from_value. (code changes by Jason) v4: Removed the get_entry_for_deref helper, now being used only once. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-28 23:50:05 -08:00
Caio Marcelo de Oliveira Filho	33dafdc024	nir/copy_prop_vars: use NIR_MAX_VEC_COMPONENTS Also replace uses of 0xf with the appropriate full mask created from the number of components. Note that an increase of MAX might make us change how the data is stored later on, but for now at least we make sure the pass is not hardcoded. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-28 23:50:05 -08:00
Caio Marcelo de Oliveira Filho	e84c841fb0	nir/copy_prop_vars: rename/refactor store_to_entry helper The name reflected this function role back when the pass also did dead write elimination. So rename it to what it does now, which is setting a value using another value; and narrow the argument list. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-28 23:50:05 -08:00
Christian Gmeiner	6c61449251	etnaviv: fix compile warnings Fixes the following compile warnings: [591/629] Compiling C object 'src/gallium/drivers/etnaviv/df32d18@@etnaviv@sta/etnaviv_context.c.o'. ../../src/ac_mesa/src/gallium/drivers/etnaviv/etnaviv_context.c: In function 'etna_cmd_stream_reset_notify': ../../src/ac_mesa/src/gallium/drivers/etnaviv/etnaviv_context.c:334:22: warning: unused variable 'entry' [-Wunused-variable] struct set_entry entry; ^~~~~ [604/629] Compiling C object 'src/gallium/drivers/etnaviv/df32d18@@etnaviv@sta/etnaviv_resource.c.o'. ../../src/ac_mesa/src/gallium/drivers/etnaviv/etnaviv_resource.c: In function 'etna_resource_used': ../../src/ac_mesa/src/gallium/drivers/etnaviv/etnaviv_resource.c:649:22: warning: unused variable 'entry' [-Wunused-variable] struct set_entry entry; ^~~~~ Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-03-01 08:45:05 +01:00
Christian Gmeiner	64813541d5	etnaviv: fix resource usage tracking across different pipe_context's A pipe_resource can be shared by all the pipe_context's hanging off the same pipe_screen. Changes from v2 -> v3: - add locking with mtx_*() to resource and screen (Marek) Changes from v3 -> v4: - drop rsc->lock, just use screen->lock for the entire serialization (Marek) - simplify etna_resource_used() flush condition, which also prevents potentially flushing resources twice (Marek) - don't remove resouces from screen->used_resources in etna_cmd_stream_reset_notify(), they may still be used in other contexts and may need flushing there later on (Marek) Changes from v4 -> v5: - Fix coding style issues reported by Guido Changes from v5 -> v6: - Add missing locking in etna_transfer_map(..) (Boris) Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Marek Vasut <marex@denx.de> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Tested-by: Marek Vasut <marex@denx.de> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Tested-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-03-01 08:08:56 +01:00
Christian Gmeiner	f1061fa577	etnaviv: enable ETC2 texture compression support for HALTI0 GPUs Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-03-01 08:02:17 +01:00
Christian Gmeiner	5d09325c1c	etnaviv: hook-up etc2 patching Changes v1 -> v2: - Avoid the GPU sampling from the resource that gets mutated by the the transfer map by setting DRM_ETNA_PREP_WRITE. Changes v2 -> v3: - make use of likely(..) - drop minor optimization regarding rsc->layout == ETNA_LAYOUT_LINEAR - better documentation why DRM_ETNA_PREP_WRITE is needed Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-03-01 08:02:17 +01:00
Christian Gmeiner	d8177f6233	etnaviv: keep track of mapped bo address Saves us from calling etna_bo_map(..) and saves us from doing the same offset calcs for map() and unmap() operations. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-03-01 08:02:17 +01:00
Christian Gmeiner	5bb4e6956d	etnaviv: implement ETC2 block patching for HALTI0 ETC2 is supported with HALTI0, however that implementation is buggy in hardware. The blob driver does per-block patching to work around this. We need to swap colors for t-mode etc2 blocks. Changes v2 -> v3: - Drop redundant format check Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Acked-by: Lucas Stach <l.stach@pengutronix.de>	2019-03-01 08:02:17 +01:00
Jason Ekstrand	e8f863e718	intel/compiler: Re-prefix non-logical surface opcodes with VEC4 The scalar back-end uses SHADER_OPCODE_SEND for all surface messages so we no longer need the non-logical opcodes there. Prefix them VEC4 so it's clear that they're only used by the vec4 back-end. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	95ae400abc	intel/schedule_instructions: Move some comments Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	aeaba24fcb	intel/compiler: Drop unused surface opcodes The unused typed surface read/write support in the vec4 back-end has been dropped and the fs back-end now uses SHADER_OPCODE_SEND for all image and buffer ops. There's no reason to keep these opcodes around anymore. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	a04c737215	intel/fs: Get rid of the IMAGE_SIZE opcode Since switching to SHADER_OPCODE_SEND for image operations, we no longer need the non-logical opcode. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	10b7d14c31	intel/vec4: Drop dead code for handling typed surface messages Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	9d437f9482	intel/fs: Drop the fs_surface_builder All of the actual abstraction (except possibly setting size_written) happens as part of the logical opcodes. The only thing that the surface builder is providing at this point is extra levels of functions to call through. I'm going to be adding bindless image support soon and all the extra abstraction here is just getting in the way. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	494a0543e6	intel/fs: Re-order logical surface arguments It makes more sense to start at the surface then move on to the address and then the data. Also, this is a really good test of whether or not we got all the places that use the sources by explicit integer number. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jason Ekstrand	94f8fd9a0c	intel/fs: Add an enum type for logical sampler inst sources Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-02-28 16:58:20 -06:00
Jose Fonseca	838c0485e0	scons: Workaround failures with MSVC when using SCons 3.0.[2-4]. This change applies the workaround suggested by Bill Deegan on the affected SCons versions. It also adds a comment with the URL explaining why we were using customizing the decider and max_drift in the first place, as I had forgotten all about it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109443 Tested-by: liviuprodea@yahoo.com Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-02-28 21:26:15 +00:00
Kristian H. Kristensen	87c2e8cbc9	freedreno: Fix a couple of warnings Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-28 10:43:53 -08:00
Kristian H. Kristensen	a5a19d1bc8	freedreno/a6xx: Don't zero SO buffer addresses Just disable SO in VPC_SO_BUF_CNTL. Less noise in dumps. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-28 10:43:53 -08:00
Kristian H. Kristensen	7dee916105	freedreno/a6xx: Only output MRT control for used framebuffers Not much of an optimization, but makes for less noise in the command buffer dumps. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-02-28 10:43:53 -08:00
Eric Engestrom	df5cd51259	gitlab-ci: install xmllint to validate 00-mesa-defaults.conf Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-02-28 17:30:48 +00:00
Eric Engestrom	bb6b691c57	driconf: add DTD to allow the drirc xml (00-mesa-defaults.conf) to be validated This DTD can be used to validate the drirc xml: $ xmllint --noout --valid 00-mesa-defaults.conf Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-02-28 17:30:44 +00:00
Eric Engestrom	4c3b293242	vulkan: use VkBase{In,Out}Structure instead of a custom struct VkBaseInStructure and VkBaseOutStructure are part of vulkan_core.h (which is part of vulkan.h) Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-02-28 16:25:59 +00:00
Lionel Landwerlin	add4b8930a	vulkan/overlay: add support for fps output in file Also make the sampling period configurable. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-28 12:40:57 +00:00
Lionel Landwerlin	b6b275212d	vulkan/overlay: rework option parsing Makes adding new options easier. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-28 12:40:57 +00:00
Lionel Landwerlin	4e29a1d36a	vulkan/overlay: fix min/max computations This shouldn't be condition to the acquire time being visible. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-02-28 12:40:57 +00:00
Emil Velikov	7ad1a05c83	egl/sl: use kms_swrast with vgem instead of a random GPU VGEM and kms_swrast were introduced to work with one another. All we do is CPU rendering to dumb buffers. There is no reason to carve out GPU memory, increasing the memory pressure on a device that could make a better use of it. Note: - The original code did not work out of the box, since the dumb buffer ioctls are not exposed to render nodes. - This requires libdrm commit 3df8a7f0 ("xf86drm: fallback to MODALIAS for OF less platform devices") - The non-kms, swrast is unaffected by this change. v2: - elaborate what and how is/isn't working (Eric) - simplify driver_name handling (Eric) v3: - move node_type outside of the loop (Eric) - kill no longer needed DRM_RENDER_DEV_NAME define Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-02-28 12:05:03 +00:00
Emil Velikov	218c7b5aca	egl/sl: use drmDevice API to enumerate available devices This provides for a more comprehensive iteration and slightly more straight-forward codebase. v2: - s/dpy/disp/ - keep original 64 devices (Eric) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-02-28 12:02:38 +00:00
Emil Velikov	893421f315	egl/sl: split out swrast probe into separate function Make the code a bit easier to read. As a bonus point this makes it obvious that we forgot to call _eglAddDevice() for the device - do so. v2: - s/dpy/disp/ (Eric) - free(driver_name) on dri2_load_driver_swrast() failure (Eric) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> (v1) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-02-28 12:02:19 +00:00
Juan A. Suarez Romero	b43b55d461	nir/spirv: return after emitting a branch in block When emitting a branch in a block, it does not make sense to continue processing further instructions, as they will not be reachable. This fixes a nasty case with a loop with a branch that both then-part and else-part exits the loop: %1 = OpLabel OpLoopMerge %2 %3 None OpBranchConditional %false %2 %2 %3 = OpLabel OpBranch %1 %2 = OpLabel [...] We know that block %1 will branch always to block %2, which is the merge block for the loop. And thus a break is emitted. If we keep continuing processing further instructions, we will be processing the branch conditional and thus emitting the proper NIR conditional, which leads to instructions after the break. This fixes dEQP-VK.graphicsfuzz.continue-and-merge. CC: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-02-28 09:47:06 +01:00
Eric Engestrom	0c3287e94d	egl/android: replace magic 0=CbCr,1=CrCb with simple enum Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-02-28 07:44:46 +00:00

... 134 135 136 137 138 ...

115447 commits