fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-04 20:38:06 +02:00

Author	SHA1	Message	Date
Krzysztof Raszkowski	07adc47460	gallium/swr: Fix crash when use GL_TDFX_texture_compression_FXT1 format. Reject the new formats in swr to prevent crashes because it doesn't know how to handle the new formats. Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	2019-12-03 16:51:24 +00:00
Rob Clark	b31637c453	gitlab-ci: disable junit results for deqp They don't seem to be hugely useful, and seem to be bogging down gitlab. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-03 08:46:39 -08:00
Jason Ekstrand	b1f37688ba	anv: Set up SBE_SWIZ properly for gl_Viewport gl_Viewport is also in the VUE header so we need to whack the read offset to 0 and emit a default (no overrides) SBE_SWIZ entry in that case as well. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-03 16:20:50 +00:00
Michel Dänzer	0c88d5952a	gitlab-ci: Update to current ci-templates master Fixes skopeo copy failures. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-12-03 16:03:31 +01:00
Samuel Pitoiset	f63a3132e8	ac/llvm: fix atomic var operations if source isn't a deref Fixes some CTS regressions. Fixes: `e61a826f39` ("ac/llvm: fix pointer type for global atomics") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-03 09:41:33 +01:00
Neil Armstrong	dde734030b	Add support for T820 CI Jobs Tomeu: - Small rebase fixups Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-12-03 06:44:08 +01:00
Dave Airlie	502548a09c	gallivm/llvmpipe: add support for front facing in sysval. This wires up the front facing value as a sysval, I'd like to remove the other facing code but I'd need to confirm VMware don't use it first. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-03 15:29:04 +10:00
Dave Airlie	f52cdaa517	llvmpipe/images: handle undefined atomic without crashing just return 0 for unbound atomic operations. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-03 15:29:04 +10:00
Alyssa Rosenzweig	71dd52e056	panfrost: Remove blend shader hack This is no longer used. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-03 04:25:04 +00:00
Tomeu Vizoso	c707b4d0f9	gitlab-ci: Test Panfrost on T720 GPUs Now that the Mali T720 GPU is supoprted at the same level as the T760, test it on PINE64 H64 boards. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig	6d05e38a96	gitlab-ci: Remove non-default skips from Panfrost During the past months, Panfrost has matured considerably and several tests stopped being flaky or failing at all. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-03 04:25:04 +00:00
Tomeu Vizoso	b655be7252	panfrost: White list the Mali T720 Support for this GPU is equal now to that of T760, so whitelist it. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig	8555bffafd	pan/midgard: Splatter on fragment out Make sure that the fragment is complete when writing it out. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-12-03 04:25:04 +00:00
Tomeu Vizoso	ab81a23d36	panfrost: Simplify shader patching We need to always upload anyway. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig	6ddaa5558a	panfrost: Simplify draw_flags Fixes dEQP-GLES3.functional.primitive_restart.*. Note the 0x18000 value is accidentally somehow enabling primitive restart for some reason. I'm not sure where this value came from but let's not. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig	9fb0904712	panfrost: Implement pan_tiler for non-hierarchy GPUs The algorithm is as described. Nothing fancy here, just need to add some new code paths depending on which model we're running on. Tomeu: - Also disable tiling when !hierarchy and !vertex_count - Avoid creating polygon lists smaller than the minimum when vertex_count > 0 but tile size smaller than 16 byte - Take into account tile size when calculating polygon list size for !hierarchy - Allow 0-sized tiles in a single dimension Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig	63cd5b8198	panfrost: Add information about T720 tiling We've figured out most of the big pieces, and though it looks faintly like other Midgards, it's much simpler. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-03 04:25:04 +00:00
Tomeu Vizoso	6887ff4e79	panfrost: Add quirks system to cmdstream Similarly to how it's already done in the compiler, add a way to express differences between GPU models that need to be taken into account when assembling the cmdstream. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-12-03 04:25:04 +00:00
Ian Romanick	fbd5359a0a	nir/algebraic: Rearrange bcsel sequences generated by nir_opt_peephole_select Reviewed-by: Matt Turner <mattst88@gmail.com> All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14660366 -> 14653437 (-0.05%) instructions in affected programs: 316166 -> 309237 (-2.19%) helped: 905 HURT: 10 helped stats (abs) min: 1 max: 36 x̄: 7.67 x̃: 6 helped stats (rel) min: 0.13% max: 18.75% x̄: 4.28% x̃: 3.60% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.10% max: 1.33% x̄: 0.70% x̃: 0.97% 95% mean confidence interval for instructions value: -7.91 -7.23 95% mean confidence interval for instructions %-change: -4.46% -3.99% Instructions are helped. total cycles in shared programs: 228571646 -> 228549759 (<.01%) cycles in affected programs: 56239919 -> 56218032 (-0.04%) helped: 681 HURT: 216 helped stats (abs) min: 1 max: 5156 x̄: 45.49 x̃: 10 helped stats (rel) min: <.01% max: 10.45% x̄: 1.29% x̃: 0.65% HURT stats (abs) min: 1 max: 320 x̄: 42.09 x̃: 14 HURT stats (rel) min: <.01% max: 37.04% x̄: 1.38% x̃: 0.49% 95% mean confidence interval for cycles value: -41.51 -7.29 95% mean confidence interval for cycles %-change: -0.80% -0.49% Cycles are helped. LOST: 1 GAINED: 0	2019-12-02 16:46:20 -08:00
Ian Romanick	780b5c1037	nir/algebraic: Simplify some Inf and NaN avoidance code Since a is non-negative, neither fsqrt nor frsq should return NaN. frsq should only return Inf when fsqrt returns 0. The changes are pretty small, but this turns a few hundred hurt shaders in the next patch into helped shaders. An alternative to the intBitsToFloat is to import numpy and do np.finfo(np.float32).max. That's more explicit, but we may also want to have specific bit encodings of float values later. I could be convinced either way, but intBitsToFloat(0x7f7fffff) was what I implemented first. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14661140 -> 14661104 (<.01%) instructions in affected programs: 7520 -> 7484 (-0.48%) helped: 36 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.32% max: 0.61% x̄: 0.49% x̃: 0.52% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.52% -0.47% Instructions are helped. total cycles in shared programs: 228585416 -> 228584806 (<.01%) cycles in affected programs: 56321 -> 55711 (-1.08%) helped: 32 HURT: 0 helped stats (abs) min: 2 max: 98 x̄: 19.06 x̃: 10 helped stats (rel) min: 0.08% max: 6.41% x̄: 1.09% x̃: 0.65% 95% mean confidence interval for cycles value: -28.32 -9.80 95% mean confidence interval for cycles %-change: -1.63% -0.54% Cycles are helped. Sandy Bridge total cycles in shared programs: 152991077 -> 152991075 (<.01%) cycles in affected programs: 11525 -> 11523 (-0.02%) helped: 2 HURT: 2 helped stats (abs) min: 2 max: 4 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.07% max: 0.11% x̄: 0.09% x̃: 0.09% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.08% max: 0.08% x̄: 0.08% x̃: 0.08% 95% mean confidence interval for cycles value: -5.27 4.27 95% mean confidence interval for cycles %-change: -0.16% 0.15% Inconclusive result (value mean confidence interval includes 0). No changes on Iron Lake or GM45.	2019-12-02 16:46:20 -08:00
Ian Romanick	d15344c0f5	intel/compiler: Increase nir_opt_peephole_select threshold I tried 2, 4, 6, 8, and 10. 8 seemed to be the sweet spot across all Intel platforms. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14736141 -> 14661140 (-0.51%) instructions in affected programs: 2272413 -> 2197412 (-3.30%) helped: 8416 HURT: 140 helped stats (abs) min: 1 max: 1152 x̄: 8.99 x̃: 6 helped stats (rel) min: 0.13% max: 42.55% x̄: 4.15% x̃: 3.20% HURT stats (abs) min: 1 max: 140 x̄: 4.73 x̃: 1 HURT stats (rel) min: 0.03% max: 3.44% x̄: 0.87% x̃: 0.60% 95% mean confidence interval for instructions value: -9.36 -8.17 95% mean confidence interval for instructions %-change: -4.14% -3.99% Instructions are helped. total cycles in shared programs: 231560416 -> 228585416 (-1.28%) cycles in affected programs: 126536021 -> 123561021 (-2.35%) helped: 7092 HURT: 1898 helped stats (abs) min: 1 max: 419320 x̄: 519.02 x̃: 159 helped stats (rel) min: <.01% max: 77.25% x̄: 13.52% x̃: 11.77% HURT stats (abs) min: 1 max: 14518 x̄: 371.91 x̃: 36 HURT stats (rel) min: <.01% max: 103.23% x̄: 5.92% x̃: 2.55% 95% mean confidence interval for cycles value: -514.34 -147.50 95% mean confidence interval for cycles %-change: -9.69% -9.14% Cycles are helped. total spills in shared programs: 5763 -> 5848 (1.47%) spills in affected programs: 1797 -> 1882 (4.73%) helped: 13 HURT: 13 total fills in shared programs: 17163 -> 16931 (-1.35%) fills in affected programs: 7214 -> 6982 (-3.22%) helped: 22 HURT: 19 total sends in shared programs: 730410 -> 730246 (-0.02%) sends in affected programs: 2705 -> 2541 (-6.06%) helped: 114 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.44 x̃: 1 helped stats (rel) min: 0.60% max: 20.00% x̄: 7.26% x̃: 5.88% 95% mean confidence interval for sends value: -1.55 -1.33 95% mean confidence interval for sends %-change: -7.90% -6.62% Sends are helped. LOST: 4 GAINED: 0 Sandy Bridge total instructions in shared programs: 10760511 -> 10724637 (-0.33%) instructions in affected programs: 961305 -> 925431 (-3.73%) helped: 3734 HURT: 110 helped stats (abs) min: 1 max: 151 x̄: 9.66 x̃: 8 helped stats (rel) min: 0.14% max: 41.21% x̄: 4.93% x̃: 3.95% HURT stats (abs) min: 1 max: 20 x̄: 1.68 x̃: 1 HURT stats (rel) min: 0.12% max: 5.41% x̄: 0.88% x̃: 0.52% 95% mean confidence interval for instructions value: -9.76 -8.91 95% mean confidence interval for instructions %-change: -4.90% -4.63% Instructions are helped. total cycles in shared programs: 153359411 -> 152991077 (-0.24%) cycles in affected programs: 11615401 -> 11247067 (-3.17%) helped: 2725 HURT: 1138 helped stats (abs) min: 1 max: 2844 x̄: 164.27 x̃: 80 helped stats (rel) min: 0.02% max: 48.60% x̄: 7.47% x̃: 3.91% HURT stats (abs) min: 1 max: 4351 x̄: 69.69 x̃: 25 HURT stats (rel) min: 0.02% max: 40.00% x̄: 3.39% x̃: 1.47% 95% mean confidence interval for cycles value: -103.18 -87.52 95% mean confidence interval for cycles %-change: -4.57% -3.97% Cycles are helped. total sends in shared programs: 584038 -> 583855 (-0.03%) sends in affected programs: 3512 -> 3329 (-5.21%) helped: 157 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.17 x̃: 1 helped stats (rel) min: 2.38% max: 25.00% x̄: 6.52% x̃: 6.06% 95% mean confidence interval for sends value: -1.26 -1.07 95% mean confidence interval for sends %-change: -7.17% -5.87% Sends are helped. LOST: 23 GAINED: 0 Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8122617 -> 8111592 (-0.14%) instructions in affected programs: 380503 -> 369478 (-2.90%) helped: 912 HURT: 86 helped stats (abs) min: 1 max: 129 x̄: 12.19 x̃: 9 helped stats (rel) min: 0.30% max: 39.21% x̄: 3.69% x̃: 2.57% HURT stats (abs) min: 1 max: 2 x̄: 1.05 x̃: 1 HURT stats (rel) min: 0.12% max: 3.64% x̄: 0.54% x̃: 0.36% 95% mean confidence interval for instructions value: -12.00 -10.10 95% mean confidence interval for instructions %-change: -3.56% -3.10% Instructions are helped. total cycles in shared programs: 188509780 -> 188534398 (0.01%) cycles in affected programs: 7211542 -> 7236160 (0.34%) helped: 859 HURT: 132 helped stats (abs) min: 2 max: 690 x̄: 46.59 x̃: 16 helped stats (rel) min: 0.01% max: 26.76% x̄: 1.53% x̃: 0.33% HURT stats (abs) min: 2 max: 1592 x̄: 489.67 x̃: 618 HURT stats (rel) min: 0.03% max: 185.92% x̄: 23.35% x̃: 6.26% 95% mean confidence interval for cycles value: 9.58 40.10 95% mean confidence interval for cycles %-change: 0.65% 2.93% Cycles are HURT.	2019-12-02 16:46:20 -08:00
Ian Romanick	e342d6970b	nir/opt_peephole_select: Don't count some unary operations In many cases, fsat, fneg, fabs, ineg, and iabs will get folded into another instruction as either source or destination modifiers. Counting them as instructions means that some if-statements won't get converted to selects. For example, vec1 32 ssa_25 = flt32 ssa_0, ssa_23.x /* succs: block_1 block_2 / if ssa_25 { block block_1: / preds: block_0 / vec1 32 ssa_26 = fabs ssa_24 vec1 32 ssa_27 = fneg ssa_26 vec1 32 ssa_28 = fabs ssa_20 vec1 32 ssa_29 = fneg ssa_28 vec1 32 ssa_30 = fmul ssa_27, ssa_29 vec1 32 ssa_31 = fsat ssa_30 / succs: block_3 / } else { block block_2: / preds: block_0 / / succs: block_3 / } block block_3: / preds: block_1 block_2 */ block_1 isn't really 6 instructions, but it will be counted that way. Most callers of the peephole_select pass use either 1 or 8. It's very easy to blow way past either of these limits with things that are really only one or two actual instructions. I also tried some fancier things like making sure the fsat was of another SSA def from the same block, but the simple test was actually better. The i965 back-end SEL peephole pass still helps ~700 shaders in shader-db with this change. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> All Gen6+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14743694 -> 14738910 (-0.03%) instructions in affected programs: 156575 -> 151791 (-3.06%) helped: 1204 HURT: 0 helped stats (abs) min: 1 max: 27 x̄: 3.97 x̃: 3 helped stats (rel) min: 0.15% max: 19.57% x̄: 5.15% x̃: 4.55% 95% mean confidence interval for instructions value: -4.12 -3.82 95% mean confidence interval for instructions %-change: -5.35% -4.95% Instructions are helped. total cycles in shared programs: 231749141 -> 231602916 (-0.06%) cycles in affected programs: 2818975 -> 2672750 (-5.19%) helped: 876 HURT: 322 helped stats (abs) min: 2 max: 788 x̄: 180.99 x̃: 220 helped stats (rel) min: <.01% max: 43.82% x̄: 20.75% x̃: 19.44% HURT stats (abs) min: 1 max: 1188 x̄: 38.27 x̃: 20 HURT stats (rel) min: 0.09% max: 102.67% x̄: 5.17% x̃: 1.70% 95% mean confidence interval for cycles value: -130.47 -113.64 95% mean confidence interval for cycles %-change: -14.85% -12.72% Cycles are helped. total sends in shared programs: 730495 -> 730491 (<.01%) sends in affected programs: 46 -> 42 (-8.70%) helped: 2 HURT: 0 Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8122757 -> 8122617 (<.01%) instructions in affected programs: 14716 -> 14576 (-0.95%) helped: 46 HURT: 1 helped stats (abs) min: 1 max: 8 x̄: 3.07 x̃: 3 helped stats (rel) min: 0.36% max: 10.00% x̄: 2.54% x̃: 1.06% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.59% max: 1.59% x̄: 1.59% x̃: 1.59% 95% mean confidence interval for instructions value: -3.42 -2.54 95% mean confidence interval for instructions %-change: -3.28% -1.62% Instructions are helped. total cycles in shared programs: 188510100 -> 188509780 (<.01%) cycles in affected programs: 58994 -> 58674 (-0.54%) helped: 32 HURT: 1 helped stats (abs) min: 2 max: 96 x̄: 10.06 x̃: 6 helped stats (rel) min: 0.05% max: 15.29% x̄: 1.37% x̃: 0.31% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.68% max: 0.68% x̄: 0.68% x̃: 0.68% 95% mean confidence interval for cycles value: -16.34 -3.06 95% mean confidence interval for cycles %-change: -2.46% -0.15% Cycles are helped.	2019-12-02 16:46:19 -08:00
Jordan Justen	e277009d8d	iris: Allow max dynamic pool size of 2GB for gen12 Reworks: * Adjust comment to list the state packets that curro found to be affected. Fixes: `8125d7960b` ("intel/dev: Add preliminary device info for Tigerlake") Cc: 19.3 <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-12-02 16:34:12 -08:00
Marek Olšák	7730d583c2	radeonsi/gfx10: fix the vertex order for triangle strips emitted by a GS Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-12-02 18:22:27 -05:00
Marek Olšák	91da6a98e7	radeonsi/gfx10: simplify some duplicated NGG GS code Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-12-02 18:22:25 -05:00
Jonathan Gray	4913215d14	util/u_thread: don't restrict u_thread_get_time_nano() to __linux__ pthread_getcpuclockid() and clock_gettime() are also available on at least OpenBSD, FreeBSD, NetBSD, DragonFly, Cygwin. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-12-02 17:23:49 -05:00
Jonathan Gray	c91997b6c4	util/futex: use futex syscall on OpenBSD Make use of the futex syscall added in OpenBSD 6.2. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-12-02 17:23:49 -05:00
Kenneth Graunke	dbe923bff9	meson: Add a "prefer_iris" build option Enabling this option makes Intel Gen8-11 hardware load the 'iris' driver by default instead of the older 'i965' driver. Regardless of how this option is set, users can still override which driver the loader selects via two methods. The first is to create a ~/.drirc or /etc/drirc file with the following snippet: <driconf> <device driver="loader" kernel_driver="i915"> <option name="dri_driver" value="i965" /> </device> </driconf> The other option is to set an environment variable: export MESA_LOADER_DRIVER_OVERRIDE=i965 For now, "prefer_iris" defaults to i965 (the historical choice). A separate future patch will change the default driver to iris. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1893 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-02 12:56:27 -08:00
Jonathan Marek	bebfb17a2b	turnip: fix display wsi fence timing out Fixes: `df9f2adf` ("turnip: add display wsi") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-02 14:29:47 -05:00
Rhys Perry	5404b7aaa3	nir/lower_io_to_vector: don't create arrays when not needed Some backends require that there are no array varyings. If there were no arrays in the input shader, the pass shouldn't have to create new ones. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2103 Fixes: `bcd14756ee` ('nir/lower_io_to_vector: add flat mode') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-12-02 17:45:01 +00:00
Rhys Perry	01cacdb71e	aco: fix block_kind_discard s_andn2 definition to exec Improves generated code of dEQP-VK.graphicsfuzz.disc-and-add-in-func-in-loop because a loop exit phi can then be fixed to exec, removing copies and improving jump threading. No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-02 16:56:24 +00:00
Rhys Perry	0e8da9f607	aco: handle loop exit and IF merge phis with break/discard ACO considers discards jumps and creates edges in the CFG for them but NIR does neither of these. This can be fixed instead by keeping track of whether a side of an IF had a break/discard, but this doesn't solve the issue with discards affecting loop exit phis. So this reworks phi handling a bit. Fixes these tests: dEQP-VK.graphicsfuzz.disc-and-add-in-func-in-loop dEQP-VK.graphicsfuzz.loop-call-discard dEQP-VK.graphicsfuzz.complex-nested-loops-and-call Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-02 16:56:19 +00:00
Rhys Perry	06fc83989c	aco: validate the CFG Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-02 16:56:05 +00:00
Alejandro Piñeiro	b6fd679a9e	mesa/main/util: moving gallium u_mm to util, remove main/mm Right now there are two copies of mm: * mesa/main/mm.[ch] * gallium/auxiliary/util/u_mm.[ch] At some point they splitted, and from the commit message it was not clear why it was not possible to have only one copy at a common place. Taking into account that was several years ago, Im assuming that it was not possible then. This change would allow to have one copy of the same code, and also being able to use that code out of mesa/main or gallium, if needed. This commit moves u_mm and removes mm, as u_mm has slightly more changes. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-12-02 13:59:28 +01:00
Rhys Perry	35fab1ba33	radv: set writes_memory for global memory stores/atomics Fixes: `13ab63bb62` ('radv: Implement VK_EXT_buffer_device_address.') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-02 11:47:12 +00:00
Rhys Perry	a814f3d8a7	ac/llvm: improve sync scope for global atomics Stronger ordering is implemented in SPIRV->NIR with barriers. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-12-02 10:48:27 +00:00
Rhys Perry	e61a826f39	ac/llvm: fix pointer type for global atomics Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-12-02 10:48:18 +00:00
Kenneth Graunke	1d416ffd09	iris: Map FXT1 texture formats This exposes GL_TDFX_texture_compression_FXT1 support. It's ancient, only Intel GPUs appear to support it, and I seriously doubt anybody uses it. But i965 supports it, and it's trivial to do, so we may as well support it in the new iris driver as well. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-01 22:55:56 -08:00
Kenneth Graunke	1bdd342b60	st/mesa: Add GL_TDFX_texture_compression_FXT1 support Eric recently added PIPE_FORMAT_FXT1_RGB[A] as part of his format unification work. This was really most of the work of implementing the extension. We just need to handle it in a couple of places and expose the extension. v2: Reject the new formats in llvmpipe_is_format_supported to prevent crashes because it doesn't know how to handle the new formats. Reviewed-by: Marek Olšák <marek.olsak@amd.com> [v1] Reviewed-by: Eric Anholt <eric@anholt.net> [v1]	2019-12-01 22:55:21 -08:00
Dave Airlie	3e21e17b2f	nir/samplers: don't zero samplers_used/txf. This allows this pass to be run multiple times and the results are just or'ed together. It fixes on test on llvmpipe nir, and regresses none. Suggested by Kenneth Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-02 09:15:55 +10:00
Samuel Pitoiset	0eb78a078e	aco: drop useless lowering of deref operations for shared memory Moved to RADV. No pipeline-db changes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 21:58:25 +01:00
Samuel Pitoiset	c105e6169c	radv,ac/nir: lower deref operations for shared memory This shouldn't introduce any functional changes for RadeonSI when NIR is enabled because these operations are already lowered. pipeline-db (NAVI10/LLVM): SGPRS: 9043 -> 9051 (0.09 %) VGPRS: 7272 -> 7292 (0.28 %) Code Size: 638892 -> 621628 (-2.70 %) bytes LDS: 1333 -> 1331 (-0.15 %) blocks Max Waves: 1614 -> 1608 (-0.37 %) Found this while glancing at some F12019 shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-29 21:58:18 +01:00
Daniel Schürmann	b690543851	aco: fix a couple of value numbering issues Fixes: `3a20ef4a32` 'aco: refactor value numbering' Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-29 21:54:27 +01:00
Daniel Schürmann	8861a82be7	aco: don't split live-ranges of linear VGPRs Fixes: `93c8ebfa78` 'aco: Initial commit of independent AMD compiler' Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-29 21:54:27 +01:00
Rhys Perry	73783ed389	aco: implement global atomics Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:02 +00:00
Rhys Perry	389ee819c0	aco: improve FLAT/GLOBAL scheduling Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:02 +00:00
Rhys Perry	cc742562c1	aco: don't enable store_global for helper invocations Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:02 +00:00
Rhys Perry	31e68e230f	aco: fix SADDR with FLAT on GFX10 The reference guide is incorrect and SADDR is actually used with FLAT on GFX10. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:01 +00:00
Rhys Perry	082e3a68fa	aco: fix assembly of FLAT/GLOBAL atomics They can take both a definition and data operand Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:01 +00:00
Rhys Perry	f1381e6715	aco: fix GFX10 opcodes for some global/flat atomics Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:01 +00:00

1 2 3 4 5 ...

118124 commits