fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-04-16 02:00:36 +02:00

Author	SHA1	Message	Date
Rob Clark	c19c4bf488	freedreno/ir3: fix crash Fixes a crash in dEQP-GLES3.functional.shaders.fragdepth.compare.fragcoord_z Fixes: `0d240c2214` freedreno/ir3: don't fetch unused tex components Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-13 15:51:01 -05:00
Rob Clark	3e8e033f4c	freedreno: also set DUMP flag on shaders If we emit shader as a pointer to a GEM object, also set the RELOC_DUMP flag as a hint to kernel that this is a useful buffer to snapshot for debug dumps. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-13 15:51:01 -05:00
Rob Clark	4cd016b5d6	freedreno: debug GEM obj names With a recent enough kernel, set debug names for GEM BOs, which will show up in $debugfs/gem Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-13 15:51:01 -05:00
Rob Clark	7ef722861b	freedreno/drm: sync uapi and enable softpin Pull in updated UAPI and use kernel API version to enable softpin. Since MSM_SUBMIT_BO_DUMP flag was added at same time, use that to signal to kernel that cmdstream buffers are useful to dump for debugging/cmdstream-traces. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-13 15:51:01 -05:00
Eric Anholt	4407e688cd	nir: Move intel's half-float image store lowering to to nir_format.h. I needed the same function for v3d. This was originally in `d3e046e76c` ("nir: Pull some of intel's image load/store format conversion to nir_format.h") before we made am istake about simplifying the function. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-13 12:24:26 -08:00
Eric Anholt	3a417a044e	Revert "intel: Simplify the half-float packing in image load/store lowering." This reverts commit `06fbcd2cd5`. nir_pack_half_2x16_split isn't vectorizable, it's 1-component only, thus why we had this split-scalar code in the first place. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-13 12:24:24 -08:00
Eric Anholt	c2c44dba7a	nir: Print the format of image variables. This helps a lot when debugging image load/store lowering on large testcases. Unfortunately the Mesa enum name stuff is under src/mesa and we can't get at it from the compiler. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-13 12:24:12 -08:00
Eric Anholt	19ffcba161	mesa/st: Expose compute shaders when NIR support is advertised. We have a NIR path, and V3D doesn't have TGSI input for compute (only what TTN can handle for the various gallium-internal shaders). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-12-13 11:44:47 -08:00
Dave Airlie	b3f2b03ece	radv/xfb: fix counter buffer bounds checks. If we gave this function 0 counter buffers, we'd still try and access pCounterBuffers[0] as this check was incorrect. Fixes crash with ext_transform_feedback-pipeline-basic-primgen on zink on radv. Fixes: `677b496b6` (radv: fix begin/end transform feedback with 0 counter buffers.) Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-13 19:27:05 +00:00
Jason Ekstrand	9ebc00f32e	i965: Enable nir_opt_idiv_const for 32 and 64-bit integers The pass should work for all bit sizes but it's less clear that the extra instructions are worth it on small integers. Also, the hardware doesn't do mul_high on anything other than 32-bit integers and, absent any decent mechanism for testing the pass on 8 and 16-bit types, it's probably best to just leave it disabled for now. Shader-db results on Sky Lake: total instructions in shared programs: 15105795 -> 15111403 (0.04%) instructions in affected programs: 72774 -> 78382 (7.71%) helped: 0 HURT: 265 Note that hurt here actually means helped because we're getting rid of integer quotient operations (which are a send on some platforms!) and replacing them with fairly cheap ALU ops. Reviewed-by: Ian Romanick ian.d.romanick@intel.com	2018-12-13 17:49:48 +00:00
Jason Ekstrand	455ec7327d	i965/vec4: Implement nir_op_uadd_sat Reviewed-by: Ian Romanick ian.d.romanick@intel.com	2018-12-13 17:49:48 +00:00
Ian Romanick	e639d39faf	i965/fs: Implement nir_op_uadd_sat Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-13 17:49:48 +00:00
Jason Ekstrand	74492ebad9	nir: Add a pass for lowering integer division by constants It's a reasonably well-known fact in the world of compilers that integer divisions by constants can be replaced by a multiply, an add, and some shifts. This commit adds such an optimization to NIR for easiest case of udiv. Other division operations will be added in following commits. In order to provide some additional driver control, the pass takes a minimum bit size to optimize. Reviewed-by: Ian Romanick ian.d.romanick@intel.com	2018-12-13 17:49:48 +00:00
Ian Romanick	090e282407	nir: Add a saturated unsigned integer add opcode Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-13 17:49:48 +00:00
Jason Ekstrand	39198a1238	nir/lower_int64: Add support for [iu]mul_high Reviewed-by: Ian Romanick ian.d.romanick@intel.com	2018-12-13 17:49:48 +00:00
Jason Ekstrand	9525971e2b	nir: Allow [iu]mul_high on non-32-bit types Reviewed-by: Ian Romanick ian.d.romanick@intel.com	2018-12-13 17:49:48 +00:00
Emil Velikov	a95ec13879	glx: mandate xf86vidmode only for "drm" dri platforms Currently we have the three dri "platforms" - drm, apple and windows. Since xf86vidmode is a thing only for the drm one, adjust the preprocessor guards and correctly check for the dependency. v2: terminate the GLX_USE_WINDOWSGL hunk Cc: Jon TURNEY <jon.turney@dronecode.org.uk> Fixes: `5bc509363b` ("glx: make xf86vidmode mandatory for direct rendering") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2018-12-13 17:38:19 +00:00
Alejandro Piñeiro	c7bdcd67aa	nir: remove unused variable To avoid the following warning: ./src/compiler/nir/nir_loop_analyze.c:807:16: warning: unused variable ‘ns’ [-Wunused-variable] nir_shader *ns = impl->function->shader; Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-13 16:35:21 +01:00
Erik Faye-Lund	e888f28d1f	virgl: work around bad assumptions in virglrenderer Virglrenderer does the wrong thing when given an instance divisor; it tries to use the element-index rather than the binding-index as the argument to glVertexBindingDivisor(). This worked fine as long as there was a 1:1 relationship between elements and bindings, which was the case util `19a91841c3` "st/mesa: Use Array._DrawVAO in st_atom_array.c.". So let's detect instance divisors, and restore a 1:1 relationship in that case. This will make old versions of virglrenderer behave correctly. For newer versions, we can consider making a better interface, where the instance divisor isn't specified per element, but rather per binding. But let's save that for another day. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `19a91841c3` "st/mesa: Use Array._DrawVAO in st_atom_array.c." Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Tested-By: Gert Wollny <gert.wollny@collabora.com>	2018-12-13 16:12:10 +01:00
Erik Faye-Lund	8447b64238	virgl: wrap vertex element state in a struct This just has one member for now; the handle. But this is about to change. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Tested-By: Gert Wollny <gert.wollny@collabora.com>	2018-12-13 16:12:10 +01:00
Erik Faye-Lund	b702ff5378	virgl: simplify virgl_hw_set_index_buffer Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Tested-By: Gert Wollny <gert.wollny@collabora.com>	2018-12-13 16:12:10 +01:00
Erik Faye-Lund	00143a6241	virgl: simplify virgl_hw_set_vertex_buffers Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Tested-By: Gert Wollny <gert.wollny@collabora.com>	2018-12-13 16:12:10 +01:00
Juan A. Suarez Romero	0991085f66	docs: update calendar, add news item and link release notes for 18.2.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-12-13 15:45:20 +01:00
Juan A. Suarez Romero	e0b0995dcf	docs: add sha256 checksums for 18.2.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `e90429cc6d`)	2018-12-13 15:42:49 +01:00
Juan A. Suarez Romero	c8a17b45ea	docs: add release notes for 18.2.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `419ee20097`)	2018-12-13 15:42:46 +01:00
Samuel Pitoiset	5088ba2aeb	radv: don't check if format is depth in radv_image_can_enable_hile() This is always TRUE if htile_size is not 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-13 09:21:21 +01:00
Samuel Pitoiset	eb0034fe28	radv: check if addrlib enabled HTILE in radv_image_can_enable_htile() When hile_size is 0, we can't enable HTILE. This doesn't change anything, except not calling radv_image_alloc_htile(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-13 09:21:19 +01:00
Samuel Pitoiset	d8325f1f07	radv: switch on EOP when primitive restart is enabled with triangle strips Otherwise, Yakuza hangs the GPU with DXVK. We don't know if linetrip and pointlist are affected, so my point is to do that only for triangle strips. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-13 09:21:16 +01:00
Samuel Pitoiset	74cf3b627c	radv: allow to skip DCC decompressions with the new predicate Feral games aren't affected because they don't decompress DCC. F1 2018 has one DCC decompression per frame, but I don't see any performance improvements. This new predicate will be probably more useful for DCC/MSAA. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-13 09:21:14 +01:00
Samuel Pitoiset	3a5adc2879	radv: add a predicate for reflecting DCC decompression state It's somehow similar to the FCE predicate. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-13 09:21:10 +01:00
Jordan Justen	c506eae53d	i965/compute: Emit GPGPU_WALKER in genX_state_upload Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-12 22:28:06 -08:00
Jordan Justen	1b85c605a6	i965/genX_state: Add register access functions Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-12 22:28:02 -08:00
Eric Anholt	06fbcd2cd5	intel: Simplify the half-float packing in image load/store lowering. This was noted by Jason in review when I tried to make a helper for the old path. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-12 16:09:48 -08:00
Eric Anholt	d3e046e76c	nir: Pull some of intel's image load/store format conversion to nir_format.h I needed the same functions for v3d. Note that the color value in the Intel lowering has already been cut down to image.chans num_components. v2: Drop the half float one, since it was a 1-liner after cleanup. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-12 16:09:43 -08:00
Eric Anholt	19c7cba2ab	nir: Add some more consts to the nir_format_convert.h helpers. Most of the bits were constant, but a few were missed. Avoids warnings from v3d's upcoming static const bits declarations. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-12 16:09:37 -08:00
Timothy Arceri	9e6b39e1d5	nir: detect more induction variables This allows loop analysis to detect inductions variables that are incremented in both branches of an if rather than in a main loop block. For example: loop { block block_1: /* preds: block_0 block_7 / vec1 32 ssa_8 = phi block_0: ssa_4, block_7: ssa_20 vec1 32 ssa_9 = phi block_0: ssa_0, block_7: ssa_4 vec1 32 ssa_10 = phi block_0: ssa_1, block_7: ssa_4 vec1 32 ssa_11 = phi block_0: ssa_2, block_7: ssa_21 vec1 32 ssa_12 = phi block_0: ssa_3, block_7: ssa_22 vec4 32 ssa_13 = vec4 ssa_12, ssa_11, ssa_10, ssa_9 vec1 32 ssa_14 = ige ssa_8, ssa_5 / succs: block_2 block_3 / if ssa_14 { block block_2: / preds: block_1 / break / succs: block_8 / } else { block block_3: / preds: block_1 / / succs: block_4 / } block block_4: / preds: block_3 / vec1 32 ssa_15 = ilt ssa_6, ssa_8 / succs: block_5 block_6 / if ssa_15 { block block_5: / preds: block_4 / vec1 32 ssa_16 = iadd ssa_8, ssa_7 vec1 32 ssa_17 = load_const (0x3f800000 / 1.000000/) / succs: block_7 / } else { block block_6: / preds: block_4 / vec1 32 ssa_18 = iadd ssa_8, ssa_7 vec1 32 ssa_19 = load_const (0x3f800000 / 1.000000/) / succs: block_7 / } block block_7: / preds: block_5 block_6 / vec1 32 ssa_20 = phi block_5: ssa_16, block_6: ssa_18 vec1 32 ssa_21 = phi block_5: ssa_17, block_6: ssa_4 vec1 32 ssa_22 = phi block_5: ssa_4, block_6: ssa_19 / succs: block_1 */ } Unfortunatly GCM could move the addition out of the if for us (making this patch unrequired) but we still cannot enable the GCM pass without regressions. This unrolls a loop in Rise of The Tomb Raider. vkpipeline-db results (VEGA): Totals from affected shaders: SGPRS: 88 -> 96 (9.09 %) VGPRS: 56 -> 52 (-7.14 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 2168 -> 4560 (110.33 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 4 -> 4 (0.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32211	2018-12-13 10:58:35 +11:00
Timothy Arceri	c03d6e61cc	nir: reword code comment Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-12-13 10:58:35 +11:00
Timothy Arceri	48b40380e3	nir: in loop analysis track actual control flow type This will allow us to improve analysis to find more induction variables. Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-12-13 10:58:35 +11:00
Danylo Piliaiev	5921a19d4b	nir: add if opt opt_if_loop_last_continue() Removing the last continue can allow more loops to unroll. Also inserting code into the if branch can allow the various if opts to progress further. The insertion of some loops into the if branch also reduces VGPR use in some shaders. vkpipeline-db results (VEGA): Totals from affected shaders: SGPRS: 6552 -> 6576 (0.37 %) VGPRS: 6544 -> 6532 (-0.18 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 481952 -> 478032 (-0.81 %) bytes LDS: 13 -> 13 (0.00 %) blocks Max Waves: 241 -> 242 (0.41 %) Wait states: 0 -> 0 (0.00 %) Shader-db results radeonsi (VEGA): Totals from affected shaders: SGPRS: 168 -> 168 (0.00 %) VGPRS: 144 -> 140 (-2.78 %) Spilled SGPRs: 157 -> 157 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 8524 -> 8488 (-0.42 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 7 -> 7 (0.00 %) Wait states: 0 -> 0 (0.00 %) v2: (Timothy Arceri): - allow for continues in either branch - move any trailing loops inside the if as well as blocks. - leave nir_opt_trivial_continues() to actually remove the continue. Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32211	2018-12-13 10:58:35 +11:00
Timothy Arceri	721566bddb	nir: rework force_unroll_array_access() Here we rework force_unroll_array_access() so that we can reuse the induction variable detection in a following patch. Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-12-13 10:39:51 +11:00
Timothy Arceri	48135f175c	nir: factor out some of the complex loop unroll code to a helper Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-12-13 10:34:48 +11:00
Jordan Justen	7fe4e0ad5d	docs: Document GitLab merge request process (email alternative) This documents a process for using GitLab Merge Requests as an second way to submit code changes for Mesa. Only one of the two methods is allowed for each patch series. We will not require all patches to be emailed. Some code changes may be reviewed and merged without any discussion on the mesa-dev email list. v2: * No longer require email. Allow submitter to choose email or a GitLab merge request. * Various feedback from Brian, Daniel, Dylan, Eric, Erik, Jason, Matt, Michel and Rob. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Rob Clark <robdclark@gmail.com>	2018-12-12 10:05:29 -08:00
Rhys Kidd	ff6f1dd0d3	meson: libfreedreno depends upon libdrm (for fence support) Error message building freedreno Gallium driver with meson: ../src/gallium/drivers/freedreno/freedreno_fence.c:27:21: fatal error: libsync.h: No such file or directory \#include <libsync.h> Fixes: `4aa69cc425` ("meson: build freedreno") Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-12 09:01:06 -08:00
Jason Ekstrand	ca98902d09	nir: Document the function inlining process This has thrown a few people off recently and it's good to have the process and all the rational for it documented somewhere. A comment at the top of nir_inline_functions seems as good a place as any. Acked-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-12-12 08:32:32 -06:00
Jason Ekstrand	5749c0ebc4	intel/blorp: Assert that we don't re-layout a compressed surface Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-12 08:32:32 -06:00
Jason Ekstrand	e4fdc650f1	anv/pipeline: Set the correct binding count for compute shaders Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-12-12 08:32:25 -06:00
Samuel Pitoiset	2ac6d55f38	radv: bump reported version to 1.1.90 After going through the spec changelog, it looks like RADV is up to date. Note that ANV also reports 1.1.90. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-12 13:51:16 +01:00
Erik Faye-Lund	f856f50194	virgl: force linear texturing support When I made sure that half-float texture-filtering was required for ES3, I didn't realize that virgl doesn't report support for this correctly. This regressed the GLES version available on top of several drivers, including i965 from 3.2 to 2.0. This is going to need protocol changes to fix properly, so let's just restore the previous behavior by enabling floating-point filtering unconditionally for now. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `fcf9fcee3c` "mesa/main: do not require float-texture filtering for es3" Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-12-12 11:44:47 +01:00
Iago Toral Quiroga	3918943211	intel/compiler: do not copy-propagate strided regions to ddx/ddy arguments The implementation of these opcodes in the generator assumes that their arguments are packed, and it generates register regions based on that assumption. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-12 08:09:45 +01:00
Jason Ekstrand	a10a450db2	anv: Advertise support for MinLod on Skylake+ These are usually used for dealing with sparse resources but there's no reason why we can't hook them up before we have sparse. We have the hardware; let's light it up. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-12-11 21:26:23 -06:00

1 2 3 4 5 ...

106217 commits