fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 09:08:07 +02:00

Author	SHA1	Message	Date
Ilia Mirkin	e683a797c6	nvc0: collapse output slots to have adjacent registers The hardware skips over unallocated slots, so we have to make sure those registers are packed together. Fixes KHR-GL45.enhanced_layouts.fragment_data_location_api Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-02-27 00:10:39 -05:00
Karol Herbst	ef308d4007	nvir/gm107: consider FILE_FLAGS dependencies in SchedDataCalculatorGM107 currently while insterting barriers, writes and reads to FILE_FLAGS aren't considered. This can lead to WaR hazards in some situations. With the previous commit fixes shaders with intstructions like this: mad u32 $r2 $r4 $r11 $r2 mad u32 { $r5 $c0 } $r4 $r10 $r6 mad (SUBOP:1) u32 $r3 $r4 $r10 $r2 $c0 Affects OpenCL CTS tests on Maxwell+: basic/test_basic intmath_long basic/test_basic intmath_long2 basic/test_basic intmath_long4 v2: only put barriers on instructions which actually read flags Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-02-26 14:41:58 +01:00
Karol Herbst	2f07f823c9	nvir/gm107: iterate over all defs in SchedDataCalculatorGM107::findFirstUse In the sched data calculator we have to track first use of defs by iterating over all defs of an instruction, not just the first one. v2: fix minGRP and maxGRP values Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-02-26 14:41:58 +01:00
Marek Olšák	8799eaed99	radeonsi: remove 2 unused user SGPRs from merged TES-GS with 32-bit pointers The effect of the last 13 commits on user SGPR counts: Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:19 +01:00
Marek Olšák	3fa7a59d69	radeonsi: make SI_SGPR_VERTEX_BUFFERS the last user SGPR input so that it can be removed and replaced with inline VBO descriptors, and the pointer can be packed in unused bits of VBO descriptors. This also removes the pointer from merged TES-GS where it's useless. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:08 +01:00
Marek Olšák	c78640ce31	radeonsi: set correct num_input_sgprs for VS prolog in merged shaders We need to take num_input_sgprs from VS, not the second shader. No apps suffered from this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:05 +01:00
Marek Olšák	f852b24ce0	radeonsi: allow fewer input SGPRs in 2nd shader of merged shaders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:03 +01:00
Marek Olšák	8d6e6b1d7c	radeonsi: don't use struct si_descriptors for vertex buffer descriptors VBO descriptor code will change a lot one day. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:00 +01:00
Dave Airlie	0cc5be7741	r600: fix tgsi clock last setting On cayman this was hitting an assert later, which probably wasn't see on non-cayman due to having the t slot. Fixes: `9041730d1` (r600: add support for ARB_shader_clock.)	2018-02-26 11:05:45 +10:00
Dave Airlie	4d72a1efea	r600: add time lo/hi debugging output. This just adds the these to the debug prints.	2018-02-26 11:05:26 +10:00
Timothy Arceri	22430224fe	radeonsi/nir: enable lowering of fpow Lowering fpow in NIR rather than LLVM can be beneficial. Polaris results: Totals from affected shaders: SGPRS: 124928 -> 124896 (-0.03 %) VGPRS: 68616 -> 68332 (-0.41 %) Spilled SGPRs: 394 -> 413 (4.82 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 3668912 -> 3658368 (-0.29 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 18575 -> 18593 (0.10 %) Wait states: 0 -> 0 (0.00 %) Fixes: `d6b7539206` "ac/nir: remove emission of nir_op_fpow" Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-26 11:43:47 +11:00
Timothy Arceri	1a757c9c97	gallium/tgsi: remove is_msaa_sampler array from tgsi_shader_info Seems to have not been used since `16be87c904` Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-26 11:43:47 +11:00
Timothy Arceri	9f7c940840	radeonsi/nir: fix loading of doubles for tess varyings Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-26 11:43:47 +11:00
Timothy Arceri	81f9d03807	radeonsi/nir: fix lds store in tcs outputs handling We were ignoring the channel offset. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-26 11:43:47 +11:00
Gert Wollny	c7cadcbda4	r600: Take ALU_EXTENDED into account when evaluating jump offsets ALU_EXTENDED needs 4 DWORDS instead of the usual 2, hence if the last ALU clause within a IF-JUMP or ELSE branch is ALU_EXTENDED the target jump offset needs to be adjusted accordingly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104654 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-26 10:29:48 +10:00
Marek Olšák	fb410ae392	radeonsi: remove si_descriptors parameter from emit_shader_pointer functions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:29 +01:00
Marek Olšák	63ea0a00a3	radeonsi: preload the tess offchip ring in TES so that it's not done multiple times in branches Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:29 +01:00
Marek Olšák	2d03c4cac8	radeonsi: move tess ring address into TCS_OUT_LAYOUT, removes 2 TCS user SGPRs TCS_OUT_LAYOUT has 13 unused bits. That's enough for a 32-bit address aligned to 512KB. Hey, it's a 13-bit pointer! Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:29 +01:00
Marek Olšák	190e064e63	radeonsi: move 2nd-shader descriptor pointers into s[0:1] If 32-bit pointers are supported, both pointers can be moved into s[0:1] and then ESGS has exactly the same user data SGPR declarations as VS. If 32-bit pointers are not supported, only one pointer can be moved into s[0:1]. In that case, the 2nd pointer is moved before TCS constants, so that the location is the same in HS and GS. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:29 +01:00
Marek Olšák	1d1df76d2b	radeonsi: change si_descriptors::shader_userdata_offset type to short We will want to use SH registers outside of user data SGPRs, like the GFX9 special SGPRs. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Marek Olšák	fca7dee9c6	radeonsi: put both tessellation rings into 1 buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Marek Olšák	d2963d8b5f	radeonsi: move tessellation ring info into si_screen Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Marek Olšák	41895c26d3	radeonsi: move TCS_OUT_LAYOUT.PatchVerticesIn to lower bits For a later patch. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Karol Herbst	f0b39779a0	nvir: dont optimize mad with subops to shladd Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-24 18:48:13 +01:00
Eric Anholt	b4b4ada761	broadcom/vc5: Fix layout of 3D textures. Cube maps are entire miptrees repeated, while 3D textures have each level have all of its layers next to each other. Fixes tex3d and tex-miplevel-selection GL2:texture() 3D.	2018-02-23 15:07:26 -08:00
Eric Anholt	97dc077303	broadcom/vc5: Ignore unused usage flags in is_format_supported. Like for vc4, the new DISPLAY_TARGET flag ended up causing no formats to match. Just drop the whole retval == usage thing and return early when we hit a known unsupported case. Fixes: `f7604d8af5` ("st/dri: only expose config formats that are display targets")	2018-02-23 15:07:18 -08:00
Emil Velikov	14a2c87c41	swr: remove dead LLVM code paths LLVM requirement was bumped to 4.0.0 with earlier commit. Hence any code tailored for older versions is now unreachable. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-By: George Kyriazis <george.kyriazis@intel.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-02-23 19:17:31 +00:00
Eric Anholt	5980a41c0f	broadcom/vc4: Remove the retval==usage check in is_format_supported(). This got us into trouble recently, so just remove it entirely.	2018-02-23 08:42:13 -08:00
Eric Anholt	bc3d16e633	broadcom/vc4: Add support for YUV textures using unaccelerated blits. Previously we would assertion fail about having no hardware format. This is enough to get kmscube -M nv12-2img working.	2018-02-23 08:42:13 -08:00
Eric Anholt	c824a045ea	broadcom/vc4: Fix double-unrefcounting of prsc->next with shadows. When we set up the shadow resource we were copying the original resource as the template, including its prsc->next field. When we shadowed the first YUV plane's resource for linear-to-tiled conversion, we would end up unbalancing the refcount on the shadow resource's destruction.	2018-02-23 08:42:13 -08:00
Eric Anholt	6deb158ec1	broadcom/vc4: Add pipe_reference debugging for vc4_bos. Trying to track down the YUV EGLImage use-after-free, it helps to see what the mystery objects are that are being refcounted.	2018-02-23 08:42:13 -08:00
Eric Anholt	34ea1aca92	broadcom/vc4: Remove dead vc4_bo_set_reference(). It would be broken if NULL was passed to it anyway, since it wouldn't participate in screen->bo_handles management.	2018-02-23 08:42:13 -08:00
Eric Anholt	a49738290c	broadcom/vc4: Use pipe_resource_reference in sampler views. Improves u_debug_refcount output.	2018-02-23 08:42:13 -08:00
Eric Anholt	0c1dd9dee0	broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride. This is part of supporting YUV textures -- MMAL will be handing us a single GEM BO with the planes at offsets within it, and MMAL-decided stride.	2018-02-23 08:42:13 -08:00
Eric Anholt	978b884afc	broadcom/vc4: Ignore PIPE_BIND_DISPLAY_TARGET in is_format_supported(). We were failing the retval == usage check at the end. Fixes: `f7604d8af5` ("st/dri: only expose config formats that are display targets")	2018-02-23 08:42:13 -08:00
Lucas Stach	8df11f3fad	etnaviv: fix in-place resolve tile count TS tiles map to a fixed amount of bytes in the color/depth surface, so the blocksize of the format needs to be taken into account when calculating the number of tiles to fill. The simplest fix is to just use the layer stride, which is the surface size in bytes. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2018-02-23 15:34:39 +01:00
Lucas Stach	add23b59c9	etnaviv: switch magic single buffer state to "3" Some of the 16bit formats misrender with missing tiles with the current "2" state. As all the previously working formats also work with the "3" state, just always use that one. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2018-02-23 15:34:39 +01:00
Lucas Stach	8befc11186	etnaviv: add debug switch to disable single buffer feature This feature has caused some trouble already. Add a debug switch to allow users to quickly check if a specific issue is caused by this feature. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-02-23 15:34:31 +01:00
Christian Gmeiner	e72062b66d	etnaviv: npot_tex_any_wrap needs one bit only Reduces size of struct etna_specs from 100 to 94 bytes. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2018-02-23 09:38:16 +01:00
Ilia Mirkin	d73f1f2ad8	nv50,nvc0: fix integer MS resolves using 2d engine We don't want filtering for integer textures, same as depth/stencil. Fixes: KHR-GL45.direct_state_access.renderbuffers_storage_multisample Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-02-22 20:47:48 -05:00
Ilia Mirkin	33ce3569c5	nvc0: fix writing query results into buffer We need to mark the range as valid, and validate the resource using a helper to ensure that the buffer status is marked properly. Fixes some CTS pipeline stats query tests, and KHR-GL45.direct_state_access.queries_functional Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-02-22 20:47:48 -05:00
Ilia Mirkin	f6e4f95668	nv50,nvc0: fix clear buffer acceleration Two things were off: - valid range was not updated, which could affect waiting for future maps - fencing was done manually instead of using the *_resource_validate helper, which resulted in a missed dirty buffer flag being set Fixes: KHR-GL45.direct_state_access.buffers_clear Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-02-22 20:47:48 -05:00
Chuck Atkins	540e49e105	glx: Properly handle cases where screen creation fails This fixes a segfault exposed by `a29d63ecf7` which occurs when swr is used on an unsupported architecture. v2: re-work to place logic in xmesa_init_display Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Cc: mesa-stable@lists.freedesktop.org Cc: George Kyriazis <george.kyriazis@intel.com> Cc: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-22 10:20:32 -05:00
Timothy Arceri	86098696fc	radeonsi/nir: collect more accurate output_usagemask Fixes assert in the glsl-1.50-gs-max-output-components piglit test. Note that the double handling will only work for doubles that don't take up multiple slots i.e. double and dvec2. However dual slot double handling is an existing bug which is made no worse by this patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
Timothy Arceri	79dc94828a	radeonsi/nir: disable GLSL IR loop unrolling Delaying unrolling and allowing NIR to do it instead has been shown to result in better code in drivers such as i965. shader-db results appear to show the same is true for radeonsi. The other advantage is that using NIR unrolling improves compile times significantly. Totals from affected shaders: SGPRS: 9624 -> 10016 (4.07 %) VGPRS: 6800 -> 6464 (-4.94 %) Spilled SGPRs: 0 -> 2 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 359176 -> 332264 (-7.49 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 1355 -> 1432 (5.68 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
Timothy Arceri	e6269ffc2e	radeonsi/nir: fix tess varying loads for doubles Fixes the following piglit tests: tests/spec/arb_tessellation_shader/execution/double-array-vs-tcs-tes.shader_test tests/spec/arb_tessellation_shader/execution/double-vs-tcs-tes.shader_test Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
Timothy Arceri	6d338d757f	ac/radeonsi: pass type to load_tess_varyings() We need this to be able to load 64bit varyings. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
Marek Olšák	b494ed168c	radeonsi: don't flush when si_eliminate_fast_color_clear is no-op	2018-02-21 20:03:11 +01:00
Marek Olšák	5f55f4c59f	radeonsi: make texture_discard_cmask/eliminate functions non-static	2018-02-21 20:03:11 +01:00
James Zhu	81dd4a7637	radeonsi: enable uvd encode for HEVC main Enable UVD encode for HEVC main profile Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00

1 2 3 4 5 ...

33785 commits