fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-11 06:28:09 +02:00

Author	SHA1	Message	Date
Roland Scheidegger	a2611ffe4b	r200: fix bgrx8/xrgb8 blits Since `779cabfc7d` the same txformat table entries are used for "normal" texturing as well as for blits. However, I forgot to put in an entry for the bgrx8 (le) and xrgb8 (be) formats - the normal texturing path can't hit them because the radeon tex format chooser will never chose them, but we get that format from the dri buffers (at least I assume we got it from there). This is untested but essentially addressing the same bug as for radeon. (I don't think that the second entry per le/be table is actually necessary, but shouldn't hurt...) Tested-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-11-17 01:04:09 +01:00
Roland Scheidegger	983614dbed	radeon: fix bgrx8/xrgb8 blits Since `d21320f625` the same txformat table entries are used for "normal" texturing as well as for blits. However, I forgot to put in an entry for the bgrx8 (le) and xrgb8 (be) formats - the normal texturing path can't hit them because the radeon tex format chooser will never chose them, but we get that format from the dri buffers (at least I assume we got it from there). This caused lots of piglit regressions (and probably lots of trouble outside piglit too). This fixes bug https://bugs.freedesktop.org/show_bug.cgi?id=92900. Tested-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-11-17 01:01:38 +01:00
Jason Ekstrand	de54b4b18f	anv: Only include the pack headers where needed Previously, we were including gen7_pack.h, gen75_pack.h, and gen8_pack.h in anv_private.h. As we add more gens, this is going to become untenable. This commit moves things around so that we only use the pack headers when and if we need them.	2015-11-16 12:29:09 -08:00
Jason Ekstrand	cb9e2305f8	anv/cmd_buffer: Move gen-specific stuff into the appropreate files	2015-11-16 12:10:11 -08:00
Ian Romanick	c40a88b6c5	meta/generate_mipmap: Only modify the draw framebuffer binding in fallback_required Previously GL_FRAMEBUFFER was used. However, if GL_EXT_framebuffer_blit is supported (note: it is supported by every Mesa driver), this is sometimes an alias for GL_DRAW_FRAMEBUFFER (getters) and sometimes an alias for both GL_DRAW_FRAMEBUFFER and GL_READ_FRAMEBUFFER (setters). As a result, the code saved one binding but modified both. If the bindings were different, the GL_READ_FRAMEBUFFER would be incorrect on exit. Fixes the piglit fbo-generatemipmap-versus-READ_FRAMEBUFFER test. Ideally this function would use DSA functions and not modify the binding at all. However, that would be a much more intrusive change because _mesa_meta_bind_fbo_image would also need to be modified. _mesa_meta_bind_fbo_image has a lot of callers. Much of this code is about to get a major rework due to bug #92363, so I don't think it matters too much. In fact, I discovered this bug while working on the other bug. Le bon temps! Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-11-16 10:30:10 -08:00
Matt Turner	d564b5b58e	nir/glsl: Fix copy-n-paste mistakes from commit `213f864`. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-16 09:05:53 -08:00
Alex Deucher	00f554abba	radeonsi: enable optimal raster config setting for fiji (v2) Requires proper kernel tiling configuration so check the tiling config registers. v2: send the right version of the patch Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: mesa-stable@lists.freedesktop.org	2015-11-16 10:09:47 -05:00
Alex Deucher	5b37d8b50c	radeonsi: use proper GRBM_GFX_INDEX offset for CI+ The offset is different on CI and newer. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-16 10:09:34 -05:00
Neil Roberts	2ca018cb65	docs: Add 16x MSAA on i965 to the release notes Signed-off-by: Neil Roberts <neil@linux.intel.com>	2015-11-16 14:36:27 +01:00
Emil Velikov	1780a562bc	nv50: add missing header into the sources list Otherwise it won't end up in the tarball. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-16 10:49:14 +00:00
Juan A. Suarez Romero	40c2acef5c	nir/glsl_to_nir: use _mesa_fls() to compute num_textures Replace the current loop by a direct call to _mesa_fls() function. It also fixes an implicit bug in the current code where num_textures seems to be one value less than it should be when sh->Program->SamplersUsed > 0. For instance, num_textures is 0 instead of 1 when sh->Program->SamplersUsed is 1. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-16 09:24:28 +01:00
Iago Toral Quiroga	3f34afa0aa	nir/copy_propagate: do not copy-propagate MOV srcs with source modifiers If a source operand in a MOV has source modifiers, then we cannot copy-propagate it from the parent instruction and remove the MOV. v2: remove the check for source modifiers from is_move() (Jason) v3: Put the check for source modifiers back into is_move() since this function is called from copy_prop_alu_src(). Add source modifiers checks to is_vec() instead. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-16 08:11:13 +01:00
Jason Ekstrand	22d024e031	nir/spirv: Add support for separate samplers and textures This gets tricky in a few places because we have to pass vtn_sampled_image values through OpAccessChain, but it works ok. At some point, it probably needs to be cleaned up but it doesn't occur to me exactly how to do that at the moment. We'll see how this approach goes.	2015-11-14 22:32:54 -08:00
Ilia Mirkin	ff17b3ccf4	nv50,nvc0: disable render condition around clear_* functions Only the regular "clear" call is supposed to respect the render condition. The rest should ignore it. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-14 20:15:22 -05:00
Kenneth Graunke	d2f089ba17	i965: Introduce a MOV_INDIRECT opcode. The geometry and tessellation control shader stages both read from multiple URB entries (one per vertex). The thread payload contains several URB handles which reference these separate memory segments. In GLSL, these inputs are represented as per-vertex arrays; the outermost array index selects which vertex's inputs to read. This array index does not necessarily need to be constant. To handle that, we need to use indirect addressing on GRFs to select which of the thread payload registers has the appropriate URB handle. (This is before we can even think about applying the pull model!) This patch introduces a new opcode which performs a MOV from a source using VxH indirect addressing (which allows each of the 8 SIMD channels to select distinct data.) Based on a patch by Jason Ekstrand. v2: Rename from INDIRECT_THREAD_PAYLOAD_MOV to MOV_INDIRECT; make it a bit more generic. Use regs_read() instead of hacking up the register allocator. (Suggested by Jason Ekstrand.) v3: Fix regs_read() to be more accurate for small unaligned regions. Also rebase on Matt's work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v3] Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> [v1]	2015-11-14 16:41:37 -08:00
Samuel Pitoiset	848fa3101d	nv50: add support for performance metrics on G84+ Currently only one metric is exposed but more will be added later. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-14 23:42:46 +01:00
Samuel Pitoiset	6a9c151dbb	nv50: add compute-related MP perf counters on G84+ These compute-related MP performance counters have been reverse engineered using CUPTI which is part of NVIDIA CUDA. As for nvc0, we use a compute kernel to read out those performance counters, and the command stream to configure them. Note that Tesla only exposes 4 MP performance counters, while Fermi has 8. Only G84+ is supported because G80 is an old and weird card. Tested on G84, G96, G200, MCP79 and GT218 with glxgears, glxspheres64, xonotic-glx, heaven and valley. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-14 23:42:42 +01:00
Samuel Pitoiset	ff72440b40	nv50: implement a basic compute support This adds the ability to launch simple compute kernels like the one I will use to read out MP performance counters in the upcoming patch. This compute support is based on the work of Francisco Jerez (aka curro) that he did as part of his EVoC project in 2011/2012 to get OpenCL working on Tesla. His original work can be found here: https://github.com/curro/mesa/commits/nv50-compute I did some improvements on the original code, like fixing using both 3D and COMPUTE simultaneously, improving global buffers binding, and making the code closer to what nvc0 already does. This compute support has been tested by Pierre Moreau and myself with some compute kernels. This is a step towards OpenCL. Speaking about this, it seems like compute programs overlap fragment programs when they are used both. To fix this, we need to re-validate fragment programs when binding compute programs and vice versa. Note that, textures, samplers and surfaces still need to be implemented. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-14 23:42:15 +01:00
Samuel Pitoiset	7167a058ba	nv50: free interpolation parameters in nv50_program_destroy() As for nvc0, we need to free memory allocated by interpolation parameters. This fixes a memory leak spotted by valgrind. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-14 23:16:12 +01:00
Jason Ekstrand	002db3ee15	anv/cmd_buffer: Add a default descriptor type case This silences a bunch of compiler warnings.	2015-11-14 09:16:55 -08:00
Jason Ekstrand	e9dba80430	anv/apply_pipeline_layout: Handle separate samplers and textures	2015-11-14 09:00:35 -08:00
Samuel Pitoiset	69271bba06	nvc0: reduce the number of GPR used when reading MP perf counters No need to allocate more GPR than used in the compute kernel which reads MP performance counters on Fermi. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-11-14 17:38:57 +01:00
Jason Ekstrand	b5d4027c35	Merge branch 'wip/i965-separate-sampler-tex' into vulkan	2015-11-14 08:23:27 -08:00
Jason Ekstrand	c7d504ad93	i965/vec4: Plumb separate surfaces and samplers through from NIR	2015-11-14 08:05:31 -08:00
Jason Ekstrand	3dd84822df	i965/vec4: Separate the sampler from the surface in generate_tex	2015-11-14 08:05:31 -08:00
Jason Ekstrand	c09e140b65	i965/fs: Plumb separate surfaces and samplers through from NIR	2015-11-14 08:04:47 -08:00
Jason Ekstrand	c2a373ec85	i965/fs: Separate the sampler from the surface in generate_tex	2015-11-14 08:01:50 -08:00
Jason Ekstrand	b169bb902a	nir: Separate texture from sampler in nir_tex_instr This commit adds the capability to NIR to support separate textures and samplers. As it currently stands, glsl_to_nir only sets the sampler and leaves the texture alone as it did before and nir_lower_samplers assumes this. However, backends can, if they wish, assume that they are separate because nir_lower_samplers sets both texture and sampler index (they are the same in this case).	2015-11-14 07:57:31 -08:00
Jason Ekstrand	1469ccb746	Merge remote-tracking branch 'mesa-public/master' into vulkan This pulls in Matt's big compiler refactor.	2015-11-14 07:56:10 -08:00
Ilia Mirkin	f94e1d9738	nouveau: don't expose HEVC decoding support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-14 10:32:10 -05:00
Jason Ekstrand	e8f51fe4de	anv/gen8: Subtract 1 from num_elements when setting up buffer surface state	2015-11-13 22:50:54 -08:00
Jason Ekstrand	91bc4e7cec	anv/pipeline: Don't free blend states that don't exist Compute pipelines don't need a blend state so we shouldn't be unconditionally freeing it.	2015-11-13 21:49:41 -08:00
Jason Ekstrand	c1733886a6	nir/spirv: Add support for SSBO stores This only handles vector stores, not component-of-a-vector stores.	2015-11-13 21:41:52 -08:00
Jason Ekstrand	c68e28d766	nir/spirv: Refactor vtn_block_load We pull the offset calculations out into their own function so we can re-use it for stores.	2015-11-13 21:32:00 -08:00
Jason Ekstrand	99494b96f0	nir/spirv: Add support for image_load_store	2015-11-13 17:54:43 -08:00
Jason Ekstrand	164b3ca164	nir/builder: Add a nir_ssa_undef helper	2015-11-13 17:54:43 -08:00
Jason Ekstrand	ffbc31d13b	nir/spirv: Add support for creating image variables	2015-11-13 17:54:43 -08:00
Jason Ekstrand	453239f6a5	nir/spirv: Add support for image types	2015-11-13 17:54:43 -08:00
Jason Ekstrand	0572444a0e	nir/types: Add image type helpers	2015-11-13 17:54:43 -08:00
Jason Ekstrand	d5ba7a26d9	glsl/types: Add a get_image_instance helper	2015-11-13 17:54:43 -08:00
Vinson Lee	3a0fef0005	nir: Silence GCC maybe-uninitialized warnings. nir/nir_control_flow.c: In function ‘split_block_cursor.isra.11’: nir/nir_control_flow.c:460:15: warning: ‘after’ may be used uninitialized in this function [-Wmaybe-uninitialized] _after = after; ^ nir/nir_control_flow.c:458:16: warning: ‘before’ may be used uninitialized in this function [-Wmaybe-uninitialized] _before = before; ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-13 16:19:11 -08:00
Kenneth Graunke	5480bbd90e	i965: Add a SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT opcode. We need to use per-slot offsets when there's non-uniform indexing, as each SIMD channel could have a different index. We want to use them for any non-constant index (even if uniform), as it lives in the message header instead of the descriptor, allowing us to set offsets in GRFs rather than immediates. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2015-11-13 16:11:02 -08:00
Kenneth Graunke	511de1a80c	glsl: Allow implicit int -> uint conversions for the % operator. GLSL 4.00 and GL_ARB_gpu_shader5 introduced a new int -> uint implicit conversion rule and updated the rules for modulus to use them. (In earlier languages, none of the implicit conversion rules did anything relevant, so there was no point in applying them.) This allows expressions such as: int foo; uint bar; uint mod = foo % bar; Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-13 16:09:58 -08:00
Kenneth Graunke	a4ba476c30	i965: Print input/output VUE maps on INTEL_DEBUG=vs, gs. I've been carrying around a patch to do this for the last few months, and it's been exceedingly useful for debugging GS and tessellation problems. I've caught lots of bugs by inspecting the interface expectations of two adjacent stages. It's not that much spam, so I figure we may as well just print it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Matt Turner <mattst88@gmail.com>	2015-11-13 16:08:51 -08:00
Kenneth Graunke	f88c175a29	i965: Make convert_attr_sources_to_hw_regs handle stride == 0. This makes expressions like component(fs_reg(ATTR, n), 7) get a proper <0,1,0> region instead of the invalid <0,8,0>. Nobody uses this today, but I plan to. v2: Rebase on Matt's changes; simplify. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]	2015-11-13 15:17:58 -08:00
Kenneth Graunke	26f9469a46	nir: Add helpers for getting input/output intrinsic sources. With the many variants of IO intrinsics, particular sources are often in different locations. It's convenient to say "give me the indirect offset" or "give me the vertex index" and have it just work, without having to think about exactly which kind of intrinsic you have. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-13 15:15:46 -08:00
Kenneth Graunke	d12bde0944	nir: Don't lower TCS outputs to temporaries. We'd like to shadow these when possible, but the current code doesn't work properly for TCS outputs. For now, disable it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-13 15:15:46 -08:00
Kenneth Graunke	134728fdae	nir: Allow outputs reads and add the relevant intrinsics. Normally, we rely on nir_lower_outputs_to_temporaries to create shadow variables for outputs, buffering the results and writing them all out at the end of the program. However, this is infeasible for tessellation control shader outputs. Tessellation control shaders can generate multiple output vertices, and write per-vertex outputs. These are arrays indexed by the vertex number; each thread only writes one element, but can read any other element - including those being concurrently written by other threads. The barrier() intrinsic synchronizes between threads. Even if we tried to shadow every output element (which is of dubious value), we'd have to read updated values in at barrier() time, which means we need to allow output reads. Most stages should continue using nir_lower_outputs_to_temporaries(), but in theory drivers could choose not to if they really wanted. v2: Rebase to accomodate Jason's review feedback. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-13 15:15:41 -08:00
Kenneth Graunke	c51d7d5fe3	nir/lower_io: Introduce nir_store_per_vertex_output intrinsics. Similar to nir_load_per_vertex_input, but for outputs. This is not useful in geometry shaders, but will be useful in tessellation shaders. v2: Change stage_uses_per_vertex_outputs() to is_per_vertex_output(), taking a nir_variable (requested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-13 15:15:10 -08:00
Kenneth Graunke	0df452cd0d	nir/lower_io: Use load_per_vertex_input intrinsics for TCS and TES. Tessellation control shader inputs are an array indexed by the vertex number, like geometry shader inputs. There aren't per-patch TCS inputs. Tessellation evaluation shaders have both per-vertex and per-patch inputs. Per-vertex inputs get the new intrinsics; per-patch inputs continue to use the ordinary load_input intrinsics, as they already work like we want them to. v2: Change stage_uses_per_vertex_inputs into is_per_vertex_input(), which takes a variable (requested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-13 15:15:10 -08:00

... 6 7 8 9 10 ...

75862 commits