fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-06 13:48:06 +02:00

Author	SHA1	Message	Date
Dave Airlie	dc68b920df	radeonsi/ac: move frag interp emission code to shared llvm code. This code should be used in radv, so move it to a shared location in advance of doing that. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-02 08:24:53 +10:00
Timothy Arceri	b940b2fd16	st/mesa: inline get_mesa_program() In the past I've gotten this function confused with the one in ir_to_mesa.cpp of the same name. Now that the affected flag setting has move into a helper it makes sense just to inline this remaining code. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-02 08:31:28 +11:00
Timothy Arceri	a7050ea1f9	st/mesa: create set_prog_affected_state_flags() helper This will be used when restoring tgsi from the on-disk shader cache. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-02 08:31:28 +11:00
Timothy Arceri	8d3d8a6d4e	st/mesa: st_atom_shader.c C99 tidy up Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-02 08:31:28 +11:00
Timothy Arceri	f3e2428a7a	st/mesa: remove pre C99 statement block for variable declaration Acked-by: Marek Olšák <marek.olsak@amd.com>	2017-02-02 08:31:28 +11:00
Jason Ekstrand	0c114f2cf0	isl: Add assertions for render target swizzle restrictions Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-01 12:07:54 -08:00
Boyuan Zhang	f90ccf48bc	st/va: add h264 constrained baseline profile Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-02-01 14:32:32 -05:00
Boyuan Zhang	d596bd29ec	st/vdpau: add h264 constrained baseline profile Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-02-01 14:32:32 -05:00
Boyuan Zhang	c29191eea8	radeon/uvd: add h264 constrained baseline support Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-02-01 14:32:32 -05:00
Boyuan Zhang	22841ec84a	vl: add h264 constrained baseline profile Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-02-01 14:32:32 -05:00
Bas Nieuwenhuizen	f5f8eb2c7c	radv: Enable VK_KHR_shader_draw_parameters. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-01 19:49:40 +01:00
Bas Nieuwenhuizen	cf8a11c1ba	radv: Pass draw index to shader. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-01 19:49:40 +01:00
Bas Nieuwenhuizen	80f4331ed1	radv/ac: Add draw index support. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-02-01 19:49:40 +01:00
Robert Foss	25f2d3c1d3	i965: Prevent coverity warning Add assert checking that num_sources is never larger than 3. This prevents Coverity from concluding that the unhandled cases of num_sources not being 0-3 are relevant. Coverity-Id: 1399480-1399489 Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-02-01 16:47:05 +00:00
Lionel Landwerlin	875b15eec4	spirv: add SPV_KHR_shader_draw_parameters support Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-01 15:08:33 +00:00
Lionel Landwerlin	bd46040162	compiler: add missing enums for debug Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-01 15:08:30 +00:00
Emil Velikov	1e8fd790e1	docs: add news item and link release notes for 13.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-01 11:21:59 +00:00
Emil Velikov	f2391e8134	docs: add sha256 checksums for 13.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `6bfc352f5a`)	2017-02-01 11:20:28 +00:00
Emil Velikov	7b6931e7fb	docs: add release notes for 13.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `3255d10da4`)	2017-02-01 11:20:27 +00:00
Michel Dänzer	31136eae3a	winsys/radeon: Allow visible VRAM size > 256MB with kernel driver >= 2.49 The kernel driver reports correct values now. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2017-02-01 16:38:14 +09:00
Tapani Pälli	58828fe4ae	android: add vulkan build for intel fixes to issues spotted by Emil Velikov: - set ANV_TIMESTAMP corretly - fix typo with VULKAN_GEM_FILES v2: update to use Makefile.sources under vulkan instead of having own v3: update to changes to generate from vk.xml (commit `c7fc310`) v4: remove 'hw' relative path cleanups, remove unnecessary cruft review from Emil Velikov: - move to vulkan folder - remove timestamp gen, no longer necessary - more cleanups Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-02-01 07:58:49 +02:00
Ilia Mirkin	62b8f494fa	mesa: use same is_color_attachment trick to discern error cases All the other calls to retrieve the attachment have been covered except this one - return the proper error for attachment points that are valid enums but out of bound for the driver. Fixes GL45-CTS.geometry_shader.layered_fbo.fb_texture_invalid_attachment Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-31 22:12:57 -05:00
Jason Ekstrand	92128590bc	anv: Improve flushing around STATE_BASE_ADDRESS It is not clear from the docs exactly how pipelined STATE_BASE_ADDRESS actually is. We know from experimentation that we need to flush the render cache prior to emitting STATE_BASE_ADDRESS and invalidate the texture cache afterwards. The only thing the PRM says is that, on gen8+ we're supposed to invalidate the state cache after STATE_BASE_ADDRESS but experimentation has indicated that doing so does nothing whatsoever. Since we don't really know, let's do just a bit more flushing in the hopes that this won't be a problem again. In particular: 1) Do a CS stall before we emit STATE_BASE_ADDRESS since we don't really know whether or not it's pipelined. 2) Do a data cache flush in case what runs before STATE_BASE_ADDRESS is a compute shader. 3) Invalidate the state and constant caches after STATE_BASE_ADDRESS because the state may be getting cached there (we don't really know). Reported-by: Mark Janes <mark.a.janes@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-01-31 18:49:44 -08:00
Jason Ekstrand	f1f9794118	anv: Flush render cache before STATE_BASE_ADDRESS on gen7 We had no good reason for not doing this on gen7 before but we didn't know it was needed. Recently, when trying update to Vulkan CTS version 1.0.2 in our CI system, Mark discovered GPU hangs on Haswell that appear to be STATE_BASE_ADDRESS related. This commit fixes them. Reported-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-01-31 18:49:44 -08:00
Jason Ekstrand	4871930451	isl/formats: Only advertise sampling for A4B4G4R4 on Broadwell This causes hangs on Broadwell if you try to render to it. I have no idea how we managed to not hit this earlier. Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-01-31 18:49:44 -08:00
Jason Ekstrand	a0348b5a0b	intel/blorp: Handle clearing of A4B4G4R4 on all platforms Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-01-31 18:49:44 -08:00
Tom Stellard	226a2c6d6e	radeonsi: Fix build on LLVM < 3.9 v2 This was broken by: `e0cc0a614c` v2: - Use preprocessor macro Tested-by: Mark Janes <mark.a.janes@intel.com>	2017-02-01 02:10:00 +00:00
Bas Nieuwenhuizen	798ae37cc9	radv: Enable Float64 support. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-01 01:09:34 +01:00
Bas Nieuwenhuizen	441ee1e65b	radv/ac: Implement Float64 SSBO loads. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-01 01:09:34 +01:00
Bas Nieuwenhuizen	bb1ce63002	radv/ac: Implement Float64 UBO loads. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-01 01:09:29 +01:00
Bas Nieuwenhuizen	03724af262	radv/ac: Implement Float64 load/store var. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-01 01:09:05 +01:00
Bas Nieuwenhuizen	91074bb11b	radv/ac: Implement Float64 SSBO stores. No f16 support as I'm not quite sure about alignment yet. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-01 01:09:05 +01:00
Bas Nieuwenhuizen	29577b2123	radv/ac: Add core Float64 support. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-01 01:09:05 +01:00
Rob Herring	01e18b21d1	vc4: Enable Neon on arm android builds Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-31 14:06:21 -08:00
Rob Herring	83107acb7b	vc4: fix arm64 build with Neon The addition of Neon assembly breaks on arm64 builds because the assembly syntax is different. For now, restrict Neon to ARMv7 builds. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-31 14:06:19 -08:00
Rob Herring	6d92f32852	vc4: Make Neon inline assembly clang compatible clang throws an error on "%r2" and similar. I couldn't find any documentation on what "%r?" is supposed to mean and I've never seen any use like that as far as I remember. The parameter is supposed to be cpu_stride and just %2/%3 should be sufficient. There's no need for trailing ";" either, so remove those, too. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-01-31 14:06:09 -08:00
Tom Stellard	e0cc0a614c	radeonsi: Set datalayout on the llvm module This prevents LLVM from using sext instructions for local memory offsets and allows the backend to fold immediate offsets into the instruction. This also prevents some incorrect code generation for ptrtoint and inttoptr instructions. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-31 20:39:30 +00:00
Francisco Jerez	11e9ebbf15	nir/spirv/glsl450: Implement IEEE-compliant handling of atan2(±∞, ±∞). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-01-31 10:33:33 -08:00
Francisco Jerez	013d40d1ce	glsl: Implement IEEE-compliant handling of atan2(±∞, ±∞). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-01-31 10:33:33 -08:00
Francisco Jerez	7215375c44	nir/spirv/glsl450: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity. See "glsl: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity." for the rationale, but note that the instruction count benefit discussed there is somewhat less important for the SPIRV implementation, because the current code already emitted no control flow instructions -- Still this saves us one hardware instruction per scalar component on Intel SKL hardware. Fixes the following Vulkan CTS tests on Intel hardware: dEQP-VK.glsl.builtin.precision.atan2.highp_compute.scalar dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec2 dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec3 dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec4 dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec2 dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec4 Note that most of the test-cases above expect IEEE-compliant handling of atan2(±∞, ±∞), which this patch doesn't explicitly handle, so except for the last two the test-cases above weren't expected to pass yet. The reason they do is that the i965 back-end implementation of the NIR fmin and fmax instructions is not quite GLSL-compliant (it complies with IEEE 754 recommendations though), because fmin/fmax of a NaN and a non-NaN argument currently always return the non-NaN argument, which causes atan() to flush NaN to one and return the expected value. The front-end should probably not be relying on this behavior for correctness though because other back-ends are likely to behave differently -- A follow-up patch will handle the atan2(±∞, ±∞) corner cases explicitly. v2: Fix up argument scaling to take into account the range and precision of exotic FP24 hardware. Flip coordinate system for arguments along the vertical line as if they were on the left half-plane in order to avoid division by zero which may give unspecified results on non-GLSL 4.1-capable hardware. Sprinkle in some more comments. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-01-31 10:33:27 -08:00
Francisco Jerez	e9ffd12827	glsl: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity. This addresses several issues of the current atan2 implementation: - Negative zero (and negative denorms which end up getting flushed to zero) isn't handled correctly by the current implementation. The reason is that it does 'y >= 0' and 'x < 0' comparisons to decide on which side of the branch cut the argument is, which causes us to return incorrect results (off by up to 2π) for very small negative values. - There is a serious precision problem for x values of large enough magnitude introduced by the floating point division operation being implemented as a mul+rcp sequence. This can lead to the quotient getting flushed to zero in some cases introducing an error of over 8e6 ULP in the result -- Or in the most catastrophic case will cause us to return NaN instead of the correct value ±π/2 for y=±∞ and x very large. We can fix this easily by scaling down both arguments when the absolute value of the denominator goes above certain threshold. The error of this atan2 implementation remains below 25 ULP in most of its domain except for a neighborhood of y=0 where it reaches a maximum error of about 180 ULP. - It emits a bunch of instructions including no less than three if-else branches per scalar component that don't seem to get optimized out later on. This implementation uses about 13% less instructions on Intel SKL hardware and doesn't emit any control flow instructions. v2: Fix up argument scaling to take into account the range and precision of exotic FP24 hardware. Flip coordinate system for arguments along the vertical line as if they were on the left half-plane in order to avoid division by zero which may give unspecified results on non-GLSL 4.1-capable hardware. Sprinkle in some more comments. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-01-31 10:32:45 -08:00
Francisco Jerez	69042a5be4	i965/fs: Fix nir_op_fsign of absolute value. This does point at the front-end emitting silly code that could have been optimized out, but the current fsign implementation would emit bogus IR if abs was set for the argument (because it would apply the abs modifier on an unsigned integer type), and we shouldn't rely on the upper layer's optimization passes for correctness. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-01-31 10:32:43 -08:00
Francisco Jerez	7ec3af3f8f	glsl/ir_builder: Add rcp builder. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-01-31 10:32:43 -08:00
Francisco Jerez	6643a97de3	glsl: Fix constant evaluation of the rcp op. Will avoid a regression in a future commit that introduces some additional rcp operations. According to the GLSL 4.10 specification: "Dividing by 0 results in the appropriately signed IEEE Inf." Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-01-31 10:32:43 -08:00
Francisco Jerez	e81130d7a1	mesa/program: Translate csel operation from GLSL IR. This will be used internally by the GLSL front-end in order to implement some built-in functions. Plumb it through MESA IR for back-ends that rely on this translation pass. v2: Add comment. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-01-31 10:32:42 -08:00
Wladimir J. van der Laan	56314f5baf	etnaviv: Set SE.CLIP registers, add margins for scissor/clip registers This fixes rendering of full-screen quads (and other screen-filling geometry, e.g. ioquake3 walls up-close) on gc3000. It should be a no-op on other hardware. - It looks like SE_CLIP registers were not set at all. I'm amazed that rendering worked without them. Emit them to avoid issues on gc3000. - Define constants ETNA_SE_SCISSOR_MARGIN_RIGHT (0x1119) ETNA_SE_SCISSOR_MARGIN_BOTTOM (0x1111) ETNA_SE_CLIP_MARGIN_RIGHT (0xffff) ETNA_SE_CLIP_MARGIN_BOTTOM (0xffff) These demarcate the margin (fixp16) between the computed sizes and the value sent to the chip. I have set these to the numbers used by the Vivante driver for gc2000. I am not sure whether any old hardware was relying on the old numbers, or whether those were just a guess. But if so, these need to be moved to the _specs structure. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-01-31 19:29:23 +01:00
Wladimir J. van der Laan	fe3bb8cdb5	etnaviv: Generate new sin/cos instructions on GC3000 Shaders using sin/cos instructions were not working on GC3000. The reason for this turns out to be that these chips implement sin/cos in a different way (but using the same opcodes): - Need their input scaled by 1/pi instead of 2/pi. - Output an x and y component, which need to be multiplied to get the result. - tex_amode needs to be set to 1. Add a new bit to the compiler specs and generate these instructions as necessary. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-01-31 19:29:16 +01:00
Nanley Chery	33e0c5d003	anv/cmd_buffer: Use the proper depth input attachment surface state Commit `2852efcda4` moved the location of the depth input attachment surface state from the render pass to the image view, but failed to update the surface state location used when emitting the binding table. Fix this by loading the surface state from the correct location. Fixes: dEQP-VK.renderpass.formats.d16_unorm.input.* dEQP-VK.renderpass.formats.d24_unorm_s8_uint.input.* dEQP-VK.renderpass.formats.d32_sfloat.input.* dEQP-VK.renderpass.formats.x8_d24_unorm_pack32.input.* dEQP-VK.renderpass.attachment_allocation.input_output.93 dEQP-VK.renderpass.attachment_allocation.input_output.92 dEQP-VK.renderpass.attachment_allocation.input_output.82 dEQP-VK.renderpass.attachment_allocation.input_output.46 Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2017-01-31 09:00:50 -08:00
Bartosz Tomczyk	fc27181f9e	glsl: fix heap-buffer-overflow The `end+1` skips the ']', whereas the `strlen+1` includes the final '\0' in the move to terminate the string. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-31 15:58:52 +01:00
Wladimir J. van der Laan	658568941d	etnaviv: Cannot render to rb-swapped formats Exposing rb swapped (or other swizzled) formats for rendering would involve swizzing in the pixel shader. This is not the case at the moment, so reject requests for creating such surfaces. (GPUs that need an extra resolve step anyway due to multiple pixel pipes, such as gc2000, might also do this swap in the resolve operation. But this would be tricky to keep track of) CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-01-31 09:28:28 +01:00

1 2 3 4 5 ...

88712 commits