fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-06-17 04:28:28 +02:00

Author	SHA1	Message	Date
Kenneth Graunke	1c1653d7b0	isl: Validate row pitch of stencil surfaces. Also, silence an obnoxious finishme that started occurring for all GL applications which use stencil after the i965 ISL conversion. v2: Check against 3DSTATE_STENCIL_BUFFER's pitch bits when using separate stencil, and 3DSTATE_DEPTH_BUFFER's bits when using combined depth-stencil. Cc: "17.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `5563872dbf`)	2017-08-11 20:59:28 +01:00
Jason Ekstrand	b1514579c2	intel/isl: Don't align the height of the last array slice We were calculating the total height of 2D surfaces by multiplying the row pitch by the number of slices. This means that we actually request slightly more space than actually needed since the padding on the last slice is unnecessary. For tiled surfaces this is not likely to make a difference. For linear surfaces, on the other hand, this means we may require additional memory. In particular, this makes the i965 driver reject EGL imports of buffers which do not have this extra padding. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "17.2" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `4d27c6095e`)	2017-08-11 20:52:22 +01:00
Jason Ekstrand	dc63c715cb	intel/isl: Stop padding surfaces The docs contain a bunch of commentary about the need to pad various surfaces out to multiples of something or other. However, all of those requirements are about avoiding GTT errors due to missing pages when the data port or sampler accesses slightly out-of-bounds. However, because the kernel already fills all the empty space in our GTT with the scratch page, we never have to worry about faulting due to OOB reads. There are two caveats to this: 1) There is some potential for issues with caches here if extra data ends up in a cache we don't expect due to OOB reads. However, because we always trash the entire cache whenever we need to move anything between cache domains, this shouldn't be an issue. 2) There is a potential issue if a surface gets placed at the very top of the GTT by the kernel. In this case, the hardware could potentially end up trying to read past the top of the GTT. If it nicely wraps around at the 48-bit (or 32-bit) boundary, then this shouldn't be an issue thanks to the scratch page. If it doesn't, then we need to come up with something to handle it. Up until some of the GL move to ISL, having the padding code in there just caused us to harmlessly use a bit more memory in Vulkan. However, now that we're using ISL sizes to validate external dma-buf images, these padding requirements are causing us to reject otherwise valid images due to the size of the BO being too small. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "17.2" <mesa-stable@lists.freedesktop.org> (cherry picked from commit `c15b92ce11`)	2017-08-11 20:52:19 +01:00
Jason Ekstrand	9d9ea2c5a4	anv/formats: Allow sampling on depth-only formats on gen7 We can't sample from depth-stencil formats but on gen7 but we can sample from depth-only formats. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102024 Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `06d3115bb9`)	2017-08-11 20:52:16 +01:00
Jason Ekstrand	e4371d14f1	anv: Stop advertising VK_KHX_multiview We don't want to advertise experimental extensions in actual releases. However, there's no harm in leaving the code lying around in the tree.	2017-08-05 00:09:26 +01:00
Dave Airlie	6efb8d79a9	intel/vec4/gs: reset nr_pull_param if DUAL_INSTANCED compile failed. If dual object compile fails (as seems to happen with virgl a fair bit, and does piglit even have any tests for it?), we end up not restarting the pull params, so we call vec4_visitor::move_uniform_array_access_to_pull_constant a second time and it runs over the ends of the alloc. Fixes: tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test running inside virgl on ivybridge. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `271fa3a684`)	2017-08-05 00:09:25 +01:00
Iago Toral Quiroga	3d0960e761	anv: only expose up to 28 vertex attributes The EU limit of 128 GRFs should allow 32 vertex elements of 4 GRFs. However, the maximum allowed value of "Vertex URB Entry Read Length" in SIMD8 is 15. And 15 * 8 = 120 gives us a limit of 30 vertex elements. Because we also need to reserve a vertex buffer to upload VertexIndex/InstanceIndex and another to upload DrawID when needed, we can only expose 28. Cc: "17.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `31f1863ace`)	2017-07-27 18:56:45 +01:00
Iago Toral Quiroga	bdbd8ab517	anv/cmd_buffer: fix off by one error in assertion Cc: "17.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `a848e693ef`)	2017-07-27 18:56:44 +01:00
Emil Velikov	a955622c1a	intel/blorp: ship blorp_genX_exec.h within the tarball Fixes: `c9cb37b2a6` ("intel/blorp: Add a partial resolve pass for MCS") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `5d47dd9c2a`)	2017-07-24 16:59:32 +01:00
Jason Ekstrand	6874b953f6	anv/image: zalloc image views This allows us to avoid some extra zeroing. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-07-22 21:41:12 -07:00
Jason Ekstrand	a1cad8218e	anv/image: Use vk_zalloc instead of an explicit memset Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-07-22 21:41:12 -07:00
Jason Ekstrand	1e32c8303a	anv: Separate surface states by layout instead of aux_usage Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-07-22 21:41:12 -07:00
Jason Ekstrand	628bfaf1c6	intel/isl: Add some sanity checks for compressed surfaces Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-07-22 21:41:12 -07:00
Jason Ekstrand	5de4209f91	intel/isl: Add a helper to get a subimage surface We already have a helper for doing this in BLORP, this just moves the logic into ISL where we can share it with other components. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-07-22 21:41:12 -07:00
Jason Ekstrand	72bc38cfc5	anv: Get rid of some unused function declarations Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-07-22 21:41:12 -07:00
Jason Ekstrand	d4de403f91	intel/isl: Add a helper for determining if a color is 0/1 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 20:59:22 -07:00
Jason Ekstrand	b26b2490e5	intel/blorp: Allow blorp_copy on sRGB formats Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 20:59:22 -07:00
Jason Ekstrand	fb86ac94cb	intel/isl/format: Add an srgb_to_linear helper Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 20:59:22 -07:00
Jason Ekstrand	44e9d65757	intel/isl/format: Dedent the template in gen_format_layout.py This makes it much easier to edit the template and doesn't really dirty the python all that much. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 20:59:22 -07:00
Jason Ekstrand	268ba028dc	intel/isl: Add an aux state for "partial clear" Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 20:59:22 -07:00
Jason Ekstrand	c9cb37b2a6	intel/blorp: Add a partial resolve pass for MCS Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 20:59:22 -07:00
Nanley Chery	67027ddf3f	anv: Predicate fast-clear resolves Image layouts only let us know that an image may be fast-cleared. For this reason we can end up with redundant resolves. Testing has shown that such resolves can measurably hurt performance and that predicating them can avoid the penalty. v2: - Introduce additional resolve state management function (Jason Ekstrand). - Enable easy retrieval of fast clear state fields. v3: Use more descriptive field enums (Jason) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-22 20:12:10 -07:00
Nanley Chery	8e2729fbb8	intel/blorp: Allow BLORP calls to be predicated Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-22 20:12:10 -07:00
Nanley Chery	be516ba9b1	anv/cmd_buffer: Skip some input attachment transitions Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-22 20:12:10 -07:00
Nanley Chery	597ff919e7	anv: Stop resolving CCS implicitly With an earlier patch from this series, resolves are additionally performed on layout transitions. Remove the now unnecessary implicit resolves within render passes. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-22 20:12:10 -07:00
Nanley Chery	5ba93e6f5a	anv: Transition more color buffer layouts v2: Expound on comment for the pipe controls (Jason Ekstrand). v3: - Cast base_layer to uint64_t to avoid overflow. - Remove "seems" from the pipe control comment. - Fix clamp of layer_count (Jason Ekstrand). Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-22 20:12:10 -07:00
Nanley Chery	a899747eb3	anv/cmd_buffer: Warn about not enabling CCS_E Use the performance warning infrastructure to provide helpful information when testing applications. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-22 20:12:10 -07:00
Nanley Chery	9c9f63d1c7	anv/cmd_buffer: Move aux_usage assignment up For readability, bring the assignment of CCS closer to the assignment of NONE and MCS. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-22 20:12:10 -07:00
Nanley Chery	62d72bb5d0	anv/cmd_buffer: Always enable CCS_D in render passes The lifespan of the fast-clear data will surpass the render pass scope. We need CCS_D to be enabled in order to invalidate blocks previously marked as cleared and to sample cleared data correctly. v2: Avoid refactoring. v3: Allow CCS_D for subpass resolves. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-22 20:12:10 -07:00
Nanley Chery	8e532aa028	anv/cmd_buffer: Disable CCS on gen7 color attachments upfront The next patch enables the use of CCS_D even when the color attachment will not be fast-cleared. Catch the gen7 case early to simplify the changes required. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-22 20:12:10 -07:00
Nanley Chery	9fd1f2aa3c	anv/cmd_buffer: Ensure fast-clear values are current v2: Rewrite functions, change location of synchronization. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-22 20:12:10 -07:00
Nanley Chery	0b16600056	anv/gpu_memcpy: Add a lighter-weight GPU memcpy function We'll be performing a GPU memcpy in more places to copy small amounts of data. Add an alternate function that thrashes less state. v2: - Make a new function (Jason Ekstrand). - Move the #define into the function. v3: - Update the function name (Jason). - Update comments. v4: Use an indirect drawing register as TEMP_REG (Jason Ekstrand). Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-22 20:12:09 -07:00
Nanley Chery	dcff5ab9f1	anv/cmd_buffer: Restrict fast clears in the GENERAL layout v2: Remove ::first_subpass_layout assertion (Jason Ekstrand). v3: Allow some fast clears in the GENERAL layout. v4: Remove extra '\|\|' and adjust line break (Jason Ekstrand). Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-22 20:12:09 -07:00
Nanley Chery	9ffe87122b	anv/cmd_buffer: Don't partially fast clear image layers v2: Don't pass in the command buffer (Jason Ekstrand). v3: Remove an incorrect assertion and an if condition for gen7. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-22 20:12:09 -07:00
Nanley Chery	07cc2ec9db	anv/cmd_buffer: Initialize the clear values buffer v2: Rewrite functions. v3 (Jason Ekstrand): - Don't set ResourceMinLOD. - Fix clamp of level_count. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-22 20:12:09 -07:00
Nanley Chery	88200e87f6	anv/image: Append CCS/MCS with a fast-clear state buffer v2: Update comments, function signatures, and add assertions. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-22 20:12:09 -07:00
Nanley Chery	325ecffc62	anv/image: Disable CCS if the image doesn't support rendering Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-22 20:12:09 -07:00
Nanley Chery	01db9a74c6	intel/isl: Add surface state clear value information This will be used to load and store clear values from surface state objects. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-22 20:12:09 -07:00
Nanley Chery	b178e239dd	anv: Transition MCS buffers from the undefined layout v2: Define MCS buffers with any sample count (Jason) Cc: <mesa-stable@lists.freedesktop.org> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2017-07-22 20:12:09 -07:00
Jason Ekstrand	f793c57cc5	intel/isl: Tighten up restrictions for CCS on gen7 It may technically be possible to enable some sort of fast-clear support for at least the base slice of a 2D array texture on gen7. However, it's not documented to work, we've never tried to do it in GL, and we have no idea what the hardware does if you turn on CCS_D with arrayed rendering. Let's just play it safe and disallow it for now. If someone really cares that much about gen7 performance, they can come along and try to get it working later.	2017-07-22 20:12:07 -07:00
Jason Ekstrand	20533e0da7	anv/blorp: Assert isl_surf_init success in do_buffer_copy Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 08:21:27 -07:00
Jason Ekstrand	cf39fb06e3	anv/blorp: Explicitly set row_pitch in do_buffer_copy We have a very specific row pitch that we want and we don't want ISL to be changing it on us so just be explicit about it. Fixes: `a40f043034` Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 08:20:07 -07:00
Kenneth Graunke	30d6bc470a	i965: Set lower_vote_trivial in vector_nir_options_gen6 too. There's a second struct for Gen6+. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-07-21 18:09:01 -07:00
Topi Pohjolainen	fbfc6a2f67	intel/isl/gen7: Don't allow multisampled surfaces with valign2 There is the same constraintg later on as assert in isl_gen7_choose_image_alignment_el() so catch it earlier in order to return error instead of crash. Needed to avoid crashes with piglits on IVB and HSW: arb_internalformat_query2.image_format_compatibility_type pname checks arb_internalformat_query2.all internalformat_<x>_type pname checks arb_internalformat_query2.max dimensions related pname checks arb_copy_image.arb_copy_image-formats --samples=2/4/6/8 arb_texture_float.multisample-fast-clear gl_arb_texture_float Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 00:14:16 +03:00
Topi Pohjolainen	df9bb8dc05	intel/isl/gen7: Allow msaa with signed integer formats These formats are already allowed by the i965 GL driver, and the feature seems to work just fine. There are tests for multisampled rendering in piglit: tests/spec/ext_framebuffer_multisample which can be patched to try 16I/32I in addition to GL_RGBA8I. IvyBridge passed all tests with all sample numbers. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 00:14:16 +03:00
Topi Pohjolainen	abb84e3f2d	intel/isl/gen7: Allow msaa with 128-bit formats These formats are already allowed by the i965 GL driver, and the feature seems to work just fine. There are tests for multisampled rendering in piglit: tests/spec/ext_framebuffer_multisample which can be patched to try GL_RGBA16F/32F/16I/16UI/32I/32UI in addition to GL_RGBA/8I. IvyBridge passed all tests with all sample numbers and even with 128-bit formats. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 00:14:16 +03:00
Topi Pohjolainen	514d68576d	intel/isl: Allow 1D surfaces with compressed formats Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 00:14:16 +03:00
Topi Pohjolainen	a40f043034	intel/isl: Align non-tiled horizontally by cache line in order to support blit engine. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-07-22 00:14:16 +03:00
Matt Turner	069bf7c907	i965/fs: Match destination type to size for ballot No use in taking a 64-bit value when we know the high 32-bits are zero.	2017-07-20 16:56:50 -07:00
Matt Turner	1038d385a9	nir: Reduce destination size of ballot intrinsic when possible Some hardware, like i965, doesn't support group sizes greater than 32. In that case, we can reduce the destination size of the ballot intrinsic, which will simplify our code generation. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-07-20 16:56:49 -07:00

1 2 3 4 5 ...

2014 commits