Commit graph

93 commits

Author SHA1 Message Date
Jason Ekstrand
075ed20614 intel/blorp: Explicitly flush all allocated state
Found by inspection.  However, I expect it fixes real bugs when using
blorp from Vulkan on little-core platforms.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-02-21 12:26:35 -08:00
Jason Ekstrand
e233db6e93 intel/blorp: Swizzle clear colors on the CPU
It's trivial to swizzle clear colors on the CPU, easily deals with the
hardware restrictions for render target swizzles, and makes swizzled
clears work on all hardware as opposed to just HSW+.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-02-13 09:24:43 -08:00
Emil Velikov
a04cb3f8a5 intel/blorp: do not return const data by get_px_size_sa()
Not much point in the const qualifier since we provide a copy to the
user. Resolves the following -Wignored-qualifiers warning.

src/intel/blorp/blorp_blit.c:1857:8: warning: 'const' type qualifier on
return type has no effect [-Wignored-qualifiers]

v2: keep const qualifier of local variable.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-02-10 11:47:12 +00:00
Jason Ekstrand
7e6a9d9c4b intel/isl: Add a formats_are_ccs_e_compatible helper
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-02 13:33:43 -08:00
Jason Ekstrand
a0348b5a0b intel/blorp: Handle clearing of A4B4G4R4 on all platforms
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-01-31 18:49:44 -08:00
Topi Pohjolainen
542bb85049 intel/blorp/dbg: Name blit shaders for easy recognition in dumps
Blorp clears already have an equivalent.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Jason Ekstrand
817f9e3b17 intel/blorp/copy: Properly handle clear colors for CCS_E images
In order to handle CCS_E, we stomp the image format to a UINT format and
then do some bitcasting logic in the shader.  This works fine since SKL
render compression only considers the channel layout of the format and
not the format itself.  In order for this to work on images that have
been fast-cleared, we need to also convert the clear color so that, when
interpreted as UINT, it provides the same bit value as it would have in
the original format.  This fixes a bunch of OpenGL ES CTS tests for
copy_image when we start using CCS more aggressively.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-21 10:34:09 -08:00
Lionel Landwerlin
bac6fe5c77 blorp: remove unnecessary struct declaration
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-20 16:46:21 +00:00
Nanley Chery
f357af0c90 intel/blorp_clear: Add gen8 HiZ clearing functions
Add an entry point for the optimized gen8 BLORP HiZ sequence. commit
c9eaf12de2 fixed a bug that was
unknowingly worked around by forcing additional clear rectangle
alignment restrictions not specified in the PRMs. Now that the bug is no
longer present, omit the additional alignment restrictions.

v2: Adjust code comment about padding

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-12 20:52:19 -08:00
Nanley Chery
09948151ab intel/blorp: Add the BDW+ optimized HZ_OP sequence to BLORP
We'll be switching to layout-transition based resolves which can occur
outside of a render pass. Add this sequence to BLORP, as using BLORP
will enable emitting depth stencil state outside of a render pass (among
other benefits). The depth buffer extent is ignored to enable eventual
usage in VkCmdClearAttachments().

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-12 20:52:19 -08:00
Jordan Justen
097c9dc2d4 intel/blorp_blit: Fix max blit size for gen6
Fixes ES3-CTS.gtf.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_stencil_blit

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-12-26 08:50:21 -08:00
Jordan Justen
d6526d7247 intel/blorp_blit: Add split_blorp_blit_debug switch
Enabling this debug switch causes surface shrinking to happen by
default, and lowers the surface size limit which causes blorp blits to
be split.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-12-07 09:00:49 -08:00
Jordan Justen
da381ae647 intel/blorp_blit: Enable splitting large blorp blits
Detect when the surface sizes are too large for a blorp blit. When it
is too large, the blorp blit will be split into a smaller operation
and attempted again.

For gen7, this fixes the cts test:

ES3-CTS.gtf.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_multisampled_to_singlesampled_blit

It will also enable us to increase our renderable size from 8k x 8k to
16k x 16k.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-12-07 09:00:49 -08:00
Jordan Justen
efea8e7244 intel/blorp_blit: Move RGB=>R conversion to follow blit splitting
In blorp_copy, when RGB surfaces are copied, we convert the
destination surface to a Red only surface, but 3 times as wide. This
introduces an implicit restriction of "mod 3" for the destination
width.

It is easier to handle the blorp split buffer offsetting with the
original RGB surface, and do the RGB=>R after this.

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-12-07 09:00:49 -08:00
Jordan Justen
edf3113aed intel/blorp_blit: Adjust blorp surface parameters for split blits
If try_blorp_blit() previously returned that a blit was too large,
shrink_surface_params() will be used to update the surface parameters
for the smaller blit so the blit operation can proceed.

v2:
 * Use double instead of float. (Jason)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-12-07 09:00:49 -08:00
Jordan Justen
12e0a6e259 intel/blorp_blit: Split blorp blits if they are too large
We rename do_blorp_blit() to try_blorp_blit(), and add a return error
if the surface size for the blit is too large. Now, do_blorp_blit() is
rewritten to try to split the blit into smaller operations if
try_blorp_blit() fails.

Note: In this commit, try_blorp_blit() will always attempt to blit and
never return an error, which matches the previous behavior. We will
enable the size checking and splitting in a future commit.

The motivation for this splitting is that in some cases when we
flatten an image, it's dimensions grow, and this can then exceed the
programmable hardware limits. An example is w-tiled+MSAA blits.

v2:
 * Use double instead of float. (Jason)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-12-07 09:00:49 -08:00
Jordan Justen
b74d4f6ca0 intel/blorp_blit: Create structure for src & dst coordinates
This will be useful for splitting blits into smaller sizes.

We also make the coordinates of type double rather than float. Since
we will be splitting and scaling the coordinates, we might require
extra precision in the calculations.

v2:
 * Use double instead of float. (Jason)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-12-07 09:00:49 -08:00
Topi Pohjolainen
f19e0967c9 intel/blorp: Fix rectangle size for level-not-zero resolves
Needed to prevent gpu hangs when mip-mapped compression gets
enabled.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-23 11:06:52 +02:00
Jason Ekstrand
2b5644e94d intel/blorp: Properly handle color compression in blorp_copy
Previously, blorp copy operations were CCS-unaware so you had to perform
resolves on the source and destination before performing the copy.  This
commit makes blorp_copy capable of handling CCS-compressed images without
any resolves.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-11-17 12:03:24 -08:00
Jason Ekstrand
89f9c46a74 intel/blorp: Always use UINT formats on SKL+
Many of these UINT formats aren't available prior to Sky Lake so we used
UNORM formats.  Using UINT formats is a bit nicer because it guarantees we
don't run into rounding issues.  Also, we will need it in the next commit
for handling copies with CCS enabled.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-11-17 12:03:24 -08:00
Jason Ekstrand
1ba2f05bc0 intel/blorp: Take a fast_clear_op in ccs_resolve
Eventually, we may want to just have a single blorp_ccs_op function that
does both clears and resolves.  For now we'll stick to just making the
ccs_resolve function we have now a bit more configurable.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-11-17 12:03:24 -08:00
Pohjolainen, Topi
7c560e8ccc intel/blorp: Add plumbing for color resolve slice details
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-17 12:03:24 -08:00
Jason Ekstrand
72878f9f53 intel/blorp: Add a clear_attachments entrypoint
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-11-16 10:11:29 -08:00
Jason Ekstrand
0aea29cc1c intel/blorp: Add capability to use pre-baked binding tables
When a pre-baked binding table is requested, no binding table is created,
instead the binding table offset (relative to surface state base address)
provided by the user is used verbatim.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-11-16 10:11:29 -08:00
Jason Ekstrand
f7f768d195 intel/blorp: Add support for vertex shaders
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-11-16 10:11:29 -08:00
Jason Ekstrand
768c8dd718 intel/blorp: Use an actual chunk of vertex buffer for the VUE header
We're about to start passing other things in as a sort of "VS header" for
vertex shaders and we need a place to put them.  Since we want the instance
id to be one of them, it makes sense to have one vec4 that's either VUE
header or VS header.  Always uploading some handy zeros makes the code a
bit simpler.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-11-16 10:11:29 -08:00
Jason Ekstrand
8c8095c260 blorp/exec: Use uint32_t for copying varying data
Some things may not be floats and intel CPUs are known for mangling bits
when a float type is used for copying integers.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-11-16 10:11:29 -08:00
Jason Ekstrand
21943c35f7 intel/blorp: Handle NIR clear inputs the same way as blit inputs
By using offsetof() we can ensure that adding fiels to wm_inputs is always
safe as long as we maintain alignment.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-11-16 10:11:29 -08:00
Jason Ekstrand
570a0e844b intel/blorp: Remove NIR support for uniforms
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-11-16 10:11:29 -08:00
Jason Ekstrand
99b436ae5c intel/blorp: Add a shader type to make keys more unique
Depending on how the driver using blorp implements its shader caching,
there is a small chance of shader collisions due to identical keys between
blit and clear programs.  Adding a small shader type at the top of the key
alleviates this problem.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-11-16 10:11:29 -08:00
Jason Ekstrand
1acebeb191 intel/blorp: Make the number of samples an explicit parameter
Previously, we always inferred it from params->dst which meant that
references to params->dst were scattered all throughout the state upload
code.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-11-16 10:11:29 -08:00
Jason Ekstrand
b3bc806855 intel/isl: Add some basic info about RENDER_SURFACE_STATE to isl_device
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-11-16 10:10:26 -08:00
Jason Ekstrand
1587ac1edc intel/genxml: Make 3DSTATE_WM more consistent across gens
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-11-16 10:09:03 -08:00
Jason Ekstrand
fb02d2d13b intel/genxml: Make some 3DSTATE_PS fields more consistent
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2016-11-16 10:08:58 -08:00
Jordan Justen
615ccf44cf intel/blorp: Use designated initializers in surf_convert_to_single_slice
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-15 22:51:19 -08:00
Jason Ekstrand
406cd9d126 intel/blorp: Emit all the binding tables
At least on Sky Lake, after emitting 3DSTATE_CONSTANT_*, you are required
to re-emit the 3DSTATE_BINDING_TABLE_POINTERS packet for the corresponding
stage.  If you don't, double-buffering may fail and you may get the wrong
constants.  It turns out that you need to do this even if you have no push
constants to speak of or else the next 3DSTATE_CONSTANT packet you emit for
that stage may not work correctly.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
2016-11-08 08:32:55 -08:00
Jason Ekstrand
4306c10a88 intel/blorp: Pass a brw_stage_prog_data to upload_shader
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98012
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
2016-11-02 09:32:19 -07:00
Jason Ekstrand
058304f081 intel/blorp: Use wm_prog_data instead of hand-rolling our own
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98012
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
2016-11-02 09:32:15 -07:00
Timothy Arceri
5857c3082e intel/blorp: remove stale comment
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-28 19:51:08 +11:00
Jason Ekstrand
43dadb6edd intel/blorp: Rework our usage of ralloc when compiling shaders
Previously, we were creating the shader with a NULL ralloc context and then
trusting in blorp_compile_fs to clean it up.  The only problem was that
blorp_compile_fs didn't clean up its context properly so we were leaking.
When I went to fix that, I realized that it couldn't because it has to
return the shader binary which is allocated off of that context and used by
the caller.  The solution is to make blorp_compile_fs take a ralloc
context, allocate the nir_shaders directly off that context, and clean it
all up in whatever function creates the shader and calls blorp_compile_fs.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0, 13.0" <mesa-stable@lists.freedesktop.org>
2016-10-27 22:46:13 -07:00
Jason Ekstrand
ab92480272 intel/blorp: Rename compile_nir_shader to compile_fs
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-10-27 22:46:13 -07:00
Jason Ekstrand
4964a5149b intel/blorp: Fix a couple asserts around image copy rectangles
With dealing with rectangles in compressed images, you can have a width or
height that isn't a multiple of the corresponding compression block
dimension but only if that edge of your rectangle is on the edge of the
image.  When we call convert_to_single_slice, it creates an 2-D image and a
set of tile offsets into that image.  When detecting the right-edge and
bottom-edge cases, we weren't including the tile offsets so the assert
would misfire.  This caused crashes in a few UE4 demos

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reported-by: "Eero Tamminen" <eero.t.tamminen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98431
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Tested-by: "Eero Tamminen" <eero.t.tamminen@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-10-27 13:45:39 -07:00
Timothy Arceri
91d61fbf7c i965: rewrite brw_setup_vue_interpolation()
Here brw_setup_vue_interpolation() is rewritten not to use the InterpQualifier
array in gl_fragment_program which will allow us to remove it.

This change also makes the code which is only used by gen4/5 more self contained
as it now has its own gen5_fragment_program struct rather than storing the map
in brw_context. This means the interpolation map will only get processed once
and will get stored in the in memory cache rather than being processed everytime
the fs changes.

Also by calling this from the fs compile code rather than from the upload code
and using the interpolation assigned there we can get rid of the
BRW_NEW_INTERPOLATION_MAP flag.

It might not seem ideal to add a gen5_fragment_program struct however by the end
of this series we will have gotten rid of all the brw_{shader_stage}_program
structs and replaced them with a generic brw_program struct so there will only
be two program structs which is better than what we have now.

V2: Don't remove BRW_NEW_INTERPOLATION_MAP from dirty_bit_map until the following
patch to fix build error.

V3 - Suggestions by Jason:
- name struct gen4_fragment_program rather than gen5_fragment_program
- don't use enum with memset()
- create interp mode set helper and simplify logic to call it
- add assert when calling function to show prog will never be NULL for
 gen4/5 i.e. no Vulkan

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-10-26 14:29:36 +11:00
Timothy Arceri
e1af20f18a nir/i965/anv/radv/gallium: make shader info a pointer
When restoring something from shader cache we won't have and don't
want to create a nir_shader this change detaches the two.

There are other advantages such as being able to reuse the
shader info populated by GLSL IR.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-26 14:29:36 +11:00
Jason Ekstrand
d80c0307ea intel/blorp: Add a flag to make blorp not re-emit dept/stencil buffers
In Vulkan, we want to be able to use blorp to perform clears inside of a
render pass.  If blorp stomps the depth/stencil buffers packets then we'll
have to re-emit them.  This gets tricky when secondary command buffers get
involved.  Instead, we'll simply guarantee that the depth and stencil
buffers we pass to blorp (if any) match those already set in the hardware.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-10-14 15:39:41 -07:00
Jason Ekstrand
0cabf93b80 intel/blorp: Add an entrypoint for clearing depth and stencil
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-10-14 15:39:41 -07:00
Jason Ekstrand
82a2c49c5f intel/blorp: Emit a NULL render target for depth/stencil-only operations
This never mattered before because the only time we used blorp
depth/stencil only was to do HiZ operations on gen6-7.  It may have worked
in that case (and maybe it didn't) but slow depth clears actually do depth
rendering so they need a valid render target.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-10-14 15:39:41 -07:00
Jason Ekstrand
b324c38ae3 intel/blorp: Allow for running without a PS on gen8+
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-10-14 15:39:41 -07:00
Jason Ekstrand
81be7be119 intel/blorp: Add an "enabled" bit to surface_info
This gives a slightly smarter way to check whether or not a particular
surface exists than looking at the address.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-10-14 15:39:41 -07:00
Jason Ekstrand
bc4bb5a7e3 intel/blorp: Emit more complete DEPTH_STENCIL state
This should now set the pipeline up properly for doing depth and/or stencil
clears by plumbing through depth/stencil test values.  We are now also
emitting color calculator state for blorp operations without an actual
shader because that is where the stencil reference value goes pre-SKL.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-10-14 15:39:41 -07:00