Due to complications with things such as URB setup on gen4-5, it's
easier to keep gen4 support in blorp completely internal to i965. This
makes things a bit awkward because that means there's a file in i965
that includes blorp_priv.h but it's either that or have a file in blorp
that includes brw_context.h.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
As part of enabling support for SF programs, we plumb the SF URB size
through to emit_urb_config. For now, it's always zero but, on gen4, it
may be something larger.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
It isn't supported prior to gen6 and, on gen6+, NIR will fuse the fmul
and fadd into an ffma automatically for us anyway.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Gen5 and earlier can't do non-normalized coordinates so we need to
compensate in the shader. Fortunately, it's pretty easy plumb through.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
It's not needed for blorp_copy because it already overrides formats.
It's also not needed for blorp_clear because it clears stencil as
stencil.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Commit e1af20f18a changed the shader_info
from being embedded into being just a pointer. The idea was that
sharing the shader_info between NIR and GLSL would be easier if it were
a pointer pointing to the same shader_info struct. This, however, has
caused a few problems:
1) There are many things which generate NIR without GLSL. This means
we have to support both NIR shaders which come from GLSL and ones
that don't and need to have an info elsewhere.
2) The solution to (1) raises all sorts of ownership issues which have
to be resolved with ralloc_parent checks.
3) Ever since 00620782c9, we've been
using nir_gather_info to fill out the final shader_info. Thanks to
cloning and the above ownership issues, the nir_shader::info may not
point back to the gl_shader anymore and so we have to do a copy of
the shader_info from NIR back to GLSL anyway.
All of these issues go away if we just embed the shader_info in the
nir_shader. There's a little downside of having to copy it back after
calling nir_gather_info but, as explained above, we have to do that
anyway.
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
All callers of isl_surf_init() that set 'min_row_pitch' wanted to
request an *exact* row pitch, as evidenced by nearby asserts, but isl
lacked API for doing so. Now that isl has an API for that, update the
code to use it.
v2: Assert that isl_surf_init() succeeds because the callers assume
it. [for jekstrand]
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> (v1)
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
The NIR story on conversion opcodes is a mess. We've had way too many
of them, naming is inconsistent, and which ones have explicit sizes was
sort-of random. This commit re-organizes things and makes them all
consistent:
- All non-bool conversion opcodes now have the explicit size in the
destination and are named <src_type>2<dst_type><size>.
- Integer <-> integer conversion opcodes now only come in i2i and u2u
forms (i2u and u2i have been removed) since the only difference
between the different integer conversions is whether or not they
sign-extend when up-converting.
- Boolean conversion opcodes all have the explicit size on the bool and
are named <src_type>2<dst_type>.
Making things consistent also allows nir_type_conversion_op to be moved
to nir_opcodes.c and auto-generated using mako. This will make adding
int8, int16, and float16 versions much easier when the time comes.
Reviewed-by: Eric Anholt <eric@anholt.net>
Not much point in the const qualifier since we provide a copy to the
user. Resolves the following -Wignored-qualifiers warning.
src/intel/blorp/blorp_blit.c:1857:8: warning: 'const' type qualifier on
return type has no effect [-Wignored-qualifiers]
v2: keep const qualifier of local variable.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
In order to handle CCS_E, we stomp the image format to a UINT format and
then do some bitcasting logic in the shader. This works fine since SKL
render compression only considers the channel layout of the format and
not the format itself. In order for this to work on images that have
been fast-cleared, we need to also convert the clear color so that, when
interpreted as UINT, it provides the same bit value as it would have in
the original format. This fixes a bunch of OpenGL ES CTS tests for
copy_image when we start using CCS more aggressively.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Enabling this debug switch causes surface shrinking to happen by
default, and lowers the surface size limit which causes blorp blits to
be split.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Detect when the surface sizes are too large for a blorp blit. When it
is too large, the blorp blit will be split into a smaller operation
and attempted again.
For gen7, this fixes the cts test:
ES3-CTS.gtf.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_multisampled_to_singlesampled_blit
It will also enable us to increase our renderable size from 8k x 8k to
16k x 16k.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
In blorp_copy, when RGB surfaces are copied, we convert the
destination surface to a Red only surface, but 3 times as wide. This
introduces an implicit restriction of "mod 3" for the destination
width.
It is easier to handle the blorp split buffer offsetting with the
original RGB surface, and do the RGB=>R after this.
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
If try_blorp_blit() previously returned that a blit was too large,
shrink_surface_params() will be used to update the surface parameters
for the smaller blit so the blit operation can proceed.
v2:
* Use double instead of float. (Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We rename do_blorp_blit() to try_blorp_blit(), and add a return error
if the surface size for the blit is too large. Now, do_blorp_blit() is
rewritten to try to split the blit into smaller operations if
try_blorp_blit() fails.
Note: In this commit, try_blorp_blit() will always attempt to blit and
never return an error, which matches the previous behavior. We will
enable the size checking and splitting in a future commit.
The motivation for this splitting is that in some cases when we
flatten an image, it's dimensions grow, and this can then exceed the
programmable hardware limits. An example is w-tiled+MSAA blits.
v2:
* Use double instead of float. (Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This will be useful for splitting blits into smaller sizes.
We also make the coordinates of type double rather than float. Since
we will be splitting and scaling the coordinates, we might require
extra precision in the calculations.
v2:
* Use double instead of float. (Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Previously, blorp copy operations were CCS-unaware so you had to perform
resolves on the source and destination before performing the copy. This
commit makes blorp_copy capable of handling CCS-compressed images without
any resolves.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Many of these UINT formats aren't available prior to Sky Lake so we used
UNORM formats. Using UINT formats is a bit nicer because it guarantees we
don't run into rounding issues. Also, we will need it in the next commit
for handling copies with CCS enabled.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
By using offsetof() we can ensure that adding fiels to wm_inputs is always
safe as long as we maintain alignment.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Depending on how the driver using blorp implements its shader caching,
there is a small chance of shader collisions due to identical keys between
blit and clear programs. Adding a small shader type at the top of the key
alleviates this problem.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Previously, we always inferred it from params->dst which meant that
references to params->dst were scattered all throughout the state upload
code.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Previously, we were creating the shader with a NULL ralloc context and then
trusting in blorp_compile_fs to clean it up. The only problem was that
blorp_compile_fs didn't clean up its context properly so we were leaking.
When I went to fix that, I realized that it couldn't because it has to
return the shader binary which is allocated off of that context and used by
the caller. The solution is to make blorp_compile_fs take a ralloc
context, allocate the nir_shaders directly off that context, and clean it
all up in whatever function creates the shader and calls blorp_compile_fs.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0, 13.0" <mesa-stable@lists.freedesktop.org>
With dealing with rectangles in compressed images, you can have a width or
height that isn't a multiple of the corresponding compression block
dimension but only if that edge of your rectangle is on the edge of the
image. When we call convert_to_single_slice, it creates an 2-D image and a
set of tile offsets into that image. When detecting the right-edge and
bottom-edge cases, we weren't including the tile offsets so the assert
would misfire. This caused crashes in a few UE4 demos
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reported-by: "Eero Tamminen" <eero.t.tamminen@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98431
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Tested-by: "Eero Tamminen" <eero.t.tamminen@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
The result of this calculation goes into an fma() in the shader and we
would like it to be as precise as possible. The division in particular
was a source of imprecision whenever dst1 - dst0 was not a power of two.
This prevents regressions in some of the new Vulkan CTS tests for blitting
using a filtering of NEAREST.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
While it can be useful, the field has substantial limtations. In
particular, the bittom 2 or 3 bits is missing so your offset always has to
be a multiple of 4 or 8. While surface alignments usually work out to make
this ok, when you start trying to fake compressed surfaces as uncompressed
(which we will want to do) this falls apart. The easiest solution is to
simply align all offsets to a tile boundary and munge the regions we're
copying to account for the intratile offset.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
The convert_to_single_slice operation is *mostly* idempotent. The only
non-repeatable thing it does is that, when it sets the intratile offset
fields, it just overwrites them instead of doing a += operation. This is
supposed to be ok because we have an early return at the top that should
make it bail of the surface is already a single slice. Unfortunately, the
if condition has been broken ever since it was first added in 96fa98c18.
This commit fixes the condition and adds an assert to ensure we don't stomp
any non-zero intratile offsets.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
If we use the view format, it may be an uncompressed view of a compressed
image which throws things off. Since we're computing offsets of images, we
want the actual surface offset anyway.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
We're going to use it for more than just stencil textures
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
This should be more compact than the enum isl_channel_select[4] that we
were using before. It's also very convenient because we already had such a
structure in the Vulkan driver we just needed to pull it over.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>