This helper encodes more details, specifically about Haswell, than the
previous asserts in isl_surface_state.c.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The size of the clear color struct (expected by the hardware) is 8
dwords (isl_dev.ss.clear_value_state_size here). But we still need to
track the size of the clear color, used when memcopying it to/from the
state buffer. For that we keep isl_dev.ss.clear_value_size.
v4:
- Add struct to gen11 too (Jason, Jordan)
- Add field for Converted Clear Color to gen11 (Jason)
- Add clear_color_state_offset to differentiate from
clear_value_offset.
- Fix all the places where clear_value_size was used.
v5 (Jason):
- Split genxml changes to another commit.
- Remove unnecessary gen checks.
- Bring back missing offset increment to init_fast_clear_color().
v6 (Jason):
- On init_fast_clear_color, change:
addr.offset += 4 => sdi.Address.offset += i * 4
- Use GEN_GEN instead of GEN_VERSIONx10.
[jordan.l.justen@intel.com: isl_device_init changes]
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
ISL already offers functions to fill out most kinds of SURFACE_STATE,
so why not handle null surfaces too?
Null surfaces are simple, so we can just take the dimensions, rather
than an entirte fill structure.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Also, silence an obnoxious finishme that started occurring for all
GL applications which use stencil after the i965 ISL conversion.
v2: Check against 3DSTATE_STENCIL_BUFFER's pitch bits when using
separate stencil, and 3DSTATE_DEPTH_BUFFER's bits when using
combined depth-stencil.
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We were calculating the total height of 2D surfaces by multiplying the
row pitch by the number of slices. This means that we actually request
slightly more space than actually needed since the padding on the last
slice is unnecessary. For tiled surfaces this is not likely to make a
difference. For linear surfaces, on the other hand, this means we may
require additional memory. In particular, this makes the i965 driver
reject EGL imports of buffers which do not have this extra padding.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
The docs contain a bunch of commentary about the need to pad various
surfaces out to multiples of something or other. However, all of those
requirements are about avoiding GTT errors due to missing pages when the
data port or sampler accesses slightly out-of-bounds. However, because
the kernel already fills all the empty space in our GTT with the scratch
page, we never have to worry about faulting due to OOB reads. There are
two caveats to this:
1) There is some potential for issues with caches here if extra data
ends up in a cache we don't expect due to OOB reads. However,
because we always trash the entire cache whenever we need to move
anything between cache domains, this shouldn't be an issue.
2) There is a potential issue if a surface gets placed at the very top
of the GTT by the kernel. In this case, the hardware could
potentially end up trying to read past the top of the GTT. If it
nicely wraps around at the 48-bit (or 32-bit) boundary, then this
shouldn't be an issue thanks to the scratch page. If it doesn't,
then we need to come up with something to handle it.
Up until some of the GL move to ISL, having the padding code in there
just caused us to harmlessly use a bit more memory in Vulkan. However,
now that we're using ISL sizes to validate external dma-buf images,
these padding requirements are causing us to reject otherwise valid
images due to the size of the BO being too small.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
We already have a helper for doing this in BLORP, this just moves the
logic into ISL where we can share it with other components.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This will be used to load and store clear values from surface state
objects.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
It may technically be possible to enable some sort of fast-clear support
for at least the base slice of a 2D array texture on gen7. However,
it's not documented to work, we've never tried to do it in GL, and we
have no idea what the hardware does if you turn on CCS_D with arrayed
rendering. Let's just play it safe and disallow it for now. If someone
really cares that much about gen7 performance, they can come along and
try to get it working later.
If we allow the size to be more than 2^32, then we should compute it
in 64bit arithmetic otherwise we might run into overflow issues.
CID: 1412892, 1412891
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
This reverts commit 8aaa13467d, which was
based on an incorrect assumption. Unlike the restriction placed on image
views in the Vulkan API, OpenGL allows you to render to texture views
whose formats differ from the originals.
Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=101677
v2 (Jason Ekstrand):
- Remove Vulkan-specific terminology from the commit title.
- Replace '== 7' with '<= 7' to hint that this is a new feature on BDW+.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
V2: Start using gen10 functions isl_gen10*(), gen10_blorp_exec()
gen10_init_atoms() (Jason)
Remove Vulkan changes. Do them later in a separate patch.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Frequently, get_image_offset_sa is combined with get_intratile_offset_sa
so it makes sense to have a single helper to do both. If the caller
doesn't want the intratile offsets, it can simply pass NULL and ISL will
assert that they are 0.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
The only surface layout for which slice0 makes any sense is GEN4_2D.
Move all of the slice0 stuff into isl_calc_phys_total_extent_el_gen4_2d
and make the others trivially return the total size in surface elements.
As a side-effect, array_pitch_el_rows is now returned from these helpers
as well.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
We've already implicitly been using a physical total size in surface
elements. This just centralizes things a bit.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Over 90% of the function only applies to ISL_DIM_LAYOUT_GEN4_2D anyway
so we can just handle the other two as special cases at the top. The
two "generic" cases below the switch only apply on gen9 and above and
only to 3D or CCS surfaces. This implies that they only apply to
surfaces with ISL_DIM_LAYOUT_GEN4_2D. Making them look generic is a
lie.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
We were only using it for validating that we don't use Ys/Yf on gen8 and
earlier. Removing it from isl_tiling_get_info lets us remove it from a
bunch of other things that had no business needing a hardware
generation.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Gen4 cube maps are a 2-D surface with ISL_DIM_LAYOUT_GEN4_3D which is a
bit weird but accurate none the less.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
On Iron Lake, the packets exist but we never emit them so there's no
need for us to ask the driver to make batch space for them.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
This gets rid of one piece of ugliness with the way ISL handles surface
emitting surface states. I've never liked that hand-rolled table but it
was the best we had at the time.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
The caller does so by setting the new field
isl_surf_init_info::row_pitch.
v2: Validate the requested row_pitch.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
Validate that isl_surf::row_pitch fits in the below bitfields,
if applicable based on isl_surf::usage.
RENDER_SURFACE_STATE::SurfacePitch
RENDER_SURFACE_STATE::AuxiliarySurfacePitch
3DSTATE_DEPTH_BUFFER::SurfacePitch
3DSTATE_HIER_DEPTH_BUFFER::SurfacePitch
v2:
-Add a Makefile dependency on generated header genX_bits.h.
v3:
- Test ISL_SURF_USAGE_STORAGE_BIT too. [for jekstrand]
- Drop explicity dependency on generated header. [for emil]
v4:
- Rebase for new gen_bits_header.py script.
- Replace gen_10x with gen_device_info*.
v5:
- Drop FINISHME for validation of GEN9 1D row pitch. [for jekstrand]
- Reformat bit tests. [for jekstrand]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v4)
The calculations of row_pitch, the row pitch's alignment, surface size,
and base_alignment were mixed together. This patch moves the calculation
of row_pitch and its alignment to occur before the calculation of
surface_size and base_alignment.
This simplifies a follow-on patch that adds a new member, 'row_pitch',
to struct isl_surf_init_info.
v2:
- Also extract the row pitch alignment.
- More helper functions that will later help validate the row pitch.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
isl has a giant comment that explains the hardware's padding
requirements. (Hint: Cache lines and page faults). But the comment is in
the wrong place, in isl_calc_linear_row_pitch(), which is unrelated to
padding.
The important parts of that comment were copied to
isl_apply_surface_padding() long ago. So drop the misplaced comment.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The PRMs state that this packet is 16 DWORDS long. Ensure that the last
three DWORDS are zeroed as required by the hardware when allocating a
null surface state.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
v2: Instead of having the same block in isl_gen7,8,9.c add it
once into isl.c::isl_choose_image_alignment_el() instead.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
v3 (Jason Ekstrand): Add a comment explaining why
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
The isl_surf_init call that each of these helpers make can, in theory,
fail. We should propagate that up to the caller rather than just
silently ignoring it.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>