3D surfaces in Skylake are stored with ISL_DIM_LAYOUT_GEN4_2D. Any
delta in the logical z offset causes an equivalent delta in the
surface's array layer.
Test isl_surf_get_image_intratile_offset_el() in the tests:
test_bdw_2d_r8g8b8a8_unorm_512x512_array01_samples01_noaux_tiley0
test_bdw_2d_r8g8b8a8_unorm_1024x1024_array06_samples01_noaux_tiley0
When calculating row pitch, the row's width in samples must be divided
by the format's block width. The commit below accidentally removed the
division.
commit eea2d4d059
Author: Chad Versace <chad.versace@intel.com>
Date: Tue Jan 5 14:28:28 2016 -0800
Subject: isl: Don't align phys_slice0_sa.width twice
The if/then/else block was bogus, as it can only take a scalar
condition, and we need to select component-wise. The GLSL IR
implementation of atan2 handles this by looping over components,
but I decided to try and do it vector-wise, and messed up.
For now, just bcsel. It means that we do the atan1 math even if
all components hit the quick case, but it works, and presumably
at least one component will hit the expensive path anyway.
We were botching this for negative numbers - floor of a negative rounds
the wrong way. Additionally, both results are supposed to retain the
sign of the original.
To fix this, just take the abs of both values, then put the sign back.
There's probably a better way to do this, but this works for now.
The table has this marked as unsupported on all gens, but I don't really
believe that given how early it is in the table. I've tested and it seems
to work on Broadwell. The Bspec says that it sould be renderable on SKL+
but alpha blending is questionable.
Side note: We really need to audit the format table again.
The X component of the offset is set to the layer index times layer
height which is obviously bogus, return the vertical offset of the
slice as Y component instead. Fixes a few image load/store tests that
use 1D arrays on SKL when forcing it to fall back to untyped reads and
writes.
It's mostly the same and contains some non-trivial logic, so it really
should be shared. Also, we're about to make some modifications here that
we would really like to share.
For unspills (scratch reads), we can just set WE_all all the time because
we always unspill into a new GRF. For spills, we have two options: If the
instruction has a 32-bit-per-channel destination and "normal" regioning,
then we just do a regular write and it will interleave channels from
different control-flow paths properly. If, on the other hand, the the
regioning is non-normal, then we have to unspill, run the instruction, and
spill afterwards. In this second case, we need to do the spill with
we_ALL.