There should be no functional change here because Broxton and CHV are
both gt1. Without this code however, it might seem like broxton support
is missing.
While here, put the gt1 check in front to hopefully short-circuit the
condition for the mobile cases.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
From the less man page:
"Warning: when the -r option is used, less cannot keep track of the
actual appearance of the screen (since this depends on how the
screen responds to each type of control character). Thus, various
display problems may result, such as long lines being split in the
wrong place."
Lines which are too long to fit in the terminal would be word wrapped,
but unfortunately less would get confused about which line it was on,
and text would be drawn on top of other text. The most noticable case
was shader assembly, which is frequently too wide for an 80 character
terminal, and thus would be drawn on top of the following state packets,
making them completely unreadable.
Using -R instead of -r fixes this problem by only allowing color escape
sequences. (Notably, Git's implicit pager invocation uses -R.)
Unfortunately, it means our "clear to the end of the line" hack for
extending the blue bar headers won't work anymore.
Word wrapping usually isn't terribly readable, anyway, so we also add
the -S option (chop long lines) to restrict it to the terminal width.
(You can hit the left and right arrow keys to scroll sideways.)
Then, for a new blue bar hack, we can use a printf specifier to pad
the command packet names to be 80 characters long (arbitrarily), which
extends them "far enough" to look good, and doesn't require us to use
ioctls to determine the terminal width.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Sirisha Gandikota <sirisha.gandikota@intel.com>
(warning: 'const' type qualifier on return type has no effect)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
We already do those checks in filter_tiling. There's no good reason to
repeat them in choose_msaa_layout. If anything they should have been
asserts and not "return false" checks. Also, this check was causing us to
outright reject multisampled HiZ surfaces which wasn't intended.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
The HiZ and CCS tiling formats are always used for HiZ and CCS surfaces
respectively. There's no reason why we should go through filter_tiling and
it's much easier to always get HiZ and CCS right if we just handle them
directly.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
HiZ buffers can be multisampled and, on Broadwell and earlier, simply using
interleaved multisampling with a compression block size of 8x4 samples
yields the correct HiZ surface size calculations. Unfortunately,
choose_msaa_layout was rejecting multisampled HiZ buffers because of format
checks. Now that we have a simple helper for determining if a format
supports multisampling, that's an easy enough issue to fix.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Compressed 1-D textures are not well-defined thing in either GL or Vulkan.
However, auxiliary surfaces are treated as compressed textures in ISL and
we can do HiZ and CCS with 1-D so we need to be able to create them. In
order to prevent actually using them (the docs say no), we assert in the
state setup code.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
The assertion that a format is uncompressed in the multisample layouts
isn't quite right. What we really want to assert is that the format
supports multisampling which is a bit more complicated query. We also want
to assert that it has a block size of 1x1 since we do nothing with the
block size in the phys_level0_sa assignment.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
gcc-4 and earlier don't allow compound literals where a constant
is required in -std=c99/gnu99 mode, so we can't use ISL_SWIZZLE()
when populating the anv_formats[] array. There are a few ways around
it: First one would be -std=c89/gnu89, but the rest of the code
depends on c99 so it's not really an option. The second option
would be to upgrade to gcc-5+ where the compiler behaviour was relaxed
a bit [1]. And the third option is just to avoid using compound
literals. I chose the last option since it keeps gcc-4 and earlier
working.
[1] https://gcc.gnu.org/gcc-5/porting_to.html
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Topi Pohjolainen <topi.pohjolainen@intel.com>
Fixes: 7ddb21708c ("intel/isl: Add an isl_swizzle structure and use it for isl_view swizzles")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixed the way the values that span two Dwords are decoded.
Based on the start and end indices of the field, the Dwords
are fetched and decoded accordingly.
v2: rename dw to qw in gen_field_iterator_next
and remove extra white space (Anuj)
v3: change all instances of dw to qw (Anuj)
Earlier, 64-bit fields (such as most pointers on Gen8+)
weren't decoded correctly. gen_field_iterator_next seemed
to walk one DWord at a time, sets v.dw, and then passes it
to field(). So, even though field() takes a uint64_t, we're
passing it a uint32_t (which gets promoted, so the top 32
bits will always be zero). This seems pretty bogus... (Ken)
Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
From the Vulkan spec:
Size is the number of bytes to fill, and must be either a multiple of 4,
or VK_WHOLE_SIZE to fill the range from offset to the end of the buffer.
If VK_WHOLE_SIZE is used and the remaining size of the buffer is not a
multiple of 4, then the nearest smaller multiple is used.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Make gen_device_info a mutable structure so we can update the fields that
can be refined by querying the kernel (like subslices and EU numbers).
This patch does not make any functional change, it just makes
gen_get_device_info() fill a structure rather than returning a const
pointer.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reproduces this commit :
commit 0fb85ac08d
Author: Kenneth Graunke <kenneth@whitecape.org>
Date: Mon Jun 6 21:37:34 2016 -0700
i965: Use the correct number of threads for compute shaders.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This reproduces this commit :
commit 2213ffdb4b
Author: Kenneth Graunke <kenneth@whitecape.org>
Date: Mon Jun 6 21:37:34 2016 -0700
i965: Allocate scratch space for the maximum number of compute threads.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Gen6 only has one additional restriction over Gen7+, so we just add it
to the existing gen7 function (which actually covers later gens too).
This should stop FINISHME spew when running GL on Sandybridge.
v2: Fix bytes per block vs. bits per block confusion (Jason) and
rename function to gen6_filter_tiling (Jason and Chad).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Without this bit set, the value in "L3 Atomic Disable" won't get applied by
the hardware so we won't properly get L3 atomic caching.
Fixes dEQP-VK.spirv_assembly.instruction.compute.opatomic.compex and 198 of
the dEQP-VK.image.atomic_operations.* tests on HSW
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
The Vulkan driver sets 3DSTATE_DRAWING_RECTANGLE once to MAX_INT x MAX_INT
at the GPU initialization time and never sets it again. The GL driver sets
it every time the framebuffer changes. Originally, blorp set it to the
size of the drawing area but meant we had to set it back in the Vulkan
driver. Instead, we can easily just do that in the GL driver's blorp_exec
implementation and not set it in blorp core.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Previously, we relied on a driver hook for 3DSTATE_MULTISAMPLE. However,
now that Vulkan and GL use the same sample positions, we can set up
3DSTATE_MULTISAMPLE directly in blorp and delete the driver hook.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Earlier, the loop pretends to loop over instructions from "start" to "end",
but the callers always pass 8192 for end, which is some huge bogus
value. The real loop termination condition is send-with-EOT or 0. (Ken)
Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Skylake adds new SENDS and SENDSC opcodes, which should be
handled in the send-with-EOT check. Make an is_send() helper
that checks if the opcode is SEND/SENDC/SENDS/SENDSC (Ken)
v2: Make is_send() much more crispier, Mix declaration and
code to make the code compact (Ken)
Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Remove the float/dword union and use the iter->p[f->start / 32]
directly as printf formatter %08x expects uint32_t (Ken)
v2: Make the cleanup much more crispier (Ken)
Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Since Vulkan doesn't allow single-slice 3D storage images, we need to just
set the base_array_layer and array_len to the full size of the 3-D LOD.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
This reverts commit 3943888c94. It turns out
that commit was pretty-much bogus since it breaks binding a 3-D texture as a
2-D storage image. The correct fix for the Vulkan CTS tests needs to be in
the Vulkan driver itself rather than ISL.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Everything that we were once using the blit2d framework for is now done
with blorp.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
This is a lot cleaner and easier to read than the old piles of if
statements.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>