Commit graph

15202 commits

Author SHA1 Message Date
Jason Ekstrand
0176c6a692 intel/isl: Allow non-2D HiZ surfaces
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03 14:53:01 -07:00
Jason Ekstrand
4e397c6c75 intel/isl: Add a detailed comment about multisampling with HiZ
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03 14:53:01 -07:00
Jason Ekstrand
c3bd711411 intel/isl: Remove tiling checks from choose_msaa_layout
We already do those checks in filter_tiling.  There's no good reason to
repeat them in choose_msaa_layout.  If anything they should have been
asserts and not "return false" checks.  Also, this check was causing us to
outright reject multisampled HiZ surfaces which wasn't intended.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03 14:53:01 -07:00
Jason Ekstrand
69d3bb9915 intel/isl: Handle HiZ and CCS tiling more directly
The HiZ and CCS tiling formats are always used for HiZ and CCS surfaces
respectively.  There's no reason why we should go through filter_tiling and
it's much easier to always get HiZ and CCS right if we just handle them
directly.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03 14:53:01 -07:00
Jason Ekstrand
b1311a48e0 intel/isl: Allow multisampling with ISL_FORMAT_HiZ
HiZ buffers can be multisampled and, on Broadwell and earlier, simply using
interleaved multisampling with a compression block size of 8x4 samples
yields the correct HiZ surface size calculations.  Unfortunately,
choose_msaa_layout was rejecting multisampled HiZ buffers because of format
checks.  Now that we have a simple helper for determining if a format
supports multisampling, that's an easy enough issue to fix.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03 14:53:01 -07:00
Jason Ekstrand
baade41a5c intel/isl: Allow creation of 1-D compressed textures
Compressed 1-D textures are not well-defined thing in either GL or Vulkan.
However, auxiliary surfaces are treated as compressed textures in ISL and
we can do HiZ and CCS with 1-D so we need to be able to create them.  In
order to prevent actually using them (the docs say no), we assert in the
state setup code.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03 14:53:01 -07:00
Jason Ekstrand
f82166578f intel/isl: Fix up asserts in calc_phys_level0_extent_sa
The assertion that a format is uncompressed in the multisample layouts
isn't quite right.  What we really want to assert is that the format
supports multisampling which is a bit more complicated query.  We also want
to assert that it has a block size of 1x1 since we do nothing with the
block size in the phys_level0_sa assignment.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03 14:53:01 -07:00
Jason Ekstrand
5637f3f120 intel/isl: Add a format_supports_multisampling helper
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-10-03 14:53:01 -07:00
Ville Syrjälä
2fef0d108a anv/formats: Fix build on gcc-4 and earlier
gcc-4 and earlier don't allow compound literals where a constant
is required in -std=c99/gnu99 mode, so we can't use ISL_SWIZZLE()
when populating the anv_formats[] array. There are a few ways around
it: First one would be -std=c89/gnu89, but the rest of the code
depends on c99 so it's not really an option. The second option
would be to upgrade to gcc-5+ where the compiler behaviour was relaxed
a bit [1]. And the third option is just to avoid using compound
literals. I chose the last option since it keeps gcc-4 and earlier
working.

[1] https://gcc.gnu.org/gcc-5/porting_to.html

Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Topi Pohjolainen <topi.pohjolainen@intel.com>
Fixes: 7ddb21708c ("intel/isl: Add an isl_swizzle structure and use it for isl_view swizzles")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-03 15:45:28 +03:00
Timothy Arceri
eaf147cb46 i965: rename max_ds_* variable to max_tes_*
Using consistent naming allows us to create macros more easily.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-10-03 15:29:58 +11:00
Timothy Arceri
b67633ce5e i965: rename max_hs_* variables to max_tcs_*
Using consistent naming allows us to create macros more easily.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-10-03 15:29:51 +11:00
Sirisha Gandikota
8e3e9d74b5 aubinator: Fix the decoding of values that span two Dwords
Fixed the way the values that span two Dwords are decoded.
Based on the start and end indices of the field, the Dwords
are fetched and decoded accordingly.

v2: rename dw to qw in gen_field_iterator_next
and remove extra white space (Anuj)

v3: change all instances of dw to qw (Anuj)

Earlier, 64-bit fields (such as most pointers on Gen8+)
weren't decoded correctly.  gen_field_iterator_next seemed
to walk one DWord at a time, sets v.dw, and then passes it
to field(). So, even though field() takes a uint64_t, we're
passing it a uint32_t (which gets promoted, so the top 32
bits will always be zero). This seems pretty bogus... (Ken)

Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-26 11:18:52 -07:00
Nayan Deshmukh
b3827819aa aubinator: fix resource leak
CovID: 1373370

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-25 12:32:48 -07:00
Nicolas Koch
f17948a30a anv: Check for VK_WHOLE_SIZE in anv_CmdFillBuffer
From the Vulkan spec:

   Size is the number of bytes to fill, and must be either a multiple of 4,
   or VK_WHOLE_SIZE to fill the range from offset to the end of the buffer.
   If VK_WHOLE_SIZE is used and the remaining size of the buffer is not a
   multiple of 4, then the nearest smaller multiple is used.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-09-23 00:20:16 -07:00
Lionel Landwerlin
6b21728c4a anv: get rid of duplicated values from gen_device_info
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-23 10:12:06 +03:00
Lionel Landwerlin
bc24590f0c intel/i965: make gen_device_info mutable
Make gen_device_info a mutable structure so we can update the fields that
can be refined by querying the kernel (like subslices and EU numbers).

This patch does not make any functional change, it just makes
gen_get_device_info() fill a structure rather than returning a const
pointer.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-23 10:11:59 +03:00
Lionel Landwerlin
b8162d6b6e anv: pipeline: use correct number of thread for compute
Reproduces this commit :

commit 0fb85ac08d
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Mon Jun 6 21:37:34 2016 -0700

    i965: Use the correct number of threads for compute shaders.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-21 12:01:06 +03:00
Lionel Landwerlin
f2d43b44d7 anv: allocator: correct scratch space for haswell
This reproduces this commit :

commit 2213ffdb4b
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Mon Jun 6 21:37:34 2016 -0700

    i965: Allocate scratch space for the maximum number of compute threads.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-21 12:01:06 +03:00
Lionel Landwerlin
09394ee6cf anv: device: calculate compute thread numbers using subslices numbers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-21 12:01:06 +03:00
Lionel Landwerlin
792d77165b aubinator: add a custom handler for immediate register load
Transforming this :

0x00c77084:  0x11000001:  MI_LOAD_REGISTER_IMM
0x00c77088:  0x0000b020 : Dword 1
    Register Offset: 0x0000b020
    0x00c7708c:  0x00880038 : Dword 2
    Data DWord: 8912952

Into this:

0x007880f0:  0x11000001:  MI_LOAD_REGISTER_IMM
0x007880f4:  0x0000b020 : Dword 1
    Register Offset: 0x0000b020
    0x007880f8:  0x00080040 : Dword 2
    Data DWord: 524352
register L3CNTLREG2 (0xb020) : 0x80040
    SLM Enable: 0
    URB Allocation: 32
    URB Low Bandwidth: 0
    RO Allocation: 32
    RO Low Bandwidth: 0
    DC Allocation: 0
    DC Low Bandwidth: 0

v2: Drop unused arguments (Sirisha)
    Print out register name

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2016-09-20 10:47:21 +01:00
Kenneth Graunke
081f21f29b isl: Finish tiling filtering for Gen6.
Gen6 only has one additional restriction over Gen7+, so we just add it
to the existing gen7 function (which actually covers later gens too).

This should stop FINISHME spew when running GL on Sandybridge.

v2: Fix bytes per block vs. bits per block confusion (Jason) and
    rename function to gen6_filter_tiling (Jason and Chad).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-09-15 21:21:50 -07:00
Jason Ekstrand
ed65e6ef49 nir: Add a flag to lower_io to force "sample" interpolation
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-15 13:31:43 -07:00
Jason Ekstrand
89a96c8f43 anv/cmd_buffer: Set the L3 atomic disable mask bit in CHICKEN3 on HSW
Without this bit set, the value in "L3 Atomic Disable" won't get applied by
the hardware so we won't properly get L3 atomic caching.

Fixes dEQP-VK.spirv_assembly.instruction.compute.opatomic.compex and 198 of
the dEQP-VK.image.atomic_operations.* tests on HSW

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-09-14 17:53:16 -07:00
Jason Ekstrand
a814e18c96 intel/blorp: Stop setting 3DSTATE_DRAWING_RECTANGLE
The Vulkan driver sets 3DSTATE_DRAWING_RECTANGLE once to MAX_INT x MAX_INT
at the GPU initialization time and never sets it again.  The GL driver sets
it every time the framebuffer changes.  Originally, blorp set it to the
size of the drawing area but meant we had to set it back in the Vulkan
driver.  Instead, we can easily just do that in the GL driver's blorp_exec
implementation and not set it in blorp core.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-09-14 17:51:16 -07:00
Jason Ekstrand
b56f509ee0 intel/blorp: Emit 3DSTATE_MULTISAMPLE directly
Previously, we relied on a driver hook for 3DSTATE_MULTISAMPLE.  However,
now that Vulkan and GL use the same sample positions, we can set up
3DSTATE_MULTISAMPLE directly in blorp and delete the driver hook.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-09-14 17:51:16 -07:00
Jason Ekstrand
c779ad3e66 intel: Move Vulkan sample positions to common code
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-09-14 17:51:16 -07:00
Sirisha Gandikota
aa7b410592 aubinator: Remove bogus "end" parameter in gen_disasm_disassemble()
Earlier, the loop pretends to loop over instructions from "start" to "end",
but the callers always pass 8192 for end, which is some huge bogus
value. The real loop termination condition is send-with-EOT or 0. (Ken)

Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-13 16:32:42 -07:00
Sirisha Gandikota
1ab92d80a8 aubinator: Make gen_disasm_disassemble handle split sends
Skylake adds new SENDS and SENDSC opcodes, which should be
handled in the send-with-EOT check. Make an is_send() helper
that checks if the opcode is SEND/SENDC/SENDS/SENDSC (Ken)

v2: Make is_send() much more crispier, Mix declaration and
code to make the code compact (Ken)

Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-13 16:32:39 -07:00
Sirisha Gandikota
5d2440532f aubinator: Simplify print_dword_val() method
Remove the float/dword union and use the iter->p[f->start / 32]
directly as printf formatter %08x expects uint32_t (Ken)

v2: Make the cleanup much more crispier (Ken)

Signed-off-by: Sirisha Gandikota <Sirisha.Gandikota@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-13 16:32:24 -07:00
Jason Ekstrand
1eebb60917 anv/image: Set correct base_array_layer and array_len for storage images
Since Vulkan doesn't allow single-slice 3D storage images, we need to just
set the base_array_layer and array_len to the full size of the 3-D LOD.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
2016-09-13 14:45:49 -07:00
Jason Ekstrand
106709db7b Revert "intel/isl: Ignore base_array_layer and array_len for 3D storage..."
This reverts commit 3943888c94.  It turns out
that commit was pretty-much bogus since it breaks binding a 3-D texture as a
2-D storage image.  The correct fix for the Vulkan CTS tests needs to be in
the Vulkan driver itself rather than ISL.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
2016-09-13 14:45:15 -07:00
Jason Ekstrand
330104464f anv: Use blorp for doing MSAA resolves
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-09-13 12:40:13 -07:00
Jason Ekstrand
6bcb1f753e anv: Use blorp for ClearColorImage
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-09-13 12:40:13 -07:00
Jason Ekstrand
57e87862eb anv: Delete meta_blit2d
Everything that we were once using the blit2d framework for is now done
with blorp.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-09-13 12:40:13 -07:00
Jason Ekstrand
36286ccb96 anv/blorp: Add a gcd_pow2_u64 helper and use it for buffer alignments
This is a lot cleaner and easier to read than the old piles of if
statements.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-09-13 12:40:13 -07:00
Jason Ekstrand
af5d30de55 anv: Use blorp for CopyBuffer and UpdateBuffer
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-09-13 12:40:13 -07:00
Jason Ekstrand
0f1ca5407a anv: Use blorp for CopyImage
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-09-13 12:40:12 -07:00
Jason Ekstrand
58593f24cb anv: Use blorp for CopyBufferToImage
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-09-13 12:40:12 -07:00
Jason Ekstrand
f07f44a5bc anv: Use blorp for CopyImageToBuffer
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-09-13 12:40:12 -07:00
Jason Ekstrand
9f44745eca anv: Use blorp to implement VkBlitImage
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-09-13 12:40:12 -07:00
Jason Ekstrand
52fa3e8347 anv: Make image_get_surface_for_aspect_mask const
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-09-13 12:40:12 -07:00
Jason Ekstrand
8f780af968 anv: Add initial blorp support
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-09-13 12:40:12 -07:00
Jason Ekstrand
1fe8bf82b2 intel/anv: Use #defines for all __gen_ helpers
This allows us to #undef them later if we don't want them to persist

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-09-13 12:40:12 -07:00
Jason Ekstrand
4a6c9e20b8 anv: Generalize emit_urb_setup
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-09-13 12:40:12 -07:00
Jason Ekstrand
8cb144bd93 anv/pipeline: Roll compute_urb_partition into emit_urb_setup
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-09-13 12:40:12 -07:00
Jason Ekstrand
823ab83432 intel/blorp: Use #defines for all __gen_ helpers
This allows us to #undef them later if we don't want them to persist

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-09-13 12:40:12 -07:00
Jason Ekstrand
c0b9776cd6 intel/isl: Divide QPitch by 2 for 3-D stencil textures on SKL+
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2016-09-13 12:40:12 -07:00
Jason Ekstrand
00e79cec99 isl/state: Don't set QPitch for GEN4_3D surfaces
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2016-09-13 12:40:12 -07:00
Jason Ekstrand
cb780c9ccf intel/blorp: Rework alloc_binding_table
The original blorp_alloc_binding_table helper was supposed to return the
binding table offset and map along with the surface state maps.  This isn't
quite what we want, however.  What we really want is the binding table
offsets, surface state offsets, and surface state maps.  In the GL driver,
the binding table map *is* an array of surface state offsets.  However, in
Vulkan, this isn't quite true as the entries in the binding table are
surface state offsets combined with another binding table block offset.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-09-13 12:40:11 -07:00
Jason Ekstrand
6ac469a6c3 anv/allocator: Use VG_NOACCESS_WRITE in anv_bo_pool_free
Previously, we were relying on the fact that VALGRIND_MEMPOOL_FREE came
later on in the function to prevent "link->bo = bo" from causing an invalid
write.  However, in the case where the size requested by the user is very
small (less than sizeof(struct anv_bo)), this isn't sufficient.  Instead,
we should call VALGRIND_MEMPOOL_FREE early and then use VG_NOACCESS_WRITE.
We do, however, have to call VALGRIND_MEMPOOL_FREE after reading bo_in
because it may be stored in the bo itself.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
2016-09-13 10:44:03 -07:00