Commit graph

2683 commits

Author SHA1 Message Date
Anuj Phogat
a86c0a08df anv/icl: Don't use DISPATCH_MODE_SIMD4X2
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
cd5fc634a8 anv/icl: Don't use SingleVertexDispatch
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
6e3940b3cf anv/icl: Don't set ResetGatewayTimer
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:32 -08:00
Anuj Phogat
41a4c2c8e8 anv/icl: Add #define genX
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:31 -08:00
Anuj Phogat
413d475b44 anv/icl: Add gen11 mocs defines
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-16 11:10:31 -08:00
Anuj Phogat
ba3cbee6c5 intel/common/icl: Add has_sample_with_hiz flag in gen_device_info
Sampling from hiz is enabled in i965 for GEN9+ but this feature has
been removed from gen11. So, this new flag will be useful to turn
the feature on/off for different gen h/w. It will be used later
in a patch adding device info for gen11.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
9c144dc81e i965/icl: Add assertions to check dispatch mode is SIMD8
SIMD4x2 dispatch mode has been removed in GEN11. We're not using
it anyways in Mesa. Adding few asserts to make it explicit.

Use GEN_GEN macro in place of devinfo->gen (Ken)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
e9ad5c9a5d i965/icl: Update the comment for maximum number of threads per PSD
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:56 -08:00
Anuj Phogat
772a75be46 intel/icl: Do StateCacheInvalidation for indirect clear color
StateCacheInvalidation is required on all gen7+ platforms. We
don't need to update this check for every new gen h/w unless
this requirement is changed. So, dropping the check for latest
gen h/w.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:55 -08:00
Anuj Phogat
bff24e2173 intel/isl/icl: Build and use gen11 surface state emit functions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-15 16:14:55 -08:00
Anuj Phogat
0427bd4954 intel/isl/icl: Add the maximum surface size limit
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:55 -08:00
Anuj Phogat
c68ede0be7 intel/genxml/icl: Update genx_bits header
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-15 16:14:55 -08:00
Anuj Phogat
165a68b05a intel/genxml/icl: Generate packing headers
Move build system changes in to one patch (Ken, Emil)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-02-15 16:14:55 -08:00
Anuj Phogat
7ed27d8cbf intel/genxml/icl: Add gen11.xml
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-15 16:14:55 -08:00
Dylan Baker
7d0e342af2 meson: add convenience variable for anv_extensions.py depdendency
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-15 09:46:07 -08:00
Dylan Baker
0e617c04f1 meson: use depend_files for adding extra file dependencies
cc: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: dd088d4bec ("anv/extensions: Generate a header file with extension tables")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-15 09:46:04 -08:00
Dylan Baker
b03969a5ad meson: use depend_files to track extra file dependencies
cc: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: f939940809 ("anv: Split anv_extensions.py into two files")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-15 09:45:56 -08:00
Dylan Baker
384bff13e0 Revert "anv/meson: Make anv_entrypoints_gen.py depend on anv_extensions.py"
This reverts commit 10d1b0be8e.

This is unnecessary, the depend_files argument is for adding
dependencies on files that are not part of the input, which is already
done.

cc: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: 10d1b0be8e
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-15 09:45:40 -08:00
Anuj Phogat
0cd37f9178 isl: Don't use surface format R32_FLOAT for typed atomic integer operations
From Skylake PRM Surface Formats section:

   "The surface format for the typed atomic integer operations must
    be R32_UINT or R32_SINT."

Fixes an error and a piglit GPU hang in simulation environment.
Piglit test: gl45-imageAtomicExchange-float.shader_test

Suggested-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.co
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "18.0 17.3" <mesa-stable@lists.freedesktop.org>
2018-02-14 16:30:05 -08:00
Jason Ekstrand
8534af44e4 intel/aubinator: Correctly decode INTERFACE_DESCRIPTOR_DATA
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 13:17:26 -08:00
Rafael Antognolli
fcae3d1a9a anv/gen10: Remove warning message.
Gen10 seems pretty stable so far, remove "alpha support" message.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: "18.0" mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-02-14 10:11:01 -08:00
Iago Toral Quiroga
cb9dbd6dec i965/compiler: clean up nir_intrinsic_load_input for vertex shaders
This code to re-set the type of the source and destination is not
necessary since we never manipulate the types. Looks like a
left over from a time where we had to retype to float temporarily
to handle 64-bit inputs.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-14 12:00:14 +01:00
Iago Toral Quiroga
4917d38321 intel/compiler: fix first_component for 64-bit types on vertex inputs
Divide it by two as we do for other stages. This is because the
component layout qualifier is always in 32-bit units.

Fixes issues in a new CTS test (still WIP):
KHR-GL45.enhanced_layouts.varying_double_components

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-02-14 12:00:14 +01:00
Jason Ekstrand
4c77e21c81 anv: Move setting current_pipeline to cmd_state_init
We were setting current_pipeline to UINT32_MAX and then calling
cmd_cmd_state_reset which memsets the entire state struct to 0 which
implicitly resets current_pipeline to 3D.  I have no idea how this
hasn't caused everything to explode.

Fixes: cd3feea745 "anv/cmd_buffer: Rework anv_cmd_state_reset"
cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-02-12 15:18:23 -08:00
Jason Ekstrand
f37bd726c7 anv: Don't resolve or ambiguate non-existent layers
The previous code was trying to avoid non-existent layers by taking a
MAX with anv_image_aux_layers.  Unfortunately, it wasn't taking into
account that layer_count starts at base_layer which may not be zero.
Instead, we need to subtract base_layer from anv_image_aux_layers with
a guard against roll-over.

Fixes: de3be61801 "anv/cmd_buffer: Rework aux tracking"
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-12 15:14:57 -08:00
Kenneth Graunke
bd87bd178c anv: Drop I915_EXEC_CONSTANTS_REL_GENERAL from execbuf.
The kernel used to have execbuf parameters to program the INSTPM bit
for whether 3DSTATE_CONSTANT_* should be relative to dynamic state
base address or an absolute address.  However, they never worked in
the presence of hardware contexts, so I deleted them a while back.

It doesn't make sense to set this flag, as it doesn't exist anymore.
It also never did anything anyway - the flag is zero, so |'ing it in
did nothing.  The default is relative anyway.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-02-12 07:00:41 -08:00
Grazvydas Ignotas
9b9a89cd79 intel/compiler: fix 64bit value prints on 32bit
Fix the following:
warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but
argument 3 has type ‘uint64_t {aka long long unsigned int}.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-02-10 17:59:02 +02:00
Jason Ekstrand
8f20cf166e intel/blorp: Use isl_aux_op instead of blorp_hiz_op
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
1e941a0528 intel/blorp: Use isl_aux_op instead of blorp_fast_clear_op
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
1810f965c8 anv: Allow fast-clearing the first slice of a multi-slice image
Now that we're tracking aux properly per-slice, we can enable this for
applications which actually care.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
de3be61801 anv/cmd_buffer: Rework aux tracking
This commit completely reworks aux tracking.  This includes a number of
somewhat distinct changes:

 1) Since we are no longer fast-clearing multiple slices, we only need
    to track one fast clear color and one fast clear type.

 2) We store two bits for fast clear instead of one to let us
    distinguish between zero and non-zero fast clear colors.  This is
    needed so that we can do full resolves when transitioning to
    PRESENT_SRC_KHR with gen9 CCS images where we allow zero clear
    values in all sorts of places we wouldn't normally.

 3) We now track compression state as a boolean separate from fast clear
    type and this is tracked on a per-slice granularity.

The previous scheme had some issues when it came to individual slices of
a multi-LOD images.  In particular, we only tracked "needs resolve"
per-LOD but you could do a vkCmdPipelineBarrier that would only resolve
a portion of the image and would set "needs resolve" to false anyway.
Also, any transition from an undefined layout would reset the clear
color for the entire LOD regardless of whether or not there was some
clear color on some other slice.

As far as full/partial resolves go, he assumptions of the previous
scheme held because the one case where we do need a full resolve when
CCS_E is enabled is for window-system images.  Since we only ever
allowed X-tiled window-system images, CCS was entirely disabled on gen9+
and we never got CCS_E.  With the advent of Y-tiled window-system
buffers, we now need to properly support doing a full resolve of images
marked CCS_E.

v2 (Jason Ekstrand):
 - Fix an bug in the compressed flag offset calculation
 - Treat 3D images as multi-slice for the purposes of resolve tracking

v3 (Jason Ekstrand):
 - Set the compressed flag whenever we fast-clear
 - Simplify the resolve predicate computation logic

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
2cbfcb205e anv/cmd_buffer: Move the mi_alu helper higher up
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
2e69045c4d anv/image: Simplify some verbose commennts
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
f0523f70ef anv: Use blorp_ccs_ambiguate instead of fast-clears
Even though the blorp pass looks a bit on the sketchy side, the end
result in the Vulkan driver is very nice.  Instead of having this weird
case where you do a fast clear and then maybe have to resolve, we just
do the ambiguate and are done with it.  The ambiguate does exactly what
we want of setting all the CCS values to 0 which puts it into the
pass-through state.

This should also improve performance a bit in certain cases.  For
instance, if we did a transition from UNDEFINED to GENERAL for a surface
that doesn't have CCS enabled all the time, we would end up doing a
fast-clear and then a full resolve which ends up touching every byte in
the main surface as well as the CCS.  With the ambiguate pass, that
transition only touches the CCS.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
84fd2ebfbc anv/cmd_buffer: Re-arrange the logic around UNDEFINED fast-clears
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
3ef8c4b2f5 anv/cmd_buffer: Pull the undefined layout condition into the if
Now that this isn't a multi-case if and it's just the one case, it's a
bit clearer if the condition is just part of the if instead of being
pulled out into a boolean variable.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
857b5b5a7f intel/blorp: Add a CCS ambiguation pass
This pass performs an "ambiguate" operation on a CCS-compressed surface
by manually writing zeros into the CCS.  On gen8+, ISL gives us a fairly
detailed notion of how the CCS is laid out so this is fairly simple to
do.  On gen7, the CCS tiling is quite crazy but that isn't an issue
because we can only do CCS on single-slice images so we can just blast
over the entire CCS buffer if we want to.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
13b621d6fd anv: Only fast clear single-slice images
The current strategy we use for managing resolves has an issues where we
track clear colors and the need for resolves per-LOD but we still allow
resolves of only a subset of the slices in any given LOD and doing so
sets the "needs resolve" flag for that LOD to false while leaving the
remaining layers unresolved.  This patch is only the first step and does
not, by itself fix anything.  However, it's fairly self-contained and
splitting it out means any performance regressions should bisect to this
nice obvious commit rather than to the giant "rework aux tracking"
commit.

Nanley and I did some testing and none of the applications we tested
even tried to fast-clear anything other than the first slice of an
image.  The test was done by adding a printf right before we call
blorp_fast_clear if we were every going to touch any slice other than
the first with a fast-clear.  Due to the way the original code was
structured, this would not have included applications which only cleared
a subset of layers.  The applications tested were:

 * All Sascha Willems demos
 * Aztec Ruins
 * Dota 2
 * The Talos Principle
 * Mad Max
 * Warhammer 40,000: Dawn of War III
 * Serious Sam Fusion 2017: BFE

While not the full list of shipping applications, it's a pretty good
spread and covers most of the engines we've seen running on our driver.
If this is ever shown to be a performance problem in the future, we can
reconsider our strategy.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
571ed588ac anv/cmd_buffer: Add a mark_image_written helper
Currently, this helper does nothing but we call it every place where an
image is written through the render pipeline.  This will allow us to
properly mark the aux state so that we can handle resolves correctly.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
9876d6f0ef anv/blorp: Add src/dst_level helper variables in CmdCopyImage
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
c180c2c868 anv/cmd_buffer: Add an anv_genX_call macro
This is copied and pasted from the similar macro we added to ISL.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
ab7543b13d anv/cmd_buffer: Generalize transition_color_buffer
This moves it to being based on layout_to_aux_usage instead of being
hard-coded based on bits of a priori knowledge of how transitions
interact with layouts.  This conceptually simplifies things because
we're now using layout_to_aux_usage and layout_supports_fast_clear to
make resolve decisions so changes to those functions will do what one
expects.

There is a potential bug with window system integration on gen9+ where
we wouldn't do a resolve when transitioning to the PRESENT_SRC layout
because we just assume that everything that handles CCS_E can handle it
all the time.  When handing a CCS_E image off to the window system, we
may need to do a full resolve if the window system does not support the
CCS_E modifier.  The only reason why this hasn't been a problem yet is
because we don't support modifiers in Vulkan WSI and so we always get X
tiling which implies no CCS on gen9+.  This patch doesn't actually fix
that bug yet but it takes us the first step in that direction by making
us actually pick the correct resolve op.  In order to handle all of the
cases, we need more detailed aux tracking.

v2 (Jason Ekstrand):
 - Make a few more things const
 - Use the anv_fast_clear_support enum

v3 (Jason Ekstrand):
 - Move an assert and add a better comment

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
151771b390 anv/cmd_buffer: Recurse in transition_color_buffer instead of falling through
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
bea7373c92 anv/image: Support color aspects in layout_to_aux_usage
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
b09464db42 anv/image: Add a helper for determining when fast clears are supported
v2 (Jason Ekstrand):
 - Return an enum instead of a boolean

v3 (Jason Ekstrand):
 - Return ANV_FAST_CLEAR_NONE instead of false (Topi)
 - Rename ANV_FAST_CLEAR_ANY to ANV_FAST_CLEAR_DEFAULT_VALUE
 - Add documentation for the enum values

v4 (Jason Ekstrand):
 - Remove a dead comment

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
1f7eee6bc1 anv/image: Update a comment
This got lost in all of the aspect vs. plane rebasing of YCBCR.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
5c38ab8f07 anv/blorp: Rework HiZ ops to look like MCS and CCS
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
1d473e26f2 anv/blorp: Support ISL_AUX_USAGE_HIZ in surf_for_anv_image
If the function gets passed ANV_AUX_USAGE_DEFAULT, it still has the old
behavior of setting ISL_AUX_USAGE_NONE for depth/stencil which is what
we want for blits/copies.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
42f1668a54 anv/blorp: Rework image clear/resolve helpers
This replaces image_fast_clear and ccs_resolve with two new helpers that
simply perform an isl_aux_op whatever that may be on CCS or MCS.  This
is a bit cleaner as it separates performing the aux operation from which
blorp helper we have to call to do it.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00
Jason Ekstrand
482c24783e intel/isl: Codify AUX operations in an enum
Right now, we have different entrypoints and enums in blorp for these
different operations.  This provides us a central enum which we can
begin to transition to.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-08 16:35:31 -08:00