Commit graph

5984 commits

Author SHA1 Message Date
Lionel Landwerlin
e3893ee204 intel/dump_gpu: add support for MMAP_OFFSET ioctl
Our driver started using this method to mmap the BOs and we need to
hook it to track the dirtiness of the BO.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Tested-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7528>
2020-11-10 13:25:11 +00:00
Anuj Phogat
3c4e43e72b intel: Pointer to SCISSOR_RECT array should be 64B aligned
v2: Apply the workaround to all gen hardawre

Ref: GEN:BUG:1409725701
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7463>
2020-11-09 21:29:04 +00:00
Jason Ekstrand
68092df8d8 intel/nir: Lower 8-bit ops to 16-bit in NIR on Gen11+
Intel hardware supports 8-bit arithmetic but it's tricky and annoying:

  - Byte operations don't actually execute with a byte type.  The
    execution type for byte operations is actually word.  (I don't know
    if this has implications for the HW implementation.  Probably?)

  - Destinations are required to be strided out to at least the
    execution type size.  This means that B-type operations always have
    a stride of at least 2.  This means wreaks havoc on the back-end in
    multiple ways.

  - Thanks to the strided destination, we don't actually save register
    space by storing things in bytes.  We could, in theory, interleave
    two byte values into a single 2B-strided register but that's both a
    pain for RA and would lead to piles of false dependencies pre-Gen12
    and on Gen12+, we'd need some significant improvements to the SWSB
    pass.

  - Also thanks to the strided destination, all byte writes are treated
    as partial writes by the back-end and we don't know how to copy-prop
    them.

  - On Gen11, they added a new hardware restriction that byte types
    aren't allowed in the 2nd and 3rd sources of instructions.  This
    means that we have to emit B->W conversions all over to resolve
    things.  If we emit said conversions in NIR, instead, there's a
    chance NIR can get rid of some of them for us.

We can get rid of a lot of this pain by just asking NIR to get rid of
8-bit arithmetic for us.  It may lead to a few more conversions in some
cases but having back-end copy-prop actually work is probably a bigger
bonus.  There is still a bit we have to handle in the back-end.  In
particular, basic MOVs and conversions because 8-bit load/store ops
still require 8-bit types.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>
2020-11-09 18:58:51 +00:00
Jason Ekstrand
b98f0d3d7c intel/nir: Lower 8-bit scan/reduce ops to 16-bit
We can't really support these directly on any platform.  May as well let
NIR lower them.  The NIR lowering is potentially one more instruction
for scan/reduce ops thanks to not being able to do the B->W conversion
as part of SEL_EXEC.  For imax/imin exclusive scan, it's yet another
instruction thanks to the extra imax/imin NIR has to insert to deal with
the fact that the first live channel will contain the identity value
which, when signed, will cast wrong.  However, it does let us drop some
complexity from our back-end so it's probably worth it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>
2020-11-09 18:58:51 +00:00
Jason Ekstrand
3ad2d85995 intel/nir: Refactor lower_bit_size_callback
We want to use it for more than just ALU.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>
2020-11-09 18:58:51 +00:00
Jason Ekstrand
2c4b47184d nir/lower_bit_size: Pass a nir_instr to the callback
This way we can start supporting more than just ALU ops.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>
2020-11-09 18:58:51 +00:00
Iván Briano
0f96a9ab3b anv: restrict number of subgroups per group
We are limited to 64 threads per dispatched group, regardless of what
num_cs_threads claims, so advertise that limit correctly.

Fixes (on TGL and up):
dEQP-VK.subgroups.size_control.compute.required_subgroup_size_min
and other *.required_subgroup_size_min tests.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7453>
2020-11-05 10:43:06 -08:00
Marcin Ślusarz
44925a8a55 intel/tools: add missing new lines to few remaining fail_if users
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7449>
2020-11-05 12:07:51 +00:00
Marcin Ślusarz
c323d7c2a7 intel/tools: refactor logging to be easier to follow by static analyzers
Refactor out the part of fail_if function that never returns into
NORETURN function and put the condition check outside.

Addresses many false positive warnings by Coverity.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7449>
2020-11-05 12:07:51 +00:00
Marcin Ślusarz
f0061277c0 intel/tools: handle some failures
Addresses "Dereference null return value" issues reported by Coverity.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7449>
2020-11-05 12:07:51 +00:00
Marcin Ślusarz
cd9907e7d3 anv: remove dead code from anv_create_cmd_buffer
pool can't be NULL at this point, because it was already
dereferenced earlier.

Addresses "Dereference before null check" issue reported by Coverity.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7449>
2020-11-05 12:07:51 +00:00
Marcin Ślusarz
d13b7d6591 intel/tools: allow --color option to be used without arg
There's already code handling that case and help text also says
it's possible.

Found, because Coverity complained about optarg NULL check,
suggesting optarg can be NULL for other options, where it's not
possible. IOW, false positive lead me to finding an unrelated issue.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7449>
2020-11-05 12:07:51 +00:00
Caio Marcelo de Oliveira Filho
5d5f3e3a47 intel/fs: Implement nir_intrinsic_{load,store}_shared_block_intel
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7448>
2020-11-04 20:24:48 +00:00
Caio Marcelo de Oliveira Filho
9fe158e1d1 intel/fs: Implement nir_intrinsic_{load,store}_ssbo_block_intel
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7448>
2020-11-04 20:24:48 +00:00
Caio Marcelo de Oliveira Filho
d372abe397 intel/fs: Add surface OWORD BLOCK opcodes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7448>
2020-11-04 20:24:48 +00:00
Caio Marcelo de Oliveira Filho
296137df53 intel/fs: Implement nir_intrinsic_{load,store}_global_block_intel
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7448>
2020-11-04 20:24:48 +00:00
Caio Marcelo de Oliveira Filho
d3d2b73fa3 intel/fs: Add A64 OWORD BLOCK opcodes
Based on a patch for OWORD BLOCK READ from Jason Ekstrand.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7448>
2020-11-04 20:24:48 +00:00
Jason Ekstrand
3cc58e6470 nir: Add and use some deref mode helpers
NIR derefs currently have exactly one variable mode.  This is about to
change so we can handle OpenCL generic pointers.  In order to transition
safely, we need to audit every deref->mode check.  This commit adds a
set of helpers that provide more nuanced mode checks and converts most
of NIR to use them.

For simple cases, we add nir_deref_mode_is and nir_deref_mode_is_one_of
helpers.  These can be used in passes which don't have to bother with
generic pointers and just want to know what mode a thing is.  If the
pass ever encounters generic pointers in a way that this check would be
unsafe, it will assert-fail to alert developers that they need to think
harder about things and fix the pass.

For more complex passes which require a more nuanced understanding of
modes, we add nir_deref_mode_may_be and nir_deref_mode_must_be helpers
which accurately describe the compiler's best knowledge about the given
deref.  Unfortunately, we may not be able to exactly identify the mode
in a generic pointers scenario so we have to be very careful when we use
these.  Conversion of these passes is left to later commits.

For the case of mass lowering of a particular mode (nir_lower_explicit_io
is one good example), we add nir_deref_mode_is_in_set.  This is also
pretty assert-happy like nir_deref_mode_is but is for a set containment
comparison on deref modes where you expect the deref to either be all-in
or all-out.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
2020-11-03 22:18:28 +00:00
Jason Ekstrand
3f0a29fffb nir/builder: Add a nir_ieq_imm helper
This shows up surprisingly often.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
2020-11-03 22:18:28 +00:00
Marcin Ślusarz
21ffacff8c intel/compiler: remove branch weight heuristic
As a result of this patch, compiler chooses SIMD32 shaders more
frequently.

Current logic is designed to avoid regressions from enabling SIMD32 at
all cost, even though the cases where regression can happen are probably
for smaller draw calls (far away from the camera and though smaller).

In Intel perf CI this patch improves FPS in:
- gfxbench5 alu2:      21.92% (gen9), 23.7%  (gen11)
- synmark OglShMapVsm:  3.26% (gen9),  4.52% (gen11)
- gfxbench5 car chase:  1.34% (gen9),  1.32% (gen11)
No observed regressions there.

In my testing, it also improves FPS in:
- The Talos Principle:   2.9% (gen9)

The other 16 games I tested had very minor changes in performance
(2/3 positive, but not significant enough to list here).

Note: this patch harms synmark OglDrvState (which is not in Intel perf
CI) by ~2.9%, but this benchmark renders multiple scenes from other
workloads (including OglShMapVsm, which is helped in standalone mode)
in tiny rectangles. Rendering so small drastically changes branching
statistics, which favors smaller SIMD modes. I assume this matters
only in micro-benchmarks, as in real workloads more expensive (with
more uniform branching behavior) draw calls dominate.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Acked-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7137>
2020-11-03 10:49:04 +00:00
Marcin Ślusarz
06764e0e5d intel/compiler: use C++ template instead of preprocessor
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7382>
2020-11-03 10:42:29 +00:00
Marcin Ślusarz
e3f6a9ea36 intel: remove dead code
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7353>
2020-11-02 19:58:56 +00:00
Marcin Ślusarz
b5e2c58ad8 anv: always annotate memory returned from anv_gem_mmap
anv_bo_pool_alloc expects that the memory returned by and_gem_mmap
was annotated using VALGRIND_MALLOCLIKE_BLOCK, but anv_gem_mmap_offset
didn't do that. Move annotation from anv_gem_mmap_legacy to common
code.

Fixes: 4abf0837cd ("anv: Add support for new MMAP_OFFSET ioctl.")

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7381>
2020-11-02 19:52:11 +00:00
Lionel Landwerlin
b03c86a71f intel/dev: Bump Max EU per subslice/dualsubslice
This isn't a problem right now because the previous max would give the
same result when aligned to a byte (8bits).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7288>
2020-10-30 08:22:26 +00:00
Caio Marcelo de Oliveira Filho
ce0b72a13a intel/fs: Don't emit_uniformize when getting a constant SSBO index
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7340>
2020-10-29 21:54:01 +00:00
Marcin Ślusarz
daec83c7d6 intel/genxml: don't generate identical code for different branches
Quiets 16 Coverity warnings like:

CID 1403401: Identical code for different branches (IDENTICAL_BRANCHES)

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7351>
2020-10-29 12:49:36 +00:00
Marcin Ślusarz
e96f33cd30 intel/tools: fix invalid type in argument to printf
$2 is exp2, exp2 is defined to be llint and llint is defined to be
unsigned long long int.

Fixes error reported by Coverity:
CID 1451141: Invalid type in argument to printf format specifier (PRINTF_ARGS)

Fixes: 70308a5a8a ("intel/tools: New i965 instruction assembler tool")

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7351>
2020-10-29 12:49:36 +00:00
Jordan Justen
06cf838cbd intel/mi_builder: Support gen11 command-streamer based register offsets
Reworks:
 * Automatically apply to any register in the range 0x2000 - 0x4000

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5466>
2020-10-27 16:11:12 -07:00
Jordan Justen
d399c3e861 intel/dev: Add device info for ADL-S
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7322>
2020-10-27 20:42:38 +00:00
Jordan Justen
8d03cfae7c anv: Drop warning about gen12 not being supported
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7333>
2020-10-27 11:04:25 -07:00
Lionel Landwerlin
d1ea49d924 anv: report latest extension spec versions
In many cases those revision happened every before the first public
release of the spec and we just forgot to update our numbers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7136>
2020-10-25 19:12:35 +02:00
Sagar Ghuge
ddca93ddf7 anv: Enable stencil buffer compression on Gen12+
v2: (Nanley Chery)
- Fix condition check.
- Move aux_usage assignment after add_aux_state_tracking_buffer method.

v3: (Nanley Chery)
- Move stencil condition close to depth block.

v4: (Nanley Chery)
- Add DEBUG_NO_RBC condition.

v5: (Nanley Chery)
- Don't add CCS plane explicitly.
- Use isl_surf_supports_ccs.

v6:
- Simplify condition (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2942>
2020-10-22 21:42:36 +00:00
Sagar Ghuge
dc22d6b3ab anv: Pass correct stencil aux usage during MSAA resolve
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2942>
2020-10-22 21:42:36 +00:00
Sagar Ghuge
993a2a7122 anv: Return optimal aux state for stencil buffer compression
v2:
- Assert on aux_supported. (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2942>
2020-10-22 21:42:36 +00:00
Sagar Ghuge
815e6c8ef4 anv: Don't track clear bo for stencil buffer compression
On Gen12+, stencil buffer compression does not support fast clear so we
don't have to track clear address for it.

v2:
- Use isl_aux_usage_has_fast_clears (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2942>
2020-10-22 21:42:36 +00:00
Sagar Ghuge
d34ab5071a anv: Get aux usage from plane while clearing stencil buffer
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2942>
2020-10-22 21:42:36 +00:00
Sagar Ghuge
c76ebc0c7a anv: Set stencil_aux_usage flag
v2: Use image aux usage (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2942>
2020-10-22 21:42:36 +00:00
Sagar Ghuge
be2ca24da5 anv: Handle compressed stencil buffer transition on Gen12+
Handle compressed stencil buffer transition from one layout to another
on gen12+.

When stencil compression is enabled, we have to initialize buffer via
stencil clear (HZ_OP) before any renderpass.

v2:
- Pass predicate bit false to anv_image_ccs_op (Nanley Chery)

v3:
- update aspect assertion (Nanley Chery)

v4:
- Make state decision based on anv_layout_to_aux_state instated of
  anv_layout_to_aux_usage (Sagar Ghuge)

v5:
- No need to handle stencil CCS resolve case (Jason Ekstrand)
- Initialize buffer using HZ_OP (Nanley Chery)

v6: (Nanley Chery)
- Pass correct layer/level count.
- Remove local variable.

v7:
- Skip stencil initialization with HZ_OP packet if followed by fast
  clear. (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2942>
2020-10-22 21:42:36 +00:00
Sagar Ghuge
c2a4102848 anv: Return number of layers/levels attached to anv_image
Don't check the auxiliary surface's ISL surf in order to return the
surface levels/layers instead we can return the anv_image parameter.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2942>
2020-10-22 21:42:36 +00:00
Ian Romanick
67956689bb nir: Rename replicated-result dot-product instructions
All these instructions replicate the result of a N-component dot-product
to a vec4.  Naming them fdot_replicatedN gives the impression that are
some sort of abstract dot-product that replicates the result to a vecN.
They also deviate from fdph_replicated... which nobody would reasonably
consider naming fdot_replicatedh.

Naming these opcodes fdotN_replicated more closely matches what they
are, and it matches the pattern of fdph_replicated.

I believe that the only reason these opcodes were named this way was
because it simplified the implementation of the binop_reduce function in
nir_opcodes.py.  I made some fairly simple changes to that function, and
I think the end result is ok.

The bulk of the changes come from the sed rename:

    sed --in-place -e 's/fdot_replicated\([234]\)/fdot\1_replicated/g' \
        $(grep -r 'fdot_replicated[234]' src/)

v2: Use a named parameter to binop_reduce instead of using
isinstance(name, str).  Suggested by Jason.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5725>
2020-10-22 18:00:19 +00:00
Lionel Landwerlin
ea32691257 anv: fix source/destination layers for 3D blits
When blitting from source depth range [0-3] into destination depth
range [0-2], we'll have to use a source layer that is in between 2
layers of the 3D source image.

Other than having an incorrect formula, we're also using integer which
prevent us from using the right source layer.

v2: Drop + 0.5 on application offsets

v3: Reuse num_layers (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3458
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6909>
2020-10-22 15:46:51 +00:00
Lionel Landwerlin
87934f02f9 blorp: allow blits with floating point source layers
The current blorp API only allows source layers for 3D images to be
integers. That is causing problems with the Vulkan API where we need
to be able to use a 3D layer that could be in between 2 layers.

This change allows a floating point value to be passed for blits and
internally sets up the input parameters to pass floating point values
to kernels.

v2: Use tex op to determinate what types are the coordinates (Jason)
    Drop setting params->z (Lionel)

v3: Fix nir_texop_txf_ms_mcs op not considered as having integer coords (Lionel)

v4: Fix incorrect test on nir_texop_txf_ms_mcs (Ivan)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3458
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6909>
2020-10-22 15:46:51 +00:00
Lionel Landwerlin
e067078fcd blorp: identify copy kernels in NIR
This was useful in identifying blit vs copy kernels.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6909>
2020-10-22 15:46:51 +00:00
Jason Ekstrand
2015a109ff anv,iris: Use the data cache for UBO pulls on Gen12+
Now that we have the HDC, using the data cache for UBO pulls seems to
help things quite a bit:

    GTA V DXVK              104.0%
    Talos Principle GL      102.8%
    Rise of Tomb Raider VK  102.8%
    Dark Souls 3 DXVK       101.4%
    Witcher3 DXVK           101.3%
    Bioshock Infinite GL    100.5%
    Doom 2016 VK            97.7%

Doom is a bit of a loss but it helps enough other stuff, it's probably
worth the hit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7230>
2020-10-20 19:54:29 +00:00
Lionel Landwerlin
afeb0c3022 genxml: drop gen10
Finishing off the job started in !6899

v2: Remove remaining gen10_pack.h include (Sagar)

v3: Forgot isl gen10 removal (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7185>
2020-10-20 07:56:40 +00:00
Caio Marcelo de Oliveira Filho
8dd03a7c12 anv: Advertise VK_KHR_shader_terminate_invocation
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7221>
2020-10-19 19:38:35 +00:00
Kenneth Graunke
aca31baafc isl: Enable Tigerlake HDC:L1 caches via MOCS in various cases.
Thanks to Felix Degrood for discovering that we missed enabling this
additional caching on Tigerlake!  Felix also benchmarked the changes.

We now use MOCS 48 (HDC:L1 + L3 + LLC) for render targets, textures,
and pull constant buffers.  We leave storage buffers & images, as well
as stateless messages, using the previous MOCS 2 value.  We can't use
HDC:L1 with atomics, and we don't know a priori whether storage buffers
will be used with atomics or not.  Similarly, the Vulkan buffer device
address feature allows atomics to be performed on buffers via stateless
messages, and we only can control MOCS at the base address level, so
we can't do much there.

This is closer to what the Windows Vulkan and OpenGL drivers do,
though it isn't quite the same - they also disable LLC in some cases,
but we observed this to have noticable performance regressions when
we tried (though a couple titles benefited).  We may try experiment
with that in the future.

Improves performance in a number of titles:

- Unreal Engine 4 Shooter Demo   [VK]: 11.8%
- Witcher 3                    [DXVK]:  3.9%
- Rise of the Tomb Raider        [VK]:  1.5%
- Shadow of the Tomb Raider      [VK]:  1.0%
- Grand Theft Auto V           [DXVK]:  0.8%

We did not observe any performance regressions.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7104>
2020-10-19 19:18:11 +00:00
Kenneth Graunke
02fe825a61 isl, anv, iris: Add a centralized helper to select MOCS based on usage
On Gen12+, we can enable additional caches in certain usage situations.
This routes that decision making to a central place in ISL, based on
surface usage flags, and updates both drivers to use it.  (i965 doesn't
need to change because it doesn't support Gen12.)

We continue handling the "external" decision via an anv_mocs() wrapper
for now, since we store that flag in anv_bo, which isl doesn't know
about.  (We could introduce an ISL_SURF_USAGE_EXTERNAL, but I'm not
actually sure that would be cleaner.)

This patch should not have any functional nor performance effects, as
we continue selecting the exact same MOCS values for now.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7104>
2020-10-19 19:18:11 +00:00
Kenneth Graunke
103ad427bc anv: Set only one ISL usage bit (RT/texture) for CopyBuffer sources
Most uses of this function deal with destination buffers, but for
copy_buffer_to_image, the buffer is the source, and isn't rendered
to.  We should avoid setting ISL_SURF_USAGE_RENDER_TARGET_BIT.
Also, we should avoid setting ISL_SURF_USAGE_TEXTURE_BIT for the
destination, which isn't sampled from.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7104>
2020-10-19 19:18:10 +00:00
Nanley Chery
3c87ac1f60 isl: Fix the aux-map encoding for D24_UNORM_X8
Bspec: 53911 now defines the encoding for this format.

Cc: mesa-stable
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7198>
2020-10-19 15:58:43 +00:00