Commit graph

5405 commits

Author SHA1 Message Date
Jonathan Marek
7870d71459 anv: use common nir_convert_ycbcr
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: D Scott Phillips <d.scott.phillips@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4528>
2020-04-20 22:01:43 +00:00
Rafael Antognolli
4abf0837cd anv: Add support for new MMAP_OFFSET ioctl.
v2: Update getparam check (Ken).

[jordan.l.justen@intel.com: use 0 offset for MMAP_OFFSET]

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1675>
2020-04-20 10:59:06 -07:00
Rafael Antognolli
0d387da083 anv: Add anv_device parameter to anv_gem_munmap.
Also update all of its callers.

On the next commit, the device will be used by anv_gem_munmap to choose
whether we need to call the valgrind code or not, depending on which
type of mmap we are using.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1675>
2020-04-20 10:59:06 -07:00
Caio Marcelo de Oliveira Filho
c76f2292b5 intel/fs,vec4: Properly account SENDs in IVB memory fence
Change brw_memory_fence to return the number of messages emitted, and
use that to update the send_count statistic in code generation.

This will fix the book-keeping for IVB since the memory fences will
result in two SEND messages.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4646>
2020-04-20 09:29:09 -07:00
Jason Ekstrand
969aeb6a93 anv: Apply any needed PIPE_CONTROLs before emitting state
Push constants in particular can get picked up by the hardware at weird
times that happen *before* 3DPRIMITIVE.  Therefore, we need to flush
before we emit all our state to ensure that any data they may pick up is
in memory in time.  This fixes an app which does vkCmdCopyBuffers
immediately followed by a vkCmdBeginRenderPass and vkCmdDraw which uses
the destination of the copy as a UBO which we push.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4601>
2020-04-19 02:41:22 +00:00
Jason Ekstrand
ffc84eac0d anv: Move vb_emit setup closer to where it's used in flush_state
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4601>
2020-04-19 02:41:22 +00:00
Albert Astals Cid
06c5875fd6 Fix promotion of floats to doubles
Use the f variants of the math functions if the input parameter is a
float, saves converting from float to double and running the double
variant of the math function for gaining no precision at all

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3969>
2020-04-18 19:55:45 +00:00
Lionel Landwerlin
f27c707585 anv: skip writing perfcntr in results on Gen12+
We were not capturing the register already so don't bother writing the
delta in the results (we were previously doing a delta between two 0
values).

v2: Fix unused function warning

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4586>
2020-04-18 13:32:27 +03:00
Lionel Landwerlin
086ea1ac7e intel/perf: Enable MDAPI queries for Gen12
We're missing the cases for gen12 leading to those metrics going
missing.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 15b7b56eb2 ("intel/perf: add TGL support")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4586>
2020-04-18 02:04:09 +03:00
Ian Romanick
f7d620f47d intel/compiler: Fixup operands in fs_builder::emit() that takes array
The versions that take a specific number of operands will do various
fixups depending on the platform and the opcode.  However, the version
that takes an array of sources did not.  This makes all version operate
similarly.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4582>
2020-04-17 08:21:47 -07:00
Ian Romanick
39ad0c2af8 intel/compiler: CSEL can do saturate
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4582>
2020-04-17 08:21:46 -07:00
Ian Romanick
5afaa407c1 intel/compiler: Only GE and L modifiers are commutative for SEL
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4582>
2020-04-17 08:21:43 -07:00
Ian Romanick
a80e44902f intel/compiler: Silence unused parameter warning in update_inst_scoreboard
src/intel/compiler/brw_fs_scoreboard.cpp: In function ‘void {anonymous}::update_inst_scoreboard(const fs_visitor*, const ordered_address*, const fs_inst*, unsigned int, {anonymous}::scoreboard&)’:
src/intel/compiler/brw_fs_scoreboard.cpp:793:45: warning: unused parameter ‘shader’ [-Wunused-parameter]
  793 |    update_inst_scoreboard(const fs_visitor *shader, const ordered_address *jps,
      |                           ~~~~~~~~~~~~~~~~~~^~~~~~

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4582>
2020-04-17 08:21:42 -07:00
Ian Romanick
c836295dfd intel/compiler: Silence unused parameter warning in fs_live_variables::setup_one_read
src/intel/compiler/brw_fs_live_variables.cpp: In member function ‘void brw::fs_live_variables::setup_one_read(brw::fs_live_variables::block_data*, fs_inst*, int, const fs_reg&)’:
src/intel/compiler/brw_fs_live_variables.cpp:56:67: warning: unused parameter ‘inst’ [-Wunused-parameter]
   56 | fs_live_variables::setup_one_read(struct block_data *bd, fs_inst *inst,
      |                                                          ~~~~~~~~~^~~~

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4582>
2020-04-17 08:21:40 -07:00
Ian Romanick
62f70a353f intel/compiler: Silence unused parameter warnings in vec4_tcs_visitor
In file included from src/intel/compiler/brw_vec4_tcs.cpp:31:
src/intel/compiler/brw_vec4_tcs.h: In member function ‘virtual void brw::vec4_tcs_visitor::emit_urb_write_header(int)’:
src/intel/compiler/brw_vec4_tcs.h:74:43: warning: unused parameter ‘mrf’ [-Wunused-parameter]
   74 |    virtual void emit_urb_write_header(int mrf) {}
      |                                       ~~~~^~~
src/intel/compiler/brw_vec4_tcs.h: In member function ‘virtual brw::vec4_instruction* brw::vec4_tcs_visitor::emit_urb_write_opcode(bool)’:
src/intel/compiler/brw_vec4_tcs.h:75:57: warning: unused parameter ‘complete’ [-Wunused-parameter]
   75 |    virtual vec4_instruction *emit_urb_write_opcode(bool complete) { return NULL; }
      |                                                    ~~~~~^~~~~~~~

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4582>
2020-04-17 08:21:37 -07:00
Jason Ekstrand
030e5ceac4 intel/blorp: Delete an unused enum
This was lying around from back when BLORP write to fs_visitor directly.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4606>
2020-04-17 15:01:10 +00:00
Jason Ekstrand
d0d039a4d3 anv: Emit pushed UBO bounds checking code in the back-end compiler
This commit fixes performance regressions introduced by e03f965280
in which we started bounds checking our push constants.  This added a
LOT of shader code to shaders which use the robustBufferAccess feature
and led to substantial spilling.  The checking we just added to the FS
back-end is far more efficient for two reasons:

 1. It can be done at a whole register granularity rather than per-
    scalar and so we emit one SIMD8 SEL per 32B GRF rather than one
    SIMD16 SEL (executed as two SELs) for each component loaded.

 2. Because we do it with NoMask instructions, we can do it on whole
    pushed GRFs without splatting them out to SIMD8 or SIME16 values.
    This means that robust buffer access no longer explodes our register
    pressure for no good reason.

As a tiny side-benefit, we're now using can use AND instead of SEL which
means no need for the flag and better scheduling.

Vulkan pipeline database results on ICL:

    Instructions in all programs: 293586059 -> 238009118 (-18.9%)
    SENDs in all programs: 13568515 -> 13568515 (+0.0%)
    Loops in all programs: 149720 -> 149720 (+0.0%)
    Cycles in all programs: 88499234498 -> 84348917496 (-4.7%)
    Spills in all programs: 1229018 -> 184339 (-85.0%)
    Fills in all programs: 1348397 -> 246061 (-81.8%)

This also improves the performance of a few apps:

 - Shadow of the Tomb Raider: +4%
 - Witcher 3: +3.5%
 - UE4 Shooter demo: +2%

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4447>
2020-04-17 14:48:06 +00:00
Jason Ekstrand
eb5a10ff63 intel/cfg: Add first/last_block helpers
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4447>
2020-04-17 14:48:06 +00:00
Jason Ekstrand
029471c3c4 intel/batch_decoder: Stop printing to stdout
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4597>
2020-04-16 17:26:16 +00:00
Jason Ekstrand
b8acf9a3d4 anv: Report correct SLM size
Fixes: d787a2d0 "anv: Implement VK_KHR_pipeline_executable_properties"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4597>
2020-04-16 17:26:16 +00:00
Jason Ekstrand
e003104605 intel: Add _const versions of prog_data cast helpers
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4597>
2020-04-16 17:26:16 +00:00
Jason Ekstrand
26a1adce5b anv: Fix UBO range detection in anv_nir_compute_push_layout
This fixes two bugs:  First, if the same block index showed up twice, we
only pick the first one.  Second, we weren't multiplying by 32.  This
didn't show up in tests because RBA testing is garbage.  Found while
looking at shaders from the UE4 Shooter demo.

Fixes: e03f9652 "anv: Bounds-check pushed UBOs when..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4578>
2020-04-15 21:51:55 +00:00
Jason Ekstrand
b2e4157143 anv: Advertise SEND count through VK_EXT_pipeline_executable_properties
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4578>
2020-04-15 21:51:55 +00:00
Caio Marcelo de Oliveira Filho
db74ad0696 intel/compiler: Remove cs_prog_data->threads
At this point all drivers are doing this math on their own -- since
most of them need to cover the variable group size case, in which at
compile time the group size (and number of threads) is not defined.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4504>
2020-04-09 19:23:20 -07:00
Caio Marcelo de Oliveira Filho
928f5f5434 anv: Stop using cs_prog_data->threads
Move the calculation to helper functions -- similar to what GL already
needs to do.

This is a preparation for dropping this field since this value is
expected to be calculated by the drivers now for variable group size
case.  And also the field would get in the way of brw_compile_cs
producing multiple SIMD variants (like FS).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4504>
2020-04-09 19:23:12 -07:00
Plamena Manolova
c77dc51203 intel/compiler: Add support for variable workgroup size
Add new builtin parameters that are used to keep track of the group
size.  This will be used to implement ARB_compute_variable_group_size.

The compiler will use the maximum group size supported to pick a
suitable SIMD variant.  A later improvement will be to keep all SIMD
variants (like FS) so the driver can select the best one at dispatch
time.

When variable workgroup size is used, the small workgroup optimization
is disabled as it we can't prove at compile time that the barriers
won't be needed.

Extracted from original i965 patch with additional changes by
Caio Marcelo de Oliveira Filho.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4504>
2020-04-09 19:23:12 -07:00
Caio Marcelo de Oliveira Filho
c54fc0d07b intel/compiler: Replace cs_prog_data->push.total with a helper
The push.total field had three values but only one was directly
used (size).  Replace it with a helper function that explicitly takes
the cs_prog_data and the number of threads -- and use that in the
drivers.

This is a preparation for ARB_compute_variable_group_size where the
number of threads (hence the total size for push constants) is not
defined at compile time (not cs_prog_data->threads).

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4504>
2020-04-09 19:23:12 -07:00
Caio Marcelo de Oliveira Filho
cf54785239 anv/gen12: Lower VK_KHR_multiview using Primitive Replication
Identify if view_index is used only for position calculation, and use
Primitive Replication to implement Multiview in Gen12.  This feature
allows storing per-view position information in a single execution of
the shader, treating position as an array.

The shader is transformed by adding a for-loop around it, that have an
iteration per active view (in the view_mask).  Stores to the position
now store into the position array for the current index in the loop,
and load_view_index() will return the view index corresponding to the
current index in the loop.

The feature is controlled by setting the environment variable
ANV_PRIMITIVE_REPLICATION_MAX_VIEWS, which defaults to 2 if unset.
For pipelines with view counts larger than that, the regular
instancing will be used instead of Primitive Replication.  To disable
it completely set the variable to 0.

v2: Don't assume position is set in vertex shader; remove only stores
    for position; don't apply optimizations since other passes will
    do; clone shader body without extract/reinsert; don't use
    last_block (potentially stale). (Jason)

    Fix view_index immediate to contain the view index, not its order.
    Check for maximum number of views supported.
    Add guard for gen12.

v3: Clone the entire shader function and change it before reinsert;
    disable optimization when shader has memory writes. (Jason)

    Use a single environment variable with _DEBUG on the name.

v4: Change to use new nir_deref_instr.
    When removing stores, look for mode nir_var_shader_out instead
    of the walking the list of outputs.
    Ensure unused derefs are removed in the non-position part of the
    shader.
    Remove dead control flow when identifying if can use or not
    primitive replication.

v5: Consider all the active shaders (including fragment) when deciding
    that Primitive Replication can be used.
    Change environment variable to ANV_PRIMITIVE_REPLICATION.
    Squash the emission of 3DSTATE_PRIMITIVE_REPLICATION into this patch.
    Disable Prim Rep in blorp_exec_3d.

v6: Use a loop around the shader, instead of manually unrolling, since
    the regular unroll pass will kick in.
    Document that we don't expect to see copy_deref or load_deref
    involving the position variable.
    Recover use_primitive_replication value when loading pipeline from
    the cache.
    Set VARYING_SLOT_LAYER to 0 in the shader.  Earlier versions were
    relying on ForceZeroRTAIndexEnable but that might not be
    sufficient.
    Disable Prim Rep in cmd_buffer_so_memcpy.

v7: Don't use Primitive Replication if position is not set, fallback
    to instancing; change environment variable to be
    ANV_PRIMITVE_REPLICATION_MAX_VIEWS and default it to 2 based on
    experiments.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
2020-04-07 17:16:09 +00:00
Caio Marcelo de Oliveira Filho
395de69b1f intel/fs: Allow multiple slots for position
Change brw_compute_vue_map() to also take the number of pos slots.  If
more than one slot is used, the VARYING_SLOT_POS is treated as an
array.

When using Primitive Replication, instead of a single position, the
VUE must contain an array of positions.  Padding might be
necessary (after clip distance) to ensure rest of attributes start
aligned.

v2: Add note about array in the commit message and assert that
    pos_slots >= 1 to make clear 0 is invalid. (Jason)
    Move padding to be after the clip distance.

v3: Apply the correct offset when gathering the sources from outputs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v2]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
2020-04-07 17:16:09 +00:00
Caio Marcelo de Oliveira Filho
afa5447312 intel/gen12: Add XML description for 3DSTATE_PRIMITIVE_REPLICATION
v2: Use groups for the 16-element arrays "Viewport Offset"
    and "RTAI Offset". (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2313>
2020-04-07 17:16:09 +00:00
Jason Ekstrand
991c426160 intel/nir: Enable load/store vectorization
This commit enables the I/O vectorization pass that was originally
written for ACO for Intel drivers.  We enable it for UBOs, SSBOs, global
memory, and SLM.  We only enable vectorization for the scalar back-end
because it vec4 makes certain alignment assumptions.

Shader-db results with iris on ICL:

    total instructions in shared programs: 16077927 -> 16068236 (-0.06%)
    instructions in affected programs: 199839 -> 190148 (-4.85%)
    helped: 324
    HURT: 0
    helped stats (abs) min: 2 max: 458 x̄: 29.91 x̃: 4
    helped stats (rel) min: 0.11% max: 38.94% x̄: 4.32% x̃: 1.64%
    95% mean confidence interval for instructions value: -37.02 -22.80
    95% mean confidence interval for instructions %-change: -5.07% -3.58%
    Instructions are helped.

    total cycles in shared programs: 336806135 -> 336151501 (-0.19%)
    cycles in affected programs: 16009735 -> 15355101 (-4.09%)
    helped: 458
    HURT: 154
    helped stats (abs) min: 1 max: 77812 x̄: 1542.50 x̃: 75
    helped stats (rel) min: <.01% max: 34.46% x̄: 5.16% x̃: 2.01%
    HURT stats (abs)   min: 1 max: 22800 x̄: 336.55 x̃: 20
    HURT stats (rel)   min: <.01% max: 17.11% x̄: 2.12% x̃: 1.00%
    95% mean confidence interval for cycles value: -1596.83 -542.49
    95% mean confidence interval for cycles %-change: -3.83% -2.82%
    Cycles are helped.

    total sends in shared programs: 814177 -> 809049 (-0.63%)
    sends in affected programs: 15422 -> 10294 (-33.25%)
    helped: 324
    HURT: 0
    helped stats (abs) min: 1 max: 256 x̄: 15.83 x̃: 2
    helped stats (rel) min: 1.33% max: 67.90% x̄: 21.21% x̃: 15.38%
    95% mean confidence interval for sends value: -19.67 -11.98
    95% mean confidence interval for sends %-change: -23.03% -19.39%
    Sends are helped.

    LOST:   7
    GAINED: 2

Most of the helped shaders were in the following titles:

 - Doom
 - Deus Ex: Mankind Divided
 - Aztec Ruins
 - Shadow of Mordor
 - DiRT Showdown
 - Tomb Raider (Rise, I think)

Five of the lost programs are SIMD16 shaders we lost from dirt showdown.
The other two are compute shaders in Aztec Ruins which switched from
SIMD8 to SIMD16.

Vulkan pipeline-db stats on ICL:

    Instructions in all programs: 296780486 -> 293493363 (-1.1%)
    Loops in all programs: 149669 -> 149669 (+0.0%)
    Cycles in all programs: 90999206722 -> 88513844563 (-2.7%)
    Spills in all programs: 1710217 -> 1730691 (+1.2%)
    Fills in all programs: 1931235 -> 1958138 (+1.4%)

By far the most help was in the Tomb Raider games.  A couple of Batman
games with DXVK were also helped.  In Shadow of the Tomb Raider:

    Instructions in all programs: 41614336 -> 39408023 (-5.3%)
    Loops in all programs: 32200 -> 32200 (+0.0%)
    Cycles in all programs: 1875498485 -> 1667034831 (-11.1%)
    Spills in all programs: 196307 -> 214945 (+9.5%)
    Fills in all programs: 282736 -> 307113 (+8.6%)

Benchmarks of real games I've done on this patch:

 - Rise of the Tomb Raider: +3%
 - Shadow of the Tomb Raider: +10%

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4367>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4367>
2020-04-03 20:26:54 +00:00
Jason Ekstrand
c1bcb025db intel/nir: Lower memory access bit sizes later
We're about to do load/store vectorization right before this but we need
that to happen after we've done a round of optimization.  Otherwise,
we'll be getting unoptimized NIR in from ANV and the vectorizer won't be
able to do anything with it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4367>
2020-04-03 20:26:54 +00:00
Jason Ekstrand
4c8b100388 anv: Improve brw_nir_lower_mem_access_bit_sizes
This commit makes us take both bit size and alignment into account so
that we can properly handle cases such as when we have a 32-bit store
to an 8-bit-aligned address.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4367>
2020-04-03 20:26:54 +00:00
Jason Ekstrand
c643979228 intel/fs: Choose memory message type based on bit size
Thanks to the NIR vectorizing pass, we're about to see alignments that
are higher than the bit size.  Previously, we could use either and we
just happened to choose alignment (probably the wrong choice) so it's
harmless to switch to detecting based on bit size.  This commit changes
things to take both into account which is more accurate to what the
messages we're using do.  We also beef up the asserts and make them more
consistent, more accurate, and more complete.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4367>
2020-04-03 20:26:54 +00:00
Lionel Landwerlin
b38c32a573 intel/aub_viewer: fix access to freed memory
Windows closed while we're displaying them might lead to invalid
memory accessed, so use the safe iterators on the list of windows.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4430>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4430>
2020-04-03 15:46:24 +03:00
Jason Ekstrand
5cc27d59a1 anv/image: Use align_u64 for image offsets
The ALIGN functions in util/u_math.h work on uintptr_t whose size
changes depending on your platform.  Use ones which take an explicit
64-bit type instead to avoid 32-bit platform issues.

Cc: mesa-stable@lists.freedesktop.org
Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4414>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4414>
2020-04-02 15:08:42 +00:00
Juan A. Suarez Romero
191ced539a anv/pipeline: allow more than 16 FS inputs
A fragment shader can have more than 16 inputs, so SBE emission should
deal with all of them.

This fixes dEQP-VK.pipeline.max_varyings.*

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>
2020-04-01 23:36:28 +00:00
Juan A. Suarez Romero
460de2159e intel/compiler: store the FS inputs in WM prog data
Store the fragment shader inputs in the program data so we can use them
later when required without needing the NIR shader.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>
2020-04-01 23:36:28 +00:00
Juan A. Suarez Romero
67c7cabd7f anv: use urb_setup_attribs in SBE
Avoid looping over all VARYING_SLOT_MAX urb_setup arrray entries.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2010>
2020-04-01 23:36:28 +00:00
Danylo Piliaiev
e47bf7dadf anv: Do not sample from 3d depth image with HiZ
For Gen8-11, there are some restrictions around sampling from HiZ.
The Skylake PRM docs for RENDER_SURFACE_STATE::AuxiliarySurfaceMode
say:

    "If this field is set to AUX_HIZ, Number of Multisamples must
    be MULTISAMPLECOUNT_1, and Surface Type cannot be SURFTYPE_3D."

Fixes: dEQP-VK.geometry.layered.3d.*.readback

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2720
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Arcady Goldmints-Orlov <agoldmints@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4409>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4409>
2020-04-01 20:12:29 +00:00
Ian Romanick
62795475e8 nir/algebraic: Distribute source modifiers into instructions
There are three main classes of cases that are helped by this change:

1. When the negation is applied to a value being type converted (e.g.,
   float(-x)).  This could possibly also be handled with more clever
   code generation.

2. When the negation is applied to a phi node source (e.g., x = -(...);
   at the end of a basic block).  This was the original case that caught
   my attention while looking at shader-db dumps.

3. When the negation is applied to the source of an instruction that
   cannot have source modifiers.  This includes texture instructions and
   math box instructions on pre-Gen7 platforms (see more details below).

In many these cases the negation can be propagated into the instructions
that generate the value (e.g., -(a*b) = (-a)*b).

In addition to the operations implemtned in this patch, I also tried:

 - frcp - Helped 6 or fewer shaders on Gen7+, and hurt just as many on
   pre-Gen7.  On Gen6 and earlier, frcp is a math box instruction, and
   math box instructions cannot have source modifiers.

   I suspect this is why so many more shaders are helped on Gen6 than on
   Gen5 or Gen7.  Gen6 supports OpenGL 3.3, so a lot more shaders
   compile on it.  A lot of these shaders may have things like cos(-x)
   or rcp(-x) that could result in an explicit negation instruction.

 - bcsel - Hurt a few shaders with none helped.  bcsel operates on
   integer sources, so the fabs or fneg cannot be a source modifier in
   the bcsel itself.

 - Integer instructions - No changes on any Intel platform.

Some notes about the shader-db results below.

 - On Tiger Lake, a single Deus Ex fragment shader is hurt for both
   spills and fills.

 - On Haswell, a different Deus Ex fragment shader is hurt for both
   spills and fills.

 - On GM45, the "LOST: 1" and "GAINED: 1" is a single Left4Dead 2
   (very high graphics settings, lol) fragment shader that upgrades
   from SIMD8 to SIMD16.

v2: Add support for fsign.  Add some patterns that remove redundant
negations and redundant absolute value rather than trying to push them
down the tree.

Tiger Lake
total instructions in shared programs: 17611333 -> 17586465 (-0.14%)
instructions in affected programs: 3033734 -> 3008866 (-0.82%)
helped: 10310
HURT: 632
helped stats (abs) min: 1 max: 35 x̄: 2.61 x̃: 1
helped stats (rel) min: 0.04% max: 16.67% x̄: 1.43% x̃: 1.01%
HURT stats (abs)   min: 1 max: 47 x̄: 3.21 x̃: 2
HURT stats (rel)   min: 0.04% max: 5.08% x̄: 0.88% x̃: 0.63%
95% mean confidence interval for instructions value: -2.33 -2.21
95% mean confidence interval for instructions %-change: -1.32% -1.27%
Instructions are helped.

total cycles in shared programs: 338365223 -> 338262252 (-0.03%)
cycles in affected programs: 125291811 -> 125188840 (-0.08%)
helped: 5224
HURT: 2031
helped stats (abs) min: 1 max: 5670 x̄: 46.73 x̃: 12
helped stats (rel) min: <.01% max: 34.78% x̄: 1.91% x̃: 0.97%
HURT stats (abs)   min: 1 max: 2882 x̄: 69.50 x̃: 14
HURT stats (rel)   min: <.01% max: 44.93% x̄: 2.35% x̃: 0.74%
95% mean confidence interval for cycles value: -18.71 -9.68
95% mean confidence interval for cycles %-change: -0.80% -0.63%
Cycles are helped.

total spills in shared programs: 8942 -> 8946 (0.04%)
spills in affected programs: 8 -> 12 (50.00%)
helped: 0
HURT: 1

total fills in shared programs: 9399 -> 9401 (0.02%)
fills in affected programs: 21 -> 23 (9.52%)
helped: 0
HURT: 1

Ice Lake
total instructions in shared programs: 16124348 -> 16102258 (-0.14%)
instructions in affected programs: 2830928 -> 2808838 (-0.78%)
helped: 11294
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.96 x̃: 1
helped stats (rel) min: 0.07% max: 17.65% x̄: 1.32% x̃: 0.93%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 3.45% max: 4.00% x̄: 3.72% x̃: 3.72%
95% mean confidence interval for instructions value: -1.99 -1.93
95% mean confidence interval for instructions %-change: -1.34% -1.29%
Instructions are helped.

total cycles in shared programs: 335393932 -> 335325794 (-0.02%)
cycles in affected programs: 123834609 -> 123766471 (-0.06%)
helped: 5034
HURT: 2128
helped stats (abs) min: 1 max: 3256 x̄: 43.39 x̃: 11
helped stats (rel) min: <.01% max: 35.79% x̄: 1.98% x̃: 1.00%
HURT stats (abs)   min: 1 max: 2634 x̄: 70.63 x̃: 16
HURT stats (rel)   min: <.01% max: 49.49% x̄: 2.73% x̃: 0.62%
95% mean confidence interval for cycles value: -13.66 -5.37
95% mean confidence interval for cycles %-change: -0.69% -0.48%
Cycles are helped.

LOST:   0
GAINED: 2

Skylake
total instructions in shared programs: 14949240 -> 14927930 (-0.14%)
instructions in affected programs: 2594756 -> 2573446 (-0.82%)
helped: 11000
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.94 x̃: 1
helped stats (rel) min: 0.07% max: 18.75% x̄: 1.39% x̃: 0.94%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 4.76% max: 4.76% x̄: 4.76% x̃: 4.76%
95% mean confidence interval for instructions value: -1.97 -1.91
95% mean confidence interval for instructions %-change: -1.42% -1.37%
Instructions are helped.

total cycles in shared programs: 324829346 -> 324821596 (<.01%)
cycles in affected programs: 121566087 -> 121558337 (<.01%)
helped: 4611
HURT: 2147
helped stats (abs) min: 1 max: 3715 x̄: 33.29 x̃: 10
helped stats (rel) min: <.01% max: 36.08% x̄: 1.94% x̃: 1.00%
HURT stats (abs)   min: 1 max: 2551 x̄: 67.88 x̃: 16
HURT stats (rel)   min: <.01% max: 53.79% x̄: 3.69% x̃: 0.89%
95% mean confidence interval for cycles value: -4.25 1.96
95% mean confidence interval for cycles %-change: -0.28% -0.02%
Inconclusive result (value mean confidence interval includes 0).

Broadwell
total instructions in shared programs: 14971203 -> 14949957 (-0.14%)
instructions in affected programs: 2635699 -> 2614453 (-0.81%)
helped: 10982
HURT: 2
helped stats (abs) min: 1 max: 12 x̄: 1.93 x̃: 1
helped stats (rel) min: 0.07% max: 18.75% x̄: 1.39% x̃: 0.94%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 4.76% max: 4.76% x̄: 4.76% x̃: 4.76%
95% mean confidence interval for instructions value: -1.97 -1.90
95% mean confidence interval for instructions %-change: -1.42% -1.37%
Instructions are helped.

total cycles in shared programs: 336215033 -> 336086458 (-0.04%)
cycles in affected programs: 127383198 -> 127254623 (-0.10%)
helped: 4884
HURT: 1963
helped stats (abs) min: 1 max: 25696 x̄: 51.78 x̃: 12
helped stats (rel) min: <.01% max: 58.28% x̄: 2.00% x̃: 1.05%
HURT stats (abs)   min: 1 max: 3401 x̄: 63.33 x̃: 16
HURT stats (rel)   min: <.01% max: 39.95% x̄: 2.20% x̃: 0.70%
95% mean confidence interval for cycles value: -29.99 -7.57
95% mean confidence interval for cycles %-change: -0.89% -0.71%
Cycles are helped.

total fills in shared programs: 24905 -> 24901 (-0.02%)
fills in affected programs: 117 -> 113 (-3.42%)
helped: 4
HURT: 0

LOST:   0
GAINED: 16

Haswell
total instructions in shared programs: 13148927 -> 13131528 (-0.13%)
instructions in affected programs: 2220941 -> 2203542 (-0.78%)
helped: 8017
HURT: 4
helped stats (abs) min: 1 max: 12 x̄: 2.17 x̃: 1
helped stats (rel) min: 0.07% max: 15.25% x̄: 1.40% x̃: 0.93%
HURT stats (abs)   min: 1 max: 7 x̄: 2.50 x̃: 1
HURT stats (rel)   min: 0.33% max: 4.76% x̄: 2.73% x̃: 2.91%
95% mean confidence interval for instructions value: -2.21 -2.13
95% mean confidence interval for instructions %-change: -1.43% -1.37%
Instructions are helped.

total cycles in shared programs: 321221791 -> 321079870 (-0.04%)
cycles in affected programs: 126886055 -> 126744134 (-0.11%)
helped: 4674
HURT: 1729
helped stats (abs) min: 1 max: 23654 x̄: 56.47 x̃: 16
helped stats (rel) min: <.01% max: 53.22% x̄: 2.13% x̃: 1.05%
HURT stats (abs)   min: 1 max: 3694 x̄: 70.58 x̃: 18
HURT stats (rel)   min: <.01% max: 63.06% x̄: 2.48% x̃: 0.90%
95% mean confidence interval for cycles value: -33.31 -11.02
95% mean confidence interval for cycles %-change: -0.99% -0.78%
Cycles are helped.

total spills in shared programs: 19872 -> 19874 (0.01%)
spills in affected programs: 21 -> 23 (9.52%)
helped: 0
HURT: 1

total fills in shared programs: 20941 -> 20941 (0.00%)
fills in affected programs: 62 -> 62 (0.00%)
helped: 1
HURT: 1

LOST:   0
GAINED: 8

Ivy Bridge
total instructions in shared programs: 11875553 -> 11853839 (-0.18%)
instructions in affected programs: 1553112 -> 1531398 (-1.40%)
helped: 7304
HURT: 3
helped stats (abs) min: 1 max: 16 x̄: 2.97 x̃: 2
helped stats (rel) min: 0.07% max: 15.25% x̄: 1.62% x̃: 1.15%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.05% max: 3.33% x̄: 2.44% x̃: 2.94%
95% mean confidence interval for instructions value: -3.04 -2.90
95% mean confidence interval for instructions %-change: -1.65% -1.59%
Instructions are helped.

total cycles in shared programs: 178246425 -> 178184484 (-0.03%)
cycles in affected programs: 13702146 -> 13640205 (-0.45%)
helped: 4409
HURT: 1566
helped stats (abs) min: 1 max: 531 x̄: 24.52 x̃: 13
helped stats (rel) min: <.01% max: 38.67% x̄: 2.14% x̃: 1.02%
HURT stats (abs)   min: 1 max: 356 x̄: 29.48 x̃: 10
HURT stats (rel)   min: <.01% max: 64.73% x̄: 1.87% x̃: 0.70%
95% mean confidence interval for cycles value: -11.60 -9.14
95% mean confidence interval for cycles %-change: -1.19% -0.99%
Cycles are helped.

LOST:   0
GAINED: 10

Sandy Bridge
total instructions in shared programs: 10695740 -> 10667483 (-0.26%)
instructions in affected programs: 2337607 -> 2309350 (-1.21%)
helped: 10720
HURT: 1
helped stats (abs) min: 1 max: 49 x̄: 2.64 x̃: 2
helped stats (rel) min: 0.07% max: 20.00% x̄: 1.54% x̃: 1.13%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.04% max: 1.04% x̄: 1.04% x̃: 1.04%
95% mean confidence interval for instructions value: -2.69 -2.58
95% mean confidence interval for instructions %-change: -1.57% -1.51%
Instructions are helped.

total cycles in shared programs: 153478839 -> 153416223 (-0.04%)
cycles in affected programs: 22050900 -> 21988284 (-0.28%)
helped: 5342
HURT: 2200
helped stats (abs) min: 1 max: 1020 x̄: 20.34 x̃: 16
helped stats (rel) min: <.01% max: 24.05% x̄: 1.51% x̃: 0.86%
HURT stats (abs)   min: 1 max: 335 x̄: 20.93 x̃: 6
HURT stats (rel)   min: <.01% max: 20.18% x̄: 1.03% x̃: 0.30%
95% mean confidence interval for cycles value: -9.18 -7.42
95% mean confidence interval for cycles %-change: -0.82% -0.71%
Cycles are helped.

Iron Lake
total instructions in shared programs: 8114882 -> 8105574 (-0.11%)
instructions in affected programs: 1232504 -> 1223196 (-0.76%)
helped: 4109
HURT: 2
helped stats (abs) min: 1 max: 6 x̄: 2.27 x̃: 1
helped stats (rel) min: 0.05% max: 8.33% x̄: 0.99% x̃: 0.66%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.94% max: 4.35% x̄: 2.65% x̃: 2.65%
95% mean confidence interval for instructions value: -2.31 -2.21
95% mean confidence interval for instructions %-change: -1.01% -0.96%
Instructions are helped.

total cycles in shared programs: 188504036 -> 188466296 (-0.02%)
cycles in affected programs: 31203798 -> 31166058 (-0.12%)
helped: 3447
HURT: 36
helped stats (abs) min: 2 max: 92 x̄: 11.03 x̃: 8
helped stats (rel) min: <.01% max: 5.41% x̄: 0.21% x̃: 0.13%
HURT stats (abs)   min: 2 max: 30 x̄: 7.33 x̃: 6
HURT stats (rel)   min: 0.01% max: 1.65% x̄: 0.18% x̃: 0.10%
95% mean confidence interval for cycles value: -11.16 -10.51
95% mean confidence interval for cycles %-change: -0.22% -0.20%
Cycles are helped.

LOST:   0
GAINED: 1

GM45
total instructions in shared programs: 4989697 -> 4984531 (-0.10%)
instructions in affected programs: 703952 -> 698786 (-0.73%)
helped: 2493
HURT: 2
helped stats (abs) min: 1 max: 6 x̄: 2.07 x̃: 1
helped stats (rel) min: 0.05% max: 8.33% x̄: 1.03% x̃: 0.66%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.95% max: 4.35% x̄: 2.65% x̃: 2.65%
95% mean confidence interval for instructions value: -2.13 -2.01
95% mean confidence interval for instructions %-change: -1.07% -0.99%
Instructions are helped.

total cycles in shared programs: 128929136 -> 128903886 (-0.02%)
cycles in affected programs: 21583096 -> 21557846 (-0.12%)
helped: 2214
HURT: 17
helped stats (abs) min: 2 max: 92 x̄: 11.44 x̃: 8
helped stats (rel) min: <.01% max: 5.41% x̄: 0.24% x̃: 0.13%
HURT stats (abs)   min: 2 max: 8 x̄: 4.24 x̃: 4
HURT stats (rel)   min: 0.01% max: 1.65% x̄: 0.20% x̃: 0.09%
95% mean confidence interval for cycles value: -11.75 -10.88
95% mean confidence interval for cycles %-change: -0.25% -0.22%
Cycles are helped.

LOST:   1
GAINED: 1

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
2020-04-01 00:28:38 +00:00
Ian Romanick
d2b4f3f137 intel/vec4: Allow late copy propagation on vec4
This change incurs a small amount of hurt now, but it enables a lot of
benefit on vec4 shaders on the next commit.  nir_opt_algebraic_late
converts dph, dot3, etc. to dhp_replicated, dot_replicated3, etc.  In
the process, it introduces extra moves.  If the original NIR contained

        vec1 32 ssa_45 = fdot4 ssa_51, ssa_44
        vec1 32 ssa_46 = fneg ssa_45

nir_opt_algebraic_late will produce

        vec4 32 ssa_18 = fdot_replicated4 ssa_1, ssa_15
        vec1 32 ssa_19 = mov ssa_18.x
        vec1 32 ssa_17 = fneg ssa_19

The algebraic pass added in the next commit can't see through the move
to know that the fneg applies to a fdot_replicated4.

Haswell, Ivy Bridge, and Sandybridge had similar results. (Haswell shown)
total cycles in shared programs: 187077604 -> 187079858 (<.01%)
cycles in affected programs: 350132 -> 352386 (0.64%)
helped: 174
HURT: 194
helped stats (abs) min: 2 max: 124 x̄: 23.60 x̃: 16
helped stats (rel) min: 0.12% max: 15.88% x̄: 4.98% x̃: 3.86%
HURT stats (abs)   min: 2 max: 164 x̄: 32.78 x̃: 16
HURT stats (rel)   min: 0.17% max: 22.82% x̄: 6.46% x̃: 0.86%
95% mean confidence interval for cycles value: 2.04 10.21
95% mean confidence interval for cycles %-change: 0.17% 1.93%
Cycles are HURT.

No shader-db changes on any other Intel platform.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
2020-04-01 00:28:38 +00:00
Lionel Landwerlin
88c046a6d3 isl: don't warn in physical extent calculation for yuv formats
Those format have correct descriptions already with the exception of
the planar format. In that case we introduce an assert.

This fine because we don't use the planar format in any of our
drivers. There are restrictions on how the addresses of the 2 planes
are relative to one another which make this annoying. The sampler is
also more limited than what we can do with a shader snippet.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2999>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2999>
2020-03-31 15:59:21 +00:00
Lionel Landwerlin
015f08dd43 isl: set bpb for Y8_UNORM
This isn't a format we use in any of the drivers but for consistency
just give it a correct bpb.

We also set the luminance in the G channel. We can't actually use this
format with the 3D sampler (only media).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2999>
2020-03-31 15:59:21 +00:00
Jason Ekstrand
896a7c28eb anv/allocator: Use util_dynarray for blocks in anv_state_stream
When we originally wrote a bunch of the allocation data structures, we
re-used the GPU memory for CPU-side data structures.  It's a bit more
memory efficient and usually ok.  However, this has a couple of
problems:

 1. It makes it MUCH more likely that the GPU will accidentlly stomp
    CPU-side data structures and cause nearly impossible to debug
    crashes.

 2. With discrete GPUs, the memory will be mapped somehow and that map
    may be across the BAR so it could have horribly slow CPU access.
    This is bad for our CPU-side data structures.

In the case of anv_state_stream, it also made the data structure
massively more complex than it needed to be.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4336>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4336>
2020-03-31 08:12:07 +00:00
Jason Ekstrand
63bec07e14 anv: Account for the header in anv_state_stream_alloc
If we have an allocation that's exactly the block size, we end up
computing a new block size to allocate that's exactly the block size,
add in the header, and then assert fail.  When computing the block size,
we need to account for the header.

Fixes: 955127db93 "anv/allocator: Add support for large stream..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4336>
2020-03-31 08:12:07 +00:00
Jason Ekstrand
4e80151c5d anv: Set alignments on descriptor and constant loads
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>
2020-03-30 15:46:19 +00:00
Jason Ekstrand
2cb9cc56d5 intel/nir: Run copy-prop and DCE after lower_bool_to_int32
No shader-db impact on ICL with iris.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>
2020-03-30 15:46:19 +00:00
Eric Engestrom
8970b7839a intel: drop unused include directories
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>
2020-03-28 21:36:54 +01:00
Eric Engestrom
79af30768d meson: inline inc_common
Let's make it clear what includes are being added everywhere, so that
they can be cleaned up.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>
2020-03-28 21:36:54 +01:00