Commit graph

3813 commits

Author SHA1 Message Date
Caio Marcelo de Oliveira Filho
5299c9cbcc anv: skip bit6 swizzle detection in Gen8+
It is always false on Gen8+.  Also, move the variable definition near
its use.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-04 20:44:41 -08:00
Danylo Piliaiev
64d3b148fe anv: Fix VK_EXT_transform_feedback working with varyings packed in PSIZ
Transform feedback did not set correct SO_DECL.ComponentMask for
varyings packed in VARYING_SLOT_PSIZ:
 gl_Layer         - VARYING_SLOT_LAYER    in VARYING_SLOT_PSIZ.y
 gl_ViewportIndex - VARYING_SLOT_VIEWPORT in VARYING_SLOT_PSIZ.z
 gl_PointSize     - VARYING_SLOT_PSIZ     in VARYING_SLOT_PSIZ.w

Fixes: 36ee2fd61c "anv: Implement the basic form of VK_EXT_transform_feedback"

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-04 15:30:43 +00:00
Danylo Piliaiev
d76e777988 anv: Handle VK_ATTACHMENT_UNUSED in colorAttachment
From the Vulkan 1.0.98 spec for vkCmdClearAttachments:

"If the aspectMask member of any element of pAttachments contains
VK_IMAGE_ASPECT_COLOR_BIT, then the colorAttachment member of that
element must either refer to a color attachment which is VK_ATTACHMENT_UNUSED,
or must be a valid color attachment."

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-04 14:49:50 +02:00
Jason Ekstrand
48ed2a7bb0 anv: Implement VK_EXT_buffer_device_address
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-01 17:09:42 -06:00
Jason Ekstrand
e644ed468f intel/fs: Implement nir_intrinsic_global_atomic_*
eviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:11:00 -06:00
Jason Ekstrand
a91f392073 intel/fs: Use SENDS for A64 writes on gen9+
eviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:11:00 -06:00
Jason Ekstrand
1c25bf4373 intel/fs: Implement load/store_global with A64 untyped messages
eviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:11:00 -06:00
Jason Ekstrand
b4f0d062cd intel/fs: Do the grf127 hack on SIMD8 instructions in SIMD16 mode
Previously, we only applied the fix to shaders with a dispatch mode of
SIMD8 but the code it relies on for SIMD16 mode only applies to SIMD16
instructions.  If you have a SIMD8 instruction in a SIMD16 shader,
neither would trigger and the restriction could still be hit.

Fixes: 232ed89802 "i965/fs: Register allocator shoudn't use grf127..."
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:11:00 -06:00
Jason Ekstrand
79724a0756 intel/fs: Properly handle 64-bit types in LOAD_PAYLOAD
By just assigning dst.type to src[i].type, we ensure that the offset at
the end of the loop actually offsets it by the right number of
registers.  Otherwise, we'll get into a case where we copy with a Q type
and then offset with a D type and things get out of sync.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:10:57 -06:00
Jason Ekstrand
f02914a991 intel/fs/cse: Split create_copy_instr into three cases
Previously, we tried to combine all cases where the instruction being
CSE'd writes to more than one MOV worth of registers into one case with
a bit of special casing for LOAD_PAYLOAD.  This commit splits things so
that LOAD_PAYLOAD is entirely it's own case.  This makes tweaking the
LOAD_PAYLOAD case simpler in the next commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:10:40 -06:00
Jason Ekstrand
f409a08e5f intel/nir: Add global support to lower_mem_access_bit_sizes
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 16:08:29 -06:00
Oscar Blumberg
fea5b8e5ad intel/fs: Fix memory corruption when compiling a CS
Missing check for shader stage in the fs_visitor would corrupt the
cs_prog_data.push information and trigger crashes / corruption later
when uploading the CS state.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-01 10:53:33 -08:00
Dylan Baker
4052142de7 meson: remove -std=c++11 from intel/tools
for meson all C++ code is already compiled as C++11, so it's
unnecessary. It's also the wrong way to do this, if we really needed
this the correct way is to set:

```meson
executable(
  ...
  override_options : ['cpp_std=c++11'],
)
```

Which ensures not only that the correct syntax for the current
compiler is used, but also that meson doesn't create arguments like
`-std=c++14 ... -std=c++11`

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-31 21:42:16 +00:00
Dylan Baker
8e49b32f63 meson: fix style in intel/tools
The `:` in options should always have one space before and after `foo
: bar`, and lists do not get spaces around the braces: `[foo]` not `[
foo ]`

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-31 21:42:16 +00:00
Dylan Baker
d93d53fa72 meson: remove build_by_default : true
Which is and has always been the default. This is largely an artifact
of how the building of these tools was controlled when the meson build
was originally created.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-31 21:42:16 +00:00
Jason Ekstrand
a920979d4f intel/fs: Use split sends for surface writes on gen9+
Surface reads don't need them because they just have the one address
payload.  With surface writes, on the other hand, we can put the address
and the data in the different halves and avoid building the payload all
together.

The decrease in register pressure and added freedom in register
allocation resulting from this change reduces spilling enough to improve
the performance of one customer benchmark by about 2x.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
014edff0d2 intel/fs: Add interference between SENDS sources
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
eab1c55590 intel/fs: Support SENDS in SHADER_OPCODE_SEND
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
cca199fd85 intel/disasm: Properly disassemble split sends
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
8babaa84e8 intel/eu: Add support for the SENDS[C] messages
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
d6a6e10390 intel/inst: Indent some code
We're about to add some more if cases so let's have the giant re-indent
in it's own patch to make review easier.

Acked-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
d96969120d intel/inst: Fix the ia16_addr_imm helpers
These have clearly never seen any use.... On gen8, the bottom 4 bits are
missing so we need to shift them off before we call set_bits and shift
again when we get the bits.  Found by inspection.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
e46fb33143 intel/disasm: Rework SEND decoding to use descriptors
Instead of fetching the information out of the instruction directly,
fetch the descriptor and then pluck the information out of the
descriptor.  The current scheme works ok for SEND but with SENDS, it all
falls to pieces because the descriptor is completely shuffled around.

This commit doesn't actually convert everything.  One notable exception
is URB messages which don't even use descriptors in emit_urb_WRITE yet.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
13a6fabc62 intel/eu: Add more message descriptor helpers
We want to be able to extract data from descriptors as well as unify a
bit of the descriptor construction.

One of the unifications we do is to unify the read/write and dataport
descriptors.  On gen4-5, read/write are substantially different and the
read descriptors change between gen4 and gen4.x.  On gen6, they unified
layouts between read, write, and dataport.  Then, on gen8, they added
one bit to the message type field but left it reserved MBZ for
read/write messages.  This commit chooses to treat that as if they
expanded the field everywhere and just didn't have enough enum values
for read/write to bother with the extra bit.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
c3aa436bfe intel/eu/validate: SEND restrictions also apply to SENDC
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
fee6bd8d8e intel/eu: Use GET_BITS in brw_inst_set_send_ex_desc
It's a bit more readable

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
b284d222db intel/fs: Use SHADER_OPCODE_SEND for varying UBO pulls on gen7+
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
8514eba693 intel/fs: Use SHADER_OPCODE_SEND for texturing on gen7+
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
f547cebbe0 intel/fs: Use a logical opcode for IMAGE_SIZE
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
d2d3e04501 intel/fs: Use SHADER_OPCODE_SEND for surface messages
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
7f1cf046cd intel/fs: Add a generic SEND opcode
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
ba3c5300f9 intel/eu: Rework surface descriptor helpers
This commit pulls the surface descriptor helpers out into brw_eu.h and
makes them no longer depend on the codegen infrastructure.  This should
allow us to use them directly from the IR code instead of the generator.
This change is unfortunately less mechanical than perhaps one would like
but it should be fairly straightforward.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
5b17379631 intel/eu: Add has_simd4x2 bools to surface_write functions
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
2ce93b88c0 intel/fs: Take an explicit exec size in brw_surface_payload_size()
Instead of magically falling back to SIMD8 for atomics and typed
messages on Ivy Bridge, explicitly figure out the exec size and pass
that into brw_surface_payload_size.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
cf42b0f9e2 intel/fs: Handle IMAGE_SIZE in size_read() and is_send_from_grf()
Like all the other sends, it's just mlen * REG_SIZE.

Fixes: 3cbc02e469 "intel: Use TXS for image_size when we have..."
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
009c0bd840 intel/defines: Explicitly cast to uint32_t in SET_FIELD and SET_BITS
If you pass a bool in as the value to set, the C standard says that it
gets converted to an int prior to shifting.  If you try to set a bool to
bit 31, this lands you in undefined behavior.  It's better just to add
the explicit cast and let the compiler delete it for us.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
077b9557a4 intel/fs: Get rid of fs_inst::equals
There are piles of fields that it doesn't check so using it is a lie.
The only reason why it's not causing problem is because it has exactly
one user which only uses it for MOV instructions (which aren't very
interesting) and only on Sandy Bridge and earlier hardware.  Just get
rid of it and inline it in the one place that it's actually used.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Matt Turner
18b467c066 intel/compiler: Add a file-level description of brw_eu_validate.c
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-01-26 10:33:22 -08:00
Tapani Pälli
540939ecee android: fix build issues with libmesa_anv_gen* libraries
We need this include path to find nir/nir_xfb_info.h.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-01-25 15:21:06 +02:00
Andrii Simiklit
4759bb2fcf intel/batch-decoder: fix a vb end address calculation
According to the loop implementation (in 'ctx_print_buffer' function),
which advances dword by dword over vertex buffer(vb),
the vb size should be aligned by 4 bytes too.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109449
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-25 15:12:30 +02:00
Andrii Simiklit
db39a44f10 intel/batch-decoder: fix vertex buffer size calculation for gen<8
It should be incremented by one according to
how it is calculated by 'emit_vertex_buffer_state':
  "\#if GEN_GEN < 8
      .BufferAccessType = step_rate ? INSTANCEDATA : VERTEXDATA,
      .InstanceDataStepRate = step_rate,
   \#if GEN_GEN >= 5
      .EndAddress = ro_bo(bo, end_offset - 1),
   \#endif
   \#endif"

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109449
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-25 15:12:07 +02:00
Eric Engestrom
9af77fcf98 anv: drop always-successful VkResult
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-25 09:45:27 +00:00
Rafael Antognolli
f2ece26601 anv/allocator: Avoid race condition in anv_block_pool_map.
Accessing bo->map and then pool->center_bo_offset without a lock is
racy. One way of avoiding such race condition is to store the bo->map +
center_bo_offset into pool->map at the time the block pool is growing,
which happens within a lock.

v2: Only set pool->map if not using softpin (Jason).
v3: Move things around and only update center_bo_offset if not using
softpin too (Jason).

Cc: Jason Ekstrand <jason@jlekstrand.net>
Reported-by: Ian Romanick <idr@freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109442
Fixes: fc3f588320
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-24 17:39:40 -08:00
Matt Turner
e166003cb7 intel/compiler: Reset default flag register in brw_find_live_channel()
emit_uniformize() emits SHADER_OPCODE_FIND_LIVE_CHANNEL with its
flag_subreg set, so that the IR knows which flag is accessed. However
the flag is only used on Gen7 in Align1 mode.

To avoid setting unnecessary bits in the instruction words, get the
information we need and reset the default flag register. This allows
round-tripping through the assembler/disassembler.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-01-23 22:48:29 -08:00
Jason Ekstrand
ac0f8a6ea0 anv: Implement transform feedback queries
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:57 -06:00
Jason Ekstrand
7f4d9bb7b8 genxml: Add SO_PRIM_STORAGE_NEEDED and SO_NUM_PRIMS_WRITTEN
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:57 -06:00
Jason Ekstrand
673f33c77d anv: Implement CmdBegin/EndQueryIndexed
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:57 -06:00
Jason Ekstrand
2be89cbd82 anv: Implement vkCmdDrawIndirectByteCountEXT
Annoyingly, this requires that we implement integer division on the
command streamer.  Fortunately, we're only ever dividing by constants so
we can use the mulh+add+shift trick and it's not as bad as it sounds.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
36ee2fd61c anv: Implement the basic form of VK_EXT_transform_feedback
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
39925d60ec anv: Add pipeline cache support for xfb_info
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:56 -06:00