Name the options to "Pixel Location":
- PIXLOC_CENTER -> CENTER
- PIXLOC_UL_CORNER -> UL_CORNER
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This makes genxml create the right struct types, and generate the right
batch commands.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Both GS and SOL have these fields. Some were ReorderEnable = true,
some were ReorderMode = REORDER_TRAILING, and some were just TRAILING.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Use an alias, so we can set the same value as the #define's.
v3:
- Call it "SO Buffer MOCS" to follow the most common naming scheme.
- Add alias for gen7 and gen75 too (Ken).
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fill out "Attribute Active Component Format" possible values.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
There are two variants:
- Clip Enable
- CLIP Enable (on gen6)
Rename everything to Clip Enable.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Add some more details to Gen4 and Gen45 and add what is needed
in Gen5 XML. This commit overwrite the previous work done on Gen4
and Gen45 as it contains more instructions and fixes some mistakes.
However, comments (dword boundaries) are lost in the process.
v3:
- Set the type of some fields, instead of prefix. Also fix the
SAMPLER_BORDER_COLOR_STATE fields of gen5.xml.
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
This implementation allocates a 4k BO for each semaphore that can be
exported using OPAQUE_FD and uses the kernel's already-existing
synchronization mechanism on BOs.
Reviewed-by: Chad Versace <chadversary@chromium.org>
This just stubs things out. Real external semaphore support will come
with VK_KHX_external_semaphore_fd.
Reviewed-by: Chad Versace <chadversary@chromium.org>
No piglit regressions.
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
After successful drmGetDevices2() call, drmFreeDevices() needs to be called.
Fixes: 743315f2 "radv: do not open random render node(s)"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
drmGetDevices2 takes count and not size. Probably hasn't caused problems
yet in practice and was missed as setups with more than 8 DRM devices
are not very common.
Fixes: 743315f2 "radv: do not open random render node(s)"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
v2 (Jason Ekstrand):
- Take a view_mask rather than a whole subpass
- Build the view mask into the VS shader key
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
We want to insert more lowering code that may insert system values and
we need to gather info after that lowering.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Shader hashing is very closely related to shader compilation. Putting
them right next to each other in anv_pipeline makes it easier to verify
that we're actually hashing everything we need to be hashing. The only
real change (other than the order of hashing) is that we now hash in the
shader stage.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
That pass hasn't existed since dd4db84640
but the prototype stuck around for no reason.
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
We always use only single element.
v2: Change single element arrays to variables
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
The regioning parameters are now properly set by convert_to_hw_regs()
and we don't need to fix them in the generator. That latter fix
previously done in the generator was strictly speaking wrong for any
non-identity regions.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
On gen7, the swizzles used in DF align16 instructions works for element
size of 32 bits, so we can address only 2 consecutive DFs. As we assumed that
in the rest of the code and prepare the instructions for this (scalarize_df()),
we need to set it to two again.
However, for DF align1 instructions, a width of 2 is wrong as we are not
reading the data we want. For example, an uniform would have a region of
<0, 2, 1> so it would repeat the first 2 DFs, when we wanted to access
to the first 4.
This patch sets the default one to 4 and then modifies the width of
align16 instruction's DF sources when we translate the logical swizzle
to the physical one.
v2:
- Remove conditional (Curro).
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
From IVB PRM, vol4, part3, "General Restrictions on Regioning
Parameters":
"If ExecSize = Width and HorzStride ≠ 0, VertStride must
be set to Width * HorzStride."
In next patch, we are going to modify the region parameter for
uniforms and vgrf. For uniforms that are the source of
DF align1 instructions, they will have <0, 4, 1> regioning and
the execsize for those instructions will be 4, so they will break
the regioning rule. This will be the same for VGRF sources where
we use the vstride == 0 exploit.
As we know we are not going to cross the GRF boundary with that
execsize and parameters (not even with the exploit), we just fix
the vstride here.
v2:
- Move is_align1_df() (Curro)
- Refactor exec_size == width calculation (Curro)
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
This fixes:
dEQP-VK.glsl.builtin.precision.min.*
dEQP-VK.glsl.builtin.precision.max.*
dEQP-VK.glsl.builtin.precision.clamp.*
The problem is the hw doesn't compare denorms properly,
so we have to flush them, even though the spec says
flushing is optional, if you don't flush the results
should be correct.
The -pro driver changes the shader float mode,
it would be nice if llvm could grow that perhaps.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
SPIR-V defines the f32->f16 operation as flushing denormals to 0,
this compares the class using amd class opcode.
Thanks to Matt Arsenault for figuring it out.
This fix is VI+ only, add a TODO for SI/CIK.
This fixes:
dEQP-VK.spirv_assembly.instruction.compute.opquantize.flush_to_zero
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Having it in the winsys didn't work when multiple devices use
the same winsys, as we then have multiple contexts per queue,
and each context counts separately.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: 7b9963a28f "radv: Enable userspace fence checking."
Loop unroll asserts if it hits a sub, we don't really want
to lower subs as llvm handles these things, but do this for
now, until we can fix loop unroll to work with subs.
Fixes: 14ae0bfa5 (radv: Add NIR loop unrolling)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
All of the dynamic states apply to rasterization & fragment processing,
so we don't need to set them if we don't rasterize.
We don't clear the dirty flags for them though, so we don't miss any
updates for the next pipeline with rasterization.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: 76603aa90b "radv: Drop the default viewport when 0 viewports are given."
This still doesn't give us complete pWaitDstStageMask support,
but it should provide enough to be correct if not as efficent as
possible.
If we have wait semaphores we must flush between submits and
flush the shaders as well.
This fixes the remaining fails in:
dEQP-VK.synchronization.op.single_queue.semaphore.*ssbo*
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
I don't see any reasons why vector_elements is 1 for images and
0 for samplers. This increases consistency and allows to clean
up some code a bit.
This will also help for ARB_bindless_texture.
No piglit regressions with RadeonSI.
This time the Intel CI system doesn't report any failures.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>