Commit graph

106651 commits

Author SHA1 Message Date
Kenneth Graunke
caa0aebd01 iris: Flag IRIS_DIRTY_BINDINGS_XS on constant buffer rebinds
We upload a new SURFACE_STATE for the UBO/SSBO in question, which
means that we need new binding tables as well.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-18 15:44:22 -07:00
Bas Nieuwenhuizen
4b7e7956f0 radv: Add DFSM support.
Apparently we already enabled it without having support ...

Not sure if we also need to set disable_start_of_prim when the PS
has memory writes, but this mirrors radeonsi.

Doubles fillrate in my dual_quad_bench from ~16 pixels/cycles to
~32 pixels/cycle on a Raven.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 21:28:51 +00:00
Bas Nieuwenhuizen
0fa2740059 radv: Disable dfsm by default even on Raven.
When actually implementing it, Talos on low is still 3% slower.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 21:28:51 +00:00
Bas Nieuwenhuizen
f2dffb395f radv: Only break batch on framebuffer change with dfsm.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 21:28:51 +00:00
Connor Abbott
57e0bb8ccc nir/opt_if: Fix undef handling in opt_split_alu_of_phi()
The pass assumed that "Most ALU ops produce an undefined result if any
source is undef" which is completely untrue. Due to how we lower if
statements to selects and then optimize on those selects later, we
simply cannot make that assumption. In particular this pass tried to
replace an ior of undef and true, which had been generated by
optimizing a select which itself came from flattening an if statement,
to undef causing a miscompilation for a CTS test with radeonsi NIR.

We fix this by always doing what the non-undef path did, i.e. duplicate
the instruction twice. If there are cases where the instruction before
the loop can be folded away due to having an undef source, we should add
these to opt_undef instead.

The comment above the pass says that if the phi source from before the
loop is undef, and we can fold the instruction before the loop to undef,
then we can ignore sources of the original instruction that don't
dominate the block before the loop because we don't need them to create
the instruction before the loop. This is incorrect, because the
instruction at the bottom of the loop would get those sources from the
wrong loop iteration. The code never actually did what the comment said,
so we only have to update the comment to match what the pass actually
does. We also update the example to more closely match what most actual
loops look like after vtn and peephole_select.

There are no shader-db changes with i965, radeonsi NIR, or radv. With
anv and my vkpipeline-db there's only one change:

total instructions in shared programs: 14125290 -> 14125300 (<.01%)
instructions in affected programs: 2598 -> 2608 (0.38%)
helped: 0
HURT: 1

total cycles in shared programs: 2051473437 -> 2051473397 (<.01%)
cycles in affected programs: 36697 -> 36657 (-0.11%)
helped: 1
HURT: 0

Fixes
KHR-GL45.shader_subroutine.control_flow_and_returned_subroutine_values_used_as_subroutine_input
with radeonsi NIR.
2019-09-18 17:18:34 -04:00
Eric Engestrom
a1de3011f3 gl: drop incorrect pkg-config file for glvnd
Akin to 1a25980c46 ("egl: drop incorrect pkg-config file for
glvnd") and b01524fff0 ("meson: don't build libGLES*.so with
GLVND") , removes a pkg-config file that shouldn't have been there in
the first place, but was needed because of that GLVND bug.

Now that the glvnd bug has been fixed, it was apparent that this gl.pc
pkg-config file was forgotten to be removed, so let's do just that :)

Suggested-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-09-18 22:16:51 +01:00
Andres Gomez
d9760f8935 nir/opcodes: Clear variable names confusion
Having Python and C variables sharing name in the same block of code
makes its understanding a bit confusing. Make it explicit that the
Python bit_size variable refers to the destination bit size.

Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-18 23:59:07 +03:00
Rhys Perry
b3f71685d9 radv: never kill a NGG GS shader
Seems to fix a hang with excessive vertex emissions when NGG is used for
GS.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-18 19:26:58 +00:00
Samuel Pitoiset
99c186fbbe radv/gfx10: fix VK_KHR_pipeline_executable_properties with NGG GS
No GS copy shader if a pipeline enables NGG GS.

This fixes
dEQP-VK.pipeline.executable_properties.graphics.*geometry_stage*.

Fixes: 86864eedd2 ("radv: Implement radv_GetPipelineExecutablePropertiesKHR.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-18 21:19:28 +02:00
Marek Olšák
fe7aa271a9 radeonsi: include drm_fourcc.h to fix the build 2019-09-18 14:52:25 -04:00
Marek Olšák
00e29816e7 radeonsi: implement pipe_screen::resource_get_param
v2: return DRM_FORMAT_MOD_INVALID from the function

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2019-09-18 14:43:01 -04:00
Marek Olšák
d307aa56f9 gallium: extend resource_get_param to be as capable as resource_get_handle
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-18 14:41:30 -04:00
Marek Olšák
aae35fbd3a ac: move ac_get_num_physical_vgprs into radeon_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 14:39:06 -04:00
Marek Olšák
0692ae34e9 ac: move ac_get_num_physical_sgprs into radeon_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 14:39:06 -04:00
Marek Olšák
ca43006fd2 ac: move ac_get_max_wave64_per_simd into radeon_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 14:39:06 -04:00
Marek Olšák
deab3a23f6 ac: move num_sdp_interfaces into radeon_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 14:39:06 -04:00
Marek Olšák
2c62b461e9 ac: move PBB MAX_ALLOC_COUNT into radeon_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-18 14:39:06 -04:00
Jonathan Marek
05da025f35 etnaviv: fix two-sided stencil
* Set missing STENCIL_CONFIG_EXT2 bits
* Swap stencil sides when rendering CCW

Fixes following deqp tests (which were 99% failing):
dEQP-GLES2.functional.fragment_ops.depth_stencil.*

Note: deqp tests require --deqp-gl-config-name=rgba8888d24s8ms0

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-09-18 12:49:10 -04:00
Samuel Pitoiset
68820007fd radv: fix loading 64-bit GS inputs
We have to load 2 32-bit integer and to cast correctly.

This fixes crashes with gs-double-interpolator.vk_shader_test.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111734
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-18 17:16:36 +02:00
Bas Nieuwenhuizen
7999e10cab tu: Set up glsl types.
Addresses this assert:

deqp-vk: ../mesa-freedreno-9999/src/compiler/glsl_types.cpp:1244: static const glsl_type *glsl_type::get_interface_instance(const glsl_struct_field *, unsigned int, enum glsl_interface_packing, bool, const char *): Assertion `glsl_type_users > 0' failed.

running dEQP-VK.api.smoke.triangle .

Fixes: 624789e370 "compiler/glsl: handle case where we have multiple users for types"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-18 16:51:18 +02:00
Samuel Pitoiset
46b7512b0a radv: fix writing depth/stencil clear values to image
Use the fastest way only if both aspects are used. Oops.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111728
Fixes: 218ce34962 ("radv: add mipmap support for the clear depth/stencil values")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-18 13:27:46 +02:00
Haihao Xiang
8a9b81ab9d i965: support AYUV/XYUV for external import only
Fixes: 89785e2d56 ("i965: add support for sampling from AYUV")
Fixes: 7cab8d3661 ("i965: Add support for sampling from XYUV images")
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-09-18 12:07:23 +03:00
Boris Brezillon
1e483a87bc panfrost: Allocate tiler and scratchpad BOs per-batch
If we want to execute several batches in parallel they need to have
their own tiler and scratchpad BOs. Let move those objects to
panfrost_batch and allocate them on a per-batch basis.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:40:17 +02:00
Boris Brezillon
0eec73a800 panfrost: Add FBO BOs to batch->bos earlier
If we want the batch dependency tracking to work correctly we must
make sure all BOs are added to the batch->bos set early enough. Adding
FBO BOs when generating the fragment job is clearly to late. Add a
panfrost_batch_add_fbo_bos helper and call it in the clear/draw path.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:37:56 +02:00
Boris Brezillon
5a4d095f9b panfrost: Add the panfrost_batch_create_bo() helper
This helper automates the panfrost_bo_create()+panfrost_batch_add_bo()+
panfrost_bo_unreference() sequence that's done for all per-batch BOs.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:37:31 +02:00
Boris Brezillon
9af4aeaaf7 panfrost: Don't return imported/exported BOs to the cache
We don't know who else is using the BO in that case, and thus shouldn't
re-use it for something else.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:35:52 +02:00
Boris Brezillon
90b8934547 panfrost: Add panfrost_bo_{alloc,free}()
Thanks to that we avoid the recursive call into panfrost_bo_create()
and we can get rid of panfrost_bo_release() by inlining the code in
panfrost_bo_unreference().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:35:29 +02:00
Boris Brezillon
cb71ae5572 panfrost: Stop using panfrost_bo_release() outside of pan_bo.c
panfrost_bo_unreference() should be used instead.

The only difference caused by this change is that the scratchpad,
tiler_heap and tiler_dummy BOs are now returned to the cache instead
of being freed when a context is destroyed. This is only a problem if
we care about context isolation, which apparently is not the case since
transient BOs are already returned to the per-FD cache (and all contexts
share the same address space anyway, so enforcing context isolation
is almost impossible).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:35:06 +02:00
Boris Brezillon
e15ab939fd panfrost: Stop passing screen around for BO operations
Store a screen pointer in panfrost_bo so we don't have to pass a screen
object to all functions manipulating the BO.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:34:27 +02:00
Boris Brezillon
10ce751726 panfrost: Don't check if BO is mmaped before calling panfrost_bo_mmap()
panfrost_bo_mmap() already takes care of that.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:34:08 +02:00
Boris Brezillon
a06e08def9 panfrost: Stop exposing panfrost_bo_cache_{fetch,put}()
They are not expected to be called directly, users should use
panfrost_bo_{create,release}() instead.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:33:51 +02:00
Boris Brezillon
154cb725d4 panfrost: Move the BO API to its own header
Right now, the BO API is spread over pan_{allocate,resource,screen}.h.
Let's move all BO related definitions to a separate header file.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:29:13 +02:00
Boris Brezillon
34efaafc93 panfrost: s/PAN_ALLOCATE_/PAN_BO_/
Change the prefix for BO allocation flags to make it consistent with
the rest of the BO API.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:28:55 +02:00
Boris Brezillon
29d0e5c177 panfrost: Move panfrost_bo_{reference,unreference}() to pan_bo.c
This way we have all BO related functions placed in the same source
file.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:28:39 +02:00
Boris Brezillon
0500c9e514 panfrost: Get rid of pan_drm.c
pan_drm.c was only meaningful when we were supporting 2 kernel drivers
(mali_kbase, and the drm one). Now that there's now kernel-driver
abstraction we're better off moving those functions were they belong:

* BO related functions in pan_bo.c
* fence related functions + query_gpu_version() in pan_screen.c
* submit related functions in pan_job.c

While at it, we rename the functions according to the place they're
being moved to.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:28:22 +02:00
Boris Brezillon
1e47c3ee7b panfrost: Stop passing has_draws to panfrost_drm_submit_vs_fs_batch()
has_draws can be inferred directly from the batch->last_job value, no
need to pass it around.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:28:03 +02:00
Boris Brezillon
07085fe8a4 panfrost: Kill a useless memset(0) in panfrost_create_context()
ctx is allocated with rzalloc() which takes care of zero-ing the memory
region. No need to call memset(0) on top.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:27:47 +02:00
Boris Brezillon
4eac1b2008 panfrost: Add polygon_list to the batch BO set at allocation time
That's what we do for other per-batch BOs, and we'll soon add an helper
to automate this create_bo()+add_bo()+bo_unreference() sequence, so
let's prepare the code to ease this transition.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:27:30 +02:00
Boris Brezillon
c16fb1f48d panfrost: Add missing panfrost_batch_add_bo() calls
Some BOs are used by batches but never explicitly added to the BO set.
This is currently not a problem because we wait for the execution of
a batch to be finished before releasing a BO, but we will soon relax
this rule.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:27:09 +02:00
Boris Brezillon
a94d028065 panfrost: Use the correct type for the bo_handle array
The DRM driver expects an array of u32, let's use the correct type, even
if using an int works in practice because it's still a 32-bit integer.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:26:49 +02:00
Boris Brezillon
2b771b8424 panfrost: Stop exposing internal panfrost_*_batch() functions
panfrost_{create,free,get}_batch() are only called inside pan_job.c.
Let's make them static.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:26:21 +02:00
Christian Gmeiner
8d5f905faa etnaviv: disable ARB_shadow
Looks like only HALT2 GPUs have support for it but that is not yet
implemented so disable ARB_shadow for now.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-18 06:47:26 +02:00
Christian Gmeiner
dcc0e23438 Revert "gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAP"
There are GPUs that do not support this feature.

This reverts commit e871abe452

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-18 06:47:21 +02:00
Lepton Wu
417d602fda virgl: Remove wrong EAGAIN handling for drmIoctl
drmIoctl handles EAGAIN itself and actually it always return -1 on errors.
Remove the wrong handling of its return value. Also, print a warning when
it fails.

v2: - use _debug_printf instead of fprintf (Gurchetan Singh)

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
2019-09-18 03:36:10 +00:00
Kenneth Graunke
f8c44e4ed7 iris: Skip allocating a null surface when there are 0 color regions.
The compiler now sets the "Null Render Target" bit in the RT write
extended message descriptor, causing it to write to an implicit null
surface without us needing to set one up in the binding table.

Together with the last patch, this improves performance in Car Chase on
an Icelake 8x8 (locked to 700Mhz) by 0.0445526% +/- 0.0132736% (n=832).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 14:27:51 -07:00
Kenneth Graunke
f76a724e06 intel/compiler: Set "Null Render Target" ex_desc bit on Gen11
When there are no color regions (i.e. a depth only pass), we can set
the "Null Render Target" bit in the Gen11 RT write extended message
descriptor to indicate that it should behave as if it's writing to a
null render target, without the need for a binding table entry.

This lets drivers avoid setting up that null RT binding table entry,
but more importantly means the HW doesn't actually have to bother
looking up the surface state.

Together with the next patch, this improves performance in Car Chase on
an Icelake 8x8 (locked to 700Mhz) by 0.0445526% +/- 0.0132736% (n=832).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 14:27:51 -07:00
Samuel Iglesias Gonsálvez
f5dd6dfe01 anv: enable VK_KHR_shader_float_controls and SPV_KHR_float_controls
This adds support for
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT_CONTROLS_PROPERTIES_KHR and
enables de Vulkan and SPIR-V extensions.

Also, notice that this includes the updates applied to the
VkPhysicalDeviceFloatControlsPropertiesKHR structure in the extension
VK_KHR_shader_float_controls v4 and Vulkan 1.1.116.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez
9b07020a4f i965/fs: add support for shader float control to remove_extra_rounding_modes()
The remove_extra_rounding_modes() optimization will remove duplicated
rounding mode changes.

v2:
- Fix bug in the rounding mode change (Alejandro).

v3:
- Fix rounding modes.

v4:
- Updated to renamed shader info member and enum values (Andres).

v5:
- Simplify flags logic operations (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez
9bd88d10d8 i965/fs: set rounding mode when emitting nir_op_f2f32 or nir_op_f2f16
v2:
- Consider nir_op_f2f16 case too (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez
ba1e25e1aa i965/fs: set rounding mode when emitting fadd, fmul and ffma instructions
v2:
- Updated to renamed shader info member (Andres).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00