Commit graph

111950 commits

Author SHA1 Message Date
Christian Gmeiner
8dd26fa2f0 etnaviv: support GL_ARB_seamless_cubemap_per_texture
Passes spec@amd_seamless_cubemap_per_texture@amd_seamless_cubemap_per_texture

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-By: Guido Günther <agx@sigxcpu.org>
2019-06-19 00:39:50 +02:00
Christian Gmeiner
a13efb3cdb etnaviv: update headers from rnndb
Update to etna_viv commit a3bf0da.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-06-19 00:39:50 +02:00
Dave Airlie
378ea92bf6 radeonsi: fix undefined shift in macro definition
Pointed out by coverity

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-19 08:32:36 +10:00
Dave Airlie
93ba356544 nouveau: fix frees in unsupported IR error paths.
This is pointless in that we won't ever hit those paths in real life,
but coverity complains.

Fixes: f014ae3c7c ("nouveau: add support for nir")
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-06-19 08:32:19 +10:00
Rohan Garg
ad284f794c panfrost: Move clearing logic into pan_job
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 12:32:43 -07:00
Chia-I Wu
98eda99ab8 virgl: fix sync issue regarding discard/unsync transfers
GL_MAP_INVALIDATE_BUFFER_BIT cannot be treated as
GL_MAP_INVALIDATE_RANGE_BIT naively.  When we run into

  ptr = glMapBufferRange(buf, 0, size,
          GL_WRITE_BIT|GL_MAP_INVALIDATE_BUFFER_BIT);
  memcpy(ptr, data1, size);
  glUnmapBuffer(buf);
  ptr = glMapBufferRange(buf, size, size,
          GL_WRITE_BIT|GL_MAP_UNSYNCHRONIZED_BIT);
  memcpy(ptr, data2, size);
  glUnmapBuffer(buf);

we never want data1 to be copy_transfer'ed.  Because that would mean
that data2 might overwrite valid data.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis alexandros.frantzis@collabora.com
Fixes: a22c5df079 ("virgl: Use buffer copy transfers to avoid waiting when mapping")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-06-18 10:38:21 -07:00
Alyssa Rosenzweig
2a717f300b panfrost: Enable sRGB
Now that sRGB formats are supported for both rendering and sampling,
advertise support.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 09:59:29 -07:00
Alyssa Rosenzweig
5aa51ba97f panfrost: Disable AFBC on sRGB buffers
The performance impact is slightly mitigated by tiling the render
target, but it's undeniably still slow compared to AFBC. Unfortunately,
it doesn't look like AFBC and sRGB play nice...

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 09:59:29 -07:00
Alyssa Rosenzweig
6585bb9f52 panfrost: Enable sRGB fixed-function blending
For fixed-function, we have hardware to handle sRGB so we just set a
flag. For blend shaders, it's rather more involved; this is currently
unimplemented. Assert it out for now; we don't need it quite yet.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 09:59:29 -07:00
Alyssa Rosenzweig
4b137da409 panfrost: Specify sRGB in the render target
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 09:59:29 -07:00
Alyssa Rosenzweig
58c34e4a6c panfrost: Implement sRGB texturing
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 09:59:29 -07:00
Alyssa Rosenzweig
31a4ef847c panfrost: Add sRGB render target flag
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 09:59:29 -07:00
Alyssa Rosenzweig
01e1eecb95 panfrost: Implement tiled rendering
We already can sample from Mali's linear/tiled encoding (the one from
Utgard -- AFBC is mostly unrelated); let's be able to render to it as
well.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 09:59:29 -07:00
Alyssa Rosenzweig
d50795109b panfrost: Decode rendering block type
A mode for rendering tiled/uncompressed was noticed, so we reshuffle the
MFBD render target definitions to explicitly include block type.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 09:59:28 -07:00
Alyssa Rosenzweig
83c02a5ea9 panfrost: Refactor texture targets
This combines the two cmdstream bits "is_3d" and "is_not_cubemap" into a
single 2-bit texture target selection, noticing it's the same as the
2-bit selection in Midgard and Bifrost texturing ops. Accordingly, we
share this definition and add the missing entry for 1D/buffer textures.

This requires a nontrivial (but functionally similar) refactor of all
parts of the driver to use the new definitions appropriately.
Theoretically, this should add support for buffer textures, but that's
obviously not tested and probably wouldn't work.

While doing so, we notice the sRGB enable bit, which we document and
decode as well here so we don't forget about it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 09:59:28 -07:00
Rohan Garg
bfca21b622 panfrost: Figure out job requirements in pan_job.c
Requirements for a job should be figured out in pan_job.c

v2: [Alyssa] Fix early return

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 09:52:20 -07:00
Rohan Garg
debb85d1ec panfrost: Reset job counters once the job is submitted
Move the reset out of frame invalidation into job submission

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 09:52:20 -07:00
Rohan Garg
0f43a2ae8a panfrost: Initial implementation of panfrost_job_submit
Start fleshing out panfrost_job

v2: [Alyssa: Remove unused variable, warning introduced]

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 09:52:01 -07:00
Gurchetan Singh
2daf3d8215 virgl_hw: add YUV support
Add corresponding entries from p_format.h

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-06-18 09:18:58 -07:00
Gurchetan Singh
2480ce802a virgl: sync to virglrenderer virgl_hw.h
It's nice to keep these two files in sync, as they define
guest userspace <---> host userspace communcation.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-06-18 09:18:48 -07:00
Jason Ekstrand
58cb865313 anv: Make border colors the right size and alignment on HSW
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-18 16:07:08 +00:00
Lionel Landwerlin
51076eb87c imgui: bump imgui memory editor copy
Getting rid of a compiler warning :

In file included from ../src/intel/tools/aubinator_viewer.cpp:225:
../src/imgui/imgui_memory_editor.h: In member function ‘void MemoryEditor::DisplayPreviewData(size_t, const u8*, size_t, MemoryEditor::DataType, MemoryEditor::DataFormat, char*, size_t) const’:
../src/imgui/imgui_memory_editor.h:637:16: warning: enumeration value ‘DataType_COUNT’ not handled in switch [-Wswitch]
         switch (data_type)
                ^

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-18 15:34:13 +00:00
Alyssa Rosenzweig
9402970751 panfrost/midgard: Enable autovectorization
Enable nir_opt_vectorize.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 06:44:13 -07:00
Connor Abbott
47e7c6961a nir: add a vectorization pass
This effectively does the opposite of nir_lower_alus_to_scalar, trying
to combine per-component ALU operations with the same sources but
different swizzles into one larger ALU operation. It uses a similar
model as CSE, where we do a depth-first approach and keep around a hash
set of instructions to be combined, but there are a few major
differences:

1. For now, we only support entirely per-component ALU operations.
2. Since it's not always guaranteed that we'll be able to combine
equivalent instructions, we keep a stack of equivalent instructions
around, trying to combine new instructions with instructions on the
stack.

The pass isn't comprehensive by far; it can't handle operations where
some of the sources are per-component and others aren't, and it can't
handle phi nodes. But it should handle the more common cases, and it
should be reasonably efficient.

[Alyssa: Rebase on latest master, updating with respect to typeless
moves]

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-18 06:43:30 -07:00
Boris Brezillon
c3558868da panfrost: Add support for TXS instructions
This patch adds support for nir_texop_txs instructions which are needed
to support the OpenGL textureSize() function. This is also needed to
support RECT texture sampling which is currently lowered to 2D sampling +
a TXS() instruction by the nir_lower_tex() helper.

Changes in v2:
* Split options for the 1st and 2nd tex lowering passes

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 06:36:07 -07:00
Boris Brezillon
5c17f84ae2 panfrost: Prepare things to support non-native texture ops
We are about to add support for the TXS (texture size) op which is not
implemented using a midgard texture instruction. Let's rename emit_tex()
into emit_texop_native() and repurpose emit_tex() as a dispatcher.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 06:36:07 -07:00
Boris Brezillon
c57f7d0f15 panfrost: Move sysval upload logic out of panfrost_emit_for_draw()
We're about to add more sysval types, and panfrost_emit_for_draw()
is big enough, so let's move the sysval upload logic in a separate
function.

We also add one sub-function per sysval type to keep the
panfrost_upload_sysvals() small/readable.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 06:36:07 -07:00
Boris Brezillon
bd49c8f0eb panfrost: Make the sysval logic more generic
We are about to add support for nir_texop_txs which requires adding a
sysval/uniform containing the texture size. Let's change the
emit_sysval_read() prototype to take a nir_instr object instead of
a nir_intrinsic_instr one so we can re-use this function when emitting
a sysval for a txs instruction.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 06:36:07 -07:00
Boris Brezillon
296c5fd25d nir/lower_tex: Add a way to lower TXS(non-0-LOD) instructions
The V3D driver has an open-coded solution for this, and we need the
same thing for Panfrost, so let's add a generic way to lower TXS(LOD)
into max(TXS(0) >> LOD, 1).

Changes in v2:
* Use == 0 instead of !
* Rework the minification logic as suggested by Jason
* Assign cursor pos at the beginning of the function
* Patch the LOD just after retrieving the old value

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 06:36:07 -07:00
Boris Brezillon
0e489fd360 nir/lower_tex: Update ->sampler_dim value before calling get_texture_size()
get_texture_size() will create a txs instruction with ->sampler_dim set
to the original tex->sampler_dim. The condition to call lower_rect()
only checks the value of ->sampler_dim and whether lower_rect is
requested or not. This leads to an infinite loop when calling
nir_lower_tex() with the same options until it returns false.

In order to avoid that, let's move the tex->sampler_dim patching before
get_texture_size() is called. This way the txs instruction will have
->sampler_dim set to GLSL_SAMPLER_DIM_2D and nir_lower_tex() won't try
to lower it on the subsequent passes.

Changes in v2:
* Add Jason R-b
* Add a comment explaining why we patch ->sampler_dim at the beginning
  of the lower_rect() func

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 06:36:07 -07:00
Boris Brezillon
352b1d9c31 nir/lower_tex: Actually report when projector lowering happened
The code considers that projector lowering was done even if it's not
really the case. Change the project_src() prototype to return a bool
encoding whether projector lowering happened or not and update the
progress var accordingly in nir_lower_tex_block().

---
Changes in v2:
* Add Jason R-b
* Drop the part suggesting that nir_lower_rect() could be called in
  a do-while(progress) loop.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 06:36:07 -07:00
Tomeu Vizoso
6f60fec48f panfrost: Adapt to constant name change in UABI
We hadn't updated the kernel header after the driver got into mainline.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 15:26:08 +02:00
Tomeu Vizoso
5ad5777f89 panfrost: ci: Update results
Alyssa fixed some failing tests last night.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-18 15:25:01 +02:00
Samuel Pitoiset
c16bf48bfc radv: adjust the DCC base VA for mipmapped color attachments
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-18 12:24:26 +02:00
Samuel Pitoiset
6ee40efd02 radv: fix color decompressions for FMASK/CMASK
Only skip levels without DCC when it's a DCC decompression.
Whoops.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-18 12:09:04 +02:00
Samuel Pitoiset
42a41a9e4a radv: do not decompress levels without DCC with the graphics path
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-18 11:24:50 +02:00
Samuel Pitoiset
e8917dcadb radv: do not decompress levels without DCC with the compute path
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-18 11:24:41 +02:00
Samuel Pitoiset
864ddda8a3 radv: check if DCC is enabled per mip not for the whole image
In other words, make use of radv_dcc_enabled() instead of
radv_image_has_dcc() all over the places.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-18 11:24:36 +02:00
Iago Toral Quiroga
79a30543ee v3d: implement simultaneous peripheral access exceptions for V3D 4.1+
Shader-db results:

total instructions in shared programs: 9117550 -> 9102719 (-0.16%)
instructions in affected programs: 1752873 -> 1738042 (-0.85%)
helped: 7076
HURT: 478
helped stats (abs) min: 1 max: 22 x̄: 2.19 x̃: 2
helped stats (rel) min: 0.07% max: 13.89% x̄: 1.70% x̃: 1.07%
HURT stats (abs)   min: 1 max: 7 x̄: 1.41 x̃: 1
HURT stats (rel)   min: 0.09% max: 10.17% x̄: 0.86% x̃: 0.54%
95% mean confidence interval for instructions value: -2.00 -1.92
95% mean confidence interval for instructions %-change: -1.58% -1.50%
Instructions are helped.

total max-temps in shared programs: 1327774 -> 1327728 (<.01%)
max-temps in affected programs: 1025 -> 979 (-4.49%)
helped: 47
HURT: 2
helped stats (abs) min: 1 max: 2 x̄: 1.02 x̃: 1
helped stats (rel) min: 2.63% max: 20.00% x̄: 7.67% x̃: 5.26%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 4.17% max: 4.17% x̄: 4.17% x̃: 4.17%
95% mean confidence interval for max-temps value: -1.06 -0.82
95% mean confidence interval for max-temps %-change: -8.89% -5.49%
Max-temps are helped.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-18 08:09:03 +02:00
Iago Toral Quiroga
6d97c8fac1 v3d: only flush jobs accessing the query BO when reading query results
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-18 08:09:03 +02:00
Iago Toral Quiroga
5491883a9a v3d: add a helper function to flush jobs using a BO
v2: use _mesa_set_search() (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-18 08:09:03 +02:00
Kenneth Graunke
e8cd7a30d5 iris: Support more RGBX pipe formats.
Without them, the state tracker falls back to an RGBA format, but it
doesn't always manage to override the swizzle for us.  So we lose the
information that the API expects an X channel, where alpha is garbage
and reads back as 1.  We have no equivalent ISL RGBX format for these,
so we just use RGBA directly and override the swizzle in all cases.
2019-06-17 21:52:38 -05:00
Kenneth Graunke
3c10a2726b glsl: Fix out of bounds read in shader_cache_read_program_metadata
The VaryingNames array has NumVaryings entries.  But BufferStride is
a small array of MAX_FEEDBACK_BUFFERS (4) entries.  Programs with
more than 4 varyings would read out of bounds.

Also, BufferStride is set based on the shader itself, which means that
it's inherently already included in the hash, and doesn't need to be
included again.  At the point when shader_cache_read_program_metadata
is called, the linker hasn't even set those fields yet.  So, just drop
it entirely.

Fixes valgrind errors in KHR-GL45.transform_feedback.linking_errors_test.

Fixes: 6d830940f7 glsl/shader_cache: Allow shader cache usage with transform feedback

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-06-17 21:22:19 -05:00
Jason Ekstrand
9672b7044c anv: Set STATE_BASE_ADDRESS upper bounds on gen7
This should fix floating-point border color on all gen7 HW.  Integer is
still thoroughly busted on gen7 because it doesn't exist on IVB and it's
crazy on HSW.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-17 18:53:07 -05:00
Bas Nieuwenhuizen
925c04b4c7 radv: Disable linear tiled compressed textures.
Support got removed in the new addrlib update.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-06-18 01:00:49 +02:00
Jason Ekstrand
1be38f9178 anv:Use VK_EXT_separate_stencil_usage to avoid stencil shadows on gen7
Whenever stencil texturing is not required (most of the time), we can
use VK_EXT_separate_stencil_usage to only create the shadow image when
VK_IMAGE_USAGE_SAMPLED_BIT is required for stencil.  Of course, this
depends on applications to use the extension but hopefully DXVK and
similar translators are doing so and that covers most of the apps.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-17 22:32:26 +00:00
Jason Ekstrand
f3ea0cf828 anv: Add stencil texturing support for gen7
Intel hardware didn't get support for sampling from W-tiled (required
for stencil) images until Broadwell so we can't directly sample from
stencil.  Instead, if we want to support stencil texturing on gen7
hardware, we have to keep a texture-capable shadow copy around and use
BLORP to update when stencil changes.  The one thing this commit does
not implement is self-dependencies with stencil input attachments.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99493
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-17 22:32:26 +00:00
Jason Ekstrand
4faa3145b1 anv/blorp: Update shadow images when clearing or uploading
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-17 22:32:26 +00:00
Jason Ekstrand
2b736d9e6c anv/cmd_buffer: Add a stencil transition helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-17 22:32:26 +00:00
Jason Ekstrand
86fc268142 anv/blorp: Take an aspect in anv_image_copy_to_shadow
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-17 22:32:26 +00:00