Commit graph

106151 commits

Author SHA1 Message Date
Pierre-Eric Pelloux-Prayer
47cc660d9c glsl: replace 'x + (-x)' with constant 0
This fixes a hang in shadertoy for radeonsi where a buffer was initialized with:

   value -= value

with value being undefined.
In this case LLVM replace the operation with an assignment to NaN.

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111241
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-29 17:48:49 -04:00
Thong Thai
8d03a6b700 radeonsi: add JPEG decode support for VCN 2.0 devices
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2019-08-29 17:27:35 -04:00
Thong Thai
2a3a560407 Revert "radeonsi: don't emit PKT3_CONTEXT_CONTROL on amdgpu"
This reverts commit 5a2e65be89.

Even though CONTEXT_CONTROL is emitted by the kernel, CONTEXT_CONTROL
still needs to be emitted by the UMD, or else the driver will hang

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-29 17:27:15 -04:00
Ian Romanick
9ad4a2eac5 nir/range-analysis: Add a lot more assertions about the contents of tables
v2: Update several of the comments.  Drop some redundant uses of
ASSERT_UNION_OF_OTHERS_MATCHES_UNKNOWN_*_SOURCE source.  Suggested by
Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-29 13:15:53 -07:00
Ian Romanick
636da12433 nir/range-analysis: Range tracking for fpow
One shader from Metro Last Light and the rest from Rochard.  In the
Rochard cases, something like:

    min(1.0, max(pow(saturate(x), y), z))

was transformed to

    saturate(max(pow(saturate(x), y), z))

because the result of the pow must be >= 0.

The Metro Last Light case was similar.  An instance of

    min(pow(abs(x), y), 1.0)

became

    saturate(pow(abs(x), y))

v2: Fix some comments.  Suggested by Caio.

v3: Fix setting is_intgral when the exponent might be negative.  See
also Mesa MR !1778.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16280670 -> 16280659 (<.01%)
instructions in affected programs: 1130 -> 1119 (-0.97%)
helped: 11
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.72% max: 1.43% x̄: 1.03% x̃: 0.97%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -1.19% -0.86%
Instructions are helped.

total cycles in shared programs: 367168430 -> 367168270 (<.01%)
cycles in affected programs: 10281 -> 10121 (-1.56%)
helped: 10
HURT: 1
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 1.31% max: 2.43% x̄: 1.79% x̃: 1.70%
HURT stats (abs)   min: 10 max: 10 x̄: 10.00 x̃: 10
HURT stats (rel)   min: 3.10% max: 3.10% x̄: 3.10% x̃: 3.10%
95% mean confidence interval for cycles value: -20.06 -9.04
95% mean confidence interval for cycles %-change: -2.36% -0.32%
Cycles are helped.
2019-08-29 13:15:53 -07:00
Ian Romanick
7dba7df5e5 nir/range-analysis: Handle constants in nir_op_mov just like nir_op_bcsel
I discovered this while looking at a shader that was hurt by some other
work I'm doing.  When I examined the changes, I was confused that one
instance of a comparison that was used in a discard_if was (incorrectly)
eliminated, while another instance used by a bcsel was (correctly) not
eliminated.  I had to use NIR_PRINT=true to see exactly where things
when wrong.

A bunch of shaders in Goat Simulator, Dungeon Defenders, Sanctum 2, and
Strike Suit Zero were impacted.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16280659 -> 16281075 (<.01%)
instructions in affected programs: 21042 -> 21458 (1.98%)
helped: 0
HURT: 136
HURT stats (abs)   min: 1 max: 9 x̄: 3.06 x̃: 3
HURT stats (rel)   min: 1.16% max: 6.12% x̄: 2.23% x̃: 2.03%
95% mean confidence interval for instructions value: 2.93 3.19
95% mean confidence interval for instructions %-change: 2.08% 2.37%
Instructions are HURT.

total cycles in shared programs: 367168270 -> 367170313 (<.01%)
cycles in affected programs: 172020 -> 174063 (1.19%)
helped: 14
HURT: 111
helped stats (abs) min: 2 max: 80 x̄: 21.21 x̃: 9
helped stats (rel) min: 0.10% max: 4.47% x̄: 1.35% x̃: 0.79%
HURT stats (abs)   min: 2 max: 584 x̄: 21.08 x̃: 5
HURT stats (rel)   min: 0.12% max: 17.28% x̄: 1.55% x̃: 0.40%
95% mean confidence interval for cycles value: 5.41 27.28
95% mean confidence interval for cycles %-change: 0.64% 1.81%
Cycles are HURT.
2019-08-29 13:15:53 -07:00
Ian Romanick
0b4782fccd nir/range-analysis: Fix incorrect fadd range result for (ne_zero, ne_zero)
Found by inspection.  I tried really, really hard to make a test case
that would trigger this problem, but I was unsuccesful.  It's very hard
to get an instruction to produce a ne_zero result without ne_zero
sources.  The most plausible way is using bcsel.  That proves
problematic because bcsel interprets its sources as integers, so it
cannot currently be used to "clean" values for floating point
instructions.

No shader-db changes on any Intel platform.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
2019-08-29 13:15:53 -07:00
Ian Romanick
ef2e235252 nir/range-analysis: Adjust result range of multiplication to account for flush-to-zero
Fixes piglit tests (new in piglit!110):

    - fs-underflow-fma-compare-zero.shader_test
    - fs-underflow-mul-compare-zero.shader_test

v2: Add back part of comment accidentally deleted.  Noticed by
Caio. Remove is_not_zero function as it is no longer used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308
Fixes: fa116ce357 ("nir/range-analysis: Range tracking for ffma and flrp")
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

All Gen7+ platforms** had similar results. (Ice Lake shown)
total instructions in shared programs: 16278465 -> 16279492 (<.01%)
instructions in affected programs: 16765 -> 17792 (6.13%)
helped: 0
HURT: 23
HURT stats (abs)   min: 7 max: 275 x̄: 44.65 x̃: 8
HURT stats (rel)   min: 1.15% max: 17.51% x̄: 4.23% x̃: 1.62%
95% mean confidence interval for instructions value: 9.57 79.74
95% mean confidence interval for instructions %-change: 1.85% 6.61%
Instructions are HURT.

total cycles in shared programs: 367135159 -> 367154270 (<.01%)
cycles in affected programs: 279306 -> 298417 (6.84%)
helped: 0
HURT: 23
HURT stats (abs)   min: 13 max: 6029 x̄: 830.91 x̃: 54
HURT stats (rel)   min: 0.17% max: 45.67% x̄: 7.33% x̃: 0.49%
95% mean confidence interval for cycles value: 100.89 1560.94
95% mean confidence interval for cycles %-change: 0.94% 13.71%
Cycles are HURT.

total spills in shared programs: 8870 -> 8869 (-0.01%)
spills in affected programs: 19 -> 18 (-5.26%)
helped: 1
HURT: 0

total fills in shared programs: 21904 -> 21901 (-0.01%)
fills in affected programs: 81 -> 78 (-3.70%)
helped: 1
HURT: 0

LOST:   0
GAINED: 1

** On Broadwell, a shader was hurt for spills / fills instead of
   helped.

No changes on any earlier platforms.
2019-08-29 13:15:53 -07:00
Ian Romanick
33ad2bab4b nir/range-analysis: Adjust result range of exp2 to account for flush-to-zero
Fixes piglit tests (new in piglit!110):

    - fs-underflow-exp2-compare-zero.shader_test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

Most of the shaders affected are, unsurprisingly, in Unigine Heaven.

All Gen6+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16278207 -> 16278465 (<.01%)
instructions in affected programs: 11374 -> 11632 (2.27%)
helped: 0
HURT: 58
HURT stats (abs)   min: 2 max: 13 x̄: 4.45 x̃: 4
HURT stats (rel)   min: 0.54% max: 4.11% x̄: 2.42% x̃: 2.82%
95% mean confidence interval for instructions value: 3.77 5.13
95% mean confidence interval for instructions %-change: 2.19% 2.64%
Instructions are HURT.

total cycles in shared programs: 367134284 -> 367135159 (<.01%)
cycles in affected programs: 81207 -> 82082 (1.08%)
helped: 17
HURT: 36
helped stats (abs) min: 6 max: 356 x̄: 90.35 x̃: 6
helped stats (rel) min: 0.69% max: 21.45% x̄: 5.71% x̃: 0.78%
HURT stats (abs)   min: 4 max: 235 x̄: 66.97 x̃: 16
HURT stats (rel)   min: 0.35% max: 27.58% x̄: 5.34% x̃: 1.09%
95% mean confidence interval for cycles value: -20.36 53.38
95% mean confidence interval for cycles %-change: -1.08% 4.67%
Inconclusive result (value mean confidence interval includes 0).

No changes on any earlier platforms.
2019-08-29 13:15:53 -07:00
Ian Romanick
e07248d2a8 nir/algebraic: Clean up value range analysis-based optimizations
Fix the a / b ordering in some compares.  Delete duplicate patterns.
Add a table explaining things.  While I was cleaning this up, I managed
to confuse myself.  The table helped sort that out.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-29 13:15:52 -07:00
Ian Romanick
ccb236d1bc nir/algebraic: Mark some value range analysis-based optimizations imprecise
This didn't fix bug #111308, but it was found will trying to find the
actual cause of that bug.

Fixes piglit tests (new in piglit!110):

    - fs-fract-of-NaN.shader_test
    - fs-lt-nan-tautology.shader_test
    - fs-ge-nan-tautology.shader_test

No shader-db changes on any Intel platform.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308
Fixes: b77070e293 ("nir/algebraic: Use value range analysis to eliminate tautological compares")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-29 13:15:52 -07:00
Kenneth Graunke
30b9ed92ea iris: Fix partial fast clear checks to account for miplevel.
We enabled fast clears at level > 0, but didn't minify the dimensions
when comparing the box size, so we always thought it was a partial
clear and as a result never actually enabled any.

This eliminates some slow clears in Civilization VI, but they are mostly
during initialization and not the main rendering.

Thanks to Dan Walsh for noticing we had too many slow clears.

Fixes: 393f659ed8 ("iris: Enable fast clears on other miplevels and layers than 0.")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-08-29 11:27:16 -07:00
Rohan Garg
394192fcee panfrost: Remove unused argument from panfrost_drm_submit_vs_fs_job()
is_scanout is not used anywhere and can be inferred within
panfrost_drm_submit_vs_fs_job() if required.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-29 19:03:17 +02:00
Kenneth Graunke
fda9fb8dcd iris: Actually describe bo_reuse driconf option
Otherwise it doesn't exist and can't be parsed, so everything dies at
screen init time.

Fixes: 6dc4ddc5f8 ("iris: use driconf for 'bo_reuse' parameter")
2019-08-29 09:40:34 -07:00
Tomeu Vizoso
aace7d3500 panfrost/ci: Print only regressions
Some functionality has been added to deqp-volt to only print
regressions, so update our version of it and use the new options.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-29 17:12:04 +02:00
Roland Scheidegger
332b21db55 gallivm: use fallback code for mul_hi with llvm >= 7.0
LLVM 7.0 ditched the pmulu intrinsics.
This is only a trivial patch to use the fallback code instead.
It'll likely produce atrocious code since the pattern doesn't match what
llvm itself uses in its autoupgrade paths, hence the pattern won't be
recognized.

Should fix https://bugs.freedesktop.org/show_bug.cgi?id=111496

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-29 16:55:49 +02:00
Samuel Pitoiset
b650ecfe31 radv/gfx10: compute the LDS size for exporting PrimID for VS
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-29 16:08:37 +02:00
Jan Zielinski
e64091ebd4 swr/rasterizer: Enable ARB_fragment_layer_viewport
Added loading gl_Layer and gl_ViewportIndex variables
to Pixel Shader context.

Reviewed-by: Alok Hota <alok.hota@intel.com>
2019-08-29 12:09:05 +02:00
Tapani Pälli
6dc4ddc5f8 iris: use driconf for 'bo_reuse' parameter
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-29 09:33:52 +03:00
Tapani Pälli
b65de51dcf i965: initialize bo_reuse when creating brw_bufmgr
Fixes a possible data race spotted while debugging on other EGL
related failures where glFinish and eglCreateContext are going on at
the same time:

  ==11558== Possible data race during read of size 1 at 0x5E78CD0 by thread #23
  ==11558== Locks held: 1, at address 0x5E77CA8
  ==11558==    at 0x61B71D4: bo_alloc_internal (brw_bufmgr.c:639)
  ==11558==    by 0x61B7328: brw_bo_alloc (brw_bufmgr.c:669)
  ==11558==    by 0x61EF975: recreate_growing_buffer (intel_batchbuffer.c:231)
  ==11558==    by 0x61EFAAE: intel_batchbuffer_reset (intel_batchbuffer.c:255)
  ==11558==    by 0x61EFB85: intel_batchbuffer_reset_and_clear_render_cache (intel_batchbuffer.c:280)
  ==11558==    by 0x61F0507: brw_new_batch (intel_batchbuffer.c:551)
  ==11558==    by 0x61F12C1: _intel_batchbuffer_flush_fence (intel_batchbuffer.c:888)
  ==11558==    by 0x61BDD6B: intel_glFlush (brw_context.c:296)
  ==11558==    by 0x61BDDB9: intel_finish (brw_context.c:307)
  ==11558==    by 0x623831B: _mesa_Finish (context.c:1906)
  ==11558==    by 0x46D556: deqp::egl::GLES2ThreadTest::Operation::execute(tcu::ThreadUtil::Thread&)
  ==11558==    by 0x721502: tcu::ThreadUtil::Thread::run()
  ==11558==
  ==11558== This conflicts with a previous write of size 1 by thread #26
  ==11558== Locks held: 1, at address 0x5D09878
  ==11558==    at 0x61B98A9: brw_bufmgr_enable_reuse (brw_bufmgr.c:1541)
  ==11558==    by 0x61BF09D: brw_process_driconf_options (brw_context.c:854)
  ==11558==    by 0x61BF6CA: brwCreateContext (brw_context.c:993)
  ==11558==    by 0x621181F: driCreateContextAttribs (dri_util.c:473)
  ==11558==    by 0x53FE87B: dri2_create_context (egl_dri2.c:1388)
  ==11558==    by 0x53EE7BE: eglCreateContext (eglapi.c:807)
  ==11558==    by 0x5C8AB9: eglw::FuncPtrLibrary::createContext(void*, void*, void*, int const*) const
  ==11558==    by 0x46E027: deqp::egl::GLES2ThreadTest::CreateContext::exec(tcu::ThreadUtil::Thread&)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-29 09:33:13 +03:00
Kenneth Graunke
90ca709f6d iris: Don't auto-flush/dirty on transfer unmap for coherent buffers
When u_upload_mgr fills up a buffer, it unmaps and destroys it.  Our
unmap function was automatically performing the equivalent of a
FlushMappedBufferRange call in this case.  Because the buffer mapping
is persistent and coherent, we don't actually do any flushing when we
do the rest of the writes to the buffer - we were just doing one final
one at the end.  But we would be using the uploaded contents on the
GPU the whole time.

This certainly shouldn't be necessary for streaming buffers, and if
such flushing and dirtying is necessary for coherent buffers, this is
wildly insufficient.

Drops a small number of constant packets and PIPE_CONTROL flushes from
most benchmarks that I've looked at.  Doesn't seem to make much of an
impact on performance, however.

Thanks to Felix Degrood for noticing that we were emitting more
3DSTATE_CONSTANT_* packets than we needed to.
2019-08-28 22:11:05 -07:00
Timur Kristóf
5f3eb6ef29 st/nine: Properly initialize GLSL types for NIR shaders.
NIR shaders use GLSL types (note: these live outside libglsl), and
nine needs to properly initialize these just like the other state
trackers. This fixes an assertion failure when TTN is used.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
2019-08-28 23:31:34 +00:00
Rob Clark
6167a63839 freedreno/ir3: do better job of marking convergence points
Fixes:
dEQP-GLES3.functional.shaders.switch.switch_in_do_while_loop_dynamic_vertex
dEQP-GLES3.functional.shaders.switch.switch_in_do_while_loop_dynamic_fragment

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-28 15:25:27 -07:00
Rob Clark
6af70aa2b4 freedreno/ir3: maintain predecessors/successors
While resolving jumps to skip intermediate jumps from the structured
CFG, maintain the successors and predecessors correctly.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-28 15:25:25 -07:00
Rob Clark
06bc4875ff freedreno/ir3: convert block->predecessors to set
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-28 15:25:19 -07:00
Jordan Justen
cfbde3282d
pci_id_driver_map: Support preferring iris over i965
This adds the ability for intel devices that:

 * Only load on i965
 * Only load on iris
 * First attempt i965, and try iris next
 * First attempt iris, and try i965 next

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-28 13:38:34 -07:00
Jordan Justen
107c22945f
i965: Exit with error if gen12+ is detected
For OpenGL support on gen12, the iris driver should be used.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-28 13:38:34 -07:00
Tapani Pälli
d8dd9a245e
anv: build libanv for gen12 in android build
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:34 -07:00
Jordan Justen
181be14d43
anv: Build for gen12
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:34 -07:00
Tapani Pälli
da603c066e
iris: build android libmesa_iris for gen12
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-28 13:38:34 -07:00
Jordan Justen
44ab7c265f
iris: Build for gen12
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-28 13:38:33 -07:00
Jordan Justen
4d2e390a65
intel/l3: Don't assert on gen12 (use gen11 config temporarily)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:33 -07:00
Jordan Justen
bdeb498070
intel/compiler: Disable compaction on gen12 for now
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-28 13:38:33 -07:00
Tapani Pälli
d7a1140c45
intel/isl: build android libmesa_isl for gen12
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:33 -07:00
Jordan Justen
6d63fd8a69
intel/isl: Build gen12 using gen11 code paths
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:33 -07:00
Tapani Pälli
7319003a74
intel/genxml: generate pack files for gen12 on android builds
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:33 -07:00
Jordan Justen
b42a05b436
intel/genxml: Build gen12 genxml
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:33 -07:00
Jordan Justen
531563b64b
intel/genxml: Add gen12.xml as a copy of gen11.xml
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:32 -07:00
Jordan Justen
2323536ee7
intel/genxml: Run sort_xml.sh to tidy gen9.xml and gen11.xml
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:32 -07:00
Jordan Justen
70566a87eb
intel/genxml/gen11: Add spaces in EnableUnormPathInColorPipe
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:32 -07:00
Jordan Justen
acce7d3460
intel/genxml: Handle field names with different spacing/hyphen
If a field name differs slightly between two generations then this
change will still add the fields into the same group.

For example, these will be treated as equal:
* "Software Exception" and "Software  Exception"
* "Per Thread" and "Per-Thread"

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-08-28 13:38:28 -07:00
Eric Anholt
973b49386c freedreno/a6xx: Fix non-mipmap filtering selection.
We were clamping the LOD to force non-mipmap filtering, but that means
that the HW doesn't get to select between the min and mag filters.
Setting MIPFILTER_LINEAR_FAR appears to force non-mipmap filtering.

Fixes all failures in dEQP-GLES2.functional.texture.filtering.2d.*

Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-08-28 13:14:41 -07:00
Ian Romanick
b418269d7d intel/compiler: Request bitfield_reverse lowering on pre-Gen7 hardware
See the previous commit for the explanation of the Fixes tag.

Hurts 21 shaders in shader-db.  All of the hurt shaders are in Unreal
Engine 4 tech demos.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 7afa26d4e3 ("nir: Add lowering for nir_op_bitfield_reverse.")
2019-08-28 11:39:29 -07:00
Ian Romanick
d3fd1c761a nir/algrbraic: Don't optimize open-coded bitfield reverse when lowering is enabled
This caused a problem on Sandybridge where an open-coded
bitfieldReverse() function could be optimized to a
nir_op_bitfield_reverse that would generate an unsupported BFREV
instruction in the backend.  This was encountered in some Unreal4 tech
demos in shader-db.  The bug was not previously noticed because we don't
actually try to run those demos on Sandybridge.

The fixes tag is a bit a lie.  The actual bug was introduced about
26,000 commits earlier in 371c4b3c48 ("nir: Recognize open-coded
bitfield_reverse.").  Without the NIR lowering pass, the flag needed to
avoid the optimization does not exist.  Hopefully nobody will care to
fix this on an earlier Mesa release.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 7afa26d4e3 ("nir: Add lowering for nir_op_bitfield_reverse.")
2019-08-28 11:38:51 -07:00
Eric Anholt
4662b70d23 gallium: Don't emit identical endian-dependent pack/unpack code.
Reduces the size of the u_format_table.c file by 140k (out of 1.64M)
and makes me less confused about endianness in gallium.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-28 10:39:36 -07:00
Eric Anholt
d17ff2f7f1 gallium: Fix big-endian addressing of non-bitmask array formats.
The formats affected are:

- LA x (16_FLOAT, 32_FLOAT, 32_UINT, 32_SINT)
- R8G8B8 x (UNORM, SNORM, SRGB, USCALED, SSCALED, UINT, SINT)
- RG/RGB/RGBA x (64_FLOAT, 32_FLOAT, 16_FLOAT, 32_UNORM, 32_SNORM,
                 32_USCALED, 32_SSCALED, 32_FIXED, 32_UINT, 32_SINT)
- RGB/RGBA x (16_UNORM, 16_SNORM, 16_USCALED, 16_SSCALED,
              16_UINT, 16_SINT)
- RGBx16 x (UNORM, SNORM, FLOAT, UINT, SINT)
- RGBx32 x (FLOAT, UINT, SINT)
- RA x (16_FLOAT, 32_FLOAT, 32_UINT, 32_SINT)

The updated st_formats.c unit test checks that the formats affected by
this change are all array formats in the equivalent Mesa format (if
any).  Mesa's array format definition is clear: the value stored is an
array (increasing memory address) of values of the channel's type.
It's also the only thing that makes sense for the RGB types, or very
large types like RGBA64_FLOAT (A should not move to the low address
because the cpu is BE).

Acked-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Tested-by: Matt Turner <mattst88@gmail.com> (unit tests on BE)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-28 10:39:36 -07:00
Eric Anholt
0547fdd7ee gallium: Drop a bit of dead code from the pack/unpack python.
Nothing used this var.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-28 10:39:36 -07:00
Eric Anholt
309ef968cd gallium: Drop the useless union wrapper on pack/unpack.
Nothing accessed the .value field, just the .chan.  Unwrap all the
code from the union, for clarity (and 13k less generated code).

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-28 10:39:36 -07:00
Eric Anholt
174240c5e4 gallium: Skip generating the pack/unpack union if we don't use it.
Shaves 30k off of the 1.6M .c file, and makes for less noise for me
trying to understand how gallium formats actually work.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-28 10:39:36 -07:00
Eric Anholt
7c8cdee0b2 gallium: Fix mesa format name in unit test failure path.
We clearly wanted the mesa format here.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-28 10:39:36 -07:00