Commit graph

35430 commits

Author SHA1 Message Date
Axel Davy
2594b2efdc st/nine: Capture also default matrices for D3DSBT_ALL
We avoid allocating space for never unused matrices.
However we must do as if we had captured them.
Thus when a D3DSBT_ALL stateblock apply has fewer matrices
than device state, allocate the default matrices for the stateblock
before applying.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
bbeddb801e st/nine: Mark transform matrices dirty for D3DSBT_ALL
D3DSBT_ALL stateblocks capture the transform matrices.

Fixes some d3d test programs not displaying properly.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
a4e9bbb8f8 st/nine: Don't update unused world matrices
While to the application we have to track
accurately all 256 world matrices (including
in stateblocks), hw vertex processing enables
to set a limit to the number of world matrices
the hardware can access to in the advertised caps,
which is 8 for nine.

Thus don't bother in the stateblock code to send
the updated values for the unreachable matrices.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
2e51c4c7cc st/nine: Remove two unused states.
NINE_STATE_MATERIAL was used incorrectly at one location.
Replace it with the correct state.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Axel Davy
cb8ea21e1c st/nine: Remove commented nine_context_apply_stateblock
At some point the project was to adapt the
commented version to csmt.

The csmt rework enabled to fix some state aliasing
issues between stateblocks and internal state updates.
The commented version needs a lot of work to work with that.
Just drop it.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
2018-10-26 22:16:16 +02:00
Brian Paul
d6be0b5556 scons/svga: remove opt from the list of valid build types
This reverts commit a5fd54f8bf.

The whole point was to add a way to pass -DVMX86_STATS to the build,
but we can do that with a command line argument when we invoke scons.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2018-10-26 12:09:00 -06:00
Boyuan Zhang
f4126cfaab radeon/vcn: use util function to get h264 profile idc
Use utility function for converting h264 pipe video profile to profile idc,
instead of using array.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
2018-10-26 13:23:06 -04:00
Boyuan Zhang
55cf565698 radeon/vce: use util function to get h264 profile idc
Use utility function for converting h264 pipe video profile to profile idc,
instead of using array.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
2018-10-26 13:23:06 -04:00
Boyuan Zhang
b15d0200a9 vl: get h264 profile idc
Adding a function for converting h264 pipe video profile to profile idc

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
2018-10-26 13:23:06 -04:00
David McFarland
07a00a8729 util: Change remaining uint32 cache ids to sha1
After discussion with Timothy Arceri. disk_cache_get_function_identifier
was using only the first byte of the sha1 build-id.  Replace
disk_cache_get_function_identifier with implementation from
radv_get_build_id.  Instead of writing a uint32_t it now writes to a
mesa_sha1.  All drivers using disk_cache_get_function_identifier are
updated accordingly.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Fixes: 83ea8dd99b ("util: add disk_cache_get_function_identifier()")
2018-10-26 14:49:22 +11:00
Hyunjun Ko
3d198926a4 freedreno: use fd_bc_alloc_batch instead of fd_batch_create.
Following the commit 2385d7b066 and 8e798e28f7, for resource dependancy
tracking.

Fixes: dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo
with FD_MESA_DEBUG=inorder

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-25 18:46:19 -04:00
Hyunjun Ko
703271c22a freedreno/ir3: take reg->num out of union in ir3_register
To avoid wrong result when identifying the type of register.
Ie. If the reg is an array, it might be identified as address or
predicate register.

Fixes: dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.6

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-25 18:45:45 -04:00
Rob Clark
3c402d0dc2 freedreno/a6xx: disable unused groups
Don't leave vsconst/fsconst group enabled if we switch to shader with no
uniforms.

Fixes: abcdf5627a freedreno/a6xx: move const emit to state group
Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-25 18:38:53 -04:00
Rob Clark
d53074d3f1 freedreno: add useful assert
Would have been useful to catch the problem fixed in
8e798e28f7

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-25 18:38:53 -04:00
Alok Hota
edf38019a0 swr/rast: ignore CreateElementUnorderedAtomicMemCpy
This function's API changed between LLVM 5 and 6. Compile errors occur
when building with LLVM 6+ if LLVM 5 was used for a dist tarball

CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107865
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-25 11:05:59 -05:00
Alok Hota
8c872ac2e3 swr/rast: fix intrinsic/function for LLVM 7 compatibility
Converted from x86 VFMADDPS intrinsic to generic LLVM intrinsic, and
removed createInstructionSimplifierPass, which were both removed in LLVM
7.0.0

These changes combine patches we received from the community and our own
internal patches

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
2018-10-25 10:32:27 -05:00
Rhys Perry
26ed0f0234 nvc0: increase NOUVEAU_TRANSFER_PUSHBUF_THRESHOLD to 1024 on Kepler+
Gives a +3.89% to +5.27% FPS improvement with Hitman and +2.73% to +2.82%
FPS improvement with Dirt Rally on my GTX 1060.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-10-25 15:25:10 +01:00
Eric Engestrom
e27902a261 util: use C99 declaration in the for-loop set_foreach() macro
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-25 12:43:18 +01:00
Eric Engestrom
bb84fa146f util: use C99 declaration in the for-loop hash_table_foreach() macro
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-25 12:43:18 +01:00
Eduardo Lima Mitev
b0c427043b ir3_compiler/nir: fix imageSize() for buffer-backed images
GL_EXT_texture_buffer introduced texture buffers, which can be used
in shaders through a new type imageBuffer.

Because how image access is implemented in freedreno, calling
imageSize on an imageBuffer returns the size in bytes instead of texels,
which is incorrect.

This patch adds a division of imageSize result by the bytes-per-pixel
of the image format, when image is buffer-backed.

Fixes all tests under
dEQP-GLES31.functional.image_load_store.buffer.image_size.*

v2: Pre-compute and submit the log2 of the image format's bpp as shader
    constant instead of emitting the LOG2 instruction in code. (Rob Clark)

v3: Use ffs (find-first-bit) helper for computing log2 (Ilia Mirkin)

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-10-24 18:18:35 +02:00
Boyuan Zhang
55e7de7b19 radeonsi: enable vcn jpeg decode for raven
Enable vcn jpeg decode for raven.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
97c473bb29 winsys/amdgpu: add vcn jpeg cs support
Add vcn jpeg cs support, align cs by no-op.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
6d2d910653 radeon/vcn: implement jpeg target buffer cmd
Implement jpeg target buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
0ee5630cfc radeon/vcn: implement jpeg bitstream buffer cmd
Implement jpeg bitstream buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
9b478b0c7a radeon/uvd: remove get mjpeg slice header
Move the previous get_mjpeg_slice_heaeder function and eoi from
"radeon/vcn" to "st/va".

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
4fc2368e3b st/va: get mjpeg slice header
Move the previous get_mjpeg_slice_heaeder function and eoi from
"radeon/vcn" to "st/va".

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
c7a5ef26ad radeon/vcn: add jpeg decode implementation
Add a new file to handle VCN Jpeg decode specific functions. Use Jpeg
specific cmd sending function in end_frame call.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
40fceb55f3 radeon/vcn: separate send cmd call from end frame
Use function pointer for sending cmd in end_frame call. By doing this, we can
assign different cmd sending logics for Jpeg decode later.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
4f1f128f8e radeon/vcn: create cs based on ring type
Add RING_VCN_JPEG for VCN Jpeg decode, and keep RING_VCN_DEC for other codecs.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
f7116e4ff8 radeon/winsys: add vcn jpeg ring type
Add a new ring type for vcn jpeg.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
e7e68d15b5 radeon/vcn: add vcn jpeg decode interface
Add VCN Jpeg decode interfaces and register defines.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Boyuan Zhang
6bc0a3a834 radeon/vcn: move radeon decoder define to header file
Move radeon_decoder definition from "radeon_vcn_dec.c" to "radeon_vcn_dec.h",
so that it can be included by other files later.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Jason Ekstrand
8b626a22b2 st/mesa: Record shader access qualifiers for images
They're not required to be the same as the access flag on the image
unit.  For hardware that does shader image lowering based on the
qualifier (Intel), it may be required for state setup.

v2: (by Kenneth Graunke, incorporating feedback from Marek Olšák)
 - Reduce both access and shader_access to uint16_t to avoid making
   the pipe_image_view structure larger.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-23 02:36:24 -07:00
Rob Herring
2bb05d70af android: Build kms_swrast for the Android platform
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-22 13:08:17 +01:00
Eduardo Lima Mitev
fdd926d5b2 ir3/nir: Set up image_dims consts for image_deref_size intrinsic too
`nir_intrinsic_image_deref_size` is not being considered during scan for
driver constants, so image constants are not emitted if a shader
only ever query the size of an image (no load, store, atomic op, etc).
This is unlikely, but possible.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-10-21 21:29:18 +02:00
Karol Herbst
2d235d69c8 nv50/ir: fix ConstantFolding::createMul for 64 bit muls
Fixes: 2f52925f5c
       "nv50/ir: move a * b -> a << log2(b) code into createMul()"

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-10-20 03:00:04 +02:00
Sonny Jiang
bfb2b90246 radeonsi: Disable clear_state with radeon kernel driver
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2018-10-19 16:16:57 -04:00
Jose Fonseca
45bacc4b63 scons: Remove gles option.
It's broken, and WGL state tracker is always built with GLES support
noawadays.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-10-19 16:50:26 +01:00
Marek Olšák
69a87b5d47 radeonsi: fix a typo in a comment in emit_guardband 2018-10-18 18:01:22 -04:00
Marek Olšák
2a26b1c045 radeonsi: fix gnome-shell crash
I wasn't expecting to get viewports with the center having
negative coordinates.

Broken by: 6cc79e4411
2018-10-18 17:55:44 -04:00
Marek Olšák
77bcbe712e radeonsi: clamp point size to the limit
This fixes dEQP-GLES2.functional.rasterization.limits.points.
Broken by: ea039f789d

Tested-by: Jakob Bornecrantz <jakob@collabora.com>
2018-10-18 16:08:56 -04:00
Marek Olšák
eae8f49fc6 radeonsi: fix a VGT hang with primitive restart on Polaris10 and later
Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Jakob Bornecrantz <jakob@collabora.com>
2018-10-18 16:08:56 -04:00
Marek Olšák
165817d47f radeonsi: fix a deadlock due to partially-initialized context on CI 2018-10-18 16:08:56 -04:00
Jan Vesely
06bf56725d radeonsi: Bump number of allowed global buffers to 32
Fixes assertion failure/crash when running luxmark/luxball on clover.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108272
CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-10-18 16:02:42 -04:00
Marek Olšák
6cc79e4411 radeonsi: fix incorrect hw screen offset and guardband computation
It resulted in assertion failures or incorrect rendering.

Broken by: 9e182b8313
2018-10-18 14:42:42 -04:00
Michał Janiszewski
0ef50ecc69 st/xlib: Use more appropriate include guard
Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com
2018-10-18 11:03:04 +01:00
Michał Janiszewski
bcc613acc1 gallium: Fix mismatched ifdef-guards
Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-10-18 11:03:03 +01:00
Gert Wollny
74adc624b6 softpipe: dynamically allocate space for immediate constants
The number of immediate constants was fixed and the size check was
only done by means of an assertion. Given this a shader that emits
more immediate constants would result in a memory corruption when
mesa is build in release mode.

Instead of using this fixed limit allocate the space dynamically, let it
grow as needed, and also remove the unused ImmArray.

Fixes: dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.1

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-10-18 10:59:51 +02:00
Neil Roberts
a9475d9337 Fix setting indent-tabs-mode in the Emacs .dir-locals.el files
Some of the .dir-locals.el had the wrong name for the truthy value so
it wasn’t setting indent-tabs-mode.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2018-10-17 19:03:08 +02:00
Rob Clark
d27b1c83b9 freedreno/a6xx: don't allocate binning rb
Now that a single cmdstream is used for both binning and draw passes, we
can skip allocation of cmdstream buffer for binning.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-10-17 12:44:49 -04:00