We avoid allocating space for never unused matrices.
However we must do as if we had captured them.
Thus when a D3DSBT_ALL stateblock apply has fewer matrices
than device state, allocate the default matrices for the stateblock
before applying.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
D3DSBT_ALL stateblocks capture the transform matrices.
Fixes some d3d test programs not displaying properly.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
While to the application we have to track
accurately all 256 world matrices (including
in stateblocks), hw vertex processing enables
to set a limit to the number of world matrices
the hardware can access to in the advertised caps,
which is 8 for nine.
Thus don't bother in the stateblock code to send
the updated values for the unreachable matrices.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
At some point the project was to adapt the
commented version to csmt.
The csmt rework enabled to fix some state aliasing
issues between stateblocks and internal state updates.
The commented version needs a lot of work to work with that.
Just drop it.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
This reverts commit a5fd54f8bf.
The whole point was to add a way to pass -DVMX86_STATS to the build,
but we can do that with a command line argument when we invoke scons.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Use utility function for converting h264 pipe video profile to profile idc,
instead of using array.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
Use utility function for converting h264 pipe video profile to profile idc,
instead of using array.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
Adding a function for converting h264 pipe video profile to profile idc
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
After discussion with Timothy Arceri. disk_cache_get_function_identifier
was using only the first byte of the sha1 build-id. Replace
disk_cache_get_function_identifier with implementation from
radv_get_build_id. Instead of writing a uint32_t it now writes to a
mesa_sha1. All drivers using disk_cache_get_function_identifier are
updated accordingly.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Fixes: 83ea8dd99b ("util: add disk_cache_get_function_identifier()")
Following the commit 2385d7b066 and 8e798e28f7, for resource dependancy
tracking.
Fixes: dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo
with FD_MESA_DEBUG=inorder
Signed-off-by: Rob Clark <robdclark@gmail.com>
To avoid wrong result when identifying the type of register.
Ie. If the reg is an array, it might be identified as address or
predicate register.
Fixes: dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.6
Signed-off-by: Rob Clark <robdclark@gmail.com>
Don't leave vsconst/fsconst group enabled if we switch to shader with no
uniforms.
Fixes: abcdf5627a freedreno/a6xx: move const emit to state group
Signed-off-by: Rob Clark <robdclark@gmail.com>
This function's API changed between LLVM 5 and 6. Compile errors occur
when building with LLVM 6+ if LLVM 5 was used for a dist tarball
CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107865
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Converted from x86 VFMADDPS intrinsic to generic LLVM intrinsic, and
removed createInstructionSimplifierPass, which were both removed in LLVM
7.0.0
These changes combine patches we received from the community and our own
internal patches
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
Gives a +3.89% to +5.27% FPS improvement with Hitman and +2.73% to +2.82%
FPS improvement with Dirt Rally on my GTX 1060.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
GL_EXT_texture_buffer introduced texture buffers, which can be used
in shaders through a new type imageBuffer.
Because how image access is implemented in freedreno, calling
imageSize on an imageBuffer returns the size in bytes instead of texels,
which is incorrect.
This patch adds a division of imageSize result by the bytes-per-pixel
of the image format, when image is buffer-backed.
Fixes all tests under
dEQP-GLES31.functional.image_load_store.buffer.image_size.*
v2: Pre-compute and submit the log2 of the image format's bpp as shader
constant instead of emitting the LOG2 instruction in code. (Rob Clark)
v3: Use ffs (find-first-bit) helper for computing log2 (Ilia Mirkin)
Reviewed-by: Rob Clark <robdclark@gmail.com>
Implement jpeg target buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Implement jpeg bitstream buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Move the previous get_mjpeg_slice_heaeder function and eoi from
"radeon/vcn" to "st/va".
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Move the previous get_mjpeg_slice_heaeder function and eoi from
"radeon/vcn" to "st/va".
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Add a new file to handle VCN Jpeg decode specific functions. Use Jpeg
specific cmd sending function in end_frame call.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Use function pointer for sending cmd in end_frame call. By doing this, we can
assign different cmd sending logics for Jpeg decode later.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Add RING_VCN_JPEG for VCN Jpeg decode, and keep RING_VCN_DEC for other codecs.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Move radeon_decoder definition from "radeon_vcn_dec.c" to "radeon_vcn_dec.h",
so that it can be included by other files later.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
They're not required to be the same as the access flag on the image
unit. For hardware that does shader image lowering based on the
qualifier (Intel), it may be required for state setup.
v2: (by Kenneth Graunke, incorporating feedback from Marek Olšák)
- Reduce both access and shader_access to uint16_t to avoid making
the pipe_image_view structure larger.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
`nir_intrinsic_image_deref_size` is not being considered during scan for
driver constants, so image constants are not emitted if a shader
only ever query the size of an image (no load, store, atomic op, etc).
This is unlikely, but possible.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Fixes: 2f52925f5c
"nv50/ir: move a * b -> a << log2(b) code into createMul()"
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
It's broken, and WGL state tracker is always built with GLES support
noawadays.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
The number of immediate constants was fixed and the size check was
only done by means of an assertion. Given this a shader that emits
more immediate constants would result in a memory corruption when
mesa is build in release mode.
Instead of using this fixed limit allocate the space dynamically, let it
grow as needed, and also remove the unused ImmArray.
Fixes: dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.1
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Some of the .dir-locals.el had the wrong name for the truthy value so
it wasn’t setting indent-tabs-mode.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Now that a single cmdstream is used for both binning and draw passes, we
can skip allocation of cmdstream buffer for binning.
Signed-off-by: Rob Clark <robdclark@gmail.com>