GL_EXT_texture_buffer introduced texture buffers, which can be used
in shaders through a new type imageBuffer.
Because how image access is implemented in freedreno, calling
imageSize on an imageBuffer returns the size in bytes instead of texels,
which is incorrect.
This patch adds a division of imageSize result by the bytes-per-pixel
of the image format, when image is buffer-backed.
Fixes all tests under
dEQP-GLES31.functional.image_load_store.buffer.image_size.*
v2: Pre-compute and submit the log2 of the image format's bpp as shader
constant instead of emitting the LOG2 instruction in code. (Rob Clark)
v3: Use ffs (find-first-bit) helper for computing log2 (Ilia Mirkin)
Reviewed-by: Rob Clark <robdclark@gmail.com>
Implement jpeg target buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Implement jpeg bitstream buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Move the previous get_mjpeg_slice_heaeder function and eoi from
"radeon/vcn" to "st/va".
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Add a new file to handle VCN Jpeg decode specific functions. Use Jpeg
specific cmd sending function in end_frame call.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Use function pointer for sending cmd in end_frame call. By doing this, we can
assign different cmd sending logics for Jpeg decode later.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Add RING_VCN_JPEG for VCN Jpeg decode, and keep RING_VCN_DEC for other codecs.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Move radeon_decoder definition from "radeon_vcn_dec.c" to "radeon_vcn_dec.h",
so that it can be included by other files later.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
`nir_intrinsic_image_deref_size` is not being considered during scan for
driver constants, so image constants are not emitted if a shader
only ever query the size of an image (no load, store, atomic op, etc).
This is unlikely, but possible.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Fixes: 2f52925f5c
"nv50/ir: move a * b -> a << log2(b) code into createMul()"
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Some of the .dir-locals.el had the wrong name for the truthy value so
it wasn’t setting indent-tabs-mode.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Now that a single cmdstream is used for both binning and draw passes, we
can skip allocation of cmdstream buffer for binning.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Now that state which is different for draw vs binning pass is split out
into different state-groups with appropriate enable_mask (so the
appropriate one is chosen for draw vs binning), switch over to using a
single cmdstream for both passes.
This should significantly lower draw overhead for CPU bound benchmarks.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Blob seems to manage to use same input registers for BS (binning pass)
vs VS (draw pass) shaders, so it can use the same VBO state for both.
We can't quite do that yet, so split them.
Signed-off-by: Rob Clark <robdclark@gmail.com>
We don't need to keep this IGNORE_VISIBILITY in binning pass. Prep work
for using single cmdstream for both draw and binning passes.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Move this to after ir3_cp (which can add lowered immediates to the const
state) for a6xx+, to ensure the uniform state matches between binning
and vertex shaders. This way we can emit just a single VS_CONST state-
group when we re-use single cmdstream for both binning and draw passes.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Use the in-memory cache to construct shader program state and re-use it
on subsequent draws, to lower driver overhead.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cache that maps gallium hwcso (in this case, 'struct ir3_shader') plus
shader variant key to a generation specific state object.
This could eventually replace the linked list of shader variants, but
for now it lets us re-use the work currently done in fdN_program_emit()
Signed-off-by: Rob Clark <robdclark@gmail.com>
Prep work for a following patch, that introduces a cache to map from
program state (all shader stages) plus variant key to pre-baked hw
state (which could be emit'd via CP_SET_DRAW_STATE, for example).
To do that, we really want the variant key to be immutable, and to
treat the binning pass shader as an extra shader stage, rather than
as a VS variant.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Unfortunately gallium doesn't match what the hw wants perfectly here, in
using a separate CSO for each texture/sampler. So we have to use a hash
table to map the collection of texture/samplers to hw state object.
We probably could use separate hw state objects for texture and sampler
state, but mesa/st tends to update the tex and samp state together.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Intended to be something more compact than a 64b pointer, which could be
used as a key into hashtables. Prep work for texture state objects.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Eventually we want to move nearly everything, but no other state depends
on const state, so this is the easiest one to move first.
For webgl aquarium, this reduces GPU load by about 10%, since for each
fish it does a uniform upload plus draw.. fish frequently are visible in
only a single tile, so this skips the uniform uploads for other tiles.
The additional step of avoiding WFI's when using CP_SET_DRAW_STATE seems
to be work an additional 10% gain for aquarium.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add helper to add state-groups to emit, and code to emit CP_DRAW_STATE
packet if we have any state-groups.
Signed-off-by: Rob Clark <robdclark@gmail.com>
These are not necessary because the corresponding settings are set via
the .dir-locals.el file anyway. Most of them were missing a ‘:’ after
“tab-width” which was making Emacs display an annoying warning
whenever you open the file.
This patch was made with:
sed -ri '/-\*- mode:/,/^$/d' \
$(find src/gallium/{drivers,winsys} -name \*.\[ch\] \
-exec grep -l -- '-\*- mode:' {} \+)
Signed-off-by: Rob Clark <robdclark@gmail.com>
Needs to allocate batches from the cache so that it could
get a valid index and make resource dependancy tracking right.
In addition this fixes assertion on debug build since the commit
1a40faa8 landed.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Needs to specify nondraw when creating a batch through
fd_bc_alloc_batch since it'd better create a batch through
it rather than fd_batch_create.
Signed-off-by: Rob Clark <robdclark@gmail.com>