CC: Ian Romanick <ian.d.romanick@intel.com>
CC: Marek Olšák <marek.olsak@amd.com>
Fixes: b3c17330e6
("mesa: expose AMD_gpu_shader_int64")
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Which is also required to put it in the tarball, a requirement for
building with meson from the tarball.
CC: Ian Romanick <ian.d.romanick@intel.com>
CC: Marek Olšák <marek.olsak@amd.com>
Fixes: 263c962cfd
("mesa: expose EXT_vertex_attrib_64bit")
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
snprintf() guarantees that it will not write more chars than allowed,
and that the string will be null-terminated, without the need to fill
the whole thing with zeroes to begin with.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
There are two problems with the fixed patch. First, it fails to create a
dependency on the sourced .c file, so changes to intel_tiled_memcpy.c
won't trigger a rebuild. It also doesn't get included in the dist
tarball.
Fixes: 11b1afdc92
("i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
extra_files is just a nice way to to tell certain IDEs (and those
reading the file) that this file is also a dependency. Meson will use
the .d file generated by the compiler to figure out what the target
actually depends on.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
GL_EXT_texture_buffer introduced texture buffers, which can be used
in shaders through a new type imageBuffer.
Because how image access is implemented in freedreno, calling
imageSize on an imageBuffer returns the size in bytes instead of texels,
which is incorrect.
This patch adds a division of imageSize result by the bytes-per-pixel
of the image format, when image is buffer-backed.
Fixes all tests under
dEQP-GLES31.functional.image_load_store.buffer.image_size.*
v2: Pre-compute and submit the log2 of the image format's bpp as shader
constant instead of emitting the LOG2 instruction in code. (Rob Clark)
v3: Use ffs (find-first-bit) helper for computing log2 (Ilia Mirkin)
Reviewed-by: Rob Clark <robdclark@gmail.com>
I found a remnant of texture_float build option that wasn't removed in
commit 66673bef94
This patch removes it.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
anv_GetPhysicalDeviceSurfaceSupportKHR will already return success for
this, but anv_GetPhysicalDevice{Xcb,Xlib}PresentationSupportKHR do not.
Apps which check for presentation support via the latter (all Feral
Vulkan games at least) will therefore fail.
This allows me to render on an Intel GPU and present to a display
connected to an AMD card (tested HD 530 + Vega 64).
v2: Rebase on current master.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Use nir_src_comp_as_uint() to read the proper second component, as
nir_src_as_uint() returns the first one.
v2: Use nir_src_comp_as_uint() [Jason]
Fixes: 16870de8a0 ("nir: Use nir_src_is_const and nir_src_as_* in core
code")
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108532
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
nir_lower_io_to_scalar_early() is really part of the link time
optimisations. Moving it here allows the code to be simplified
and also keeps the code easy to follow in the next patch.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
The linking opts shouldn't try removing or compacting XFB varyings
in the consumer. To avoid this we copy the always_active_io flag
from the producer.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
To have uniform behavior while disassembling send(c) instruction use
register type of unsigned doubleword for src1 when message descriptor is
immediate value. Bspec does not specifiy anything for src1 immediate
default type.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
There exists _mesa_components_in_format() which already includes
all cases handled in _mesa_base_format_component_count().
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
The conon_bit_class and canon_var_class variables got switched.
Fixes: 932c650e0b "nir/algebraic: Loosen a restriction on variables"
Reported-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
JPEG was added after DRM version 3.26
Signed-off-by: Leo Liu <leo.liu@amd.com>
Fixes: 4558758c51749(amd/common: add vcn jpeg ip info query)
Cc: Boyuan Zhang <boyuan.zhang@amd.com>
Cc: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
I'll take care of 18.2 releases series on Andres behalf.
CC: Andres Gomez <agomez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
Pointer arithmetic...
v2: s/4/sizeof(uint32_t)/ (Eric)
v3: Give bytes to print_batch() in error_decode (Lionel)
Make clear what values we're dealing with in error_decode (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v2)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Implement jpeg target buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Implement jpeg bitstream buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Move the previous get_mjpeg_slice_heaeder function and eoi from
"radeon/vcn" to "st/va".
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Move the previous get_mjpeg_slice_heaeder function and eoi from
"radeon/vcn" to "st/va".
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Add a new file to handle VCN Jpeg decode specific functions. Use Jpeg
specific cmd sending function in end_frame call.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Use function pointer for sending cmd in end_frame call. By doing this, we can
assign different cmd sending logics for Jpeg decode later.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Add RING_VCN_JPEG for VCN Jpeg decode, and keep RING_VCN_DEC for other codecs.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Move radeon_decoder definition from "radeon_vcn_dec.c" to "radeon_vcn_dec.h",
so that it can be included by other files later.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Fixes: 593996bc02 ("radv: implement buffer to image operations for R32G32B32")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Rename the (un)map_gtt functions to (un)map_map (map by
returning a map) and add new functions (un)map_tiled_memcpy that
return a shadow buffer populated with the intel_tiled_memcpy
functions.
Tiling/detiling with the cpu will be the only way to handle Yf/Ys
tiling, when support is added for those formats.
v2: Compute extents properly in the x|y-rounded-down case (Chris Wilson)
v3: Add units to parameter names of tile_extents (Nanley Chery)
Use _mesa_align_malloc for the shadow copy (Nanley)
Continue using gtt maps on gen4 (Nanley)
v4: Use streaming_load_memcpy when detiling
v5: (edited by Ken) Move map_tiled_memcpy above map_movntdqa, so it
takes precedence. Add intel_miptree_access_raw, needed after
rebasing on commit b499b85b0f.
v6: refactor to changes done for sse41 separation (Tapani)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
The reference for MOVNTDQA says:
For WC memory type, the nontemporal hint may be implemented by
loading a temporary internal buffer with the equivalent of an
aligned cache line without filling this data to the cache.
[...] Subsequent MOVNTDQA reads to unread portions of the WC
cache line will receive data from the temporary internal
buffer if data is available.
This hidden cache line sized temporary buffer can improve the
read performance from wc maps.
v2: Add mfence at start of tiled_to_linear for streaming loads (Chris)
v3: add Android build support (Tapani)
v4: squash 'fix i915: Fix streaming loads for intel_tiled_memcpy'
separate sse41 to own static library (Tapani)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
There is currently no use of returned memcpy functions outside
intel_tiled_memcpy. Patch changes intel_get_memcpy to return memcpy
type instead of actual function. This makes it easier later to separate
streaming load copy in to own static library.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes errors thrown by GCC's Undefined Behaviour sanitizer (ubsan) every
time this macro is used.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Fixes: a4c4efad89 "radv: Rework guard band calculation"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
`platforms` is no longer a comma-separated string, and some of our
option descriptions are way too long already. Just drop the incorrect
bit.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
They're not required to be the same as the access flag on the image
unit. For hardware that does shader image lowering based on the
qualifier (Intel), it may be required for state setup.
v2: (by Kenneth Graunke, incorporating feedback from Marek Olšák)
- Reduce both access and shader_access to uint16_t to avoid making
the pipe_image_view structure larger.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Previously, we would fail if a variable had an assigned but unknown bit
size X and we tried to assign it an actual bit size. However, this is
ok because, at the time we do the search, the variable does have an
actual bit size and it will match X because of the NIR rules.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
We rename some local variables in validate() to be more readable and
plumb the var through to get/set_var_bit_class instead of the var index.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>