Some hardware can do PIPE_TEX_WRAP_MIRROR_REPEAT but not
PIPE_TEX_WRAP_MIRROR_CLAMP and PIPE_TEX_WRAP_MIRROR_CLAMP_TO_BORDER.
Drivers for such hardware would like to advertise support for
ARB_texture_mirror_clamp_to_edge but not EXT_texture_mirror_clamp.
This commit adds a new PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE bit,
changes the extension enable to be based on that, and enables it
in all upstream drivers which supported PIPE_CAP_TEXTURE_MIRROR_CLAMP
(so they continue supporting this mode).
With earlier rework the user and provider of the symbol are within the
same binary. Thus there's no point in exporting the function.
Spotted while reviewing patch from Chuck, that nearly added another
unneeded PUBLIC function.
Cc: Chuck Atkins <chuck.atkins@kitware.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Fixes: f50aa21456 "(swr: build driver proper separate from rasterizer")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com<mailto:george.kyriazis@intel.com>>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com<mailto:chuck.atkins@kitware.com>>
Zeroing memory after calloc is not necessary. This also allows to avoid
possible crash when allocation fails, because memset is called before
checking screen for NULL.
Fixes: a29d63ecf7 "swr: refactor swr_create_screen to allow
for proper cleanup on error"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v2)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
The new name make the zero-input behavior more obvious. The next
patch adds a new function with different zero-input behavior.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Nobody queries these and nobody sets them to anything useful,
the docs say TODO.
Drop them until a use appears.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This makes the following changes to address cleanup issues:
- Error conditions now return NULL instead of calling exit()
- swr_creen is now freed upon error, rather than leak.
- Library handle from dlopen is now closed upon swr_screen destruction
v2: Added additional context in commit msg and remove unnecessary "PUBLIC"
v3: Fix typo in commit message.
Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
cc: mesa-stable@lists.freedesktop.org
Supporting simd16 vertex shaders involves packing the output of the
fetch shader appropriately, especially the vertexID buffers that have to
be formatted in one simd16 register, needed by the VS.
As part of this support, we needed to remove the 2nd JitManager, since it
was not accounting for vector width correctly.
USE_SIMD16_SHADERS is also split into two defines. The additional
one (USE_SIMD16_VS) controls the width of the vertex shader (VS), while
the original one (USE_SIMD16_SHADERS) controls overall front end width.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
This patch fixes piglit tex3d-maxsize by correcting 4 things:
The total_size calculation was using 32-bit math, therefore a >4GB
allocation request overflowed and was not returning false (unsupported).
Changed AlignedMalloc arguments from "unsigned int" to size_t, to handle
>4GB allocations.
Added error checking on texture allocations to fail gracefully.
Finally, temporarily decreased supported max texture size from 4GB to 2GB.
The gallivm texture-sampler needs some additional work to correctly handle
larger than 2GB textures (offsets to LLVMBuildGEP are signed).
I'm working on a follow-on patch to allow up to 4GB textures, as this is
useful in HPC visualization applications.
Fixes piglit tex3d-maxsize.
v2: Updated patch description to clarify ">4GB".
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
Some hw (evergreen) has a limit on how many combined (images/buffers/mrts)
a fragment shader can access.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Remove allocation of > 2kbyte buffers into context memory in
swr_copy_to_scatch_space() (which is used to copy small vertex/index buffers
and shader constants to a scratch space to be used by the upcoming draw.)
Large shader constant allocations need to be done in the circular scratch
buffer instead of context memory, because their values persist across
render calls.
Also lower SCRATCH_SINGLE_ALLOCATION_LIMIT to 8k, since allocations of larger
buffers will get too large for the circular scratch space.
Fixes render issues with CEI Ensight.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Start building vertex shaders as simd16.
Disabled by default, set USE_SIMD16_SHADERS in knobs.h to experiment.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Because vc4 can control the order that tiles are rasterized in, we can use
it to implement overlapping blits using normal drawing and
GL_ARB_texture_barrier, as long as we can tell the kernel what order to
render the tiles in.
This commit introduces the core gallium support, vc4 changes will follow.
v2: Fix on the simulator.
v3: Add the cap (disabled) to other drivers, add rst docs for the cap.
v4: Rebase on PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS
v5: Drop vc4 changes from this commit, for clarity.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v3)
Needed to compensate for change to fetch jit requiring
alignment.
Fixes regressions in piglit: vertex-buffer-offsets and about
another hundred of the vs-input*byte* tests.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Accompanying patch "st/mesa: only try to create 1x msaa surfaces for
'fake' msaa" requires driver to report max_samples=1 to enable "fake"
msaa. Previously, 0 and 1 were treated equivalently in st_init_extensions()
and either could enable "fake" msaa.
This patch raises the swr default msaa_max_count from 0 to 1, so that
swr_is_format_supported will report max_samples=1.
Real msaa can still be enabled by exporting SWR_MSAA_MAX_COUNT with a
pow2 value between 2 and 16.
This patch is necessary to prevent an OpenSWR regression resulting from
the st/mesa patch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102038
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
This can be used to guard support for EXT_memory_object and related
extensions.
v2: update gallium docs
v3 (Timothy Arceri):
- add cap to nv50
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
v2: rename cap to PIPE_CAP_QUERY_SO_OVERFLOW and be a bit more explicit
in the documentation
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The last user of the function was removed with earlier commit.
Fixes: 50842e8a93 ("swr: replace gallium->swr format enum conversion")
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Fixes performance regression from f50aa21456 - was forcing internal
code generation to target AVX (no gather, etc).
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
If size of client memory copy is too large, don't copy. The draw will
access user-buffer directly and then block. This is faster and more
efficient than queuing many large client draws.
Applications that still use large client arrays benefit from this. VMD
is an example.
The threshold for this path defaults to 32KB. This value can be
overridden by setting environment variable SWR_CLIENT_COPY_LIMIT.
v2: Use #define for default value, rather than hard-coded constant.
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Moved reading of environment config options out of
swr_create_screen_internal, into a separate swr_validate_env_options.
This is to keep from cluttering create_screen.
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Use the SWR rasterizer API through the table returned from
SwrGetInterface rather than referencing the functions directly.
This will allow us to move to a model of having the driver dynamically
load the appropriate swr architecture library.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
This patch limits the number of items on the fence work queue (the
deferred deletion list) by submitting a sync fence when the queue size
exceeds a threshold. This initiates deferred deletion of all resources
on the list and decreases the total amount of memory held waiting for
"deferred deletion".
This resolves bug 101467 filed against swr for the piglit
streaming-texture-leak test. For those running on smaller memory
(16GB?) systems, this will prevent oom-killer.
Thus far, we have not seen any real world applications that exhibit
behavior like the streaming-texture-leak test; as any form of pipeline
flush will trigger the defer queue and properly free any retained
allocations. But, this addresses those as well.
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Whether bindless texture operations are supported by the
underlying driver.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
The next patch will use it. This is really for svga and GL2-level drivers.
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
v3: list piglit tests fixed by this patch. Fixed typo Tim pointed out.
v2: Reword commit message to more closely adhere to community
guidelines.
This patch moves msaa resolve down into core/StoreTiles where the
surface format conversion routines are available. The previous
"experimental" resolve was limited to 8-bit unsigned render targets.
This fixes a number of piglit msaa tests by adding resolve support for
all the render target formats we support.
Specifically:
layered-rendering/gl-layer-render: fail->pass
layered-rendering/gl-layer-render-storage: fail->pass
multisample-formats *[2,4,8,16] gl_arb_texture_rg: crash->pass
multisample-formats *[2,4,8,16] gl_ext_texture_snorm: crash->pass
multisample-formats *[2,4,8,16] gl_arb_texture_float: fail->pass
multisample-formats *[2,4,8,16] gl_arb_texture_rg-float: fail->pass
MSAA is still disabled by default, but can be enabled with
"export SWR_MSAA_MAX_COUNT=4" (1,2,4,8,16 are options)
The default is 0, which is disabled.
This patch improves the number of multisample-formats supported by swr,
and fixes several crashes currently in the 17.1 branch. Therefore, it
should be considered for inclusion in the 17.1 stable release. Being
disabled by default, it poses no risk to most users of swr.
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
cc: mesa-stable@lists.freedesktop.org