In a fragment shader, r0 is written out with a special branch sequence.
r0 is not a real register here, but essentially a pipeline register --
as such, it needs to be written out in full and on time, with hanging
dependencies in the bundle. Otherwise, we break up the bundle, which
costs an extra ALU cycle and adds a move.
When the scheduler ran last thing, we could do this analysis within the
scheduler. Now that RA can run after scheduling, that's no longer valid,
so we remove the analysis and always break it up (at a performance
penalty). Future work can add a post-RA/post-schedule pass to merge
writeout blocks if possible. It's a bit of a low-priority next to fixing
conformance regressions, of course.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
In this particular instance, struct member were used outside of the
block where it was defined. Fix this by moving the definition outside of
block.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Fixes: 569f838987 ("winsys/svga: Add support for new surface ioctl, multisample pattern")
Reviewed-by: Brian Paul <brianp@vmware.com>
We use the shared nir_lower_idiv pass to lower integer division, fixing
144 dEQP tests. This pass was not applied in the past due to breakage
from iabs fixed earlier in the series.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-By: Ryan Houdek <Sonicadvance1@gmail.com>
Certain ops that only take one argument have an imaginary "zero"
constant for their second argument. For instance, conversions:
i2f [dest], [source], #0
Memory corruption meant that #0 was instead random noise. For some ops,
that doesn't matter (manifested as abnormally large code size and poor
scheduling due to extra constants in random places). But for others,
where a 1-op is emulated by a 2-op with an implicit 0 second argument,
that broke things.
Fixes iabs (emulated by iabsdiff).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-By: Ryan Houdek <Sonicadvance1@gmail.com>
These ops are used to accelerate various functions exposed in OpenCL.
This commit only includes the routine additions to the table. They are
not wired through the compiler; rather, they are just here to keep a
reference for the disassembler.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-By: Ryan Houdek <Sonicadvance1@gmail.com>
This new 'platform' is added by default with no guards.
It is effectively a copy of the surfaceless one, with updated function
names and brand new probe function.
Due to the reuse, some of the ifdef HAVE_SURFACELESS_PLATFORM guards
have been dropped.
A worthy mention are the changes in _egFindDisplay, since the original
and dup'd fd are required, we make use of the plat_opt argument.
Note that no hacks for eglGetDisplay are added - the API works only with
the eglGetPlatformDisplay* API.
v2:
- s/_eglCompareDeviceDisplay/_eglSameDeviceDisplay/ (Eric)
- let ^^ return bool (Eric)
- fixup meson build, move files() further up (Eric)
- copy from plat. surfaceless w/o the visual cleanups
- close and free when destroying the dpy
- sprinkle a few _eglDeviceSupports
- split fd handling into separate function
- use directly the render node if no FD is given (Mathias)
v3:
- s/dpy/disp/g
- drop swap_buffers* callbacks
- drop loader_set_logger()
- drop local define
- re-introduce _eglGetDRMDeviceRenderNode()
- EGL_WARN on ForceSoftware with HW device - continue using the HW device
- bail out for "EGL_MESA_device_software" until it's fixed
- wire-up the Android build
v4:
- use new style _eglFindDisplay()
- split hw vs sw code paths
- don't close the internal fd (already handled in FiniDisplay())
- make swrast work (bit hacky bit will do for now)
- Android for real, drop autotools
- Correct HW + LIBGL_ALWAYS_SOFTWARE check
- use the dri2_create_drawable() helper
v5:
- enhance comment around fd checks (Mathias)
- rebase for dri2_init_surface() changes
Cc: Mathias Fröhlich <Mathias.Froehlich@gmx.net>
Acked-by: Marek Olšák <marek.olsak@amd.com> (v4)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
By default, the user is likely to pick the first device so it should
not be the least performant (aka software) one.
v2: Drop odd comment (Marek)
Suggested-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Wrap the loader->createNewDrawable() dance into a helper and use it
throughout the codebase.
This addresses a cases like surfaceless (SL) on swrast (SL on kms_swrast
is fine) where we'd attempt using the wrong driver and crash out.
v2: fixup quirky GBM (Mathias)
v3: fixup GBM for real (Marek)
Cc: mesa-stable@lists.freedesktop.org
Cc: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (v2)
Signed-off-by: Marek Olšák <marek.olsak@amd.com> (v2)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Since we no longer need special handling for X11, refactor the code to
follow the style used by all other platforms.
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
The full set of attributes is already handled with previous patches.
Thus all this is not dead code.
v2 (Emil) - split from a larger patch.
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
At the moment the user will pass the screen number via attribs, yet we
would throw that away. Reason being that the int *screen passed to
xcb_connect() is output only.
v2 (Emil):
- split from a larger patch
- use xcb_connect() returned screen, as fallback
- use helper function only as needed
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Earlier spec is vague, although EGL 1.5 makes it clear:
Multiple calls made to eglGetPlatformDisplay with the same
parameters will return the same EGLDisplay handle.
With this commit we store and compare the full attrib list.
v2 (Emil):
- Split into separate patches
- Use EGLBoolean over int masked as such
- Don't return free'd pointed on calloc failure
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
This commit fix support and adjusts the capabilities
returned by the SWR driver and the documentation
to correctly report the GL_ARB_copy_image extension.
Reviewed-by: Alok Hota <alok.hota@intel.com>
This avoids a conflict with freedreno's bo_del().
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
This avoids a conflict with freedreno's table_lock
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Two driver files had tabs mixed with spaces. Remove the tabs.
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
The compiler configuration was hardened to fail on format warnings and
things stopped building.
Fixes: c9c1e26106 ("mesa: prevent common string formatting security issues")
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-By: Ryan Houdek <Sonicadvance1@gmail.com>
We do support TF now, so it's no longer valid. Besides, if we want this
optimization, we should probably have mesa/st doing it right for everyone.
Reviewed-by: Rob Clark <robdclark@gmail.com>
We have the GLSL type, so we can just ask it how many coordinates there
are. The GLSL function already has Vulkan cases that we'd probably want
eventually.
Reviewed-by: Rob Clark <robdclark@gmail.com>
When comparing our sin/cos behavior to the closed source driver, I
noticed that we were off by a bit (or, in the case of 1/2pi, 3 bits).
Fixes:
dEQP-GLES3.functional.shaders.random.trigonometric.vertex.52
dEQP-GLES3.functional.shaders.random.all_features.vertex.0
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Just provide a "(null)" literal in case driverName is NULL.
In file included from ../src/glx/dri3_glx.c:76:
../src/glx/dri3_glx.c: In function ‘dri3_create_screen’:
../src/glx/dri_common.h:70:36: error: ‘%s’ directive argument is null [-Werror=format-overflow=]
70 | #define CriticalErrorMessageF(...) dri_message(_LOADER_FATAL, __VA_ARGS__)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/glx/dri3_glx.c:1002:4: note: in expansion of macro ‘CriticalErrorMessageF’
1002 | CriticalErrorMessageF("failed to load driver: %s\n", driverName);
| ^~~~~~~~~~~~~~~~~~~~~
../src/glx/dri3_glx.c:1002:50: note: format string is defined here
1002 | CriticalErrorMessageF("failed to load driver: %s\n", driverName);
| ^~
cc1: some warnings being treated as errors
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
When PIPE_TRANSFER_READ requires a resolve, we blit from the host
storage to a temporary storage, and do a format conversion from the
temporary storage to the guest storage. This change makes sure we
convert to the correct level of the guest storage.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
util_format_translate_3d expects the source box to be aligned to the
block size. When resolving, make sure the size of the staging
buffer is aligned to the block size.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Some new flag setting disallows it due to being a security risk.
Fixes: c9c1e26106 "mesa: prevent common string formatting security issues"
Reviewed-by: Rob Clark <robdclark@gmail.com>
A previous optimization converts fmax(x, 0.0) instructions to fmov.pos.
This pass then propagates the .pos from the move up to the source
instruction (when possible). From there, copy propagation will eliminate
the move.
In the future, we might prefer to do this in common NIR code like we do
for saturate, as Bifrost can also benefit.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>
This prepass, run after scheduling but before RA, specializes to
pipeline registers where possible. It walks the IR, checking whether
sources are ever used outside of the immediate bundle in which they are
written. If they are not, they are rewritten to a pipeline register (r24
or r25), valid only within the bundle itself. This has theoretical
benefits for power consumption and register pressure (and performance by
extension). While this is tested to work, it's not clear how much of a
win it really is, especially without an out-of-order scheduler (yet!).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>