EGL_CONTEXT_RELEASE_BEHAVIOR_KHR determines what happends with implicit
flushes when context changes, on multigpu scenario we change context
frequently when blitting content. while we still rely on explicit sync
fences, the flush is destroying driver optimisations.
setting it to EGL_CONTEXT_RELEASE_BEHAVIOR_NONE_KHR essentially mean
just swap context and continue processing on the next context.
* shader: split CM rgba/rgbx into discard ones
make it branchless if we have no discards.
* shader: ensure we dont stall on vbo uv buffer
if we render a new texture before the previous was done gpu wise its
going to stall until done, call glBufferData to orphan the data.
this allows the driver to return a new memory block immediately
if the GPU is still reading from the previous one
* protocols: ensure we reset GL_PACK_ALIGNMENT
reset GL_PACK_ALIGNMENT back to the default initial value of 4
* shader: use unsigned short in VAO
loose a tiny bit of precision but gain massive bandwidth reductions.
use GL_UNSIGNED_SHORT and set it as normalized. clamp and round the UV
for uint16_t in customUv.
* shader: interleave vertex buffers
use std::array for fullverts, use a single interleaved buffer for
position and uv, should in theory improve cache locality. and also remove
the need to have two buffers around.
* shader: revert precision drop
we need the float precision because we might have 1.01 or similiar
floats entering CM shader maths, and rounding/clamping those means the
maths turns out wrong. so revert back to float, sadly higher bandwidth
usage.
* update doColorManagement api
* convert primaries to XYZ on cpu
* remove unused primaries uniform
---------
Co-authored-by: UjinT34 <ujint34@mail.ru>
* shader: begin the shader refactor
make SShader a class and rename it to CShader, move createprogram,
compileshader, logshadererror to CShader.
* shader: move uniform creation to CShader
move uniform creation to CShader, reduces tons of duplicated effort,
however forcing uniform names to be same in all shaders.
* shader: move to array based frag handling
use an array with an enum so it gets easier dealing with multiple
shaders, move creating program to a for loop and array, reduces line of
code a lot.
* shader: use shared ptr for frags
with smart pointers we can now rename useProgram to useShader and return
the shader directly, means only place we have to decide the shader frag
is when calling useShader. easier for future shader splitting to reduce
branching.
* shader: move unneded public members to private
move structs and uniforms to private add a get/set for initialtime
and add a getUniformLocation to make the code tell what its doing,
instead of direct array getting when all we wanted to get was its value,
also limits the setting of uniformLocations to the createProgram as it should
be.
* shader: fix style nits
set first enum member to 0 , remove extra {}
* shader: dont show a failed notif on success
the logic got inverted in the refactor here.
* shader: split CM shader to rgba/rgbx variants
split shader to rgba/rgbx variants, use bool, and reduce branching.
* shader: split up blurprepare CM and non CM
split up blurprepare, remove skipcm, move gain to gain.glsl.
remove ternary operator and reduce branching by using step() and mix()
use vec3 for gain, make brightness a cheap mulitplication with max.
* shader: split up border to CM/noncm variants
splitup border shader to CM/noncm variant, move common used things to
border.glsl , there is room for optimisations here but its a complex
shader im putting it for future PR.
* shader: touchup blurfinish
make brightness a cheap multiplication instead of branching.
mod is redundant, fract in hash already returns a value in [0.0, 1.0]
* fix named primaries
* default to gamma22
* mark mastering primaries as supported
* remove xx-cm and frog support
* immutable primaries and image descriptions
* clang-format
ubsan reports under wonky situation a runtime error of uninitialized
value lookups because of m_capStatus isnt initialized. so default
initialize it.
OpenGL.cpp:3331:26: runtime error: load of value 190, which is not a valid value for type 'bool'
These are: pointer_shape from the cursor-shape-v1 protocol prepared for v2, along with left_ptr...bottom_right_corner and killing (Hyprland specific)
pointer_shape_previous with
pointer_switch_time to blend between shapes
pointer_size scaled size as used by the normal cursor
pointer_pressed_positions[32] with
pointer_pressed_times[32] and
pointer_pressed_killed(32 bits) for click/touch animations and if they killed something
pointer_inactive_timeout with
pointer_last_active to smoothly fade the pointer out
pointer_hidden to hide it when the cursor is hidden (excluding by cursor:invisible as this config value can be used to turn off the normal cursor, which is useful when drawing it with the screen shader)
setCapStatus is a a heavy used function in hot rendering paths, it
shows up in profiling as using a bit of cpu just because of
unordered_set hashing etc, move to a enum and array and cache only the
heavily used ones.
* opengl: cache viewport state
according to nvidia docs calling glViewPort unnecessarily on the same
already set viewport is wasteful and can cause state changes when not
needed. cache it in a struct and only call it when the viewport is
actually changing.
* opengl: cache glenable/gldisable state
avoid making multiple glenable/gldisable calls on already set caps, can
cause state changes and incur driver overhead.
* opengl: cache glscissor box
only call glscissor if the box actually has changed, try to avoid state
changes.
* opengl: cache gluniform calls
cache the gluniform calls, the uniform values are cached in driver per
program only the drawcalls setting the uniform yet again with the same
value on same location is causing more overhead then caching it ourself
and just no oping on it if no changes.
* shader: rewrite handling of uniforms and state
this is way faster as we don't need to mess with maps (hashing, etc) and instead can just use an array
* opengl: stuff and 300 shaders
* opengl: typo
* opengl: get the uniform locations properly
now that the legacy shaders are gone get the uniformlocations for
SKIP_CM etc, so they can be properly set and used depending on if
cm_enabled is set to false or true, before it was falling back to a
legacy shader that didnt even have those uniforms.
* opengl: check epsilon on float and remove extra glcall
seems an extra unset glcall was added, remove it. and check the float
epsilon on the glfloat.
* opengl: remove instanced shader draw
remove the instanced boolean from the vertex shader, might be neglible
differences, needs more benchmark/work to see if its even worth it.
* texture: cache texture paramaters
parameters where occasionally set twice or more on same texture, short
version wrap it and cache it. and move gpu churn to cpu churn.
add a bind/unbind to texture aswell.
* texture: use fast std::array caching
cache the texparameter values in fast array lookups
and incase we dont want it cached, apply it anyways.
* shader: fix typo and hdr typo
actually use Matrix4x2fv in the 4x2fv cache function, and send the
proper float array for hdr.
* texture: make caching not linear lookup
make caching of texture params not linear.
* minor style changes
* opengl: revert drawarrays
revert the mostly code style reduce loc change of drawarrays, and focus
on the caching. its a if else case going wrong here breaking
blur/contrast amongst others drawing.
---------
Co-authored-by: Vaxry <vaxry@vaxry.net>
* opengl: remove unnecessery glflush calls
glflushing forces the driver to break batching and issue commands
prematurely and prevents optimisations like command reordering and
merging.
many glFunctions already internally glflushes and eglsync creation still
has a glflush at end render. so lets reduce the overhead of these calls.
* opengl: reduce glUseProgram calls
apitrace shows cases where the same program gets called multiple times,
add a helper function that keeps track of current program and only call
it once on same program. reduces slight overhead.
* opengl: use more efficient vertex array object
use a more modern vertex array object approach with the shaders, makes
it a onetime setup on shader creation instead of once per drawcall, also
should make the driver not have to revalidate the vertex format on each
call.
* syncobj: cleanup and use uniqueptrs
cleanup a bit missing removals if resource not good, erasing from
containers etc. make use of unique ptrs instead. and add default
destructors.
* syncobj: rework syncobj entirerly
remove early buffer release that was breaking explicit sync, the buffer
needs to exist until the surface commit event has been emitted and draw
calls added egl sync points, move to eventfd signaling instead of
stalling sync point checks, and recommit pending commits if waiting on a
signal. add a CDRMSyncPointState helper class. move a few weak pointers
to shared pointers so they dont destruct before we need to use them.
* syncobj: queue pending states for eventfd
eventfd requires us to queue pending stats until ready and then apply to
current, and also when no ready state exist commit the client commit on
the current existing buffer, if there is one.
* syncobj: clear current buffer damage
clear current buffer damage on current buffer commits.
* syncobj: cleanup code and fix hyprlock
remove unused code, and ensure we dont commit a empty texture causing
locksession protocol and gtk4-layer-shell misbehaving.
* syncobj: ensure buffers are cleaned up
ensure the containers having the various buffers actually gets cleaned
up from their containers, incase the CSignal isnt signaled because of
expired smart pointers or just wrong order destruction because mishaps.
also move the acquire/point setting to buffer attaching. instead of on
precommit.
* syncobj: remove unused code, optimize
remove unused code and merge sync fds if fence is valid, remove manual
directscanout buffer dropping that signals release point on pageflip, it
can cause us to signal the release point while still keeping the current
buffer and rendering it yet again causing wrong things.
* syncobj: delay buffer release on non syncobj
delay buffer releases on non syncobj surfaces until next commit, and
check on async buffers if syncobj and drop and signal the release point
on backend buffer release.
* syncobj: ensure we follow protocol
ensure we follow protocol by replacing acquire/release points if they
arrive late and replace already existing ones. also remove unneded
brackets, and dont try to manual lock/release buffers when it comes to
explicit protocol. it doesnt care about buffer releases only about
acquire and release points and signaling them.
* syncobj: lets not complicate things
set points in precommit, before checking protocol errors and we catch
any pending acquire/release points arriving late.
* syncobj: move SSurfaceState to types
remove destructor resource destroying, let resources destroys them on
their events, and move SSurfaceStates to types/SurfaceState.hpp
* syncobj: actually store the merged fd
have to actually store the mergedfd to use it.
* syncobj: cleanup a bit around fences
ensure the current asynchronous buffer is actually released on pageflip
not the previous. cleanup a bit FD handling in
commitPendingAndDoExplicitSync, and reuse the in fence when syncing
surfaces.
* syncobjs: ensure fence FD doesnt leak
calling resetexplicitfence without properly ensuring the FD is closed
before will leak it, store it per monitor and let it close itself with
the CFileDescriptor class.
* syncobj: ensure buffers are actually released
buffers were never being sent released properly.
* types: Defer buffer sync releaser until unlock
* syncobj: store directscanout fence in monitor
ensure the infence fd survives the scope of attemptdirectscanout so it
doesnt close before it should have.
* syncobj: check if if acquire is expired
we might hit a race to finish on exit where the timeline just has
destructed but the buffer waiter is still pending. and such we
removeAllWaiters null dereferences.
* syncobj: code style changes
remove quack comment, change to m_foo and use a std::vector and
weakpointer in the waiter for removal instead of a std::list.
* syncobj: remove unused async buffer drop
remove unused async buffer drop, only related to directscanout and is
handled elsewhere.
---------
Co-authored-by: Lee Bousfield <ljbousfield@gmail.com>
ensure framebuffer textures are detached and deleted, avoid leaving framebuffers bound when not needed
* render: avoid calling glDeleteProgram on no program
its safe to do so but it adds a bunch of unnecessery lines in apitrace
when tracing. if guard it and return early.
* opengl: ensure texture and buffers are unbound
ensure bound buffers are unbound after use, also detach textures from
framebuffer before deleting it otherwise it will become dangling and
essentially leak.
* config: make fd use CFileDescriptor
make use of the new hyprutils CFileDescriptor instead of manual FD
handling.
* hyprctl: make fd use CFileDescriptor
make use of the new hyprutils CFileDescriptor instead of manual FD
handling.
* ikeyboard: make fd use CFileDescriptor
make use of the new CFileDescriptor instead of manual FD handling, also
in sendKeymap remove dead code, it already early returns if keyboard
isnt valid, and dont try to close the FD that ikeyboard owns.
* core: make SHMFile functions use CFileDescriptor
make SHMFile misc functions use CFileDescriptor and its associated usage
in dmabuf and keyboard.
* core: make explicit sync use CFileDescriptor
begin using CFileDescriptor in explicit sync and its timelines and
eglsync usage in opengl, there is still a bit left with manual handling
that requires future aquamarine change aswell.
* eventmgr: make fd and sockets use CFileDescriptor
make use of the hyprutils CFileDescriptor instead of manual FD and
socket handling and closing.
* eventloopmgr: make timerfd use CFileDescriptor
make the timerfd use CFileDescriptor instead of manual fd handling
* opengl: make gbm fd use CFileDescriptor
make the gbm rendernode fd use CFileDescriptor instead of manual fd
handling
* core: make selection source/offer use CFileDescriptor
make data selection source and offers use CFileDescriptor and its
associated use in xwm and protocols
* protocols: convert protocols fd to CFileDescriptor
make most fd handling use CFileDescriptor in protocols
* shm: make SHMPool use CfileDescriptor
make SHMPool use CFileDescriptor instead of manual fd handling.
* opengl: duplicate fd with CFileDescriptor
duplicate fenceFD with CFileDescriptor duplicate instead.
* xwayland: make sockets and fds use CFileDescriptor
instead of manual opening/closing make sockets and fds use
CFileDescriptor
* keybindmgr: make sockets and fds use CFileDescriptor
make sockets and fds use CFileDescriptor instead of manual handling.