This can deal with all the 15 32-bit untyped atomic operations the
hardware supports, but only INC and PREDEC are going to be exposed
through the API for now.
v2: Represent atomics as GLSL intrinsics. Add support for variably
indexed atomic counter arrays. Fix interaction with fragment
discard.
v3: Add comment on why we don't need to assign uniform storage for
atomic counters.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
This patch fixes the three dead code elimination passes and the
VEC4/FS instruction scheduling passes so they leave instructions with
side effects alone.
At some point it might be interesting to have the instruction
scheduler calculate the exact memory dependencies between atomic ops,
but they're rare enough that it seems unlikely that it will make any
practical difference.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Inspired by a patch sent to the mailing list by Tom Stellard, but
using a different algorithm to calculate the optimal block size that
has been found to be considerably more effective.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
xserver 1.14.99.2 simplified the DamageUnregister API, by
dropping the drawable argument.
Follow xf86-video-intel and xf86-video-vmware approach and
handle the new API by checking XORG_VERSION_CURRENT.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71110
Reported-by: Michał Górny <mgorny@gentoo.org>
Reported-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
dso_list was added as an argument for createInternalizePass in 3.4, and then
it was removed again in the same llvm version.
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
The new option clamps GL_MAX_SAMPLES to a hardware-supported MSAA mode.
If negative, then no clamping occurs.
v2: (for Paul)
- Add option to i965 only, not to all DRI drivers.
- Do not realy on int->uint cast to convert negative
values to large positive values. Explicitly check for
clamp_max_samples < 0.
v3: (for Ken)
- Don't allow clamp_max_samples to alter context version.
- Use clearer for-loop and correct comment.
- Rename variables.
v4: (for Ken)
- Merge identical if-branches.
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Fixes "Macro compares unsigned to 0" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fixes "Macro compares unsigned to 0" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Actually link VS out / FS in based on semantic info, keeping in mind
that position/pointsize can also be an input to the FS. This fixes a
few fragment shaders which were using gl_Position.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Fixes use of full-precision in fragment shader (ie. don't clobber r0.x
since that can be used by future bary instructions for varying fetch).
And makes use of full-precision the default in fragment shader (but can
be overriden via FD_MESA_DEBUG=fraghalf).
Seems like half precision is often not enough for texture coordinates.
The blob compiler is clever enough to keep texture coords in full
precision registers while using half precision for everything else. But
we aren't quite that clever yet, so better to default to full precision.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Handle some relative addressing constraints: cannot handle const or
relative in cat5 and src2 of cat3.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
- Enable GEN7_WM_MSDISPMODE_PERSAMPLE, GEN7_WM_POSOFFSET_SAMPLE,
GEN7_WM_OMASK_TO_RENDER_TARGET as per extension's specification.
- Only enable one of GEN7_WM_8_DISPATCH_ENABLE or GEN7_WM_16_DISPATCH_ENABLE
when GEN7_WM_MSDISPMODE_PERSAMPLE is enabled. Refer IVB PRM Vol. 2, Part 1,
Page 288 for details.
V2:
- Use shared function _mesa_get_min_invocations_per_fragment().
- Use brw_wm_prog_data variables: uses_pos_offset, uses_omask.
V3:
- Enable simd16 dispatch with per sample shading.
- Make changes to give preference to 'simd16 only' mode over
'simd8 only' mode in case of non 1x per sample shading.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
- Enable GEN6_WM_MSDISPMODE_PERSAMPLE, GEN6_WM_POSOFFSET_SAMPLE,
GEN6_WM_OMASK_TO_RENDER_TARGET as per extension's specification.
- Only enable one of GEN6_WM_8_DISPATCH_ENABLE or GEN6_WM_16_DISPATCH_ENABLE
when GEN6_WM_MSDISPMODE_PERSAMPLE is enabled.
Refer SNB PRM Vol. 2, Part 1, Page 279 for details.
V2:
- Use shared function _mesa_get_min_invocations_per_fragment().
- Use brw_wm_prog_data variables: uses_pos_offset, uses_omask.
V3:
- Enable simd16 dispatch with per sample shading.
- Make changes to give preference to 'simd16 only' mode over
'simd8 only' mode in case of non 1x per sample shading.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
V2:
- Update comments
- Add a special backend instructions to compute sample_mask.
- Add a new variable uses_omask in brw_wm_prog_data.
V3:
- Make changes to support simd16 mode.
- Delete redundant AND instruction and handle the register
stride in FS backend instruction.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
V2:
- Update comments
- Add compute_sample_id variables in brw_wm_prog_key
- Add a special backend instruction to compute sample_id.
V3:
- Make changes to support simd16 mode.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
This is required while adding builtin system value vec{2, 3, 4}
variables. For example:
(declare (sys) vec2 gl_SamplePosition)
Without this patch above glsl ir splits in to:
(declare (temporary) float gl_SamplePosition_x)
(declare (temporary) float gl_SamplePosition_y)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This function is used to test if we need to do per sample shading or
per fragment shading.
V2: Use MAX2() to make sure the function returns a number >= 1.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
New builtins added by GL_ARB_sample_shading:
in vec2 gl_SamplePosition
in int gl_SampleID
in int gl_NumSamples
out int gl_SampleMask[]
V2: - Use SWIZZLE_XXXX for STATE_NUM_SAMPLES.
- Use "result.samplemask" in arb_output_attrib_string.
- Add comment to explain the size of gl_SampleMask[] array.
- Make gl_SampleID and gl_SamplePosition system values.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Number of samples will be required in fragment shader program by new
GLSL builtin uniform "gl_NumSamples".
V2: Use "state.numsamples" in place of "state.num.samples"
Use _NEW_BUFFERS flag in place of _NEW_MULTISAMPLE
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Reviewed-by: Ken Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>