This reduces the overhead of using the fixed function internally
in the driver.
V2: Use setup_glsl_generate_mipmap() and setup_ff_generate_mipmap()
functions to avoid code duplication.
Use glsl version when ARB_{vertex, fragmet}_shader are present.
Remove redundant code.
V3: Remove redundant border related code leaving the assertion.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
the progs/util directory is now in mesa demos
replace glean with piglit
add ApiTrace
markup: replace the unordered list <ul> with a definition list <dl>
Signed-off-by: Brian Paul <brianp@vmware.com>
I've reviewed the code, and the swrast callsites remaining are all in
drawpixels/copypixels/bitmap/accum, or _swrast_BlitFramebuffer that shouldn't
be hit. A piglit run with the context setup disabled on legacy GL and GLES2
showed regressions only in the copypixels and drawpixels tests.
If the context type is forced, this reduces the shader_runner maximum heap
size for glsl-algebraic-add-add-1.shader_test from 15,137,496b to 4,165,376b.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
There were no other cases that set it any more.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The Fallback field of the context struct doesn't work that way on i965, and
it's the only caller of FALLBACK() in the driver.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This code has been in the driver since the first commit. I think it was
trying to stop rendering from happening with a disabled position array. Core
mesa has since had changes to deal with disabled position arrays correctly.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
It turns out it hasn't worked since at least 8.0.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
swrast uses MapRenderbuffer, which leads to intel_miptree_map, which does the
depth resolve.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Fixes piglit fp-kil and glBitmap() with radeonsi.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
It should be initialized by the kernel as necessary.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
But cap the size in bytes, to avoid depleting the whole system memory,
with humongus textures.
Tested with max-texture-size piglit test.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
We want to check whether there are bits set outside of the valid flags.
Fixes piglit test egl-create-context-invalid-flag-gl
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Now that it's on by default, we may as well make it obey the flag,
for consistency's sake if nothing else.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Precompiling the shader at link time often allows us to avoid compiling
it at the first use. This moves the expensive compilation and
optimization process to game or level load time, rather than at draw
time, where we really can't avoid any cycles and don't want to risk
stalling the GPU.
The downside is that we have to guess the non-orthagonal state the
program will have set when it draws with the shader. Previously, we
guessed wrong for nearly every shader, so it wasn't useful. With the
recent SamplerUnits rework and this series, we've either eliminated
state or made smarter guesses, and usually get it right now.
In the L4D2 time demo, I now have 39 fragment shader recompiles and no
vertex shader recompiles. Before this series and the SamplerUnits
rework, I had 206 fragment shader recompiles and 192 vertex shader
recompiles.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
This fixes a regression since 76d1301e8e:
I began setting SWIZZLE_XYZW for unused sampler units in the actual
program keys, since this matched the FS precompile behavior. However,
the VS precompile was expecting zero, so that commit made essentially
every vertex shader (even those not using texturing) mismatch and need
to be recompiled.
Setting them in the VS precompile key solves the issue. It also is an
improvement over our old behavior: previously we guessed that vertex
shaders didn't use any textures at all. Now we actually look to see if
the VS had any sampler uniforms and guess based on that.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Eric added support for WM key debugging. This adds it for the VS.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Our previous assumption, SWIZZLE_XYZW, was completely bogus for depth
textures. There are no Y, Z, or W components.
DEPTH_TEXTURE_MODE has three options:
- GL_LUMINANCE: <X, X, X, 1>
- GL_INTENSITY: <X, X, X, X>
- GL_ALPHA: <0, 0, 0, X>
The default value is GL_LUMINANCE, and most applications don't seem to
alter DEPTH_TEXTURE_MODE. Make that our precompile guess.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Now that most things are based on the linker-assigned index, it makes
sense to convert the arrays in the VS/WM program key as well. It seems
silly to leave them indexed by texture unit.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
brw_wm_prog_key's proj_attrib_mask field is designed to enable an
optimization for fixed-function programs, letting us avoid projecting
attributes where the divisor is 1.0.
However, for shaders, this is not useful, and is pretty much impossible
to guess when building the FS precompile key. Turning it off for
shaders should allow the precompile to work and not lose much.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Suggested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
We probably want to do something more sophisticated here, but this at
least makes it through L4D2 without dumping the program cache.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
We were stomping on the caller's buffer by ignoring their alignment
requests and other pixel store modes. This patch makes the USE_XCB path match
the older one more closely.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52059
Signed-off-by: Julien Cristau <julien.cristau@logilab.fr>
Signed-off-by: Brian Paul <brianp@vmware.com>
Do all pre-draw hiz resolves *after* the renderbuffers are resized by
intel_prepare_render. Otherwise, we may resolve buffers that are
immediately discarded afterwards.
Fixes the assertion failure below when resizing windows in KDE and under
some unknown circumstance in Chrome OS:
intel_resolve_map.c:46: intel_resolve_map_set: Assertion
`(*tail)->need == need' failed.
Also, remove the comment that "resolves must occur [...] before setting up
any hardware state". That was true when resolves were implemented with
meta-ops, but no longer with blorp.
v2:
- Keep brw_predraw_resolve_buffers in its current position, which is
before any brw_context bits are modified. Instead, move the call to
intel_prepare_render.
Note: This is a candiate for the 8.0 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52252
Reported-by: Lu Hua <huax.lu@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
intel_renderbuffer_resolve_hiz checks if rb->mt is null, so there is no
need for the caller to do so.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Fixes piglit fbo-blending-formats.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
In preparation for extending this code, which would make it rather unwieldy in
its current place.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Mostly inspired by r600g commit 4acf71f01e
('r600g: cache shader variants instead of rebuilding v3').
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Could cause build failures if trying to use the macros in certain constructs.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
This adds the FMASK and CMASK buffers. They share the same resource
with color data.
COMPRESSION and FAST_CLEAR are always enabled if both FMASK and CMASK are
allocated. We initialize the CMASK to a "compressed" state (not "fast cleared"),
so that we can keep FAST_CLEAR enabled all the time.
Both FMASK and CMASK must be present at the moment. If either one is missing,
the other one is not used.
v2: add cayman regs in the list
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
The original samples positions took samples outside of the pixel boundary,
leading to dark pixels on the edge of the colorbuffer, among other things.
Reviewed-by: Jerome Glisse <jglisse@redhat.com>