These enumerations are simply log2 of the number of multisamples shifted
by a bit, so we can calculate them using ffs() in a lot less code.
Suggested by Eric Anholt.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
The const in
const unsigned foo(void);
is meaningless. Removing it silences this warning:
src/glsl/ast_to_hir.cpp:1802:56: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
From page 14 (page 20 of the PDF) of the GLSL 1.10 spec:
"In addition, all identifiers containing two consecutive underscores
(__) are reserved as possible future keywords."
The intention is that names containing __ are reserved for internal use
by the implementation, and names prefixed with GL_ are reserved for use
by Khronos. Names simply containing __ are dangerous to use, but should
be allowed.
Per the Khronos bug mentioned below, a future version of the GLSL
specification will clarify this.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Cc: Tapani Pälli <lemody@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Bugzilla: Khronos #11702
Section 3.3 (Preprocessor) of the GLSL 1.30 spec (and later) and the
GLSL ES spec (all versions) say:
"All macro names containing two consecutive underscores ( __ ) are
reserved for future use as predefined macro names. All macro names
prefixed with "GL_" ("GL" followed by a single underscore) are also
reserved."
The intention is that names containing __ are reserved for internal use
by the implementation, and names prefixed with GL_ are reserved for use
by Khronos. Since every extension adds a name prefixed with GL_ (i.e.,
the name of the extension), that should be an error. Names simply
containing __ are dangerous to use, but should be allowed. In similar
cases, the C++ preprocessor specification says, "no diagnostic is
required."
Per the Khronos bug mentioned below, a future version of the GLSL
specification will clarify this.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Cc: Tapani Pälli <lemody@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Bugzilla: Khronos #11702
Linking with LLVM static libraries is easily broken by changes to
the llvm-config program or when LLVM adds, removes, or changes library
components. Keeping up with these changes requires a lot of maintanence
effort to keep the build working on the master and stable branches.
Also, because of issues in the past LLVM static libraries, the release
manager is currently configuring with --with-llvm-shared-libs when
checking the build before release. Enabling shared libraries by
default would allow the release manager to run ./configure with
no arguments, and be reasonably confident that the build would succeed.
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
Useful because the total number of uniform components might exceed
MAX_UNIFORMS * 4 in some cases because of the image metadata we'll be
passing as push constants.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Like the VEC4 back-end does. It will make dynamic allocation of the
param_size array easier in a future commit.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Since we are now consuming two ringbuffers at a time, we probably want a
pool larger than 4.. but we don't need each individual ringbuffer to be
so large, so offset the pool size increase by reducing rb size.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
It seems the write-after-read hazard that applies to texture fetch
instructions, also applies to sfu instructions.
Also, cat5/cat6 instructions do not have a (ss) bit, so in these
cases we need to insert a dummy nop instruction with (ss) bit set.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
There doesn't seem to be any reason for it to be a method, and it's
surprising that the expression 'reg.retype(t)' doesn't retype its
object but rather it creates a temporary with the new type. Use
'retype(reg, t)' instead.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Add assertion that the register is not in the HW_REG or IMM file,
calculate the conjunction of the old and new mask instead of replacing
the old [consistent with the behavior of brw_writemask(), causes no
functional changes right now], make it static inline to let the
compiler do a slightly better job at optimizing things, and shorten
its name.
v2: Assert that the new writemask is not zero to avoid undefined
hardware behaviour.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
And define non-mutating helper functions to retype fixed and normal
regs with a common interface. At some point we may want to get rid of
::fixed_hw_reg completely and have fixed regs use the normal register
data members (e.g. backend_reg::reg to select a fixed GRF number,
src_reg::swizzle to store the swizzle, etc.), I have the feeling that
this is not the last headache we're going to get because of the
multiple ways to represent the same thing and the different register
interface depending on the file a register is stored in...
Reviewed-by: Paul Berry <stereotype441@gmail.com>
There doesn't seem to be any reason for nr_params, nr_pull_params,
param, and pull_param to be duplicated in the stage-specific
subclasses of brw_stage_prog_data. Moving their definition to the
common base class will allow some code sharing in a future commit, the
removal of brw_vec4_prog_data_compare and brw_*_prog_data_free, and
the simplification of the stage-specific brw_*_prog_data_compare.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Broadwell's 3DSTATE_WM_HZ_OP packet makes this much easier.
Instead of programming the whole pipeline, we simply have to emit the
depth/stencil packets, a state override, and a pipe control. Then
arrange for the state to be put back. This is easily done from a single
function.
v2: Use minify(mt->logical_{width,height}0, level) in 3DSTATE_WM_HZ_OP
instead of intel_mipmap_level's width/height fields. Those were
based on the physical width/height, and thus wrong for MSAA buffers.
Eric also deleted those fields.
v3: Use 0xFFFF as the sample mask regardless of what the user set (as
this operation is unrelated); set the drawing rectangle to the
miplevel being operated on, rather than the whole surface; remove
unnecessary MAX2(..., 1) around mt->logical_depth0 (all suggested
by Eric Anholt).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
The existing code followed the vtable function signature, which is not a
great fit: many of the parameters are unused, and the function still
inspects global state, making it less reusable.
This patch refactors the depth buffer packet emission code into a new
function which takes exactly the parameters it needs, and which uses no
global state. It then makes the existing vtable function call the new
one.
Ideally, we would remove the vtable function, and clean up that
interface. But that can happen once HiZ is working.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Broadwell's "HiZ Resolve" operation still has the restriction that the
rectangle primitive must be 8x4 aligned. So I believe we still need
this.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
HiZ buffers still don't exist, but when they do, we'll set them up.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
brw_depthbuffer_format is not very reusable at the moment, since it
uses global state (ctx->DrawBuffer) to access a particular depth buffer.
For HiZ on Broadwell, I need a function which simply converts the
formats. However, at least one existing user of brw_depthbuffer_format
really wants the existing interface. So, I've created a new function.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
This reverts commit 1456ed85f0.
_eglInitResource can and is supposed to be called on subclass objects.
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
This reverts commit 498d10e230.
_eglInitResource can and is supposed to be called on subclass objects.
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Even with the other limits raised, TestProxyTexImage would still reject
textures > 1GB in size. This is an artificial limit; nothing prevents
us from having a larger texture. I stayed shy of 2GB to avoid the
larger-than-aperture situation.
For 3D textures, this raises the effective limit:
- RGBA8: 645 -> 738
- RGBA16: 512 -> 586
- RGBA32F: 406 -> 465
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74130
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Gen4+ supports 8192x8192 cube maps. Ivybridge and later can actually
support 16384, but that would place GL_MAX_CUBE_MAP_TEXTURE_SIZE above
GL_MAX_TEXTURE_SIZE, which seems like a bad idea.
(Unfortunately, we can't bump GL_MAX_TEXTURE_SIZE to 16384 without
causing regressions due to awful W-tiled stencil buffer interactions.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74130
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
It's highly unlikely that there will be enough memory in the system to
allocate enough space for this, but we should still expose the hardware
limit. It's what the Intel Windows driver does, and it seems most other
vendors do likewise.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74130
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
GL_ARB_ES2_compatibility doesn't say anything about shader linking
when one of the shaders (vertex or fragment shader) is absent. So,
the extension shouldn't change the behavior specified in GLSL
specification.
Tested the behavior on proprietary linux drivers of NVIDIA and AMD.
Both of them allow linking a version 100 shader program in OpenGL
context, when one of the shaders is absent.
Makes following Khronos CTS tests to pass:
successfulcompilevert_linkprogram.test
successfulcompilefrag_linkprogram.test
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This drops the MOVs for header setup, which are totally mis-scheduled.
total instructions in shared programs: 1590047 -> 1589331 (-0.05%)
instructions in affected programs: 43729 -> 43013 (-1.64%)
GAINED: 0
LOST: 0
glb27-trex:
x before
+ after
+-----------------------------------------------------------------------------+
| + x xx + + + |
| ++ + xxx ++x xx + ** *x+ + + + x * |
|+x xx x* x+++xx*x*xx+++*+*xx++** *x* x+***x*+xx+* + * + + *|
| |__|__________MA___A___________|___| |
+-----------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 49 62.33 65.41 63.49 63.53449 0.62757822
+ 50 62.28 65.4 63.7 63.6982 0.656564
No difference proven at 95.0% confidence
Reviewed-by: Matt Turner <mattst88@gmail.com>
It often confused people because it was unclear on whether it was the
physical or logical, and people needed the other one as well. We can
recompute it trivially using the minify() macro, clarifying which value is
being used and making getting the other value obvious.
v2: Fix a pasteo in intel_blit.c's dst flip.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Since only window system renderbuffers can have a singlesample_mt, this
lets us drop a bunch of sanity checking to make sure that we're just a
renderbuffer-like thing.
v2: Fix a badly-written comment (thanks Kenneth!), drop the now trivial
helper function for set_needs_downsample.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The only DRI2 vs DRI3 delta was just how to decide about frontbuffer-ness
for doing the upsample.
v2: Fix missing singlesample_mt->region->name update in the merged code,
which would have broken the DRI2 don't-recreate-the-miptree
optimization.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Pretty silly to pass in values dereferenced out of one of the arguments.
v2: Get the destination size from the dst, even though the callers are
always dealing with src size == dst size cases.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
So far it's happened to be that we're only ever calling
intel_miptree_blit() (up/downsampling) from the ReadBuffer, but I stumbled
over a null ReadBuffer case when debugging later parts of the series.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Mostly mt->target == 2D_MS just results in a few checks that we don't try
to allocate multiple LODs and don't try to do slice copies with them. But
with the introduction of binding renderbuffers to textures, we need more
consistency.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This lets us simplify our shaders, and rely on GLES-prohibited
functionality (like ARB_texture_multisample) when writing these
driver-internal functions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Compare this VS to the one for the post-130 case. Fixes piglit
glsl-lod-bias, and presumably tons of other code (I haven't done a full
piglit run on swrast).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74911
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>