This will allow use to use ralloc_parent() on the info field and fix
a regression in nir_sweep() caused by e1af20f18a.
This is intended to be a temporary requirement that will be removed
when we finish separating shader_info from nir_shader.
Reviewed-by: Eric Anholt <eric@anholt.net>
This allows us to use ralloc_parent() to see which data structure owns
shader_info which allows us to fix a regression in nir_sweep().
This will also allow us to move some fields from gl_linked_shader to
gl_program, which will allow us to do some clean-ups like storing
gl_program directly in the CurrentProgram array in gl_pipeline_object
enabling some small validation optimisations at draw time.
Also it is error prone to depend on the gl_linked_shader for
programs in current use because a failed linking attempt will free
infomation about the current program. In i965 we could be trying
to recompile a shader variant but may have lost some required fields.
Reviewed-by: Eric Anholt <eric@anholt.net>
This limitation was initially here because AMD_performance_monitor
doesn't allow to expose the real number of hardware counters. But
this actually really annoying when profiling with qapitrace.
Anyways, performance counters are mostly for developers and
failures are expected if you try to monitor more queries than
supported.
This breaks amd_performance_monitor_measure but it's expected.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Event not_predicated_off_thread_inst_executed is SM35+.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Dolphin tried to use this, but we hadn't had any tests for it properly.
All that is required is the shader output format needs to be set
for 0 and 1 exports.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
As this array was not actually sorted, FindGLXFunction's binary search
would only sometimes work.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Before we were caching the prog data but we weren't doing anything with
brw_stage_prog_data::param so anything with push constants wasn't getting
cached properly. This commit fixes that.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98012
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
While we can simply calculate offsets to get to things such as the
prog_data and the key, it's much more user-friendly if there are just
pointers. Also, it's a bit more fool-proof.
While we're at it, we rework the pipeline cache API to use the
brw_stage_prog_data type directly.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98012
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
The case where we just want the loop to continue is INCOMPATIBLE_DRIVER
because that simply means that whatever FD we opened isn't a supported
Intel chip. Other error codes such as OUT_OF_HOST_MEMORY are actual errors
and we should be returning early in that case.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Without this fix, the function would still end up returning NULL but it
would put that NULL connection in the hash table which would be bad.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Note LOCAL_CFLAGS and LOCAL_SHARED_LIBRARIES in Android.common.mk
are used by both host and target modules. However, commit 112e988
moved libdrm related flags to common. It causes the errors like:
error: 'out/host/linux-x86/obj32/SHARED_LIBRARIES/libdrm_intermediates/export_includes',
needed by 'out/host/linux-x86/obj32/EXECUTABLES/mesa_gen_matypes_intermediates/import_includes',
missing and no known rule to make it
No reason to use libdrm with host modules.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Fixes: 112e988329 ("Android: move libdrm settings to top-level
Android.common.mk")
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
This is required when an out argument involves an array index that is either
a global variable modified by the function or another out argument in the
same function call.
Fixes the shaders/out-parameter-indexing/vs-inout-index-inout-* tests.
v2:
- modify the ir_dereference_array nodes in place
- use ir_hierarchical_visitor
v3: use base_ir (Ian Romanick)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
As previously written, these opcodes use the SM5 semantics which is
incompatible with GLSL when bits == 0, offset == 32.
At some point we may want to add BFI_SM5 etc. opcodes, but all users
currently either want (and expect!) the GLSL semantics or don't care.
Bitfield inserts are generated by the GLSL lower_instructions and
lower_packing_builtins passes with constant bits and offset arguments,
so any workaround code that drivers may have to emit to follow GLSL
semantics should be optimized away easily for those uses.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
I missed this when I added the xlib code, this allows
dolphin emu to start and crash later.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This makes more sense than OUT_OF_HOST_MEMORY. Technically, you can
recover from a failed execbuf2 but the batch you just submitted didn't
fully execute so things are in an ill-defined state. The app doesn't want
to continue from that point anyway.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The emission of vertex attributes corresponding to dvec3 and dvec4
vertex shader input variables was not correct when the <size> passed
to the VertexAttribL* commands was <= 2.
This was because we were using the vertex array size when emitting vertices
to decide if we uploaded a 64-bit floating point attribute as 1 slot (128-bits)
for sizes 1 and 2, or 2 slots (256-bits) for sizes 3 and 4. This caused problems
when mapping the input variables to registers because, for deciding which
registers contain the values uploaded for a certain variable, we use the size
and type given to the variable in the shader, so we will be assigning 256-bits
to dvec3/4 variables, even if we only uploaded 128-bits for them, which happened
when the vertex array size was <= 2.
The patch uses the shader information to only emit as 128-bits those 64-bit floating
point variables that were declared as double or dvec2 in the vertex shader. Dvec3 and
dvec4 variables will be always uploaded as 256-bits, independently of the <size> given
to the VertexAttribL* command.
From the ARB_vertex_attrib_64bit specification:
"For the 64-bit double precision types listed in Table X.1, no default
attribute values are provided if the values of the vertex attribute variable
are specified with fewer components than required for the attribute
variable. For example, the fourth component of a variable of type dvec4
will be undefined if specified using VertexAttribL3dv or using a vertex
array specified with VertexAttribLPointer and a size of three."
We are filling these unspecified components with zeros, which coincidentally is
also what the GL44-CTS.vertex_attrib_binding.basic-inputL-case1 expects.
v2: Do not use bitcount (Kenneth Graunke)
Fixes: GL44-CTS.vertex_attrib_binding.basic-inputL-case1 test
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97287
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We require 12 bytes of headers but in some cases we just need 4.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>