All of these (bug titles, patch titles, features, and people's names)
can contain characters that are not valid html. Just escape everything
for safety.
Fixes: 86079447da
("scripts: Add a gen_release_notes.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
oops.
Fixes: 3226b12a09
("release: Add an update_release_calendar.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Fixes: 3226b12a09
("release: Add an update_release_calendar.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
I made a bad assumption; I assumed this would be run in the release
branch. But we don't do that, we run in the master branch. As a result
we need to pass the version as an argument.
Fixes: 3226b12a09
("release: Add an update_release_calendar.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Which is very likely .Z > 0 releases.
Fixes: 86079447da
("scripts: Add a gen_release_notes.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
If they use the `Fixes: #1` form.
Fixes: 86079447da
("scripts: Add a gen_release_notes.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Previously this would result in the .0 warning be generated for .z > 0
and the .z == 0 would get the other message.
Fixes: 86079447da
("scripts: Add a gen_release_notes.py script")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
After the discussion in
https://github.com/KhronosGroup/OpenGL-API/issues/45
the section 8.17 (texture completeness) of the OpenGL 4.6 core profile
was changed to explicitly say that multisample texture completeness
ignores filter state of the texture.
"Using the preceding definitions, a texture is complete unless any of the
following conditions hold true:
...
- The minification filter requires a mipmap (is neither NEAREST nor LINEAR),
the texture is not multisample, and the texture is not mipmap complete.
- The texture is not multisample; either the magnification filter is not
NEAREST, or the minification filter is neither NEAREST nor NEAREST_-
MIPMAP_NEAREST; and any of
– The internal format of the texture is integer (see table 8.12).
– The internal format is STENCIL_INDEX.
– The internal format is DEPTH_STENCIL, and the value of DEPTH_-
STENCIL_TEXTURE_MODE for the texture is STENCIL_INDEX."
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Signed-off-by: Illia Iorin <illia.iorin@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
MAX_VARYINGS_INCL_PATCH subtracts VARYING_SLOT_VAR0 giving us a size
that's too small, so BITSET_SET writes words out of bounds, corrupting
the stack and causing all kinds of chaos. VARYING_SLOT_TESS_MAX is
the right value to use here, as it's the largest location.
Closes: 2002
Fixes: ee2050b111 ("nir: Use BITSET for tracking varyings in lower_io_arrays")
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
[12/60] Compiling C object 'src/gallium/auxiliary/eb820e8@@gallium@sta/rbug_rbug_texture.c.o'.
FAILED: src/gallium/auxiliary/eb820e8@@gallium@sta/rbug_rbug_texture.c.o
[...]
../src/gallium/auxiliary/rbug/rbug_texture.c: In function 'rbug_send_texture_info_reply':
../src/gallium/auxiliary/rbug/rbug_texture.c:302:21: error: implicit declaration of function 'alloca'; did you mean 'malloc'? [-Werror=implicit-function-declaration]
uint32_t *height = alloca(sizeof(uint32_t) * height_len);
^~~~~~
malloc
../src/gallium/auxiliary/rbug/rbug_texture.c:302:21: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
../src/gallium/auxiliary/rbug/rbug_texture.c:303:20: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
uint32_t *depth = alloca(sizeof(uint32_t) * height_len);
^~~~~~
cc1: some warnings being treated as errors
Include c99_alloca.h to portably make the alloca() prototype available.
See also: 498d9d0f, adfb9c5c, fc8139b1
Fixes: 6174cba7 ("rbug: fix transmitted texture sizes")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Rather than supplying a mask/swizzle to compose with the original, just
supply the offset of the allocated register so we can directly offset
the mask/swizzle, without resorting to composition.
This is simpler, cleaner, and will generalize to non-32-bit.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
GFX10 hazards require a different approach compared to previous
generations, for example it doesn't need s_nop, and most hazards
can't be solved by adding NOPs at all. Also, they are not
resolved by branch instructions.
This commit reorganizes aco_insert_NOPs so that there is now a
separate pass for GFX10. The new GFX10 pass also respects the
control flow of the shader.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
This commit refines the VMEMtoScalarWriteHazard mitigation, based
upon a closer look at what LLVM does. Also changes the code to
match the structure of the other hazard mitigations.
* The hazard is not only triggered by VMEM, FLAT and GLOBAL
but also SCRATCH and DS instructions.
* The SMEM/SALU instructions only cause a hazard when they
write a register that the VMEM/etc. are reading.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
There is a hazard caused by there is a branch between a
VMEM/GLOBAL/SCRATCH instruction and a DS instruction.
This commit adds a workaround that avoids the problem.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
There is a hazard that happens when an SMEM instruction
reads an SGPR and then a VALU instruction writes that same SGPR.
This commit adds a workaround that avoids the problem.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
There is a hazard when a non-VALU instruction reads the EXEC mask
and then a VALU instruction writes the EXEC mask.
This commit adds a workaround that avoids the problem.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Any permlane instruction that follows any VOPC instruction can cause a hazard,
this commit implements a workaround that avoids this causing a problem.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
ACO currently mitigates VMEMtoScalarWriteHazard and Offset3fBug
(names from LLVM). There are some bugs that ACO needn't care about.
Just to be on the safe side, add an assertion that makes sure
that we aren't hit by FlatSegmentOffsetBug.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
From the Vulkan spec 1.1.126 :
"VK_SHADER_FLOAT_CONTROLS_INDEPENDENCE_32_BIT_ONLY_KHR specifies
that shader float controls for 32-bit floating point can be set
independently; other bit widths must be set identically to each
other."
Forgot to update this when I enabled that extension recently.
Fixes dEQP-VK.spirv_assembly.instruction.compute.float_controls.independence_settings.independence_setting
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This saves one return and a simple benchmark which calls glGetString
repeatedly on my desktop shows it improves calls per second from 118M
to 128M.
Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Most GPU require the sample count is power of 2. Just remove those
formats with unusual sample count. This decreases dEQP EGL tests run
time a lot.
Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
MAX_VARYINGS_INCL_PATCH is greater than 64, so we'll need more that 64
bits (per component) to track which vars have indirects. This pass was
trying to track patch varyings (which start at bit 63) in a separate
64 bit word, but failed to subtract VARYING_SLOT_PATCH0 and accessed
out of bounds.
Do away with the ad-hoc bit mask tracking and just use a BITSET.
Fixes: dEQP-GLES31.functional.tessellation.user_defined_io.per_patch_block.vertex_io_array_size_implicit.triangles
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
In some cases, in particular when you have things that can be src
modifiers ((abs)/(neg)), once eliminating one mov, there is a
possibility to remove another. Handle this by re-visiting an
instruction after eliminating a copy on one of it's srcs.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
These date back to relatively early days of ir3, when a lot was still
not well understood. But according to CI (and what I've seen blob
driver do), these are not actually real restrictions.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Now that we fixed the sharp edges that this was papering over, we can
relax the restriction about eliminating a mov coming out of a fanout
(for example from result of texture fetch).
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
This avoids copy-propagating a high register into an instruction which
cannot consume it.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
We did this properly already for split/fanout. But collect was missed.
Extract out a helper to share.
This way we avoid copy propagating a mov from high or half reg into an
instruction which cannot consume a high/half reg.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
1) deduplicate IR3_SHADER_DEBUG=disasm versus fs/vs/etc handling
2) standardize shader stage name prints, in particular VERT vs BVERT
3) don't mix stderr and stdout
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Avoid keeping track of the idx and all possible image operands for
each operation. Note for convenience we split up the handling of
ImageOperandsOffsetMask and ImageOperandsConstOffsetMask.
Suggested by Jason Ekstrand.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Change the information to also include the category, so that the
particulars of BitEnum enumeration can be handled in the template.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Emit barriers with semantics matching the access operand and the
storage class of the pointer.
v2: Fix order of visible / available emission relative to the
operations. (Bas)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Set the memory semantics and scope for later emitting the barrier.
Note the barrier emission code already exist in vtn_handle_image for
the Image atomics.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>