We might want to resolve something to be in a particular register,
so we can access it outside of the gen_mi framework...but it may already
be in that register, at which point there's no work to do.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Make sure that a <group> tag within another <group> tag work just fine.
v2: rename 'halfbyte' to 'byte' to match the size (Lionel).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Now we can decode a <group> tag inside another <group> tag, and properly
print its indices and content.
v2: Use push/pop stack to fields, groups and iters (Lionel).
v3: Add assert(iter->level < DECODE_MAX_ARRAY_DEPTH) (Lionel).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
We currently use the group->next pointer to iterate through the <group>
tags. This change them to be a type of field, so we can descend into
them while iterating, and then go back to the original position. Will be
useful when we want to decode <group>'s inside <group>'s, and when there
are more <field>'s after a <group> tag.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
A gen_group (group in most of the code) can be of several types:
- instruction
- struct
- register
- group (?!?)
The <group> tag actually represents an array of elements. So at least
in our code, lets call it an array to avoid confusion with gen_group.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Refactor the code from gen_spec_load_from_path() into a separate
function, that can be used with a xml file that doesn't fit the genX.xml
filename format.
Will be used soon for implementing unit tests for gen_decoder.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
When using gen_spec_load_from path, only abort decoding if the read
length is 0. Previously, we were aborting if finding an EOF, even if
something was read from the file.
Also only kill the decoded file if no commands or structs were found,
and print a message in such case.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This makes the following packets use actual driver provided sizes rather
than guessing an arbitrary number:
- CC_VIEWPORT
- SF_CLIP_VIEWPORT
- BLEND_STATE
- COLOR_CALC_STATE
- SCISSOR_RECT
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
libintel_common depends on libintel_compiler, but it contains debug
functionality that is needed by libintel_compiler. Break the circular
dependency by moving gen_debug files to libintel_dev.
Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Drivers using genxml will start compilation before generated files are
created, so add a dependency to it.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Cc: mesa-stable@lists.freedesktop.org
IGT has a test to hang the GPU that works by having a batch buffer
jump back into itself, trigger an infinite loop on the command stream.
As our implementation of the decoding is "perfectly" mimicking the
hardware, our decoder also "hangs". This change limits the number of
batch buffer we'll decode before we bail to 100.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
An MI_BATCH_BUFFER_START in the ring buffer acts as a second level
batchbuffer (aka jump back to ring buffer when running into a
MI_BATCH_BUFFER_END).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Some commands like MI_BATCH_BUFFER_START have this indicator.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
There was an issue recently caused by the system header being included
by mistake, so let's just get rid of this include path and always
explicitly #include "drm-uapi/FOO.h"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
According to the loop implementation (in 'ctx_print_buffer' function),
which advances dword by dword over vertex buffer(vb),
the vb size should be aligned by 4 bytes too.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109449
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
It should be incremented by one according to
how it is calculated by 'emit_vertex_buffer_state':
"\#if GEN_GEN < 8
.BufferAccessType = step_rate ? INSTANCEDATA : VERTEXDATA,
.InstanceDataStepRate = step_rate,
\#if GEN_GEN >= 5
.EndAddress = ro_bo(bo, end_offset - 1),
\#endif
\#endif"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109449
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Use L3 configuration specified in h/w specification.
V2: Drop configs which do under allocation of l3 cache.
Bump up the comment above table.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
The engine to which the batch was sent to is now set to the decoder context when
decoding the batch. This is needed so that we can distinguish between
instructions as the render and video pipe share some of the instruction opcodes.
v2: The engine is now in the decoder context and the batch decoder uses a local
function for finding the instruction for an engine.
v3: Spec uses engine_mask now instead of engine, replaced engine class enums
with the definitions from UAPI.
v4: Fix up aubinator_viewer (Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Preliminary work for adding handling of different pipes to gen_decoder. Each
instruction needs to have a definition describing which engine it is meant for.
If left undefined, by default, the instruction is defined for all engines.
v2: Changed to use the engine class definitions from UAPI
v3: Changed I915_ENGINE_CLASS_TO_MASK to use BITSET_BIT, change engine to
engine_mask, added check for incorrect engine and added the possibility to
define an instruction to multiple engines using the "|" as a delimiter in the
engine attribute.
v4: Fixed the memory leak.
v5: Removed an unnecessary ralloc_free().
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
There are some cases where the VS is the only stage enabled, it uses the
entire URB, and the URB is large enough that placing later stages after
the VS exceeds the number of bits for "URB Starting Address".
For example, on Icelake GT2, "varying-packing-simple mat2x4 array" from
Piglit is getting a starting offset of 128 for the GS/HS/DS. But the
field is only large enough to hold an offset of 127.
i965 doesn't hit any genxml assertions because it's still using the old
OUT_BATCH mechanism. 128 << GEN7_URB_STARTING_ADDRESS_SHIFT (57) == 0,
with the extra bit falling off the end. So we place the disabled stage
at the beginning of the URB (overlapping with push constants). This is
likely okay since it's a zero size region (0 entries).
It seems like the Vulkan driver might hit this assertion, however, and
the situation seems harmless. To work around this, always place
disabled stages at the start of the URB, so the last enabled stage can
fill the remaining space without overflowing the field.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Use the 'DWord Length' and 'bias' fields from the instruction definition to
parse the packet length from the command stream when possible. The hardcoded
mechanism is used whenever an instruction doesn't have this field.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This function was there when the file was introduced in commit
38f10d5a03 "intel: tools: add aubinator viewer", but was
never actually used.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Pointer arithmetic...
v2: s/4/sizeof(uint32_t)/ (Eric)
v3: Give bytes to print_batch() in error_decode (Lionel)
Make clear what values we're dealing with in error_decode (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v2)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
STATE_BASE_ADDRESS only modifies various bases if the "modify" bit is
set. Otherwise, we want to keep the existing base address.
Iris uses this for updating Surface State Base Address while leaving the
others as-is.
v2: Also update aubinator_viewer_decoder (caught by Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
construct correct gen xml filename when we try to load hardware xml
description from a given path
v2: remove temporary variable (Francesco Ansanelli)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
The "gen_group_get_length" function can return a negative value
and it can lead to the out of bounds group_iter.
v2: printing of "unknown command type" was added
v3: just the asserts are added
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Gen7.5 has a BLEND_STATE of size 0 which includes a variable length
group. We did not deal with that very well, leading to an endless
loop.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107544
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Instead of printing addresses like everyone else, we were accidentally
printing the offset from state base address. Also, state_map is a void
pointer so we were incrementing in bytes instead of dwords and every
state other than the first was wrong.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
First of all, setting iter->name in advance_field is unnecessary because
it gets set by gen_decode_field which gets called immediately after
gen_decode_field in the one call-site. Second, we weren't properly
initializing start_bit and end_bit in the initial condition of
gen_field_iterator_next so the first field of a struct would get printed
wrong if it doesn't start on the first bit. This is fixed by adding a
iter_start_field helper which sets the field and also sets up the other
bits we need. This fixes decoding of 3DSTATE_SBE_SWIZ.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>