Previously when considering whether to rematerialize or spill/fill
ssa_1954, we would go for a spill/fill :
vec4 32 ssa_388 = (float32)txf ssa_387 (texture_handle), ssa_86 (coord), ssa_23 (lod), 0 (texture), 0 (sampler)
...
vec1 32 ssa_1953 = load_const (0xbd23d70a = -0.040000)
vec1 32 ssa_1954 = fadd ssa_388.x, ssa_1953
vec1 32 ssa_1955 = fneg ssa_1954
This is because when looking at ssa_1955 the first time, we would
consider ssa_388 unrematerialiable, and therefore all values built on
top of it would be considered unrematerialiable as well.
The missing piece when considering whether to rematerialize ssa_1954
is that we should look at filled values. Now that ssa_388 has been
spilled/filled, we can rebuild ssa_1955 on top of the filled value and
avoid spilling/filling ssa_1955 at all.
This requires a bit more work though. We can't just look at an
instruction in isolation, we need to go through the ssa chains until
we find values we can rematerialize or not.
In this change we build a list of all ssa values involved in building
a given value, up to the point there we find a filled or a
rematerializable value.
In this particular case, looking at ssa_1955 :
* We can rematerialize ssa_388 from its filled value
* We can rematerialize ssa_1953 trivially
* We can rematerialize ssa_1954 because its 2 inputs are rematerializable
* We can rematerialize ssa_1955 because ssa_1954 is rematerializable
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16556>
Currently we do something like this :
ssa_0 = ...
ssa_1 = ...
* spill ssa_0, ssa_1
call1()
* fill ssa_0, ssa_1
ssa_2 = ...
ssa_3 = ...
* spill ssa_0, ssa_1, ssa_2, ssa_3
call2()
* fill ssa_0, ssa_1, ssa_2, ssa_3
If we assign the same possition to ssa_0 & ssa_1 in the spilling
stack, then on call2(), we know that those values are already present
in memory at the right location and we can avoid respilling them.
The result would be something like this :
ssa_0 = ...
ssa_1 = ...
* spill ssa_0, ssa_1
call1()
* fill ssa_0, ssa_1
ssa_2 = ...
ssa_3 = ...
* spill ssa_2, ssa_3
call2()
* fill ssa_0, ssa_1, ssa_2, ssa_3
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16556>
For a follow up optimization, we would like to track scratch loads.
This isn't possible with global load/store intrinsics. So use a couple
of special intrinsic in the pass and only lower it to global
intrinsics at the end.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16556>
When moving code into the main block or loop blocks, put the code into
its own :
if(true) { ... }
block so that we avoid break/continue/return issues.
v2: Also take care of the main block with return instructions
v3: Make deletion more obvious with dummy if blocks (Jason)
v4: Fixup assert for loops (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8dfb240b1f ("nir: Add raytracing shader call lowering pass.")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16036>
When moving code from below to the insertion cursor point, if the
cursor points to a jump instruction, don't bother inserting the code.
It would break the break/continue assumptions of NIR and would not be
executed anyway.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8dfb240b1f ("nir: Add raytracing shader call lowering pass.")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16036>
Stop using nop instructions which are causing issues with
break/continue, instead use a nir_cursor (which brings its share of
pain).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8dfb240b1f ("nir: Add raytracing shader call lowering pass.")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16036>
Helpful when lost in a sea of NIR :)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15887>
v2: Add helper for acceleration->root_node computation (Caio)
v3: Update comment on "done" bit (Caio)
Remove progress bool value for impl function (Caio)
Don't use nir_shader_instructions_pass to search the shader (Caio)
v4: Rename variable for if/else block (Caio)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>
This will allow to reuse the same intrinsic for various topology based
ID.
v2: fix intrinsic comment (Caio)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13719>
Replace load_mesh_global_arg_addr_intel with a more general intrinsic
load_mesh_inline_data_intel, since inline data now hold both
a pointer descriptor information and the first few push constants.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14788>
Static struct bitset was renamed to brw_bitset as a struct bitset
is defined in sys/_bitset.h included by pthread_np.h on FreeBSD that
is indirectly included by src/intel/compiler/brw_nir_lower_shader_calls.c
Signed-off-by: Eleni Maria Stea <elene.mst@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11203>
Really copying Jason's pass.
Changes:
- Instead of all the intel lowering introduce rt_{execute_callable,trace_ray,resume}
- Add the ability to use scratch intrinsics directly.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10339>