The code for growing the memory pool (which is used for storing all of
the global buffers) wasn't working. There seem to be two separate issues
with the memory pool code. The first was the way it was growing the pool.
When the memory pool needed more space, it would:
1. Copy the data from the memory pool's backing texture to system memory.
2. Delete the memory pool's texture
3. Create a bigger backing texture for the memory pool.
4. Copy the data from system memory into the bigger texture.
The copy operations didn't seem to be working, and I suspect that since
they were using fragment shaders to do the copy, that there might have
been a problem with the mixing of compute and 3D state.
The other issue is that the size of 1D textures is limited, and I was
having trouble getting 2D textures to work.
I think these problems will be easier to solve once more code is shared
between 3D and compute, which is why I decided to disable it for now
rather than continue searching for a fix.
The original strategy for handling floating point loads, which was to
lower (f32 load) to (f32 bitcast (i32 load)) wasn't really working. The
main problem was that the DAG legalizer couldn't handle replacing a node
with two results (load) with a node with only one result (bitcast).
Use r600_resource_texture::flished_depth_texture for GPU access, and
allocate it in the VRAM. For transfers we'll allocate texture in the GTT
and store it in the r600_transfer::staging.
Improves performance when flushed depth texture is frequently used by the
GPU, e.g. in Lightsmark (~30%)
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
This function is used when dispatching compute shader in order to avoid
mixing compute and 3D registers in the context's dirty list. This
allows the compute code to resuse 3D functions like evergreen_cb, which
return a struct r600_pipe_state and still have control over when and how
the register writes are emitted.
The start_compute_cs atom initializes some config and context registers
to the values needed for running compute shaders. When a compute shader
is dispatched, this atom is emitted after the start_cs_cmd atom, which
initializes registers that are common to both 3D and compute.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Some packets require the shader type bit (bit 1) to be set when
used for compute shaders. The pkt_flag will be initialized to
RADEON_CP_PACKET3_COMPUTE_MODE for any struct r600_command_buffer used
for dispatching compute shaders and it will be or'd against the result of
the PKT3 macro when adding a new packet to a struct r600_command buffer.
Reviewed-by: Marek Olšák <maraeo@gmail.com>