fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-07 04:58:05 +02:00

Author	SHA1	Message	Date
Jason Ekstrand	1c89e098e8	i965/fs: Make null_reg_* const members of fs_visitor instead of globals We also set the register width equal to the dispatch width. Right now, this is effectively a no-op since we don't do anything with it. However, it will be important once we add an actual width field to fs_reg. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	ab7234c852	i965/fs: Use the var_from_vgrf helper function instead of doing it manually Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	c24dd54f97	i965/fs: Fix a bug with dead_code_eliminate on large writes Previously, if an instruction wrote to more than one register, we implicitly assumed that it filled the entire register. We never hit this before because the only time we did multi-register writes was things like texturing which always wrote to all of the registers. However, with the upcoming ability to do 16-wide instructions in SIMD8 and things of that nature, we can have multi-register writes at offsets and we'll hit this. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	1385a4b706	i965/fs: Use the UW type for the destination of VARYING_PULL_CONSTANT_LOAD instructions Using a floating-point type doesn't usually cause hangs on my HSW, but the simulator complains about it quite a bit. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	f0d43c09b2	i965/fs: Use offset a lot more places We have this wonderful offset() function for advancing registers, but we're not using it. Using offset() allows us to do some sanity checking and avoid manually touching fs_reg::reg_offset. In a few commits, we will make offset do even more nifty things for us. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	0089d025aa	i965/fs: fix a comment in compact_virtual_grfs Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	3dc3fccb75	i965/fs: Rewrite fs_visitor::split_virtual_grfs The original vgrf splitting code was written with the assumption that vgrfs came in two types: those that can be split into single registers and those that can't be split at all It was very conservative and bailed as soon as more than one element of a register was read or written. This won't work once we start allowing a regular MOV or ADD operation to operate on multiple registers. This rewrite allows for the case where a vgrf of size 5 may appropriately be split in to one register of size 1 and two registers of size 2. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	f9da0740e2	i965/fs_live_variables: Use var_from_vgrf insead of repeating the calculation Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	75afe17b79	i965/fs: Manually generate the meta fast-clear shader Previously, we were generating the fast-clear shader from GLSL. The problem is that fast clears require that we use a replicated write rather than a regular write instruction. In order to get this we had a complicated and somewhat fragile optimization pass that looked for places where we can use a replicated write and used it. Since replicated writes have a lot of restrictions, we only ever use them for fast-clear operations. This commit replaces the optimization pass with a function that just generates the shader we want. This is a) less code, b) less fragile than the optimization pass, and c) generates a more efficient shader. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-30 10:29:13 -07:00
Michel Dänzer	61128d7507	radeonsi: Pass the slice size to si_dma_copy_buffer Otherwise some parts of tiled slices can be missed. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-30 18:55:48 +09:00
Michel Dänzer	74aeccd701	radeonsi: Catch more cases that can't be handled by si_dma_copy_buffer/tile Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-30 18:55:48 +09:00
Michel Dänzer	d17b85524d	radeonsi: Fix si_dma_copy(_tile) for compressed formats Fixes GPUVM faults when running the piglit test "getteximage-formats init-by-rendering" with R600_DEBUG=forcedma on SI. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-30 18:55:48 +09:00
Michel Dänzer	761d80ddab	radeonsi: Fix tiling mode index for stencil resources We are currently only dealing with depth-only or stencil-only resources here, not with resources having both depth and stencil[0]. In both cases, the tiling mode index is in the tile_mode field, not in the stencil_tile_mode field. [0] Add an assertion for that. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-30 18:55:48 +09:00
Chia-I Wu	594e1a2f4b	ilo: fix format of edge flag pointer The VE format of edge flag pointers was changed in `780ce576bb`. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-30 16:41:32 +08:00
Chia-I Wu	2d13b5ac81	ilo: add a pass to finalize ilo_ve_state Add finalize_vertex_elements() to finalize ilo_ve_state. This fixes a potential issue with URB entry allocation for VS and move the complexity of gen6_3DSTATE_VERTEX_ELEMENTS() to the new function. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-30 16:41:32 +08:00
Chia-I Wu	2b4c8ffc30	ilo: precalculate aligned depth buffer size To replace the hacky zs_align_surface(). Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-30 16:41:31 +08:00
Chia-I Wu	343b014b57	ilo: use dynamic bo for rectlist vertices The size is always 24 bytes. We can upload them to the dynamic buffer. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-30 16:41:31 +08:00
Thomas Hellstrom	46537f1d03	st/xa: Fix regression in xa_yuv_planar_blit() Commit "st/xa: scissor to help tilers" broke xa_yuv_planar_blit() and vmwgfx textured video. Fix this by implementing scissors also in the yuv draw path. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Cc: Rob Clark <robclark@freedesktop.org> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-30 08:31:33 +02:00
Kenneth Graunke	68627235f2	i965: Delete intel_chipset.h. Unused; it was replaced by include/pci_ids/i965_pci_ids.h long ago. Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-29 20:10:00 -07:00
Alex Henrie	3bea907797	driconf: Correct and update Catalan translation Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-09-29 17:45:41 -07:00
Alex Henrie	33a7d0d040	driconf: Update Spanish translation Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-09-29 17:45:26 -07:00
Alex Henrie	3b34b876f4	driconf: Synchronize po files Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-09-29 17:45:10 -07:00
Eric Anholt	4ceaad14ff	vc4: Don't try to do stores to buffers that aren't bound. The code was kind of mixed up what buffers were getting stored in the case that a resolve bit was unset (which are set based on the GL state at draw time) and the buffer wasn't actually bound. In particular, depth-only rendering would store the color buffer contents, which happen to be pointing at the depth buffer. Thanks to clearing out the resolve bits for things we really can't resolve, now I can drop the safety checks for buffer presence around the actual stores. Fixes 42 piglit tests.	2014-09-29 17:44:15 -07:00
Eric Anholt	1d42aa8358	vc4: Shove some depth comparison bits down to where they're used.	2014-09-29 17:44:15 -07:00
Matt Turner	66ab9c22fe	i965: Use BRW_MATH_DATA_SCALAR when source regioning is scalar. Notice the mistaken (but harmless) argument swapping in brw_math_invert(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-29 15:59:19 -07:00
Matt Turner	a0df258f89	i965/compaction: Move variable declarations to their uses. Tested-by: Mark Janes <mark.a.janes@intel.com>	2014-09-29 15:59:16 -07:00
Matt Turner	a36631b74c	i965/compaction: Simplify jump target code. My attempts to clarify the code with _compacted/_uncompacted prefixed variables apparently failed. Hopefully this is clearer. In any case, the previous code wasn't clear enough to gcc to let it optimize division by a power of two into a shift. No problems now. Also, the previous code (in the ADD case) didn't work on 32-bit x86, due to complicated set of interactions best summed up as unsigned division and compiler optimizations. Tested-by: Mark Janes <mark.a.janes@intel.com>	2014-09-29 15:58:57 -07:00
Rob Clark	dce96f6da2	freedreno/a3xx: re-emit shaders on variant change We need to keep track if a state change other than frag/vert shader state will trigger us to need a different shader variant, and if necessary mark the appropriate shader state as dirty. Otherwise we will forget to re-emit the shader state. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-29 18:30:43 -04:00
Rob Clark	3aaab87563	freedreno/ir3: add some cmdline args Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-29 18:30:43 -04:00
Rob Clark	7cdd467994	freedreno/a3xx: add support to emulate GL_CLAMP Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-29 18:30:43 -04:00
Rob Clark	3541705816	freedreno: add texcoord clamp support to lowering This is for hw that needs to emulate some texture wrap modes (like CLAMP) with some help from the shader. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-29 18:30:43 -04:00
Rob Clark	a6746d1124	freedreno: move bind_sampler_states to per-generation Keep the existing function as a common helper. But this lets us move an a2xx specific hack out of common code. And the PIPE_TEX_WRAP_CLAMP emulation will require an a3xx specific hack. So rather than piling on hacks, split this out. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-29 18:30:42 -04:00
Rob Clark	7e20c09d4a	freedreno/a3xx: fix border color order Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-29 18:30:42 -04:00
Rob Clark	c61133046e	freedreno/a3xx: add 32bit integer vtx formats Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-29 18:30:42 -04:00
Eric Anholt	fc4b5b85ce	vc4: Add support for GL 1.1's stupid CLAMP mode. We just clamp the incoming texture coordinates. This breaks the lambda calculation, but it gets the piglit tests to pass. This is the same behavior as in i965.	2014-09-29 14:12:33 -07:00
Eric Anholt	ae22f5aa14	vc4: Add support for texture border color. One spot in the docs says that it's stored at a miplevel just beyond the last miplevel, which was scary. But really, you just load it as the R coordinate (which conflicts with cubemaps, but you don't do border clamping on cubes).	2014-09-29 13:48:08 -07:00
Eric Anholt	b65761f764	vc4: Add the necessary stubs for occlusion queries. We have to expose them for GL 2.0, but we just always return a value of 0. We should be advertising 0 query bits instead of 64, but gallium doesn't have plumbing for that yet. At least this stops the segfaults.	2014-09-29 11:51:09 -07:00
Eric Anholt	76cd9955d9	vc4: Optimize out silly SUBs of 0. Drops instructions on vs-temp-array-mat4-index-col-row-wr.shader_test, which I was looking at because it's failing to register allocate.	2014-09-29 11:33:34 -07:00
Eric Anholt	64122b16ce	vc4: Dump constant uniform values in VC4_DEBUG=qir. Definitely helps when trying to understand and optimize a program.	2014-09-29 11:33:34 -07:00
Eric Anholt	3311513041	vc4: Turn a SEL_X_Y(x, 0) into SEL_X_0(x). This may reduce register pressure and uniform counts. Drops a bunch of 0 uniform loads on vs-temp-array-mat4-index-col-row-wr.shader_test, which is failing to register allocate.	2014-09-29 11:33:34 -07:00
Eric Anholt	730267eb23	vc4: Add support for texture cube maps. It's not passing some of the piglit tests, because it looks like at small miplevels some contents from surrounding faces are getting filtered in at the corners. It does get 7 new tests passing.	2014-09-29 11:29:28 -07:00
Eric Anholt	c4245d8b2e	vc4: Rename the slice's size0. In the other related fields, "0" refers to the size of the first miplevel, while this is a field in a slice. The other implicit slices we have (cubemap layers) don't vary in size compared to the first one.	2014-09-29 11:26:43 -07:00
Eric Anholt	7a85ebf6e2	vc4: Stop trying to reuse temporaries that store uniform values. Almost always, the MOV will get copy propagated out. Even if it doesn't, it's probably better to just reload the uniform at next use (to reduce register pressure) rather than try to save instruction count. I was looking at this because in the presence of texturing (which calls add_uniform() directly to get the uniform load forced into the instruction) the c->uniform_contents indices don't match 1:1 with the temporary qregs.	2014-09-29 10:07:24 -07:00
Tapani Pälli	3386e95994	egl: setup screen iterator before using it commit `4ed23fd` broke creation of pbuffer surfaces, patch fixes the failure, noticed when running chrome with '--use-gl=egl'. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-09-29 15:12:11 +03:00
Chia-I Wu	8c7c0f7114	ilo: fix a missing 'else' An 'else' is missing in the disassembler. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-29 16:58:36 +08:00
Kalyan Kondapally	66a2fe4cf9	glsl: Allow texture2DProjLod and textureCubeLod in GL ES According to GLES (i.e. 1.0 and above) spec textureCubeLod and texture2DProjLod are built in functions. We seem to disable support for these functions with GLES. This patch enables the support. Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84355	2014-09-29 11:10:38 +03:00
Rob Clark	40aabc0e80	configure.ac: bump libdrm_freedreno requirement We need 2.4.57 for fd_bo_dmabuf() / fd_bo_from_dmabuf(). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-28 12:46:17 -04:00
Matt Turner	5ccdc23a86	glsl: Recognize open-coded pow(x, y). pow(x, y) is equivalent to exp(log(x) * y). instructions in affected programs: 578 -> 458 (-20.76%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-27 12:18:37 -07:00
Matt Turner	e9aee2572a	i965/fs: Don't invalidate live intervals in saturate propagation. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-09-27 12:18:37 -07:00
Matt Turner	b9689c6bda	i965/fs: Ignore mov.sat instructions in interference check in sat prop. When an instruction's result was consumed by multiple mov.sat instructions, we would decide that we couldn't move the saturate modifier because something else was using the result, even though it was just another mov.sat! total instructions in shared programs: 4275598 -> 4274842 (-0.02%) instructions in affected programs: 75634 -> 74878 (-1.00%) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-09-27 12:18:37 -07:00

1 2 3 4 5 ...

65748 commits