fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-07 20:08:06 +02:00

Author	SHA1	Message	Date
Topi Pohjolainen	8b2332e3d1	i965: Allow texture surface state setup to be used by blorp Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:42:10 +03:00
Topi Pohjolainen	0ad83d222b	i965/blorp: Prepare sampling for gen9 v2 (Ken): Added switch cases for gen8/9 in texel_fetch(). These were wrongly introduced in blit-enabling patch. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:41:40 +03:00
Topi Pohjolainen	328ab6c268	i965/blorp: Prepare render target write for gen8 v2 (Ken): Use payload directly instead of retyping it into vec8. Drop the implied header, it isn't used for gen6+ anyway. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:40:33 +03:00
Topi Pohjolainen	135f00e666	i965/blorp/gen6: Prepare vertex buffer setup logic for gen8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:37:06 +03:00
Topi Pohjolainen	395abb9c3b	i965/blorp/gen7: Expose state setup applicable to gen8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:36:53 +03:00
Topi Pohjolainen	ede09e672a	i965/blorp: Use 8k chunk size for urb allocation Previously, we hardcoded "VS URB Starting Address" to 2 (in 8kB chunks), which meant VS URB data would start at an offset of 16kB. However, on Haswell GT3 and Gen8+, we allocate the first 32kB for the push constant region. This means that the PS push constant and VS URB data regions overlap, which can lead to corruption. v2 (Ken): Better description of the change, and do not change vs_size from 2 to 1. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:36:26 +03:00
Topi Pohjolainen	e04b3cdf33	i965/blorp/gen7: Prepare re-using for gen8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:36:14 +03:00
Topi Pohjolainen	f1ddfa8512	i965/blorp: Let compiler calculate the vertex buffer size Currently the size is sizeof(float) times too large. One reserves GEN6_BLORP_VBO_SIZE many floats whereas GEN6_BLORP_VBO_SIZE stands for the size of vertex buffer in bytes. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:35:58 +03:00
Topi Pohjolainen	4c526370ca	i965/gen8: Expose state base address setup Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:35:45 +03:00
Topi Pohjolainen	9949103756	i965/gen8: Expose surface state helpers Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:35:34 +03:00
Topi Pohjolainen	4f1d9f2879	i965/gen9: Use correct size for DS_STATE Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-04-21 08:32:12 +03:00
Roland Scheidegger	0295db2a8b	glsl: add forgotten textureOffset function for sampler2DArrayShadow This was part of EXT_gpu_shader4 - as such it should have been supported by glsl 130. It was however forgotten, and not added until glsl 430 - with the wrong syntax no less (glsl 430 mentions it was overlooked). glsl 440 (but revision 8 only) fixed this finally for good. At least nvidia supports this with just version glsl version 1.30 as well (the spec doesn't explicitly say it should be supported retroactively), so just add this to the other glsl 130 textureOffset functions. Passes a (hacked) piglit tex-miplevel-selection test (2DArrayShadow textureOffset -auto) with llvmpipe. v2: fix up comment (by Ian), add testing to commit message. Reviewed-by: Dave Airlie <airlied@gmail.com>	2016-04-21 02:38:46 +02:00
Kenneth Graunke	d8c8f4203f	i965: Fix interpolateAtSample() on single sampled buffers. Fixes dEQP-GLES31.functional.shaders.multisample_interpolation tests: - interpolate_at_sample.non_multisample_buffer.sample_n_default_framebuffer - interpolate_at_sample.non_multisample_buffer.sample_n_singlesample_rbo - interpolate_at_sample.non_multisample_buffer.sample_n_singlesample_texture Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	447d3eec6a	i965: Fix gl_SampleMaskIn[] in per-sample shading mode. The coverage mask is not sufficient - in per-sample mode, we also need to AND with a mask representing the samples being processed by the current fragment shader invocation. Fixes 18 dEQP-GLES31.functional.shaders.sample_variables tests: sample_mask_in.bit_count_per_sample.multisample_{rbo,texture}_{1,2,4,8} sample_mask_in.bit_count_per_two_samples.multisample_{rbo,texture}_{4,8} sample_mask_in.bits_unique_per_sample.multisample_{rbo,texture}_{1,2,4,8} sample_mask_in.bits_unique_per_two_samples.multisample_{rbo,texture}_{4,8} Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	66a725570c	i965: Only enable oMask output when there's a multisample FBO. The ARB_sample_shading specification says that setting gl_SampleMask bits to 0 means that the corresponding sample "should be considered uncovered for the purposes of multisample fragment operations (Section 4.1.3)." The OpenGL 4.4 specification, section 17.3.3 ("Multisample Fragment Operations") specifies: "No changes to the fragment alpha or coverage values are made at this step if MULTISAMPLE is disabled, or if the value of SAMPLE_BUFFERS is not one." oMask output alters coverage masks and can kill pixels. We need to disable it in the above case, which conveniently corresponds to key->multisample_fbo being false. Khronos bug #12188 also spells this out clearly: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=12188 Fixes two Piglit tests: tests/spec/arb_sample_shading/builtin-gl-sample-mask-simple 0 tests/spec/arb_sample_shading/builtin-gl-sample-mask 0 Fixes 21 ES3 conformance tests: ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_0 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_1 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_2 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_3 ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_7 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_3 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_4 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_5 ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_7 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_2 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_3 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_4 ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_6 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_zero ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_0 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_2 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_5 ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_7 Fixes 9 dEQP-GLES31.functional.shaders.sample_variables tests: sample_mask.discard_half_per_pixel.default_framebuffer sample_mask.discard_half_per_pixel.singlesample_rbo sample_mask.discard_half_per_pixel.singlesample_texture sample_mask.discard_half_per_sample.default_framebuffer sample_mask.discard_half_per_sample.singlesample_rbo sample_mask.discard_half_per_sample.singlesample_texture sample_mask.discard_half_per_two_samples.default_framebuffer sample_mask.discard_half_per_two_samples.singlesample_rbo sample_mask.discard_half_per_two_samples.singlesample_texture Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	81407531e0	i965: Generalize wm_key->compute_sample_id to wm_key->multisample_fbo. I'm going to need a key entry meaning "we have a multisample FBO, and multisampling is enabled" in an upcoming patch. This is basically wm_key->compute_sample_id, except that it also checks that the SAMPLE_ID system value is read. The only use of wm_key->compute_sample_id is in emit_sampleid_setup(), which is only called when handling the SAMPLE_ID system value. So we can just eliminate the check and generalize the field. v2: Also update the Vulkan driver. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	de0a46a040	i965: Delete now dead persample_2x FS program key flag. This was only used by the old gl_SampleID calculations. The new code doesn't need to handle 2x specially. v2: Delete it from the Vulkan driver, too. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	57118a19da	i965: Simplify gl_SampleID setup on Gen8+. On Gen7+, the thread payload provides the sample ID - we can read it in two instructions, without any elaborate calculations. We don't even need a state dependency - this will properly produce zero in the non-MSAA case. Unfortunately, we need the state flag anyway, so we may as well continue to use it to produce a single MOV 0 instead of SHR/AND. For some reason, the sample ID field is always zero on Gen7/7.5, so we can't use this yet. However, it works fine on Gen8+. So, land the code and use it where it's working, and leave a TODO for later. v2: Fix register types in the comment (caught by Matt Turner!). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Kenneth Graunke	528255b0b1	i965: Flip key->compute_sample_id check. This just moves the simple case first. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 16:18:47 -07:00
Bas Nieuwenhuizen	43ed1f73f8	st/mesa: Use correct size for compute CAPs. Some CAPs are stored as 64-bit value while Mesa stores the related constant as 32-bit value. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-04-21 00:27:01 +02:00
Kenneth Graunke	60a17d0718	i965: Properly handle integer types in opt_vector_float(). Previously, opt_vector_float() always interpreted MOV sources as floating point, and always created a MOV with a F-type destination. This meant that we could mess up sequences of integer loads, such as: mov vgrf6.0.x:D, 0D mov vgrf6.0.y:D, 1D mov vgrf6.0.z:D, 2D mov vgrf6.0.w:D, 3D Here, integer 0/1/2/3 become approximately 0.0f, so we generated: mov vgrf6.0:F, [0F, 0F, 0F, 0F] which is clearly wrong. We can properly handle this by converting integer values to float (rather than bitcasting), and emitting a type converting MOV: mov vgrf6.0:D, [0F, 1F, 2F, 3F] To do this, see first see if the integer values (converted to float) are representable. If so, we use a D-type MOV. If not, we then try the floating point values and an F-type MOV. We make zero not impose type restrictions. This is important because 0D would imply a D-type MOV, but is often used in sequences such as MOV 0D, MOV 0x3f800000D, where we want to use an F-type MOV. Fixes about 54 dEQP-GLES2 failures with the vec4 VS backend. This recently became visible due to changes in opt_vector_float() which made it optimize more cases, but it was a pre-existing bug. Apparently it also manages to turn more integer loads into VFs, producing the following shader-db statistics on Haswell: total instructions in shared programs: 7084195 -> 7082191 (-0.03%) instructions in affected programs: 246027 -> 244023 (-0.81%) helped: 1937 total cycles in shared programs: 65669642 -> 65651968 (-0.03%) cycles in affected programs: 531064 -> 513390 (-3.33%) helped: 1177 v2: Handle the type of zero better. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 15:05:13 -07:00
Kenneth Graunke	1aa28f3509	i965: Make opt_vector_float() only handle non-type-conversion MOVs. We don't handle this properly - we'd have to perform the type conversion before trying to convert the value to a VF. While we could do that, it doesn't seem particularly useful - most vector loads should be consistently typed (all float or all integer). As a special case, we do allow type-converting MOVs of integer 0, as it's represented the same regardless of the type. I believe this case does actually come up. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 15:05:13 -07:00
Kenneth Graunke	2a25a5142b	i965: Fold vectorize_mov() back into the one caller. After the previous patch, this helper is only called in one place. So, just fold it back in - there are a lot of parameters here and not much code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 15:05:13 -07:00
Kenneth Graunke	9967561158	i965: Rework opt_vector_float() control flow. This reworks opt_vector_float() so that there's only one place that flushes out any accumulated state and emits a VF. v2: Don't break the sequence for non-representable numbers - just skip recording their values. Only break it for non-MOVs or register changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-04-20 15:05:13 -07:00
Jason Ekstrand	50018522d2	anv: s/anv_batch_emit_blk/anv_batch_emit/ Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	0a45395902	anv: Remove the old emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	86c52bc757	anv/gen7_pipeline: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	744e133431	anv/gen7_cmd_buffer: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	cae2f14947	anv/device: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	932c353592	anv/state: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	9e9f3f4e71	anv/gen8_pipeline: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	dba3727bea	anv/genX_pipeline: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	a48f8340d9	anv/gen8_cmd_buffer: Use the new emit macro Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	8a6ced83e9	anv/cmd_buffer: Use the new emit macro for quaries Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	db25e1eec5	anv/cmd_buffer: Use the new emit macro for DRAWING_RECTANGLE Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	deb13870d8	anv/cmd_buffer: Use the new emit macro for compute shader dispatch Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	06fc7fa684	anv/cmd_buffer: Use the new emit macro for 3DSTATE_CONSTANT Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	a71ded0e18	anv/cmd_buffer: Use the new emit macro for DEPTH/STENCIL_BUFFER Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	56453eeaff	anv/cmd_buffer: Use the new emit macro for PIPE_CONTROL and STATE_BASE_ADDRESS Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	1d4d6852b4	anv/cmd_buffer: Use the new emit macro for 3DPRIMITIVE commands Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Jason Ekstrand	64ad2d3bcd	anv: Add a new block-based batch emit macro This new macro uses a for loop to create an actual code block in which to place the macro setup code. One advantage of this is that you syntatically use braces instead of parentheses. Another is that the code in the block doesn't even get executed if anv_batch_emit_dwords fails. Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2016-04-20 14:54:09 -07:00
Samuel Pitoiset	d30768025a	gk110/ir: make use of IMUL32I for all immediates Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-20 22:55:36 +02:00
Samuel Pitoiset	17a37c78fc	gk110/ir: do not overwrite def value with zero for EXCH ops This is only valid for other atomic operations (including CAS). This fixes an invalid opcode error from dmesg. While we are it, make sure to initialize global addr to 0 for other atomic operations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-20 22:55:33 +02:00
Marcin Ślusarz	3caf2e89aa	anv: fix build without Wayland platform Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-20 11:12:10 -07:00
Laurent Carlier	6c952d8ac7	anv: fix building on i686 with -mcpu=generic mcpu=generic doesn't enable sse2, and anvil definitly needs it Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-20 10:48:11 -07:00
Jason Ekstrand	2ef7aef322	spirv: Trivially handle the NonWriteable decoration Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-20 10:33:23 -07:00
Connor Abbott	b6dc940ec2	nir: rename nir_foreach_block() to nir_foreach_block_call() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-04-20 09:47:05 -07:00
Samuel Pitoiset	7143068296	nvc0: avoid tex read fault from compute shaders on GK110 After some investigation, it seems like that disabling the UNK02C4 command avoid a read fault with texelFetch() from a compute shader. I have no clue on what this method actually does, but this avoid the GPU to hang with basic-texelFetch.shader_test without introducing any compute-related regressions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-04-20 18:28:47 +02:00
Jason Ekstrand	87a4fb516e	i965/vec4: Always split uniforms in array_access_to_pull_constants Normally, we split uniforms at the end but in Vulkan, we bail because we don't want pull constants. However, we still need them split because pack_uniforms relies on it. I really don't like this patch not because it doesn't work (it does) but because now that we're using MOV_INDIRECT, uniform numbers and sizes don't really matter anymore. In the FS backend, uniform splitting and packing is handled all at once (actual re-assignment of locations happens later) and we really should do it that way in vec4 eventually as well. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001	2016-04-20 09:15:01 -07:00
Jason Ekstrand	b3f43822c7	i965/vec4: Use the correct offset for the swizzle shift in push constants This was actually caught by Ken in review the first time around but somehow didn't get fixed before the patches were pushed. :-( Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001	2016-04-20 09:15:01 -07:00

1 2 3 4 5 ...

80405 commits