fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-07 11:28:05 +02:00

Author	SHA1	Message	Date
Ilia Mirkin	89f00f749f	a4xx: make sure to actually clamp depth as requested We were previously ... not clamping. I guess this meant that everything got clamped to 1/0, which was enough to pass the existing tests. Or perhaps the clamping would only happen to the rasterized depth value and not the frag shader's output depth value. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-08-19 19:40:04 -04:00
Ilia Mirkin	cd8e30452f	a4xx: only disable depth clipping, not all clipping, when requested The previous bit disables the whole clipper, including the regular viewport-related clipping that would go on. The two new bits disable near and far clipping (separately, as verified with the depth-clamp-range piglit). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-08-19 19:40:04 -04:00
Eric Anholt	5adee83806	vc4: Switch store_output to using nir_lower_io_to_scalar / component.	2016-08-19 13:11:36 -07:00
Eric Anholt	f8fecc396a	vc4: Use the intrinsic's first_component for vattr VPM index. Avoids another multiplication by 4 of the base in the NIR.	2016-08-19 13:11:36 -07:00
Eric Anholt	cbf8c19410	vc4: Convert to using nir_lower_io_scalar for FS inputs. The scalarizing of FS inputs can be done in a non-driver-dependent manner, so extract it out of the driver.	2016-08-19 13:11:36 -07:00
Eric Anholt	c30b22c421	vc4: Switch to using the intrinsic accessors. The const_index[] values have always felt magic, and this documents them a bit better.	2016-08-19 13:11:36 -07:00
Eric Anholt	9f1411d1ec	nir: Add an IO scalarizing pass using the intrinsic's first_component. vc4 wants to have per-scalar IO load/stores so that dead code elimination can happen on a more granular basis, which it has been doing in the backend using a multiplication by 4 of the intrinsic's driver_location. We can represent it properly in the NIR using the first_component field, though. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 13:11:36 -07:00
Eric Anholt	c35f979220	nir: Add nir_builder support for individual system value loads. The previous nir_load_system_value(b, nir_intrinsic_load_whatever), 0) was rather verbose, when system values should be easy to generate. The index is left out because only one system value had an index included in it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 13:11:36 -07:00
Eric Anholt	24728637e2	nir: Move the undef of nir_intrinsics.h macros to the .h. I wanted to include this from nir_builder as well, so it also needed the undefs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 13:11:36 -07:00
Eric Anholt	c078c41520	ttn: Use nir_load_front_face instead of the TGSI-style input. This reduces the diff between GLSL-to-NIR and TGSI-to-NIR, and gives NIR more optimization to work on. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 13:11:36 -07:00
Eric Anholt	3f607f9e4f	nir: Use the system-value front face for twoside lowering. GLSL-to-NIR generates system value usage, and vc4/freedreno would both like the system value instead of the varying, so switch this pass over to it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 13:11:36 -07:00
Eric Anholt	ed92241d78	ttn: Make FRAG_RESULT_DEPTH be a float variable to match gtn and ptn. This lets TTN-using drivers handle FRAG_RESULT_DEPTH the same between all their source paths. Reviewed-by: Rob Clark <robdclark@gmail.com>	2016-08-19 13:11:36 -07:00
Eric Anholt	d80d03b830	vc4: Dump the TGSI before trying to convert it to NIR. In the case of debugging a crash in TTN, this is nice to have.	2016-08-19 13:11:36 -07:00
Boyuan Zhang	c0be51f270	radeon/vce: set flag based on dual instance enablement Set the flag on when dual instance encoding is supported, otherwise set it to off. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>	2016-08-19 10:36:44 -04:00
Boyuan Zhang	c59628d11b	st/va: enable dual instances encode by sync surface This patch improves the performance of Vaapi Encode by enabling dual instances encoding. flush function is not called after each end_frame call. radeon/vce will do flush whenever 2 frames are submitted for encoding. Implement sync surface function to flush only if the frame hasn't been flushed yet. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-08-19 10:36:44 -04:00
Jason Ekstrand	93d2b5c576	i965/blorp: Remove no longer used state setup helpers Now that we're using genxml for everything, we no longer need the hand-rolled state emit helpers. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	16a9fcbbb6	i965/blorp: Use genxml for gen8-9 state setup Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	e198983c61	i965/blorp: Use genxml for gen7 state setup Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	344841fcba	i965/blorp: Add genxml-based vertex setup helpers Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	7b035fd0c9	i965/blorp: Add a helper for emitting surface states The new helper emits surface states and the binding table in one go. It's nice to have it pulled out of the main blorp_exec function. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	48f13545dd	i965/blorp: Add genxml-based sampler state emit function Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	eb655c4fc2	i965/blorp: Add genxml-based dynamic state emit functions Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	c8bc1ae96a	i965: Move gen6_blorp.c to a file that gets recompiled per-gen At the moment, it's only used for gen6 but that will change soon. We use the genX prefix for recompiled things in the Vulkan driver. It isn't great, but it seems to have worked ok. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	eea6a66222	i965/blorp/gen6: Use genxml packing structs for state setup Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	b5c20a98c1	i965/blorp: Stop setting point and line rasterization rules Blorp never uses points or lines and the default values of 0 are perfectly fine. Explicitly setting them is just noise. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	5e2dd7a381	i965/blorp/gen8: Move viewport setup to after wm state This matches gen6 and gen7. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	802f0f8596	i965/blorp/gen6-7: Move multisample setup to right after samplers This mimics gen8 blorp Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	75304fdbd8	i965/blorp/gen6-7: Move surfaces and samplers closer together This mimics what we do on gen8. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	8b0426ddd4	i965/blorp/gen7-8: Emit depth stencil state with CC and BLEND All three go together on SNB so let's keep them together for gen7+ as well. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	38c1909c0a	i965/blorp/gen6: Move constant disables higher up This is what gen7-8 do and it's a bit cleaner. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	e0bc2cb145	i965/blorp: Don't clear an empty region Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	e4d6ffbbf6	i965/blorp: Move the non-static blorp state setup helpers to another file We're about to start replacing blorp state setup code with packing structs and we want to feel free to delete files as we go. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	50768a3879	i965/blorp: Make gen6 VS and GS disable helpers static Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	949a892026	i965: Roll intel_reg.h into brw_defines.h More than half of the stuff in intel_reg.h had nothing whatsoever to do with registers and really belongs in brw_defines.h anyway. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	8455f9430f	i965: Stop including brw_defines.h in brw_state.h Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	4c3acf94da	i965/state: Move is_drawing_lines/points to gen6_clip_state.c Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	04f3594cd5	genxml/gen9: Make 3DSTATE_SBE::AttributeActiveComponentFormat an array Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	bfdff28d68	genxml: Add a uint MOCS field to VERTEX_BUFFER_STATE Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	373613fa4b	genxml: Make a couple of VERTEX_BUFFER_STATE fields boolean Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	29f1f945a6	genxml: Make VERTEX_ELEMENT_STATE::Valid a bool Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	eb2589cba6	genxml/gen6: Make SAMPLER_STATE look a bit more like gen7 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	2a84e40dae	genxml: Add a uint MOCS field to DEPTH_BUFFER packets This is easier than dealing with structs all the time Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	3f1022b029	genxml/gen6: Make "Depth Clear Value" a uint The actual data storred is in float, UNORM24, or UNORM16 depending on the actual depth format. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	be62e7645e	genxml/gen6: Add the 3D_Prim_Topo_Type enum Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	cca95a7bd6	genxml/gen6: Fix the length of 3DSTATE_WM Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	3ddb6f6e2a	genxml/gen6: Add a Surface Base Address field to HIER_DEPTH_BUFFER Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Jason Ekstrand	be52e16dbc	genxml/gen6: Add uint MOCS fields for most things Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-19 03:11:29 -07:00
Kenneth Graunke	7d0554f341	nir: Rely on the fact that bcsel takes a well formed boolean. According to Connor, it's safe to assume that the first operand of bcsel, as well as the operand of b2f and b2i, must be well formed booleans. https://lists.freedesktop.org/archives/mesa-dev/2016-August/125658.html With the previous improvements to a@bool handling, this now has no change in shader-db instruction counts on Broadwell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-19 02:05:23 -07:00
Francisco Jerez	7ceb42ccc5	i965/sched: Change the scheduling heuristics to favor early program termination. This uses the unblocked time of the exit assigned to each available node to attempt to unblock exit nodes as early as possible, potentially reducing the runtime of the shader when an exit branch is taken. There is a natural trade-off between terminating the program as early as possible and reducing the worst-case latency of the program as a whole (since this will typically move exit-unblocking nodes closer to its dependencies potentially causing additional stalls of the execution pipeline), but in practice the bandwidth and ALU cycle savings from terminating the program earlier tend to outweigh the slight increase in worst-case program execution latency, so it makes sense to prefer nodes likely to unblock an earlier exit regardless of the latency benefits of other available nodes. I haven't observed any benchmark regressions from this change after testing on VLV, HSW, BDW, BSW and SKL. The FPS of the GfxBench Manhattan benchmark increases by 10%-20% and the FPS of Unigine Valley improves by roughly 5% depending on the platform and settings. The change to the register pressure-sensitive heuristic is rather conservative and gives precedence to the existing heuristic in order to avoid increasing register pressure and causing spill count and SIMD width regressions in shader-db. It may make sense to revisit this with additional performance data. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 20:05:00 -07:00
Francisco Jerez	4147ca75d5	i965/sched: Assign a preferred exit node to each node of the dependency graph. This adds a bit of metadata to schedule_node that will be used to compare available nodes in the scheduling heuristic code based on which of them unblocks the earliest successor exit node. Note that assigning exit nodes wouldn't be necessary in a bottom-up scheduler because we could achieve the same effect by scheduling the exit nodes themselves appropriately. No shader-db changes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-18 20:05:00 -07:00

1 2 3 4 5 ...

84092 commits