fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-08 09:08:10 +02:00

Author	SHA1	Message	Date
Jordan Justen	de65d4dcaf	anv: Fix build without VALGRIND Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-06 15:54:51 -08:00
Jason Ekstrand	5bbf060ece	i965/compiler: Enable more lowering in NIR We don't need these for GLSL or ARB, but we need them for SPIR-V	2016-01-06 15:30:53 -08:00
Jason Ekstrand	573351cb0f	nir/algebraic: Add more lowering This commit adds lowering options for the following opcodes: - nir_op_fmod - nir_op_bitfield_insert - nir_op_uadd_carry - nir_op_usub_borrow	2016-01-06 15:30:53 -08:00
Jason Ekstrand	1f503603d3	nir/opcodes: Fix the folding expression for usub_borrow	2016-01-06 15:30:53 -08:00
Jason Ekstrand	22804de110	nir/spirv: Properly implement Modf	2016-01-06 15:30:53 -08:00
Jason Ekstrand	1f3593d8a1	nir/builder: Add a helper for storing to a deref	2016-01-06 15:30:53 -08:00
Sarah Sharp	39c41be50d	mesa: Add KBL PCI IDs and platform information. Add PCI IDs for the Intel Kabylake platforms. The IDs are taken directly from the Linux kernel patches, which are under review: http://lists.freedesktop.org/archives/intel-gfx/2015-October/078967.html http://cgit.freedesktop.org/~vivijim/drm-intel/log/?h=kbl-upstream-v2 The Kabylake PCI IDs taken from the kernel are rearranged to be in order of GT type, then PCI ID. Please note that if this patch is backported, the following fixes will need to be added before this patch: commit `28ed1e08e8` "i965/skl: Remove early platform support" commit `c1e38ad370` "i965/skl: Use larger URB size where available." Thanks to Ben for fixing a bug around setting urb.size, and being patient with my questions about what the various fields mean. Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Suggested-by: Ben Widawsky <benjamin.widawsky@intel.com> Tested-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (KBL-GT2) Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2016-01-06 15:11:00 -08:00
Sinclair Yeh	0819287f56	svga: Rename SVGA_HINT_FLAG_DRAW_EMITTED Rename SVGA_HINT_FLAG_DRAW_EMITTED to SVGA_HINT_FLAG_CAN_PRE_FLUSH because preemptive flush can be unblocked by more commands than draw. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 16:04:45 -07:00
Sinclair Yeh	9ccc716534	svga: allow preemptive flushing on DMA, update, and readback commands The existing code effectively turns off preemptive flushing for all but the regions used for draws. This turns out to be overly restrictive as some memory regions, e.g. GMR, may never get a draw when used as a DMA upload staging area, causing problems for apps that upload a large amount of textures, e.g. Unigine Heaven. This patch fixes the Unigine Heaven memory allocation error and has been verified to not cause a regression in the previous extended retina display issue. Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 16:03:33 -07:00
Charmaine Lee	b074a5b02d	svga: skip vertex attribute instruction with zero usage_mask In emit_input_declarations(), we are skipping declarations for those registers that are not being used. But in emit_vertex_attrib_instructions(), we are still emitting instructions to tweak the vertex attributes even if they are not being used. This causes an assert in the backend because an input register is not declared in the shader. This patch fixes the problem by skipping the instruction if the vertex attribute is not being used. Changes in this patch is originated from the code snippet from Jose as suggested in bug 1530161. Tested with piglit, Heaven, Turbine, glretrace. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 16:01:38 -07:00
Brian Paul	b59fad8478	st/mesa: minor clean-ups in st_atom.c Remove useless comment. Reformat code.	2016-01-06 15:53:47 -07:00
Brian Paul	85444ab08b	st/mesa: replace bitmap size checks with assertion The _mesa_Bitmap() caller already checks for zero-sized bitmaps.	2016-01-06 15:53:47 -07:00
Brian Paul	18038b9fd6	st/mesa: check texture target in allocate_full_mipmap() Some kinds of textures never have mipmaps. 3D textures seldom have mipmaps. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:47 -07:00
Brian Paul	c032ae85ee	st/mesa: move mipmap allocation check logic into a function Better readability and easier to extend. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	0d39b5fc3b	main: s/GLuint/GLbitfield for state bitmasks Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	c81ddc2092	vbo: s/GLuint/GLbitfield/ for state bitmasks Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	3c0521cd0f	st/mesa: use GLbitfield in st_state_flags, add comments Use GLbitfield instead of GLuint to be consistent with other variables. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	4cd1bd46ed	s/GLuint/GLbitfield/ for st_invalidate_state() parameter To match dd_function_table::UpdateState(). Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	2cc52801c0	st/mesa: be more careful about state validation in st_Bitmap() If the only dirty state is mesa's _NEW_PROGRAM_CONSTANTS flag, we can skip state validation before drawing a bitmap since that state doesn't effect bitmap rendering. This further increases the performance of the ipers demo on llvmpipe to about what it was before commit `36c93a6fae`. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	b6bcf08641	st/mesa: move bitmap cache flushing out of state validation Just do it where needed (before drawing, clearing, etc). Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	c28d72a347	st/mesa: check state->mesa in early return check in st_validate_state() We were checking the dirty->st flags but not the dirty->mesa flags. When we took the early return, we didn't clear the dirty->mesa flags so the next time we called st_validate_state() we'd often flush the glBitmap cache. And since st_validate_state() is called from st_Bitmap(), it meant we flushed the bitmap cache for every glBitmap() call. This change seems to recover most of the performance loss observed with the ipers demo on llvmpipe since commit commit `36c93a6fae`. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	c75d00e054	st/mesa: protect debug printf() with a conditional instead of comment	2016-01-06 15:53:46 -07:00
Brian Paul	72d6bbca5b	st/mesa: fix comment indentation in st_flush_bitmap_cache()	2016-01-06 15:53:46 -07:00
Timothy Arceri	e58be8ac0e	glsl: fix varying slot allocation for blocks and structs with explicit locations Previously each member was being counted as using a single slot, count_attribute_slots() fixes the count for array and struct members. Also don't assign a negitive to the unsigned expl_location variable. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-07 09:44:32 +11:00
Timothy Arceri	47dde2bd45	glsl: don't try adding built-ins to explicit locations bitmask Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 09:06:26 +11:00
Timothy Arceri	ac6e2c2056	glsl: fix overlapping of varying locations for arrays and structs Previously we were only reserving a single location for arrays and structs. We also didn't take into account implicit locations clashing with explicit locations when assigning locations for their arrays or structs. This patch fixes both issues. V5: fix regression for patch inputs/outputs in tessellation shaders V4: just use count_attribute_slots() to get the number of slots, also calculate the correct number of slots to reserve for gs and tess stages by making use of the new get_varying_type() helper. V3: handle arrays of structs V2: also fix for arrays of arrays and structs. Acked-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 09:06:20 +11:00
Timothy Arceri	5907a02ab6	glsl: create helper to remove outer vertex index array used by some stages This will be used in the following patch for calculating array sizes correctly when reserving explicit varying locations. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 09:06:16 +11:00
Timothy Arceri	30991d7389	glsl: remove unused varyings before packing them Previously we would pack varyings before trying to remove them, this relied on the packing pass not packing varyings with a location of -1 to avoid packing varyings that should be removed. However this meant unused varyings with an explicit location would be packed before they could be removed when we enable packing of them in a later patch. V2: fix regression in V1 removing unused varyings in multi-stage SSO, fix regression with single stage programs. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 09:06:12 +11:00
Krzysztof Sobiecki	0d7477a289	gallium/r600: Replace ALIGN_DIVUP with DIV_ROUND_UP ALIGN_DIVUP is a driver specific(r600g) macro that duplicates DIV_ROUND_UP functionality. Replacing it with DIV_ROUND_UP eliminates this problems. Signed-off-by: Krzysztof A. Sobiecki <sobkas@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-06 16:09:12 -05:00
Eric Anholt	bbd29f1375	vc4: Fix driver build from last minute rebase fix. I had the driver all tested for the last series, and in my last build I noticed that get_swizzled_channel was unused now, and removed it... apparently without testing to find that I removed the wrong channel swizzle function.	2016-01-06 12:49:45 -08:00
Eric Anholt	25aa436e86	vc4: Optimize out a comparison for bcsel based on an ALU comparison We routinely have code like: vec1 ssa_220 = fge ssa_104, ssa_61 vec1 ssa_199 = bcsel ssa_220, ssa_106, ssa_105 and we would compare fge's args and choose between ~0 and 0 to generate ssa_220, then compare ssa_220 to 0 and choose between bcsel's args. Instead, try to notice the pattern and compare between fge's args to select between bcsel's args. total instructions in shared programs: 88019 -> 87574 (-0.51%) instructions in affected programs: 9985 -> 9540 (-4.46%) total estimated cycles in shared programs: 245752 -> 245237 (-0.21%) estimated cycles in affected programs: 17232 -> 16717 (-2.99%)	2016-01-06 12:43:09 -08:00
Eric Anholt	7a9eb76786	vc4: Add missing sRGB decode to texel fetches. We only see txf on MSAA textures, currently, and apparently this didn't impact any of our piglit tests.	2016-01-06 12:43:09 -08:00
Eric Anholt	f01ca9eeda	vc4: Add support for GL_ARB_texture_swizzle. We already had the code supporting it, since it's needed for the depth mode when doing shadow comparisons.	2016-01-06 12:43:09 -08:00
Eric Anholt	12519a972f	vc4: Use NIR texture lowering for texture swizzling. We can't use its other features currently (mostly because we don't want Newton-Raphson on rcps for texture coordinates), but it gets us started. This eliminates some comparisons with constants in GLB2.7 and ETQW traces at the QIR level by moving the comparisons into NIR, where they get constant-folded out. instructions in affected programs: 165 -> 156 (-5.45%) total uniforms in shared programs: 32087 -> 32085 (-0.01%) total estimated cycles in shared programs: 245762 -> 245752 (-0.00%) estimated cycles in affected programs: 461 -> 451 (-2.17%)	2016-01-06 12:43:08 -08:00
Eric Anholt	71db7d3dc5	vc4: Replace the SSA-style SEL operators with conditional MOVs. I'm moving away from QIR being SSA (since NIR is doing lots of SSA optimization for us now) and instead having QIR just be QPU operations with virtual registers. By making our SELs be composed of two MOVs, we could potentially coalesce the registers for the MOV's src and dst and eliminate the MOV. total instructions in shared programs: 88448 -> 88028 (-0.47%) instructions in affected programs: 39845 -> 39425 (-1.05%) total estimated cycles in shared programs: 246306 -> 245762 (-0.22%) estimated cycles in affected programs: 162887 -> 162343 (-0.33%)	2016-01-06 12:39:51 -08:00
Eric Anholt	0a89f307f9	vc4: Don't try the SF coalescing unless it's on a def. If you want the SF of the value of a register produced from a series of packing MOVs or conditional MOVs, we can't just SF on the last MOV into the register.	2016-01-06 12:39:27 -08:00
Chad Versace	8284786c5d	anv/gen9: Teach gen9_image_view_init() about 1D surface qpitch QPitch is usually expressed as rows of surface elements (where a surface element is an compression block or a single surface sample. Skylake 1D is an outlier; there QPitch is expressed as individual surface elements.	2016-01-06 09:38:57 -08:00
Chad Versace	e05b307942	isl: Add isl_surf_get_array_pitch_el() Will be needed to program SurfaceQPitch for Skylake 1D arrays.	2016-01-06 09:38:57 -08:00
Chad Versace	c1e890541e	isl/gen9: Support ISL_DIM_LAYOUT_GEN9_1D	2016-01-06 09:38:57 -08:00
Chad Versace	eea2d4d059	isl: Don't align phys_slice0_sa.width twice It's already aligned to the format's block width. Don't align it again in isl_calc_row_pitch().	2016-01-06 09:38:57 -08:00
Chad Versace	39d043f94a	isl: Fix the documented units of isl_surf::row_pitch It's the pitch between surface elements, not between surface samples.	2016-01-06 09:38:57 -08:00
Chad Versace	dcb9c11dc7	anv/gen9: Fix oob lookup of surface halign, valign For 1D surfaces and for surfaces with Yf or Ys tiling, the hardware ignores SurfaceVerticalAlignment and SurfaceHorizontalAlignment. Moreover, the anv_halign[] and anv_valign[] lookup tables may not even contain the surface's actual alignment values. So don't do the lookup for those surfaces.	2016-01-06 09:38:57 -08:00
Chad Versace	94566d9b68	anv/meta: Teach meta how to blit from a 1D image Meta needed a VkShader with a 1D sampler type.	2016-01-06 09:38:57 -08:00
Edward O'Callaghan	1953cee6d7	gallium/drivers/svga: Use unsigned for loop index Fix a 's/unsigned int/unsigned/' consistency case while here. Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Edward O'Callaghan	8e2a8ec731	gallium/drivers/r600: Use unsigned for loop index Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Edward O'Callaghan	76a7d6f412	gallium/drivers/ilo: Use unsigned for loop index Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Edward O'Callaghan	5071c192cc	gallium: Use unsigned for loop index Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Edward O'Callaghan	bfabd5e74a	gallium/drivers: Remove unnecessary semicolons Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Edward O'Callaghan	67d4b4b28c	gallium: Remove unnecessary semicolons Fix silly issue with MSVC case fall-though support to need a extra 'break;' Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Oded Gabbay	9d59b9d00c	llvmpipe: Optimize lp_rast_triangle_32_3_16 for POWER8 This patch converts the SSE-optimized lp_rast_triangle_32_3_16() to VMX/VSX. I measured the results on POWER8 machine with 32 cores at 3.4GHz and 16GB of RAM. FPS/Score Name Before After Delta ------------------------------------------------ openarena 16.35 16.7 2.14% xonotic 4.707 4.97 5.57% glmark2 didn't show a significant (more than 1%) difference. v2: Make sure code is build only on POWER8 LE machine Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-06 14:54:16 +02:00

... 7 8 9 10 11 ...

77476 commits