fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-07 15:48:36 +02:00

Author	SHA1	Message	Date
Matt Turner	a3b51a22f7	glsl: Correctly validate fma()'s types. lrp() can take a scalar as a third argument, and fma() cannot. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-17 17:02:06 -07:00
Matt Turner	d56bbd0441	glsl: Add frexp signatures and implementation. I initially implemented frexp() as an IR opcode with a lowering pass, but since it returns a value and has an out-parameter, it would break assumptions our optimization passes make about ir_expressions being pure (i.e., having no side effects). For example, if opt_tree_grafting encounters this code: uniform float u; void main() { int exp; float f = frexp(u, out exp); float g = float(exp)/256.0; float h = float(exp) + 1.0; gl_FragColor = vec4(f, g, h, g + h); } it may try to optimize it to this: uniform float u; void main() { int exp; float g = float(exp)/256.0; float h = float(exp) + 1.0; gl_FragColor = vec4(frexp(u, out exp), g, h, g + h); } Some hardware has an instruction which performs frexp(), but we would need some other compiler infrastructure to be able to generate it, such as an intrinsics system that would allow backends to emit specific code for particular bits of IR. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-17 17:01:58 -07:00
Matt Turner	c43d6060b1	i965: Lower ldexp. v2: Drop frexp lowering. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-17 16:59:26 -07:00
Matt Turner	d0b8ea60b7	glsl: Add ldexp_to_arith lowering pass. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-17 16:59:23 -07:00
Matt Turner	5561251b58	glsl: Allow vectors to be created from ir_constant(). Note the parameter name change in the int version of ir_constant, to avoid the conflict with the loop iterator. v2: Make analogous change to builtin_builder::imm(). Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-17 16:59:14 -07:00
Matt Turner	b2ab840130	glsl: Add support for ldexp. v2: Drop frexp. Rebase on builtins rewrite. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-17 16:59:05 -07:00
Paul Berry	4b0488ef4e	i965: Add some missing bits to {mesa,brw,cache}_bits[]. These data structures are used for debug output, so it wasn't hurting anything that there were missing bits. But it's good to keep things up to date. This patch also adds static asserts so that the {brw,cache}_bits[] arrays are the proper size, so that we don't forget to add to them in the future. Unfortunately there's no convenient way to assert that mesa_bits[] is the proper size. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-17 15:18:18 -07:00
Paul Berry	3374dabce7	i965/gs: Implement basic gl_PrimitiveIDIn functionality. If the geometry shader refers to the built-in variable gl_PrimitiveIDIn, we need to set a bit in 3DSTATE_GS to tell the hardware to dispatch primitive ID to r1, and we need to leave room for it when allocating registers. Note: this feature doesn't yet work properly when software primitive restart is in use (the primitive ID counter will incorrectly reset with each primitive restart, since software primitive restart works by performing multiple draw calls). I plan to address that in a future patch series. Fixes piglit test "spec/glsl-1.50/execution/geometry/primitive-id-in". Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-17 15:18:14 -07:00
Paul Berry	f67fa8f3c8	i965/gs: New gs primitive types are supported by HW primitive restart. When we previously implemented primitive restart, we didn't add cases to brw_primitive_restart.c's can_cut_index_handle_prims() for the primitive types that are introduced with geometry shaders. It turns out that all of the new primitive types are supported by hardware primitive restart. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-17 15:18:11 -07:00
Paul Berry	9791af90e3	i965/gs: Add new primitive types. As part of its support for geometry shaders, GL 3.2 introduces four new primitive types: GL_LINES_ADJACENCY, GL_LINE_STRIP_ADJACENCY, GL_TRIANGLES_ADJACENCY, and GL_TRIANGLE_STRIP_ADJACENCY. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-17 15:18:07 -07:00
Roland Scheidegger	93b5f71179	gallivm: some bits of seamless cube filtering implementation Simply adjust wrap mode to clamp_to_edge. This is all that's needed for a correct implementation for nearest filtering, and it's way better than using repeat wrap for instance for linear filtering (though obviously this doesn't actually do seamless filtering). v2: fix s/t wrap not r/s... Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-09-18 00:00:37 +02:00
Kenneth Graunke	b8244b0056	i965: Remove MIPLAYOUT_BELOW from Gen4-6 constant buffer surface state. Specifying a miptree layout makes no sense for constant buffers. This has no functional change since BRW_SURFACE_MIPMAPLAYOUT_BELOW is just a #define for 0. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-17 13:17:07 -07:00
Kristian Høgsberg	a1b6e69e45	egl: Also add EGL_TEXTURE_FORMAT as a valid eglQueryWaylandBufferWL attribute Now that we have a table of accepted eglQueryWaylandBufferWL() attributes, we should also list EGL_TEXTURE_FORMAT.	2013-09-16 22:22:49 -07:00
Stanislav Vorobiov	1281a90532	egl: add EGL_WAYLAND_Y_INVERTED_WL attribute This enables querying of wl_buffer's orientation	2013-09-16 22:20:27 -07:00
Kenneth Graunke	9ad6dda21e	i965: Use gen7_upload_constant_state for 3DSTATE_CONSTANT_PS as well. Now we use gen7_upload_constant_state() for all three shader stages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-16 18:25:14 -07:00
Kenneth Graunke	e776c18afb	i965: Set brw_stage_state::push_const_size for PS constants. This paves the way for using gen7_upload_constant_state for PS data. The formula is copied from gen7_wm_state.c. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-16 18:25:11 -07:00
Kenneth Graunke	d385edf4c3	i965: Introduce a prog_data temporary in gen6_upload_wm_push_constants. This saves a bit of typing and shortens a few lines. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-16 18:25:07 -07:00
Paul Berry	24765c58bd	i965/gen6+: Support 128 varying components. GL 3.2 requires us to support 128 varying components for geometry shader outputs and fragment shader inputs, and 64 varying components otherwise. But there's no hardware limitation that restricts us to 64 varying components, and core Mesa doesn't currently allow different stages to have different maximum values, so just go ahead and enable 128 varying components for all stages. This gets us better test coverage anyway. Even though we are only working on GL 3.2 support for gen7 right now, gen6 also supports 128 varying components, so go ahead and switch it on there too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:58 -07:00
Paul Berry	f5d38c58ee	i965/ff_gs: Generate URB writes using a loop. Previously we only ever did 1 URB write, since the maximum number of varyings we support is small enough to fit in 1 URB write (when using BRW_URB_SWIZZLE_NONE, which is what the pre-Gen7 GS always uses). But we're about to increase the number of varying components we support from 64 to 128. With 128 varyings, the most URB writes we'll have to do is 2, but it's just as easy to write a general-purpose loop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:55 -07:00
Paul Berry	57b8cff33c	i965/gen6: Fix assertions on VS/GS URB size. The "{VS,GS} URB Entry Allocation Size" fields of 3DSTATE_URB allow values in the range 0-4, but they are U8-1 fields, so the range of possible allocation sizes is 1-5. We were erroneously prohibiting a size of 5. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:52 -07:00
Paul Berry	784044c206	i965/vec4: Generate URB writes using a loop. Previously we only ever did 1 or 2 URB writes, since the maximum number of varyings we support is small enough to fit in 2 URB writes. But GL 3.2 requires the geometry shader to support 128 output varying components, and this could require up to 3 URB writes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:49 -07:00
Paul Berry	875972029e	i965/fs: When >64 input components, order them to match prev pipeline stage. Since the SF/SBE stage is only capable of performing arbitrary reorderings of 16 varying slots, we can't arrange the fragment shader inputs in an arbitrary order if there are more than 16 input varying slots in use. We need to make sure that slots 16-31 match the corresponding outputs of the previous pipeline stage. The easiest way to accomplish this is to just make all varying slots match up with the previous pipeline stage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:46 -07:00
Paul Berry	a4546ec114	i965/fs: Simplify computation of key.input_slots_valid during precompile. The for loop was rather silly. In addition to checking brw->gen < 6 on each loop iteration, it took pains to exclude bits from fp->Base.InputsRead that don't correspond to fragment shader inputs. But those bits would never have been set in the first place, since the only bits that are ever set in fp->Base.InputsRead are fragment shader inputs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:43 -07:00
Paul Berry	8a36f4382b	i965/gs: Stop storing an input VUE map in the GS program key. Now that the vertex shader output VUE map is determined solely by a 64-bit bitfield, we don't have to store it in its entirety in the geometry shader program key; instead, we can just store the bitfield, and let the geometry shader infer the VUE map at compile time. This dramatically reduces the size of the geometry shader program key, which we want to keep small since it gets recomputed whenever the active program changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:40 -07:00
Paul Berry	d1ad447f01	i965/gen6+: Remove VUE map dependency on userclip_active. Previously, on Gen6+, we laid out the vertex (or geometry) shader VUE map differently depending whether user clipping was active. If it was active, we put the clip distances in slots 2 and 3 (where the clipper expects them); if it was inactive, we assigned them in the order of the gl_varying_slot enum. This made for unnecessary recompiles, since turning clipping on/off for a shader that used gl_ClipDistance might rearrange the varyings. It also required extra bookkeeping, since it required the user clipping flag to be provided to brw_compute_vue_map() as a parameter. With this patch, we always put clip distances at in slots 2 and 3 if they are written to. do_vs_prog() and do_gs_prog() are responsible for ensuring that clip distances are written to when user clipping is enabled (as do_vs_prog() previously did for gen4-5). This makes the only input to brw_compute_vue_map() a bitfield of which varyings the shader writes to, a fact that we'll take advantage of in forthcoming patches. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:36 -07:00
Paul Berry	3a83b20dcc	i965/fs: Stop wasting input attribute space on gl_FragCoord and gl_FrontFacing. Previously, if a fragment shader accessed gl_FragCoord or gl_FrontFacing, we would assign them their own slots in the fragment shader input attribute array, using up space that could be made available to real varyings. This was not strictly necessary (since these values are not true varyings, and are instead computed from other data available in the FS payload). But we had to do it anyway because the SF/SBE setup code assumed that every 1 bit in the gl_program::InputsRead bitfield corresponded to a genuine varying variable. Now that the SF/SBE code consults brw_wm_prog_data and only sets up the attributes that the fragment shader actually needs, we don't have to do this anymore. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:32 -07:00
Paul Berry	0af1252ae4	i965/sf: Consult brw_wm_prog_data when setting up SF/SBE state. Previously, the SF/SBE setup code delivered varying inputs to the FS in the order in which they appear in the gl_program::InputsRead bitfield, since that's what the FS expects. When we add support for more than 64 varying components, this will no longer always be the case, because the Gen6+ SF/SBE stage is only capable of performing arbitrary reorderings of 16 varying slots. So, when there are more than 16 vec4's worth of varying inputs, the FS will have to adjust the order its input varyings in order to partially match the order of outputs from the geometry or vertex shader. To allow extra flexibility in the ordering of FS varyings, this patch causes the SF/SBE to deliver varying inputs to the FS in exactly the order that the FS requests, by consulting brw_wm_prog_data::urb_setup and brw_wm_prog_data::num_varying_inputs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:29 -07:00
Paul Berry	af84bbd2ca	i965/sf: Consolidate common code for setting up gen6-7 attribute overrides. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:25 -07:00
Paul Berry	d5b4095356	i965/sf: Use BRW_SF_URB_ENTRY_READ_OFFSET rather than hardcoded values. We always program the SF unit to start reading the vertex URB entry at offset 1. In upcoming patches, we'll be adding FS code that relies on this. So consistently use the constant BRW_SF_URB_ENTRY_READ_OFFSET rather than hardcoding a 1. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:21 -07:00
Paul Berry	8c2b9bd1df	i965/fs: Consult brw_wm_prog_data::num_varying_inputs when setting up WM state. Previously, we assumed that the number of varying inputs consumed by the fragment shader was equal to the number of bits set in gl_program::InputsRead. However, we'll soon be making two changes that will cause that not to be true: - We'll stop wasting varying input space for gl_FragCoord and gl_FrontFacing, which aren't varyings. - For fragment shaders that have more than 16 varying inputs, we'll adjust the layout of the inputs to account for the fact that the SF/SBE pipeline stage can't reorder inputs beyond the first 16; if there are GS outputs that the FS doens't use (or vice versa) this may cause the number of FS varying inputs to change. So, instead of trying to guess the number of FS inputs from gl_program::InputsRead, simply read it from brw_wm_prog_data:num_varying_inputs, which is guaranteed to be correct since it's populated by fs_visitor::calculate_urb_setup(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:18 -07:00
Paul Berry	8c69eaba1a	i965/fs: Change brw_wm_prog_data::urb_read_length to num_varying_inputs. On gen4-5, the FS stage reads varying inputs from URB entries that were output by the SF thread, where each register stores the interpolation setup for two components of a vec4, therefore the FS urb_read_length is twice the number of FS input varyings. On gen6+, varying inputs are directly deposited in the FS payload by the SF/SBE fixed function logic, so urb_read_length is irrelevant. However, in future patches, it will be nice to be able to consult brw_wm_prog_data to determine how many varying inputs the FS expects (rather than inferring it from gl_program::InputsRead). So instead of storing urb_read_length, we simply store num_varying_inputs in brw_wm_prog_data. On gen4-5, we multiply this by 2 to recover the URB read length. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:14 -07:00
Paul Berry	58f01bd17d	i965/fs: Expose "urb_setup" as part of brw_wm_prog_data. At the moment, for Gen6+, the FS assumes that all varying inputs are delivered to it in the order in which they appear in the gl_program::InputsRead bitfield, and the SF/SBE setup code ensures that they are delivered in this order. When we add support for more than 64 varying components, this will no longer always be possible, because the Gen6+ SF/SBE stage is only capable of performing arbitrary reorderings of 16 varying slots. To allow extra flexibility in the ordering of FS varyings, this patch causes the FS to advertise exactly what ordering it expects. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:05 -07:00
Chia-I Wu	4a6939edae	ilo: make ilo_bind_sampler_states return void So that it can be hooked up pipe_context::bind_sampler_states that is currently living on another branch.	2013-09-17 00:20:50 +08:00
Kenneth Graunke	120d100627	glsl/tests: Update .gitignore for new unit test. I rarely run 'git status', so I failed to notice this was missing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 08:26:09 -07:00
Kenneth Graunke	1da3ff1b1c	glsl/tests: Add a test for properties of sampler types. For each sampler type, this tests that: - The base type is GLSL_TYPE_SAMPLER. - The dimensionality is set correctly. - The returned data type is correct. - The sampler_array and sampler_shadow flags are set correctly. - sampler_coordinate_components() returns the correct value. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2013-09-15 21:48:20 -07:00
Dave Airlie	2f508f244e	st/mesa: don't dereference stObj->pt if NULL It seems a user app can get us into this state, I trigger the fail running fbo-maxsize inside virgl, it fails to create the backing storage for the texture object, but then segfaults here when it should fail the completeness test. Cc: "9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-09-16 08:33:02 +10:00
Dave Airlie	bbe3d6dc29	nouveau: fix regression since float comparison instructions (v2) Fix the return type and allow src and dst types for comparison to be separate, this at least fixes the two test cases I've written. v2: drop the u32->s32 change Acked-by: Christoph Bumiller <christoph.bumiller@speed.at> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-09-16 08:32:42 +10:00
Rico Schüller	6f52295129	vdpau/decode: Check max width and max height. Reviewed-by: Christian König <christian.koenig@amd.com>	2013-09-15 16:18:08 +02:00
Rob Clark	ffa3244534	freedreno: PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE When the old contents do not need to be preserved, it is faster to create a new backing bo rather than stall. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	d7be322410	freedreno/a3xx: fix VFD_INDEX_MAX overflow max_index may be 0xffffffff. The hardware does not need 1 + max_index (although it does not hurt unless max_index wraps around to zero). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	c756a3ef70	freedreno: add debug option to disable GMEM bypass Useful for debugging. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	cdec879e38	freedreno/a3xx: handle front_ccw Used by supertuxkart. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	cda75253f7	freedreno/a3xx: stencil fixes For mem->gmem we don't sample depth/stencil as it's native type. So we need to setup the swizzle state for the sampler based on the format used for sampling. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	65ae4392ce	freedreno/a3xx: alpha-test Needed by some games, like etuxracer and supertuxkart which use alpha test rather than blending, to handle texture transparency. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	dbf041e61f	freedreno/a3xx/compiler: implement SUB Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	1a42d4ee34	freedreno/a3xx: use INDIRECT state load for shaders With a debug option to force DIRECT (mainly to make it easier for capturing cmdstream dumps). Using INDIRECT for large shaders at least makes a noticable reduction in CPU load, which helps for CPU limited games. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	6e9c386d16	freedreno: avoid stalling at ringbuffer wraparound Because of how the tiling works, we can't really flush at arbitrary points very easily. So wraparound is handled by resetting to top of ringbuffer. Previously this would stall until current rendering is complete. Instead cycle through multiple ringbuffers to avoid a stall. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	ca505303a7	freedreno: emit markers to scratch registers Emit markers by writing to scratch registers in order to "triangulate" gpu lockup position from post-mortem register dump. By comparing register values in post-mortem dump to command-stream, it is possible to narrow down which DRAW_INDX caused the lockup. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	1e6d290f21	freedreno: split out WFI helper Mostly just to give an easy debug/instrumentation point. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	74052347f3	freedreno: fd_draw helper Have a single helper that all draws come through.. mainly for a convenient debug and instrumentation point. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00

1 2 3 4 5 ...

58558 commits