linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-15 07:33:56 +00:00

Author	SHA1	Message	Date
Trond Myklebust	a1940805d0	NFS: Get rid of the unused nfs_read_data->flags field Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-19 13:37:34 -07:00
Andy Fleming	3d153a7c8b	net: Allow skb_recycle_check to be done in stages skb_recycle_check resets the skb if it's eligible for recycling. However, there are times when a driver might want to optionally manipulate the skb data with the skb before resetting the skb, but after it has determined eligibility. We do this by splitting the eligibility check from the skb reset, creating two inline functions to accomplish that task. Signed-off-by: Andy Fleming <afleming@freescale.com> Acked-by: David Daney <david.daney@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-19 15:59:45 -04:00
J. Bruce Fields	8b289b2c23	nfsd4: implement new 4.1 open reclaim types Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-19 11:52:12 -04:00
Yevgeny Petrilin	f3a9d1f25d	mlx4_en: Controlling FCS header removal Canceling FCS removal where FW allows for better alignment of incoming data. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-19 03:42:26 -04:00
Daniel Martensson	5ea2ef5f8b	caif-hsi: Added recovery check of CA wake status. Added recovery check of CA wake status in case of wake up timeout. Added check of CA wake status in case of wake down timeout. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-19 03:25:43 -04:00
Daniel Martensson	5bbed92d3d	caif-hsi: Added sanity check for length of CAIF frames Added sanity check for length of CAIF frames, and tear down of CAIF link-layer device upon protocol error. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-19 03:25:42 -04:00
Dmitry Tarnyagin	28bd204942	caif-hsi: Make inactivity timeout configurable. CAIF HSI uses a timer for inactivity. Upon timeout HSI-wake signaling is initiated to allow power-down of the HSI block. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-19 03:25:42 -04:00
Daniel Martensson	687b13e98a	caif-hsi: Making read and writes asynchronous. Some platforms do not allow to put HSI block into low-power mode when FIFO is not empty. The patch flushes (by reading) FIFO at wake down sequence. Asynchronous read and write is implemented for that. As a side effect this will also greatly improve performance. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-19 03:25:41 -04:00
Eric Dumazet	9e903e0852	net: add skb frag size accessors To ease skb->truesize sanitization, its better to be able to localize all references to skb frags size. Define accessors : skb_frag_size() to fetch frag size, and skb_frag_size_{set\|add\|sub}() to manipulate it. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-19 03:10:46 -04:00
Eric Dumazet	bc416d9768	macvlan: handle fragmented multicast frames Fragmented multicast frames are delivered to a single macvlan port, because ip defrag logic considers other samples are redundant. Implement a defrag step before trying to send the multicast frame. Reported-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-18 23:22:07 -04:00
Trond Myklebust	0c2e53f11a	NFS: Remove the unused "lookupfh()" version of nfs4_proc_lookup() ...and also remove the associated nfs_v4_clientops entry. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 16:13:51 -07:00
Jiri Slaby	fa90e1c935	TTY: make tty_add_file non-failing If tty_add_file fails at the point it is now, we have to revert all the changes we did to the tty. It means either decrease all refcounts if this was a tty reopen or delete the tty if it was newly allocated. There was a try to fix this in v3.0-rc2 using tty_release in `0259894c7` (TTY: fix fail path in tty_open). But instead it introduced a NULL dereference. It's because tty_release dereferences filp->private_data, but that one is set even in our tty_add_file. And when tty_add_file fails, it's still NULL/garbage. Hence tty_release cannot be called there. To circumvent the original leak (and the current NULL deref) we split tty_add_file into two functions, making the latter non-failing. In that case we may do the former early in open, where handling failures is easy. The latter stays as it is now. So there is no change in functionality. The original bug (leak) was introduced by `f573bd176` (tty: Remove __GFP_NOFAIL from tty_add_file()). Thanks Dan for reporting this. Later, we may split tty_release into more functions and call only some of them in this fail path instead. (If at all possible.) Introduced-in: v2.6.37-rc2 Signed-off-by: Jiri Slaby <jslaby@suse.cz> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: stable <stable@vger.kernel.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-10-18 14:22:37 -07:00
Thomas Meyer	8193c42906	tty: Support compat_ioctl get/set termios_locked When running a Fedora 15 (x86) on an x86_64 kernel, in the boot process plymouthd complains about those two missing ioctls: [ 2.581783] ioctl32(plymouthd:186): Unknown cmd fd(10) cmd(00005457){t:'T';sz:0} arg(ffb6a5d0) on /dev/tty1 [ 2.581803] ioctl32(plymouthd:186): Unknown cmd fd(10) cmd(00005456){t:'T';sz:0} arg(ffb6a680) on /dev/tty1 both ioctl functions work on the 'struct termios' resp. 'struct termios2', which has the same size (36 bytes resp. 44 bytes) on x86 and x86_64, so it's just a matter of converting the pointer from userland. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-10-18 14:17:11 -07:00
Jason Baron	07613b0b5e	dynamic_debug: consolidate repetitive struct _ddebug descriptor definitions Replace the repetitive struct _ddebug descriptor definitions with a new DECLARE_DYNAMIC_DEBUG_META_DATA(name, fmt) macro. [akpm@linux-foundation.org: s/DECLARE/DEFINE/] Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-10-18 11:22:00 -07:00
Kai Jiang	27a90700a4	uio: Support physical addresses >32 bits on 32-bit systems To support >32-bit physical addresses for UIO_MEM_PHYS type we need to extend the width of 'addr' in struct uio_mem. Numerous platforms like embedded PPC, ARM, and X86 have support for systems with larger physical address than logical. Since 'addr' may contain a physical, logical, or virtual address the easiest solution is to just change the type to 'phys_addr_t' which should always be greater than or equal to the sizeof(void *) such that it can properly hold any of the address types. For physical address we can support up to a 44-bit physical address on a typical 32-bit system as we utilize remap_pfn_range() for the mapping of the memory region and pfn's are represnted by shifting the address by the page size (typically 4k). Signed-off-by: Kai Jiang <Kai.Jiang@freescale.com> Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Hans J. Koch <hjk@hansjkoch.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-10-18 11:18:57 -07:00
Trond Myklebust	a9a4a87a59	NFS: Use the inode->i_version to cache NFSv4 change attribute information Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 09:14:34 -07:00
Trond Myklebust	d77385f238	SUNRPC: Fix rpc_sockaddr2uaddr rpc_sockaddr2uaddr is only used by net/sunrpc/rpcb_clnt.c, where it is used in a non-blockable context in at least one case. Add non-blocking capability by adding a gfp_t argument Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 09:13:32 -07:00
Peng Tao	c1225158a8	SUNRPC/NFS: make rpc pipe upcall generic The same function is used by idmap, gss and blocklayout code. Make it generic. Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-18 09:08:12 -07:00
David S. Miller	ae2a458315	Merge branch 'nf' of git://1984.lsi.us.es/net	2011-10-17 19:38:03 -04:00
Marc Kleine-Budde	f861c2b80c	can: remove references to berlios mailinglist The BerliOS project, which currently hosts our mailinglist, will close with the end of the year. Now take the chance and remove all occurrences of the mailinglist address from the source files. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-17 19:22:46 -04:00
Gerrit Renker	f36c23bb9f	udplite: fast-path computation of checksum coverage Commit `903ab86d19` of 1 March this year ("udp: Add lockless transmit path") introduced a new fast TX path that broke the checksum coverage computation of UDP-lite, which so far depended on up->len (only set if the socket is locked and 0 in the fast path). Fixed by providing both fast- and slow-path computation of checksum coverage. The latter can be removed when UDP(-lite)v6 also uses a lockless transmit path. Reported-by: Thomas Volkert <thomas@homer-conferencing.com> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-17 19:07:30 -04:00
John W. Linville	41ebe9cde7	Merge branch 'master' of git://git.infradead.org/users/linville/wireless-next into for-davem	2011-10-17 15:05:26 -04:00
Ian Campbell	9bab0b7fba	genirq: Add IRQF_RESUME_EARLY and resume such IRQs earlier This adds a mechanism to resume selected IRQs during syscore_resume instead of dpm_resume_noirq. Under Xen we need to resume IRQs associated with IPIs early enough that the resched IPI is unmasked and we can therefore schedule ourselves out of the stop_machine where the suspend/resume takes place. This issue was introduced by `676dc3cf5b` "xen: Use IRQF_FORCE_RESUME". Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/1318713254.11016.52.camel@dagon.hellion.org.uk Cc: stable@kernel.org (at least to 2.6.32.y) Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-10-17 11:42:49 +02:00
Rafael J. Wysocki	2aede851dd	PM / Hibernate: Freeze kernel threads after preallocating memory There is a problem with the current ordering of hibernate code which leads to deadlocks in some filesystems' memory shrinkers. Namely, some filesystems use freezable kernel threads that are inactive when the hibernate memory preallocation is carried out. Those same filesystems use memory shrinkers that may be triggered by the hibernate memory preallocation. If those memory shrinkers wait for the frozen kernel threads, the hibernate process deadlocks (this happens with XFS, for one example). Apparently, it is not technically viable to redesign the filesystems in question to avoid the situation described above, so the only possible solution of this issue is to defer the freezing of kernel threads until the hibernate memory preallocation is done, which is implemented by this change. Unfortunately, this requires the memory preallocation to be done before the "prepare" stage of device freeze, so after this change the only way drivers can allocate additional memory for their freeze routines in a clean way is to use PM notifiers. Reported-by: Christoph <cr2005@u-club.de> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2011-10-16 23:28:52 +02:00
H Hartley Sweeten	37cce26b32	PM / VT: Cleanup #if defined uglyness and fix compile error Introduce the config option CONFIG_VT_CONSOLE_SLEEP in order to cleanup the #if defined ugliness for the vt suspend support functions. Note that CONFIG_VT_CONSOLE is already dependant on CONFIG_VT. The function pm_set_vt_switch is actually dependant on CONFIG_VT and not CONFIG_PM_SLEEP. This fixes a compile error when CONFIG_PM_SLEEP is not set: drivers/tty/vt/vt_ioctl.c:1794: error: redefinition of 'pm_set_vt_switch' include/linux/suspend.h:17: error: previous definition of 'pm_set_vt_switch' was here Also, remove the incorrect path from the comment in console.c. [rjw: Replaced #if defined() with #ifdef in suspend.h.] Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2011-10-16 23:28:51 +02:00
Martin Schwidefsky	85055dd805	PM / Hibernate: Include storage keys in hibernation image on s390 For s390 there is one additional byte associated with each page, the storage key. This byte contains the referenced and changed bits and needs to be included into the hibernation image. If the storage keys are not restored to their previous state all original pages would appear to be dirty. This can cause inconsistencies e.g. with read-only filesystems. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2011-10-16 23:27:46 +02:00
ShuoX Liu	2a77c46de1	PM / Suspend: Add statistics debugfs file for suspend to RAM Record S3 failure time about each reason and the latest two failed devices' names in S3 progress. We can check it through 'suspend_stats' entry in debugfs. The motivation of the patch: We are enabling power features on Medfield. Comparing with PC/notebook, a mobile enters/exits suspend-2-ram (we call it s3 on Medfield) far more frequently. If it can't enter suspend-2-ram in time, the power might be used up soon. We often find sometimes, a device suspend fails. Then, system retries s3 over and over again. As display is off, testers and developers don't know what happens. Some testers and developers complain they don't know if system tries suspend-2-ram, and what device fails to suspend. They need such info for a quick check. The patch adds suspend_stats under debugfs for users to check suspend to RAM statistics quickly. If not using this patch, we have other methods to get info about what device fails. One is to turn on CONFIG_PM_DEBUG, but users would get too much info and testers need recompile the system. In addition, dynamic debug is another good tool to dump debug info. But it still doesn't match our utilization scenario closely. 1) user need write a user space parser to process the syslog output; 2) Our testing scenario is we leave the mobile for at least hours. Then, check its status. No serial console available during the testing. One is because console would be suspended, and the other is serial console connecting with spi or HSU devices would consume power. These devices are powered off at suspend-2-ram. Signed-off-by: ShuoX Liu <shuox.liu@intel.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2011-10-16 23:27:45 +02:00
Greg Rose	5f8444a3fa	if_link: Add additional parameter to IFLA_VF_INFO for spoof checking Add configuration setting for drivers to turn spoof checking on or off for discrete VFs. v2 - Fix indentation problem, wrap the ifla_vf_info structure in #ifdef __KERNEL__ to prevent user space from accessing and change function paramater for the spoof check setting netdev op from u8 to bool. v3 - Preset spoof check setting to -1 so that user space tools such as ip can detect that the driver didn't report a spoofcheck setting. Prevents incorrect display of spoof check settings for drivers that don't report it. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2011-10-16 13:15:38 -07:00
Helmut Schaa	bb6e753e95	nl80211: Add sta_flags to the station info Reuse the already existing struct nl80211_sta_flag_update to specify both, a flag mask and the flag set itself. This means nl80211_sta_flag_update is now used for setting station flags and also for getting station flags. Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-10-14 14:48:23 -04:00
Helmut Schaa	7f2a5e214d	mac80211: Populate radiotap header with MCS info for TX frames mac80211 already filled in the MCS rate info for rx'ed frames but tx'ed frames that are sent to a monitor interface during the status callback lack this information. Add the radiotap fields for MCS info to ieee80211_tx_status_rtap_hdr and populate them when sending tx'ed frames to the monitors. The needed headroom is only extended by one byte since we don't include legacy rate information in the rtap header for HT frames. Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-10-14 14:48:14 -04:00
Szymon Janc	88149db494	Bluetooth: rfcomm: Fix sleep in invalid context in rfcomm_security_cfm This was triggered by turning off encryption on ACL link when rfcomm was using high security. rfcomm_security_cfm (which is called from rx task) was closing DLC and this involves sending disconnect message (and locking socket). Move closing DLC to rfcomm_process_dlcs and only flag DLC for closure in rfcomm_security_cfm. BUG: sleeping function called from invalid context at net/core/sock.c:2032 in_atomic(): 1, irqs_disabled(): 0, pid: 1788, name: kworker/0:3 [<c0068a08>] (unwind_backtrace+0x0/0x108) from [<c05e25dc>] (dump_stack+0x20/0x24) [<c05e25dc>] (dump_stack+0x20/0x24) from [<c0087ba8>] (__might_sleep+0x110/0x12c) [<c0087ba8>] (__might_sleep+0x110/0x12c) from [<c04801d8>] (lock_sock_nested+0x2c/0x64) [<c04801d8>] (lock_sock_nested+0x2c/0x64) from [<c05670c8>] (l2cap_sock_sendmsg+0x58/0xcc) [<c05670c8>] (l2cap_sock_sendmsg+0x58/0xcc) from [<c047cf6c>] (sock_sendmsg+0xb0/0xd0) [<c047cf6c>] (sock_sendmsg+0xb0/0xd0) from [<c047cfc8>] (kernel_sendmsg+0x3c/0x44) [<c047cfc8>] (kernel_sendmsg+0x3c/0x44) from [<c056b0e8>] (rfcomm_send_frame+0x50/0x58) [<c056b0e8>] (rfcomm_send_frame+0x50/0x58) from [<c056b168>] (rfcomm_send_disc+0x78/0x80) [<c056b168>] (rfcomm_send_disc+0x78/0x80) from [<c056b9f4>] (__rfcomm_dlc_close+0x2d0/0x2fc) [<c056b9f4>] (__rfcomm_dlc_close+0x2d0/0x2fc) from [<c056bbac>] (rfcomm_security_cfm+0x140/0x1e0) [<c056bbac>] (rfcomm_security_cfm+0x140/0x1e0) from [<c0555ec0>] (hci_event_packet+0x1ce8/0x4d84) [<c0555ec0>] (hci_event_packet+0x1ce8/0x4d84) from [<c0550380>] (hci_rx_task+0x1d0/0x2d0) [<c0550380>] (hci_rx_task+0x1d0/0x2d0) from [<c009ee04>] (tasklet_action+0x138/0x1e4) [<c009ee04>] (tasklet_action+0x138/0x1e4) from [<c009f21c>] (__do_softirq+0xcc/0x274) [<c009f21c>] (__do_softirq+0xcc/0x274) from [<c009f6c0>] (do_softirq+0x60/0x6c) [<c009f6c0>] (do_softirq+0x60/0x6c) from [<c009f794>] (local_bh_enable_ip+0xc8/0xd4) [<c009f794>] (local_bh_enable_ip+0xc8/0xd4) from [<c05e5804>] (_raw_spin_unlock_bh+0x48/0x4c) [<c05e5804>] (_raw_spin_unlock_bh+0x48/0x4c) from [<c040d470>] (data_from_chip+0xf4/0xaec) [<c040d470>] (data_from_chip+0xf4/0xaec) from [<c04136c0>] (send_skb_to_core+0x40/0x178) [<c04136c0>] (send_skb_to_core+0x40/0x178) from [<c04139f4>] (cg2900_hu_receive+0x15c/0x2d0) [<c04139f4>] (cg2900_hu_receive+0x15c/0x2d0) from [<c0414cb8>] (hci_uart_tty_receive+0x74/0xa0) [<c0414cb8>] (hci_uart_tty_receive+0x74/0xa0) from [<c02cbd9c>] (flush_to_ldisc+0x188/0x198) [<c02cbd9c>] (flush_to_ldisc+0x188/0x198) from [<c00b2774>] (process_one_work+0x144/0x4b8) [<c00b2774>] (process_one_work+0x144/0x4b8) from [<c00b2e8c>] (worker_thread+0x198/0x468) [<c00b2e8c>] (worker_thread+0x198/0x468) from [<c00b9bc8>] (kthread+0x98/0xa0) [<c00b9bc8>] (kthread+0x98/0xa0) from [<c0061744>] (kernel_thread_exit+0x0/0x8) Signed-off-by: Szymon Janc <szymon.janc@tieto.com> Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>	2011-10-14 15:04:54 -03:00
Boaz Harrosh	4b46c9f5cf	ore/exofs: Change ore_check_io API Current ore_check_io API receives a residual pointer, to report partial IO. But it is actually not used, because in a multiple devices IO there is never a linearity in the IO failure. On the other hand if every failing device is reported through a received callback measures can be taken to handle only failed devices. One at a time. This will also be needed by the objects-layout-driver for it's error reporting facility. Exofs is not currently using the new information and keeps the old behaviour of failing the complete IO in case of an error. (No partial completion) TODO: Use an ore_check_io callback to set_page_error only the failing pages. And re-dirty write pages. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>	2011-10-14 18:54:42 +02:00
Boaz Harrosh	5a51c0c7e9	ore/exofs: Define new ore_verify_layout All users of the ore will need to check if current code supports the given layout. For example RAID5/6 is not currently supported. So move all the checks from exofs/super.c to a new ore_verify_layout() to be used by ore users. Note that any new layout should be passed through the ore_verify_layout() because the ore engine will prepare and verify some internal members of ore_layout, and assumes it's called. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>	2011-10-14 18:54:41 +02:00
Boaz Harrosh	3bd9856857	ore: Support for partial component table Users like the objlayout-driver would like to only pass a partial device table that covers the IO in question. For example exofs divides the file into raid-group-sized chunks and only serves group_width number of devices at a time. The partiality is communicated by setting ore_componets->first_dev and the array covers all logical devices from oc->first_dev upto (oc->first_dev + oc->numdevs) The ore_comp_dev() API receives a logical device index and returns the actual present device in the table. An out-of-range dev_index will BUG. Logical device index is the theoretical device index as if all the devices of a file are present. .i.e: total_devs = group_width * mirror_p1 * group_count 0 <= dev_index < total_devs Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>	2011-10-14 18:54:41 +02:00
Boaz Harrosh	9826075404	ore: cleanup: Embed an ore_striping_info inside ore_io_state Now that each ore_io_state covers only a single raid group. A single striping_info math is needed. Embed one inside ore_io_state to cache the calculation results and eliminate an extra call. Also the outer _prepare_for_striping is removed since it does nothing. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>	2011-10-14 18:53:54 +02:00
Eric Dumazet	87fb4b7b53	net: more accurate skb truesize skb truesize currently accounts for sk_buff struct and part of skb head. kmalloc() roundings are also ignored. Considering that skb_shared_info is larger than sk_buff, its time to take it into account for better memory accounting. This patch introduces SKB_TRUESIZE(X) macro to centralize various assumptions into a single place. At skb alloc phase, we put skb_shared_info struct at the exact end of skb head, to allow a better use of memory (lowering number of reallocations), since kmalloc() gives us power-of-two memory blocks. Unless SLUB/SLUB debug is active, both skb->head and skb_shared_info are aligned to cache lines, as before. Note: This patch might trigger performance regressions because of misconfigured protocol stacks, hitting per socket or global memory limits that were previously not reached. But its a necessary step for a more accurate memory accounting. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Andi Kleen <ak@linux.intel.com> CC: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-13 16:05:07 -04:00
Neil Zhang	dde34cc501	usb: gadget: mv_udc: refine the driver structure This patch do the following things: 1. Add header and Copyright for marvell usb driver. 2. Add mv_usb.h in include/linux/platform_data, make the driver fits all the marvell platform using the same ChipIdea usb ip. 3. Some SOC may has mutiple clock sources, so let me define it in mv_usb_platform_data and give two helper functions named udc_clock_enable/udc_clock_disable to deal with the clocks. 4. Different SOCs will have some difference in PHY initialization, so we will remove file mv_udc_phy.c and add two funtions in mv_usb_platform_data, let the platform relative driver to realize it. 5. Rewrite probe function according to the modification list above. Find it will kernel panic when probe failed. The root cause is as follows: When probe failed, the error handle may call device_unregister() which in return will call gadget_release.In current code, gadget_release have two issues: 1: the_controller is a NULL pointer. 2: if we free udc here, then the following code in probe will access NULL pointer. Signed-off-by: Neil Zhang <zhangwm@marvell.com> Signed-off-by: Felipe Balbi <balbi@ti.com>	2011-10-13 20:41:56 +03:00
Kuninori Morimoto	f427eb64f4	usb: gadget: renesas_usbhs: support otg pin control some renesas_usbhs device is supporting OTG external device interface. In that device, it is necessary to control PWEN/EXTLP on DVSTCTR. This patch support it. But renesas_usbhs driver doesn't have OTG support for now. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Felipe Balbi <balbi@ti.com>	2011-10-13 20:41:47 +03:00
Kuninori Morimoto	258485d990	usb: gadget: renesas_usbhs: add bus control functions this patch add DVSTCTR control function for HOST support Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Felipe Balbi <balbi@ti.com>	2011-10-13 20:41:38 +03:00
Kuninori Morimoto	11935de557	usb: gadget: renesas_usbhs: change usbhsc_bus_ctrl() to usbsc_set_buswait() renesas_usbhs will have register DVSTCTR control function for HOST support. This patch changes usbhsc_bus_ctrl() to usbsc_set_buswait(), to remove DVSTCTR access from it, Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Felipe Balbi <balbi@ti.com>	2011-10-13 20:41:37 +03:00
Felipe Balbi	089b837a39	usb: gadget: fix typo for default U1/U2 exit latencies s/DEFULT/DEFAULT/, no functional changes. Signed-off-by: Felipe Balbi <balbi@ti.com>	2011-10-13 20:39:59 +03:00
Yoshihiro Shimoda	b8a56e17e1	usb: gadget: r8a66597-udc: add support for SUDMAC SH7757 has a USB function with internal DMA controller (SUDMAC). This patch supports the SUDMAC. The SUDMAC is incompatible with general-purpose DMAC. So, it doesn't use dmaengine. Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <balbi@ti.com>	2011-10-13 20:38:39 +03:00
Linus Walleij	2744e8afb3	drivers: create a pin control subsystem This creates a subsystem for handling of pin control devices. These are devices that control different aspects of package pins. Currently it handles pinmuxing, i.e. assigning electronic functions to groups of pins on primarily PGA and BGA type of chip packages which are common in embedded systems. The plan is to also handle other I/O pin control aspects such as biasing, driving, input properties such as schmitt-triggering, load capacitance etc within this subsystem, to remove a lot of ARM arch code as well as feature-creepy GPIO drivers which are implementing the same thing over and over again. This is being done to depopulate the arch/arm/* directory of such custom drivers and try to abstract the infrastructure they all need. See the Documentation/pinctrl.txt file that is part of this patch for more details. ChangeLog v1->v2: - Various minor fixes from Joe's and Stephens review comments - Added a pinmux_config() that can invoke custom configuration with arbitrary data passed in or out to/from the pinmux driver ChangeLog v2->v3: - Renamed subsystem folder to "pinctrl" since we will likely want to keep other pin control such as biasing in this subsystem too, so let us keep to something generic even though we're mainly doing pinmux now. - As a consequence, register pins as an abstract entity separate from the pinmux. The muxing functions will claim pins out of the pin pool and make sure they do not collide. Pins can now be named by the pinctrl core. - Converted the pin lookup from a static array into a radix tree, I agreed with Grant Likely to try to avoid any static allocation (which is crap for device tree stuff) so I just rewrote this to be dynamic, just like irq number descriptors. The platform-wide definition of number of pins goes away - this is now just the sum total of the pins registered to the subsystem. - Make sure mappings with only a function name and no device works properly. ChangeLog v3->v4: - Define a number space per controller instead of globally, Stephen and Grant requested the same thing so now maps need to define target controller, and the radix tree of pin descriptors is a property on each pin controller device. - Add a compulsory pinctrl device entry to the pinctrl mapping table. This must match the pinctrl device, like "pinctrl.0" - Split the file core.c in two: core.c and pinmux.c where the latter carry all pinmux stuff, the core is for generic pin control, and use local headers to access functionality between files. It is now possible to implement a "blank" pin controller without pinmux capabilities. This split will make new additions like pindrive.c, pinbias.c etc possible for combined drivers and chunks of functionality which is a GoodThing(TM). - Rewrite the interaction with the GPIO subsystem - the pin controller descriptor now handles this by defining an offset into the GPIO numberspace for its handled pin range. This is used to look up the apropriate pin controller for a GPIO pin. Then that specific GPIO range is matched 1-1 for the target controller instance. - Fixed a number of review comments from Joe Perches. - Broke out a header file pinctrl.h for the core pin handling stuff that will be reused by other stuff than pinmux. - Fixed some erroneous EXPORT() stuff. - Remove mispatched U300 Kconfig and Makefile entries - Fixed a number of review comments from Stephen Warren, not all of them - still WIP. But I think the new mapping that will specify which function goes to which pin mux controller address 50% of your concerns (else beat me up). ChangeLog v4->v5: - Defined a "position" for each function, so the pin controller now tracks a function in a certain position, and the pinmux maps define what position you want the function in. (Feedback from Stephen Warren and Sascha Hauer). - Since we now need to request a combined function+position from the machine mapping table that connect mux settings to drivers, it was extended with a position field and a name field. The name field is now used if you e.g. need to switch between two mux map settings at runtime. - Switched from a class device to using struct bus_type for this subsystem. Verified sysfs functionality: seems to work fine. (Feedback from Arnd Bergmann and Greg Kroah-Hartman) - Define a per pincontroller list of GPIO ranges from the GPIO pin space that can be handled by the pin controller. These can be added one by one at runtime. (Feedback from Barry Song) - Expanded documentation of regulator_[get\|enable\|disable\|put] semantics. - Fixed a number of review comments from Barry Song. (Thanks!) ChangeLog v5->v6: - Create an abstract pin group concept that can sort pins into named and enumerated groups no matter what the use of these groups may be, one possible usecase is a group of pins being muxed in or so. The intention is however to also use these groups for other pin control activities. - Make it compulsory for pinmux functions to associate with at least one group, so the abstract pin group concept is used to define the groups of pins affected by a pinmux function. The pinmux driver interface has been altered so as to enforce a function to list applicable groups per function. - Provide an optional .group entry in the pinmux machine map so the map can select beteween different available groups to be used with a certain function. - Consequent changes all over the place so that e.g. debugfs present reasonable information about the world. - Drop the per-pin mux (config) function in the pinmux_ops struct - I was afraid that this would start to be used for things totally unrelated to muxing, we can introduce that to the generic struct pinctrl_ops if needed. I want to keep muxing orthogonal to other pin control subjects and not mix these things up. ChangeLog v6->v7: - Make it possible to have several map entries matching the same device, pin controller and function, but using a different group, and alter the semantics so that pinmux_get() will pick all matching map entries, and store the associated groups in a list. The list will then be iterated over at pinmux_enable()/pinmux_disable() and corresponding driver functions called for each defined group. Notice that you're only allowed to map multiple groups* to the same { device, pin controller, function } triplet, attempts to map the same device to multiple pin controllers will for example fail. This is hopefully the crucial feature requested by Stephen Warren. - Add a pinmux hogging field to the pinmux mapping entries, and enable the pinmux core to hog pinmux map entries. This currently only works for pinmuxes without assigned devices as it looks now, but with device trees we can look up the corresponding struct device * entries when we register the pinmux driver, and have it hog each pinmux map in turn, for a simple approach to non-dynamic pin muxing. This addresses an issue from Grant Likely that the machine should take care of as much of the pinmux setup as possible, not the devices. By supplying a list of hogs, it can now instruct the core to take care of any static mappings. - Switch pinmux group retrieveal function to grab an array of strings representing the groups rather than an array of unsigned and rewrite accordingly. - Alter debugfs to show the grouplist handled by each pinmux. Also add a list of hogs. - Dynamically allocate a struct pinmux at pinmux_get() and free it at pinmux_put(), then add these to the global list of pinmuxes active as we go along. - Go over the list of pinmux maps at pinmux_get() time and repeatedly apply matches. - Retrieve applicable groups per function from the driver as a string array rather than a unsigned array, then lookup the enumerators. - Make the device to pinmux map a singleton - only allow the mapping table to be registered once and even tag the registration function with __init so it surely won't be abused. - Create a separate debugfs file to view the pinmux map at runtime. - Introduce a spin lock to the pin descriptor struct, lock it when modifying pin status entries. Reported by Stijn Devriendt. - Fix up the documentation after review from Stephen Warren. - Let the GPIO ranges give names as const char * instead of some fixed-length string. - add a function to unregister GPIO ranges to mirror the registration function. - Privatized the struct pinctrl_device and removed it from the <linux/pinctrl/pinctrl.h> API, the drivers do not need to know the members of this struct. It is now in the local header "core.h". - Rename the concept of "anonymous" mux maps to "system" muxes and add convenience macros and documentation. ChangeLog v7->v8: - Delete the leftover pinmux_config() function from the <linux/pinctrl/pinmux.h> header. - Fix a race condition found by Stijn Devriendt in pin_request() ChangeLog v8->v9: - Drop the bus_type and the sysfs attributes and all, we're not on the clear about how this should be used for e.g. userspace interfaces so let us save this for the future. - Use the right name in MAINTAINERS, PIN CONTROL rather than PINMUX - Don't kfree() the device state holder, let the .remove() callback handle this. - Fix up numerous kerneldoc headers to have one line for the function description and more verbose documentation below the parameters ChangeLog v9->v10: - pinctrl: EXPORT_SYMBOL needs export.h, folded in a patch from Steven Rothwell - fix pinctrl_register error handling, folded in a patch from Axel Lin - Various fixes to documentation text so that it's consistent. - Removed pointless comment from drivers/Kconfig - Removed dependency on SYSFS since we removed the bus in v9. - Renamed hopelessly abbreviated pctldev_* functions to the more verbose pinctrl_dev_* - Drop mutex properly when looking up GPIO ranges - Return NULL instead of ERR_PTR() errors on registration of pin controllers, using cast pointers is fragile. We can live without the detailed error codes for sure. Cc: Stijn Devriendt <highguy@gmail.com> Cc: Joe Perches <joe@perches.com> Cc: Russell King <linux@arm.linux.org.uk> Acked-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Stephen Warren <swarren@nvidia.com> Tested-by: Barry Song <21cnbao@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2011-10-13 12:49:17 +02:00
Murali Raja	3ceca74966	net-netlink: Add a new attribute to expose TOS values via netlink This patch exposes the tos value for the TCP sockets when the TOS flag is requested in the ext_flags for the inet_diag request. This would mainly be used to expose TOS values for both for TCP and UDP sockets. Currently it is supported for TCP. When netlink support for UDP would be added the support to expose the TOS values would alse be done. For IPV4 tos value is exposed and for IPV6 tclass value is exposed. Signed-off-by: Murali Raja <muralira@google.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-12 19:09:18 -04:00
Hans Schillstrom	ae1d48b23d	IPVS netns shutdown/startup dead-lock ip_vs_mutext is used by both netns shutdown code and startup and both implicit uses sk_lock-AF_INET mutex. cleanup CPU-1 startup CPU-2 ip_vs_dst_event() ip_vs_genl_set_cmd() sk_lock-AF_INET __ip_vs_mutex sk_lock-AF_INET __ip_vs_mutex * DEAD LOCK * A new mutex placed in ip_vs netns struct called sync_mutex is added. Comments from Julian and Simon added. This patch has been running for more than 3 month now and it seems to work. Ver. 3 IP_VS_SO_GET_DAEMON in do_ip_vs_get_ctl protected by sync_mutex instead of __ip_vs_mutex as sugested by Julian. Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2011-10-12 18:32:15 +02:00
Ingo Molnar	910e94dd0c	Merge branch 'tip/perf/core' of git://github.com/rostedt/linux into perf/core	2011-10-12 17:14:47 +02:00
Johannes Berg	73b9f03a81	mac80211: parse radiotap header earlier We can now move the radiotap header parsing into ieee80211_monitor_start_xmit(). This moves it out of the hotpath, and also helps the code since now the radiotap header will no longer be present in ieee80211_xmit() etc. which is easier to understand. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-10-11 16:41:19 -04:00
Johannes Berg	a26eb27ab4	mac80211: move fragment flag to info flag as dont-fragment The purpose of this is two-fold: 1) by moving it out of tx_data.flags, we can in another patch move the radiotap parsing so it no longer is in the hotpath 2) if a device implements fragmentation but can optionally skip it, the radiotap request for not doing fragmentation may be honoured Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-10-11 16:41:19 -04:00
John W. Linville	094daf7db7	Merge branch 'master' of git://git.infradead.org/users/linville/wireless-next into for-davem Conflicts: Documentation/feature-removal-schedule.txt	2011-10-11 15:35:42 -04:00
Greg Kroah-Hartman	15b80d6417	hv: remove struct hv_device_info from hyperv.h This is only used/needed by the vmbus core code, so move it out of the hyperv.h file and into the .c file that uses it. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-10-11 09:51:22 -06:00
Greg Kroah-Hartman	9f3e28e375	hv: remove free_channel() from hyperv.h This function is only used in the file it is declared in (channel_mgmt.c) so make it static and remove it from the hyperv.h file. Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-10-11 09:51:22 -06:00
Greg Kroah-Hartman	2726f95e0b	hv: hyperv.h: remove unneeded forward declarations of structures This file was created by mushing different .h files together and it shows. This change removes some unneeded forward declarations. Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-10-11 09:51:22 -06:00
Greg Kroah-Hartman	7a4ba88cc1	hv: hyperv.h: remove unused module macros I have no idea what these were ever for, but they aren't used, so delete them. Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-10-11 09:51:22 -06:00
Greg Kroah-Hartman	5557e8a605	hv: remove unused LOWORD and HIWORD macros from hyperv.h They aren't used anywhere anymore now that the debugging macros are gone, so remove it from hyperv.h as well. Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-10-11 09:51:22 -06:00
Greg Kroah-Hartman	815166b95d	Staging: hv: remove vmbus_loglevel as it is not used at all anymore As there is no user of this variable, it's time to delete it. For dynamic debugging of the hyperv code, use the standard dynamic debug kernel interface. Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-10-11 09:51:21 -06:00
Greg Kroah-Hartman	1a2643012f	Staging: hv: remove last user of DPRINT() macro This also removed the unused function hv_dump_ring_info(). Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-10-11 09:51:21 -06:00
Greg Kroah-Hartman	d181daa06d	Staging: hv: storvsc: remove last usage of DPRINT_WARN Used the correct dev_warn() call instead. Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-10-11 09:51:21 -06:00
Greg Kroah-Hartman	a832a1eba9	hv: remove a bunch of unused debug macros from hyperv.h These aren't used by anyone anymore, so remove them before someone tries to use them again. Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-10-11 08:49:19 -06:00
Greg Kroah-Hartman	da0e96315c	hv: rename prep_negotiate_resp() to vmbus_prep_negotiate_resp() It's a global symbol, so properly prefix it and use the proper EXPORT value as well. Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-10-11 08:49:19 -06:00
Greg Kroah-Hartman	407dd16443	Staging: hv: remove unneeded asm include file in hyperv.h No one outside of the hyperv core needs to include the asm/hyperv.h file, so don't put it in the "global" include/linux/hyperv.h file. Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-10-11 08:49:19 -06:00
Stephen Rothwell	540f41edc1	llist: Add back llist_add_batch() and llist_del_first() prototypes Commit `1230db8e15` ("llist: Make some llist functions inline") has deleted the definitions, causing problems for (not upstream yet) code that tries to make use of them. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Huang Ying <ying.huang@intel.com> Cc: David Miller <davem@davemloft.net> Link: http://lkml.kernel.org/r/20111005172528.0d0a8afc65acef7ace22a24e@canb.auug.org.au Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-11 12:51:22 +02:00
Greg Kroah-Hartman	46a9719136	Staging: hv: move hyperv code out of staging directory After many years wandering the desert, it is finally time for the Microsoft HyperV code to move out of the staging directory. Or at least the core hyperv bus code, and the utility driver, the rest still have some review to get through by the various subsystem maintainers. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>	2011-10-10 22:52:55 -06:00
Harro Haan	276532ba96	USB: fix ehci alignment error The Kirkwood gave an unaligned memory access error on line 742 of drivers/usb/host/echi-hcd.c: "ehci->last_periodic_enable = ktime_get_real();" Signed-off-by: Harro Haan <hrhaan@gmail.com> Cc: stable <stable@kernel.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-10-10 16:43:53 -07:00
Rafael J. Wysocki	7811ac276b	Merge branch 'pm-devfreq' into pm-for-linus * pm-devfreq: PM / devfreq: Add basic governors PM / devfreq: Add common sysfs interfaces PM: Introduce devfreq: generic DVFS framework with device-specific OPPs PM / OPP: Add OPP availability change notifier.	2011-10-07 23:17:18 +02:00
Rafael J. Wysocki	9696cc9007	Merge branch 'pm-qos' into pm-for-linus * pm-qos: PM / QoS: Update Documentation for the pm_qos and dev_pm_qos frameworks PM / QoS: Add function dev_pm_qos_read_value() (v3) PM QoS: Add global notification mechanism for device constraints PM QoS: Implement per-device PM QoS constraints PM QoS: Generalize and export constraints management code PM QoS: Reorganize data structs PM QoS: Code reorganization PM QoS: Minor clean-ups PM QoS: Move and rename the implementation files	2011-10-07 23:17:07 +02:00
Rafael J. Wysocki	c28b56b1d4	Merge branch 'pm-domains' into pm-for-linus * pm-domains: PM / Domains: Split device PM domain data into base and need_restore ARM: mach-shmobile: sh7372 sleep warning fixes ARM: mach-shmobile: sh7372 A3SM support ARM: mach-shmobile: sh7372 generic suspend/resume support PM / Domains: Preliminary support for devices with power.irq_safe set PM: Move clock-related definitions and headers to separate file PM / Domains: Use power.sybsys_data to reduce overhead PM: Reference counting of power.subsys_data PM: Introduce struct pm_subsys_data ARM / shmobile: Make A3RV be a subdomain of A4LC on SH7372 PM / Domains: Rename argument of pm_genpd_add_subdomain() PM / Domains: Rename GPD_STATE_WAIT_PARENT to GPD_STATE_WAIT_MASTER PM / Domains: Allow generic PM domains to have multiple masters PM / Domains: Add "wait for parent" status for generic PM domains PM / Domains: Make pm_genpd_poweron() always survive parent removal PM / Domains: Do not take parent locks to modify subdomain counters PM / Domains: Implement subdomain counters as atomic fields	2011-10-07 23:17:02 +02:00
Rafael J. Wysocki	d727b60659	Merge branch 'pm-runtime' into pm-for-linus * pm-runtime: PM / Tracing: build rpm-traces.c only if CONFIG_PM_RUNTIME is set PM / Runtime: Replace dev_dbg() with trace_rpm_() PM / Runtime: Introduce trace points for tracing rpm_ functions PM / Runtime: Don't run callbacks under lock for power.irq_safe set USB: Add wakeup info to debugging messages PM / Runtime: pm_runtime_idle() can be called in atomic context PM / Runtime: Add macro to test for runtime PM events PM / Runtime: Add might_sleep() to runtime PM functions	2011-10-07 23:16:55 +02:00
David S. Miller	88c5100c28	Merge branch 'master' of github.com:davem330/net Conflicts: net/batman-adv/soft-interface.c	2011-10-07 13:38:43 -04:00
John Fastabend	27737aa3a9	dcb: Add stub routines for !CONFIG_DCB To avoid ifdefs in the other code that supports DCB notifiers add stub routines. This method seems popular in other net code for example 8021Q. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-06 15:49:51 -04:00
John Fastabend	6bd0e1cb10	dcb: add DCBX mode to event notifier attributes Add DCBX mode to event notifiers so listeners can learn currently enabled mode. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-06 15:49:51 -04:00
Mark Rustad	e290ed8130	dcb: Use ifindex instead of ifname Use ifindex instead of ifname in the DCB app ring. This makes for a smaller data structure and faster comparisons. It also avoids possible issues when a net device is renamed. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-06 15:49:51 -04:00
Linus Torvalds	6367f1775e	Merge branch 'for-linus' of http://people.redhat.com/agk/git/linux-dm * 'for-linus' of http://people.redhat.com/agk/git/linux-dm: dm crypt: always disable discard_zeroes_data dm: raid fix write_mostly arg validation dm table: avoid crash if integrity profile changes dm: flakey fix corrupt_bio_byte error path	2011-10-06 08:31:47 -07:00
Joerg Roedel	a240f76165	perf, core: Introduce attrs to count in either host or guest mode The two new attributes exclude_guest and exclude_host can bes used by user-space to tell the kernel to setup performance counter to either only count while the CPU is in guest or in host mode. An additional check is also introduced to make sure user-space does not try to exclude guest and host mode from counting. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1317816084-18026-2-git-send-email-gleb@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-06 13:00:28 +02:00
Ingo Molnar	9d01402023	Merge commit 'v3.1-rc9' into perf/core Merge reason: pick up latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-06 12:49:21 +02:00
Ingo Molnar	9243a169ac	Merge commit 'v3.1-rc9' into sched/core Merge reason: pick up latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-06 12:43:35 +02:00
Jamie Iles	a1330228f9	dw_apb_timer: constify clocksource name The clocksource name should be const for correctness. Cc: John Stultz <johnstul@us.ibm.com> Signed-off-by: Jamie Iles <jamie@jamieiles.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2011-10-04 13:08:18 -07:00
Rafael J. Wysocki	1a9a91525d	PM / QoS: Add function dev_pm_qos_read_value() (v3) To read the current PM QoS value for a given device we need to make sure that the device's power.constraints object won't be removed while we're doing that. For this reason, put the operation under dev->power.lock and acquire the lock around the initialization and removal of power.constraints. Moreover, since we're using the value of power.constraints to determine whether or not the object is present, the power.constraints_state field isn't necessary any more and may be removed. However, dev_pm_qos_add_request() needs to check if the device is being removed from the system before allocating a new PM QoS constraints object for it, so make it use the power.power_state field of struct device for this purpose. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2011-10-04 21:54:26 +02:00
John W. Linville	d6222fb0d6	Merge branch 'master' of git://github.com/padovan/bluetooth-next	2011-10-04 14:06:47 -04:00
Linus Torvalds	8a04b45367	Merge git://github.com/davem330/net * git://github.com/davem330/net: pch_gbe: Fixed the issue on which a network freezes pch_gbe: Fixed the issue on which PC was frozen when link was downed. make PACKET_STATISTICS getsockopt report consistently between ring and non-ring net: xen-netback: correctly restart Tx after a VM restore/migrate bonding: properly stop queuing work when requested can bcm: fix incomplete tx_setup fix RDSRDMA: Fix cleanup of rds_iw_mr_pool net: Documentation: Fix type of variables ibmveth: Fix oops on request_irq failure ipv6: nullify ipv6_ac_list and ipv6_fl_list when creating new socket cxgb4: Fix EEH on IBM P7IOC can bcm: fix tx_setup off-by-one errors MAINTAINERS: tehuti: Alexander Indenbaum's address bounces dp83640: reduce driver noise ptp: fix L2 event message recognition	2011-10-04 10:37:06 -07:00
Jon Mason	5f39e6705f	PCI: Disable MPS configuration by default Add the ability to disable PCI-E MPS turning and using the BIOS configured MPS defaults. Due to the number of issues recently discovered on some x86 chipsets, make this the default behavior. Also, add the option for peer to peer DMA MPS configuration. Peer to peer DMA is outside the scope of this patch, but MPS configuration could prevent it from working by having the MPS on one root port different than the MPS on another. To work around this, simply make the system wide MPS the smallest possible value (128B). Signed-off-by: Jon Mason <mason@myri.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-10-04 09:52:28 -07:00
Peter Zijlstra	f0f1d32f93	llist: Remove cpu_relax() usage in cmpxchg loops Initial benchmarks show they're a net loss: $ for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do echo performance > $i; done $ echo 4096 32000 64 128 > /proc/sys/kernel/sem $ ./sembench -t 2048 -w 1900 -o 0 Pre: run time 30 seconds 778936 worker burns per second run time 30 seconds 912190 worker burns per second run time 30 seconds 817506 worker burns per second run time 30 seconds 830870 worker burns per second run time 30 seconds 845056 worker burns per second Post: run time 30 seconds 905920 worker burns per second run time 30 seconds 849046 worker burns per second run time 30 seconds 886286 worker burns per second run time 30 seconds 822320 worker burns per second run time 30 seconds 900283 worker burns per second So about 4% faster. (!) cpu_relax() stalls the pipeline, therefore, when used in a tight loop it has the following benefits: - allows SMT siblings to have a go; - reduces pressure on the CPU interconnect. However, cmpxchg loops are unfair and thus have unbounded completion time, therefore we should avoid getting in such heavily contended situations where the above benefits make any difference. A typical cmpxchg loop should not go round more than a handfull of times at worst, therefore adding extra delays just slows things down. Since the llist primitives are new, there aren't any bad users yet, and we should avoid growing them. Heavily contended sites should generally be better off using the ticket locks for serialization since they provide bounded completion times (fifo-fair over the cpus). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Huang Ying <ying.huang@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1315836358.26517.43.camel@twins Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-04 12:44:03 +02:00
Peter Zijlstra	fa14ff4acc	sched: Convert to struct llist Use the generic llist primitives. We had a private lockless list implementation in the scheduler in the wake-list code, now that we have a generic llist implementation that provides all required operations, switch to it. This patch is not expected to change any behavior. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Huang Ying <ying.huang@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1315836353.26517.42.camel@twins Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-04 12:43:58 +02:00
Peter Zijlstra	924f8f5af3	llist: Add llist_next() So we don't have to expose the struct list_node member. Cc: Huang Ying <ying.huang@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1315836348.26517.41.camel@twins Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-04 12:43:53 +02:00
Huang Ying	38aaf8090d	irq_work: Use llist in the struct irq_work logic Use llist in irq_work instead of the lock-less linked list implementation in irq_work to avoid the code duplication. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1315461646-1379-6-git-send-email-ying.huang@intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-04 12:43:49 +02:00
Huang Ying	781f7fd916	llist: Return whether list is empty before adding in llist_add() Extend the llist_add*() functions to return a success indicator, this allows us in the scheduler code to send an IPI if the queue was empty. ( There's no effect on existing users, because the list_add_xxx() functions are inline, thus this will be optimized out by the compiler if not used by callers. ) Signed-off-by: Huang Ying <ying.huang@intel.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1315461646-1379-5-git-send-email-ying.huang@intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-04 12:43:44 +02:00
Huang Ying	a3127336b7	llist: Move cpu_relax() to after the cmpxchg() If in llist_add()/etc. functions the first cmpxchg() call succeeds, it is not necessary to use cpu_relax() before the cmpxchg(). So cpu_relax() in a busy loop involving cmpxchg() should go after cmpxchg() instead of before that. This patch fixes this for all involved llist functions. Signed-off-by: Huang Ying <ying.huang@intel.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1315461646-1379-4-git-send-email-ying.huang@intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-04 12:43:39 +02:00
Ingo Molnar	2c30245c65	llist: Remove the platform-dependent NMI checks Remove the nmi() checks spread around the code. in_nmi() is not available on every architecture and it's a pretty obscure and ugly check in any case. Cc: Huang Ying <ying.huang@intel.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1315461646-1379-3-git-send-email-ying.huang@intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-04 12:43:11 +02:00
Boaz Harrosh	d866d875f6	ore/exofs: Change the type of the devices array (API change) In the pNFS obj-LD the device table at the layout level needs to point to a device_cache node, where it is possible and likely that many layouts will point to the same device-nodes. In Exofs we have a more orderly structure where we have a single array of devices that repeats twice for a round-robin view of the device table This patch moves to a model that can be used by the pNFS obj-LD where struct ore_components holds an array of ore_dev-pointers. (ore_dev is newly defined and contains a struct osd_dev *od member) Each pointer in the array of pointers will point to a bigger user-defined dev_struct. That can be accessed by use of the container_of macro. In Exofs an __alloc_dev_table() function allocates the ore_dev-pointers array as well as an exofs_dev array, in one allocation and does the addresses dance to set everything pointing correctly. It still keeps the double allocation trick for the inodes round-robin view of the table. The device table is always allocated dynamically, also for the single device case. So it is unconditionally freed at umount. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>	2011-10-04 12:13:59 +02:00
Huang Ying	1230db8e15	llist: Make some llist functions inline Because llist code will be used in performance critical scheduler code path, make llist_add() and llist_del_all() inline to avoid function calling overhead and related 'glue' overhead. Signed-off-by: Huang Ying <ying.huang@intel.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1315461646-1379-2-git-send-email-ying.huang@intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-04 11:30:53 +02:00
Ingo Molnar	22f92bacbe	Merge branch 'linus' into sched/core Merge reason: pick up the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-04 11:09:08 +02:00
Andrew Vagin	92e51938f5	perf: Fix counter of ftrace events Each event adds some points to its counters. By default it adds 1, and a number of points may be transmited in event's parameters. E.g. sched:sched_stat_runtime adds how long process has been running. But this functionality was broken by v2.6.31-rc5-392-gf413cdb and now the event's parameters doesn't affect on a number of points. TP_perf_assign isn't defined, so __perf_count(c) isn't executed and __count is always equal to 1. Signed-off-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1317052535-1765247-2-git-send-email-avagin@openvz.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-04 11:07:54 +02:00
Eliad Peller	8a3a3c85e4	mac80211: pass vif param to conf_tx() callback tx params should be configured per interface. add ieee80211_vif param to the conf_tx callback, and change all the drivers that use this callback. The following spatch was used: @rule1@ struct ieee80211_ops ops; identifier conf_tx_op; @@ ops.conf_tx = conf_tx_op; @rule2@ identifier rule1.conf_tx_op; identifier hw, queue, params; @@ conf_tx_op ( - struct ieee80211_hw hw, + struct ieee80211_hw hw, struct ieee80211_vif vif, u16 queue, const struct ieee80211_tx_queue_params params) {...} Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-10-03 15:22:41 -04:00
Rajkumar Manoharan	b6f35301ef	mac80211: Send nullfunc frames at lower rate during connection monitor Recently mac80211 was changed to use nullfunc instead of probe request for connection monitoring for tx ack status reporting hardwares. Sometimes in congested network, STA got disconnected quickly after the association. It was observered that the rate control was not adopted to environment due to minimal transmission. As the nullfunc are used for monitoring purpose, these frames should not be sacrificed for rate control updation. So it is better to send the monitoring null func frames at minimum rate that could help to retain the connection. Signed-off-by: Rajkumar Manoharan <rmanohar@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-10-03 15:22:32 -04:00
Sangwook Lee	e209c5a7ed	net:rfkill: add a gpio setup function into GPIO rfkill Add a gpio setup function which gives a chance to set up platform specific configuration such as pin multiplexing, input/output direction at the runtime or booting time. Signed-off-by: Sangwook Lee <sangwook.lee@linaro.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-10-03 15:19:19 -04:00
Helmut Schaa	893d73f4a1	mac80211: Allow noack flag overwrite for injected frames Allow injected unicast frames to be sent without having to wait for an ACK. Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-10-03 15:19:17 -04:00
Vasily Averin	349d2895cc	ipv4: NET_IPV4_ROUTE_GC_INTERVAL removal removing obsoleted sysctl, ip_rt_gc_interval variable no longer used since 2.6.38 Signed-off-by: Vasily Averin <vvs@sw.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-03 14:13:01 -04:00
Jiří Župka	96c131842a	Repair wrong named definition aligned_u64 This repairs problem with compile library in userspace (libnl). Signed-off-by: Jiří Župka <jzupka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-03 14:03:48 -04:00
Eric Dumazet	b5c5693bb7	tcp: report ECN_SEEN in tcp_info Allows ss command (iproute2) to display "ecnseen" if at least one packet with ECT(0) or ECT(1) or ECN was received by this socket. "ecn" means ECN was negotiated at session establishment (TCP level) "ecnseen" means we received at least one packet with ECT fields set (IP level) ss -i ... ESTAB 0 0 192.168.20.110:22 192.168.20.144:38016 ino:5950 sk:f178e400 mem:(r0,w0,f0,t0) ts sack ecn ecnseen bic wscale:7,8 rto:210 rtt:12.5/7.5 cwnd:10 send 9.3Mbps rcv_space:14480 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-03 14:01:21 -04:00
Boaz Harrosh	eb507bc189	ore: Make ore_striping_info and ore_calc_stripe_info public The struct ore_striping_info will be used later in other structures. And ore_calc_stripe_info as well. Rename them make struct ore_striping_info public. ore_calc_stripe_info is still static, will be made public on first use. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>	2011-10-03 17:07:51 +02:00
Boaz Harrosh	8d2d83a835	exofs: Remove unused data_map member from exofs_sb_info The struct pnfs_osd_data_map data_map member of exofs_sb_info was never used after mount. In fact all it's members were duplicated by the ore_layout structure. So just remove the duplicated information. Also removed some stupid, but perfectly supported, restrictions on layout parameters. The case where num_devices is not divisible by mirror_count+1 is perfectly fine since the rotating device view will eventually use all the devices it can get. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com>	2011-10-03 17:07:51 +02:00
Boaz Harrosh	5bf696dad4	exofs: Rename struct ore_components comps => oc ore_components already has a comps member so this leads to things like comps->comps which is annoying. the name oc was already used in new code. So rename all old usage of ore_components comps => ore_components oc. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>	2011-10-03 17:07:50 +02:00
Marc Zyngier	1e7c5fd294	genirq: percpu: allow interrupt type to be set at enable time As request_percpu_irq() doesn't allow for a percpu interrupt to have its type configured (it is generally impossible to configure it on all CPUs at once), add a 'type' argument to enable_percpu_irq(). This allows some low-level, board specific init code to be switched to a generic API. [ tglx: Added WARN_ON argument ] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Abhijeet Dharmapurikar <adharmap@codeaurora.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-10-03 15:35:27 +02:00
Marc Zyngier	31d9d9b6d8	genirq: Add support for per-cpu dev_id interrupts The ARM GIC interrupt controller offers per CPU interrupts (PPIs), which are usually used to connect local timers to each core. Each CPU has its own private interface to the GIC, and only sees the PPIs that are directly connect to it. While these timers are separate devices and have a separate interrupt line to a core, they all use the same IRQ number. For these devices, request_irq() is not the right API as it assumes that an IRQ number is visible by a number of CPUs (through the affinity setting), but makes it very awkward to express that an IRQ number can be handled by all CPUs, and yet be a different interrupt line on each CPU, requiring a different dev_id cookie to be passed back to the handler. The _percpu_irq() functions is designed to overcome these limitations, by providing a per-cpu dev_id vector: int request_percpu_irq(unsigned int irq, irq_handler_t handler, const char devname, void __percpu percpu_dev_id); void free_percpu_irq(unsigned int, void __percpu ); int setup_percpu_irq(unsigned int irq, struct irqaction new); void remove_percpu_irq(unsigned int irq, struct irqaction act); void enable_percpu_irq(unsigned int irq); void disable_percpu_irq(unsigned int irq); The API has a number of limitations: - no interrupt sharing - no threading - common handler across all the CPUs Once the interrupt is requested using setup_percpu_irq() or request_percpu_irq(), it must be enabled by each core that wishes its local interrupt to be delivered. Based on an initial patch by Thomas Gleixner. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1316793788-14500-2-git-send-email-marc.zyngier@arm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-10-03 15:35:26 +02:00
MyungJoo Ham	ce26c5bb95	PM / devfreq: Add basic governors Four cpufreq-like governors are provided as examples. powersave: use the lowest frequency possible. The user (device) should set the polling_ms as 0 because polling is useless for this governor. performance: use the highest freqeuncy possible. The user (device) should set the polling_ms as 0 because polling is useless for this governor. userspace: use the user specified frequency stored at devfreq.user_set_freq. With sysfs support in the following patch, a user may set the value with the sysfs interface. simple_ondemand: simplified version of cpufreq's ondemand governor. When a user updates OPP entries (enable/disable/add), OPP framework automatically notifies devfreq to update operating frequency accordingly. Thus, devfreq users (device drivers) do not need to update devfreq manually with OPP entry updates or set polling_ms for powersave , performance, userspace, or any other "static" governors. Note that these are given only as basic examples for governors and any devices with devfreq may implement their own governors with the drivers and use them. Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Reviewed-by: Mike Turquette <mturquette@ti.com> Acked-by: Kevin Hilman <khilman@ti.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2011-10-02 00:19:34 +02:00
MyungJoo Ham	a3c98b8b2e	PM: Introduce devfreq: generic DVFS framework with device-specific OPPs With OPPs, a device may have multiple operable frequency and voltage sets. However, there can be multiple possible operable sets and a system will need to choose one from them. In order to reduce the power consumption (by reducing frequency and voltage) without affecting the performance too much, a Dynamic Voltage and Frequency Scaling (DVFS) scheme may be used. This patch introduces the DVFS capability to non-CPU devices with OPPs. DVFS is a techique whereby the frequency and supplied voltage of a device is adjusted on-the-fly. DVFS usually sets the frequency as low as possible with given conditions (such as QoS assurance) and adjusts voltage according to the chosen frequency in order to reduce power consumption and heat dissipation. The generic DVFS for devices, devfreq, may appear quite similar with /drivers/cpufreq. However, cpufreq does not allow to have multiple devices registered and is not suitable to have multiple heterogenous devices with different (but simple) governors. Normally, DVFS mechanism controls frequency based on the demand for the device, and then, chooses voltage based on the chosen frequency. devfreq also controls the frequency based on the governor's frequency recommendation and let OPP pick up the pair of frequency and voltage based on the recommended frequency. Then, the chosen OPP is passed to device driver's "target" callback. When PM QoS is going to be used with the devfreq device, the device driver should enable OPPs that are appropriate with the current PM QoS requests. In order to do so, the device driver may call opp_enable and opp_disable at the notifier callback of PM QoS so that PM QoS's update_target() call enables the appropriate OPPs. Note that at least one of OPPs should be enabled at any time; be careful when there is a transition. Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Reviewed-by: Mike Turquette <mturquette@ti.com> Acked-by: Kevin Hilman <khilman@ti.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2011-10-02 00:19:15 +02:00
Linus Torvalds	f72a209a3e	Merge branches 'irq-urgent-for-linus', 'x86-urgent-for-linus' and 'sched-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip * 'irq-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip: irq: Fix check for already initialized irq_domain in irq_domain_add irq: Add declaration of irq_domain_simple_ops to irqdomain.h * 'x86-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip: x86/rtc: Don't recursively acquire rtc_lock * 'sched-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip: posix-cpu-timers: Cure SMP wobbles sched: Fix up wchan borkage sched/rt: Migrate equal priority tasks to available CPUs	2011-10-01 08:37:25 -07:00
Ingo Molnar	048b718029	Merge branch 'rcu/next' of git://github.com/paulmckrcu/linux into core/rcu	2011-10-01 14:21:36 +02:00
MyungJoo Ham	03ca370fbf	PM / OPP: Add OPP availability change notifier. The patch enables to register notifier_block for an OPP-device in order to get notified for any changes in the availability of OPPs of the device. For example, if a new OPP is inserted or enable/disable status of an OPP is changed, the notifier is executed. This enables the usage of opp_add, opp_enable, and opp_disable to directly take effect with any connected entities such as cpufreq or devfreq. Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Reviewed-by: Mike Turquette <mturquette@ti.com> Reviewed-by: Kevin Hilman <khilman@ti.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2011-09-30 22:35:12 +02:00
Johannes Berg	4b801bc969	mac80211: document client powersave With the addition of uAPSD and driver buffering the powersave handling has gotten quite complex. Add a section to the documentation to explain it for anyone wanting to implement it. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-09-30 15:57:24 -04:00
Johannes Berg	37fbd90800	mac80211: allow out-of-band EOSP notification iwlwifi has a separate EOSP notification from the device, and to make use of that properly it needs to be passed to mac80211. To be able to mix with tx_status_irqsafe and rx_irqsafe it also needs to be an "_irqsafe" version in the sense that it goes through the tasklet, the actual flag clearing would be IRQ-safe but doing it directly would cause reordering issues. This is needed in the case of a P2P GO going into an absence period without transmitting any frames that should be driver-released as in this case there's no other way to inform mac80211 that the service period ended. Note that for drivers that don't use the _irqsafe functions another version of this function will be required. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-09-30 15:57:23 -04:00
Johannes Berg	40b9640883	mac80211: explicitly notify drivers of frame release iwlwifi needs to know the number of frames that are going to be sent to a station while it is asleep so it can properly handle the uCode blocking of that station. Before uAPSD, we got by by telling the device that a single frame was going to be released whenever we encountered IEEE80211_TX_CTL_POLL_RESPONSE. With uAPSD, however, that is no longer possible since there could be more than a single frame. To support this model, add a new callback to notify drivers when frames are going to be released. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-09-30 15:57:21 -04:00
Johannes Berg	deeaee197b	mac80211: reply only once to each PS-poll If a PS-poll frame is retried (but was received) there is no way to detect that since it has no sequence number. As a consequence, the standard asks us to not react to PS-poll frames until the response to one made it out (was ACKed or lost). Implement this by using the WLAN_STA_SP flags to also indicate a PS-Poll "service period" and the IEEE80211_TX_STATUS_EOSP flag for the response packet to indicate the end of the "SP" as usual. We could use separate flags, but that will most likely completely confuse drivers, and while the standard doesn't exclude simultaneously polling using uAPSD and PS-Poll, doing that seems quite problematic. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-09-30 15:57:18 -04:00
Johannes Berg	47086fc51a	mac80211: implement uAPSD Add uAPSD support to mac80211. This is probably not possible with all devices, so advertising it with the cfg80211 flag will be left up to drivers that want it. Due to my previous patches it is now a fairly straight-forward extension. Drivers need to have accurate TX status reporting for the EOSP frame. For drivers that buffer themselves, the provided APIs allow releasing the right number of frames, but then drivers need to set EOSP and more-data themselves. This is documented in more detail in the new code itself. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-09-30 15:57:15 -04:00
Johannes Berg	4049e09acd	mac80211: allow releasing driver-buffered frames If there are frames for a station buffered in the driver, mac80211 announces those in the TIM IE but there's no way to release them. Add new API to release such frames and use it when the station polls for a frame. Since the API will soon also be used for uAPSD it is easily extensible. Note that before this change drivers announcing driver-buffered frames in the TIM bit actually will respond to a PS-Poll with a potentially lower priority frame (if there are any frames buffered in mac80211), after this patch a driver that hasn't been changed will no longer respond at all. This only affects ath9k, which will need to be fixed to implement the new API. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-09-30 15:57:15 -04:00
Johannes Berg	948d887dec	mac80211: split PS buffers into ACs For uAPSD support we'll need to have per-AC PS buffers. As this is a major undertaking, split the buffers before really adding support for uAPSD. This already makes some reference to the uapsd_queues variable, but for now that will never be non-zero. Since book-keeping is complicated, also change the logic for keeping a maximum of frames only and allow 64 frames per AC (up from 128 for a station). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-09-30 15:57:12 -04:00
Johannes Berg	042ec45337	mac80211: let drivers inform it about per TID buffered frames For uAPSD implementation, it is necessary to know on which ACs frames are buffered. mac80211 obviously knows about the frames it has buffered itself, but with aggregation many drivers buffer frames. Thus, mac80211 needs to be informed about this. For now, since we don't have APSD in any form, this will unconditionally set the TIM bit for the station but later with uAPSD only some ACs might cause the TIM bit to be set. ath9k is the only driver using this API and I only modify it in the most basic way, it won't be able to implement uAPSD with this yet. But it can't do that anyway since there's no way to selectively release frames to the peer yet. Since drivers will buffer frames per TID, let them inform mac80211 on a per TID basis, mac80211 will then sort out the AC mapping itself. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-09-30 15:57:10 -04:00
Arik Nemtsov	07ba55d7f1	nl80211/mac80211: allow adding TDLS peers as stations When adding a TDLS peer STA, mark it with a new flag in both nl80211 and mac80211. Before adding a peer, make sure the wiphy supports TDLS and our operating mode is appropriate (managed). In addition, make sure all peers are removed on disassociation. A TDLS peer is first added just before link setup is initiated. In later setup stages we have more info about peer supported rates, capabilities, etc. This info is reported via nl80211_set_station(). Signed-off-by: Arik Nemtsov <arik@wizery.com> Cc: Kalyan C Gaddam <chakkal@iit.edu> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-09-30 15:57:08 -04:00
Arik Nemtsov	dfe018bf99	mac80211: handle TDLS high-level commands and frames Register and implement the TDLS cfg80211 callback functions. Internally prepare and send TDLS management frames. We incorporate local STA capabilities and supported rates with extra IEs given by usermode. The resulting packet is either encapsulated in a data frame, or assembled as an action frame. It is transmitted either directly or through the AP, as mandated by the TDLS specification. Declare support for the TDLS external setup wiphy capability. This tells usermode to handle link setup and discovery on its own, and use the kernel driver for sending TDLS mgmt packets. Signed-off-by: Arik Nemtsov <arik@wizery.com> Cc: Kalyan C Gaddam <chakkal@iit.edu> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-09-30 15:57:07 -04:00
Arik Nemtsov	768db3438b	mac80211: standardize adding supported rates IEs Relocate the mesh implementation of adding the (extended) supported rates IE to util.c, anticipating its use by other parts of mac80211. Signed-off-by: Arik Nemtsov <arik@wizery.com> Cc: Kalyan C Gaddam <chakkal@iit.edu> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-09-30 15:57:06 -04:00
Arik Nemtsov	109086ce0b	nl80211: support sending TDLS commands/frames Add support for sending high-level TDLS commands and TDLS frames via NL80211_CMD_TDLS_OPER and NL80211_CMD_TDLS_MGMT, respectively. Add appropriate cfg80211 callbacks for lower level drivers. Add wiphy capability flags for TDLS support and advertise them via nl80211. Signed-off-by: Arik Nemtsov <arik@wizery.com> Cc: Kalyan C Gaddam <chakkal@iit.edu> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-09-30 15:57:05 -04:00
Johannes Berg	3b9ce80ce9	cfg80211/mac80211: apply station uAPSD parameters selectively Currently, when hostapd sets the station as authorized we also overwrite its uAPSD parameter. This obviously leads to buggy behaviour (later, with my patches that actually add uAPSD support). To fix this, only apply those parameters if they were actually set in nl80211, and to achieve that add a bitmap of things to apply. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2011-09-30 15:57:03 -04:00
John W. Linville	8e00f5fbb4	Merge branch 'master' of git://git.infradead.org/users/linville/wireless-next into for-davem Conflicts: drivers/net/wireless/iwlwifi/iwl-pci.c drivers/net/wireless/wl12xx/main.c	2011-09-30 14:52:29 -04:00
Dimitris Papastamos	6eb0f5e015	regmap: Implement regcache_cache_bypass helper function Ensure we've got a function so users can enable/disable the cache bypass option. Signed-off-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>	2011-09-30 13:57:47 +01:00
Peter Zijlstra	d670ec1317	posix-cpu-timers: Cure SMP wobbles David reported: Attached below is a watered-down version of rt/tst-cpuclock2.c from GLIBC. Just build it with "gcc -o test test.c -lpthread -lrt" or similar. Run it several times, and you will see cases where the main thread will measure a process clock difference before and after the nanosleep which is smaller than the cpu-burner thread's individual thread clock difference. This doesn't make any sense since the cpu-burner thread is part of the top-level process's thread group. I've reproduced this on both x86-64 and sparc64 (using both 32-bit and 64-bit binaries). For example: [davem@boricha build-x86_64-linux]$ ./test process: before(0.001221967) after(0.498624371) diff(497402404) thread: before(0.000081692) after(0.498316431) diff(498234739) self: before(0.001223521) after(0.001240219) diff(16698) [davem@boricha build-x86_64-linux]$ The diff of 'process' should always be >= the diff of 'thread'. I make sure to wrap the 'thread' clock measurements the most tightly around the nanosleep() call, and that the 'process' clock measurements are the outer-most ones. --- #include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <time.h> #include <fcntl.h> #include <string.h> #include <errno.h> #include <pthread.h> static pthread_barrier_t barrier; static void chew_cpu(void arg) { pthread_barrier_wait(&barrier); while (1) __asm__ __volatile__("" : : : "memory"); return NULL; } int main(void) { clockid_t process_clock, my_thread_clock, th_clock; struct timespec process_before, process_after; struct timespec me_before, me_after; struct timespec th_before, th_after; struct timespec sleeptime; unsigned long diff; pthread_t th; int err; err = clock_getcpuclockid(0, &process_clock); if (err) return 1; err = pthread_getcpuclockid(pthread_self(), &my_thread_clock); if (err) return 1; pthread_barrier_init(&barrier, NULL, 2); err = pthread_create(&th, NULL, chew_cpu, NULL); if (err) return 1; err = pthread_getcpuclockid(th, &th_clock); if (err) return 1; pthread_barrier_wait(&barrier); err = clock_gettime(process_clock, &process_before); if (err) return 1; err = clock_gettime(my_thread_clock, &me_before); if (err) return 1; err = clock_gettime(th_clock, &th_before); if (err) return 1; sleeptime.tv_sec = 0; sleeptime.tv_nsec = 500000000; nanosleep(&sleeptime, NULL); err = clock_gettime(th_clock, &th_after); if (err) return 1; err = clock_gettime(my_thread_clock, &me_after); if (err) return 1; err = clock_gettime(process_clock, &process_after); if (err) return 1; diff = process_after.tv_nsec - process_before.tv_nsec; printf("process: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n", process_before.tv_sec, process_before.tv_nsec, process_after.tv_sec, process_after.tv_nsec, diff); diff = th_after.tv_nsec - th_before.tv_nsec; printf("thread: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n", th_before.tv_sec, th_before.tv_nsec, th_after.tv_sec, th_after.tv_nsec, diff); diff = me_after.tv_nsec - me_before.tv_nsec; printf("self: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n", me_before.tv_sec, me_before.tv_nsec, me_after.tv_sec, me_after.tv_nsec, diff); return 0; } This is due to us using p->se.sum_exec_runtime in thread_group_cputime() where we iterate the thread group and sum all data. This does not take time since the last schedule operation (tick or otherwise) into account. We can cure this by using task_sched_runtime() at the cost of having to take locks. This also means we can (and must) do away with thread_group_sched_runtime() since the modified thread_group_cputime() is now more accurate and would deadlock when called from thread_group_sched_runtime(). Aside of that it makes the function safe on 32 bit systems. The old code added t->se.sum_exec_runtime unprotected. sum_exec_runtime is a 64bit value and could be changed on another cpu at the same time. Reported-by: David Miller <davem@davemloft.net> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/1314874459.7945.22.camel@twins Tested-by: David Miller <davem@davemloft.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-30 14:07:06 +02:00
Serge Hallyn	d178bc3a70	user namespace: usb: make usb urbs user namespace aware (v2) Add to the dev_state and alloc_async structures the user namespace corresponding to the uid and euid. Pass these to kill_pid_info_as_uid(), which can then implement a proper, user-namespace-aware uid check. Changelog: Sep 20: Per Oleg's suggestion: Instead of caching and passing user namespace, uid, and euid each separately, pass a struct cred. Sep 26: Address Alan Stern's comments: don't define a struct cred at usbdev_open(), and take and put a cred at async_completed() to ensure it lasts for the duration of kill_pid_info_as_cred(). Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-09-29 13:13:08 -07:00
David Vrabel	8b5d44a5ac	xen: allow balloon driver to use more than one memory region Allow the xen balloon driver to populate its list of extra pages from more than one region of memory. This will allow platforms to provide (for example) a region of low memory and a region of high memory. The maximum possible number of extra regions is 128 (== E820MAX) which is quite large so xen_extra_mem is placed in __initdata. This is safe as both xen_memory_setup() and balloon_init() are in __init. The balloon regions themselves are not altered (i.e., there is still only the one region). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-09-29 11:12:10 -04:00
David Vrabel	aa24411b67	xen/balloon: account for pages released during memory setup In xen_memory_setup() pages that occur in gaps in the memory map are released back to Xen. This reduces the domain's current page count in the hypervisor. The Xen balloon driver does not correctly decrease its initial current_pages count to reflect this. If 'delta' pages are released and the target is adjusted the resulting reservation is always 'delta' less than the requested target. This affects dom0 if the initial allocation of pages overlaps the PCI memory region but won't affect most domU guests that have been setup with pseudo-physical memory maps that don't have gaps. Fix this by accouting for the released pages when starting the balloon driver. If the domain's targets are managed by xapi, the domain may eventually run out of memory and die because xapi currently gets its target calculations wrong and whenever it is restarted it always reduces the target by 'delta'. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-09-29 11:12:09 -04:00
Stefano Stabellini	0930bba674	xen: modify kernel mappings corresponding to granted pages If we want to use granted pages for AIO, changing the mappings of a user vma and the corresponding p2m is not enough, we also need to update the kernel mappings accordingly. Currently this is only needed for pages that are created for user usages through /dev/xen/gntdev. As in, pages that have been in use by the kernel and use the P2M will not need this special mapping. However there are no guarantees that in the future the kernel won't start accessing pages through the 1:1 even for internal usage. In order to avoid the complexity of dealing with highmem, we allocated the pages lowmem. We issue a HYPERVISOR_grant_table_op right away in m2p_add_override and we remove the mappings using another HYPERVISOR_grant_table_op in m2p_remove_override. Considering that m2p_add_override and m2p_remove_override are called once per page we use multicalls and hypercall batching. Use the kmap_op pointer directly as argument to do the mapping as it is guaranteed to be present up until the unmapping is done. Before issuing any unmapping multicalls, we need to make sure that the mapping has already being done, because we need the kmap->handle to be set correctly. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> [v1: Removed GRANT_FRAME_BIT usage] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-09-29 10:32:58 -04:00
Stefano Stabellini	693394b8c3	xen: add an "highmem" parameter to alloc_xenballooned_pages Add an highmem parameter to alloc_xenballooned_pages, to allow callers to request lowmem or highmem pages. Fix the code style of free_xenballooned_pages' prototype. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-09-29 09:56:52 -04:00
Paul E. McKenney	82e78d80fc	rcu: Simplify unboosting checks Commit 7765be (Fix RCU_BOOST race handling current->rcu_read_unlock_special) introduced a new ->rcu_boosted field in the task structure. This is redundant because the existing ->rcu_boost_mutex will be non-NULL at any time that ->rcu_boosted is nonzero. Therefore, this commit removes ->rcu_boosted and tests ->rcu_boost_mutex instead. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:38:39 -07:00
Paul E. McKenney	6206ab9bab	rcu: Move __rcu_read_unlock()'s barrier() within if-statement We only need to constrain the compiler if we are actually exiting the top-level RCU read-side critical section. This commit therefore moves the first barrier() cal in __rcu_read_unlock() to inside the "if" statement, thus avoiding needless register flushes for inner rcu_read_unlock() calls. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:38:35 -07:00
Paul E. McKenney	6846c0c540	rcu: Improve rcu_assign_pointer() and RCU_INIT_POINTER() documentation The differences between rcu_assign_pointer() and RCU_INIT_POINTER() are subtle, and it is easy to use the the cheaper RCU_INIT_POINTER() when the more-expensive rcu_assign_pointer() should have been used instead. The consequences of this mistake are quite severe. This commit therefore carefully lays out the situations in which it it permissible to use RCU_INIT_POINTER(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:38:34 -07:00
Eric Dumazet	d322f45cee	rcu: Make rcu_assign_pointer() unconditionally insert a memory barrier Recent changes to gcc give warning messages on rcu_assign_pointers()'s checks that allow it to determine when it is OK to omit the memory barrier. Stephen Hemminger tried a number of gcc tricks to silence this warning, but #pragmas and CPP macros do not work together in the way that would be required to make this work. However, we now have RCU_INIT_POINTER(), which already omits this memory barrier, and which therefore may be used when assigning NULL to an RCU-protected pointer that is accessible to readers. This commit therefore makes rcu_assign_pointer() unconditionally emit the memory barrier. Reported-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:38:33 -07:00
Shi, Alex	fc0763f53e	nohz: Remove nohz_cpu_mask RCU no longer uses this global variable, nor does anyone else. This commit therefore removes this variable. This reduces memory footprint and also removes some atomic instructions and memory barriers from the dyntick-idle path. Signed-off-by: Alex Shi <alex.shi@intel.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:38:29 -07:00
Paul E. McKenney	22507ed9b9	rcu: Remove unused and redundant interfaces The rcu_dereference_bh_protected() and rcu_dereference_sched_protected() macros are synonyms for rcu_dereference_protected() and are not used anywhere in mainline. This commit therefore removes them. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:38:26 -07:00
Paul E. McKenney	d4c08f2ac3	rcu: Add grace-period, quiescent-state, and call_rcu trace events Add trace events to record grace-period start and end, quiescent states, CPUs noticing grace-period start and end, grace-period initialization, call_rcu() invocation, tasks blocking in RCU read-side critical sections, tasks exiting those same critical sections, force_quiescent_state() detection of dyntick-idle and offline CPUs, CPUs entering and leaving dyntick-idle mode (except from NMIs), CPUs coming online and going offline, and CPUs being kicked for staying in dyntick-idle mode for too long (as in many weeks, even on 32-bit systems). Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> rcu: Add the rcu flavor to callback trace events The earlier trace events for registering RCU callbacks and for invoking them did not include the RCU flavor (rcu_bh, rcu_preempt, or rcu_sched). This commit adds the RCU flavor to those trace events. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:38:21 -07:00
Paul E. McKenney	965a002b4f	rcu: Make TINY_RCU also use softirq for RCU_BOOST=n This patch #ifdefs TINY_RCU kthreads out of the kernel unless RCU_BOOST=y, thus eliminating context-switch overhead if RCU priority boosting has not been configured. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:38:20 -07:00
Paul E. McKenney	385680a948	rcu: Add event-trace markers to TREE_RCU kthreads Add event-trace markers to TREE_RCU kthreads to allow including these kthread's CPU time in the utilization calculations. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:38:19 -07:00
Paul E. McKenney	72fe701b70	rcu: Add RCU type to callback-invocation tracing Add a string to the rcu_batch_start() and rcu_batch_end() trace messages that indicates the RCU type ("rcu_sched", "rcu_bh", or "rcu_preempt"). The trace messages for the actual invocations themselves are not marked, as it should be clear from the rcu_batch_start() and rcu_batch_end() events before and after. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:38:15 -07:00
Paul E. McKenney	300df91ca9	rcu: Event-trace markers for computing RCU CPU utilization This commit adds the trace_rcu_utilization() marker that is to be used to allow postprocessing scripts compute RCU's CPU utilization, give or take event-trace overhead. Note that we do not include RCU's dyntick-idle interface because event tracing requires RCU protection, which is not available in dyntick-idle mode. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:38:13 -07:00
Paul E. McKenney	29c00b4a1d	rcu: Add event-tracing for RCU callback invocation There was recently some controversy about the overhead of invoking RCU callbacks. Add TRACE_EVENT()s to obtain fine-grained timings for the start and stop of a batch of callbacks and also for each callback invoked. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:38:12 -07:00
Paul E. McKenney	2c42818e96	rcu: Abstract common code for RCU grace-period-wait primitives Pull the code that waits for an RCU grace period into a single function, which is then called by synchronize_rcu() and friends in the case of TREE_RCU and TREE_PREEMPT_RCU, and from rcu_barrier() and friends in the case of TINY_RCU and TINY_PREEMPT_RCU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:36:42 -07:00
Paul E. McKenney	990987511c	rcu: Move rcu_head definition to types.h Take a first step towards untangling Linux kernel header files by placing the struct rcu_head definition into include/linux/types.h and including include/linux/types.h in include/linux/rcupdate.h where struct rcu_head used to be defined. The actual inclusion point for include/linux/types.h is with the rest of the #include directives rather than at the point where struct rcu_head used to be defined, as suggested by Mathieu Desnoyers. Once this is in place, then header files that need only rcu_head can include types.h rather than rcupdate.h. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>	2011-09-28 21:36:38 -07:00
Paul E. McKenney	b3fbab0571	rcu: Restore checks for blocking in RCU read-side critical sections Long ago, using TREE_RCU with PREEMPT would result in "scheduling while atomic" diagnostics if you blocked in an RCU read-side critical section. However, PREEMPT now implies TREE_PREEMPT_RCU, which defeats this diagnostic. This commit therefore adds a replacement diagnostic based on PROVE_RCU. Because rcu_lockdep_assert() and lockdep_rcu_dereference() are now being used for things that have nothing to do with rcu_dereference(), rename lockdep_rcu_dereference() to lockdep_rcu_suspicious() and add a third argument that is a string indicating what is suspicious. This third argument is passed in from a new third argument to rcu_lockdep_assert(). Update all calls to rcu_lockdep_assert() to add an informative third argument. Also, add a pair of rcu_lockdep_assert() calls from within rcu_note_context_switch(), one complaining if a context switch occurs in an RCU-bh read-side critical section and another complaining if a context switch occurs in an RCU-sched read-side critical section. These are present only if the PROVE_RCU kernel parameter is enabled. Finally, fix some checkpatch whitespace complaints in lockdep.c. Again, you must enable PROVE_RCU to see these new diagnostics. But you are enabling PROVE_RCU to check out new RCU uses in any case, aren't you? Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:36:37 -07:00
Richard Cochran	f75159e993	ptp: fix L2 event message recognition The IEEE 1588 standard defines two kinds of messages, event and general messages. Event messages require time stamping, and general do not. When using UDP transport, two separate ports are used for the two message types. The BPF designed to recognize event messages incorrectly classifies L2 general messages as event messages. This commit fixes the issue by extending the filter to check the message type field for L2 PTP packets. Event messages are be distinguished from general messages by testing the "general" bit. Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Cc: <stable@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-09-29 00:32:03 -04:00
Yoshihiro Shimoda	d4fa0e35fd	net: sh_eth: move the asm/sh_eth.h to include/linux/ Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-09-28 13:41:51 -04:00
Vladimir Zapolskiy	f786ecba41	connector: add comm change event report to proc connector Add an event to monitor comm value changes of tasks. Such an event becomes vital, if someone desires to control threads of a process in different manner. A natural characteristic of threads is its comm value, and helpfully application developers have an opportunity to change it in runtime. Reporting about such events via proc connector allows to fine-grain monitoring and control potentials, for instance a process control daemon listening to proc connector and following comm value policies can place specific threads to assigned cgroup partitions. It might be possible to achieve a pale partial one-shot likeness without this update, if an application changes comm value of a thread generator task beforehand, then a new thread is cloned, and after that proc connector listener gets the fork event and reads new thread's comm value from procfs stat file, but this change visibly simplifies and extends the matter. Signed-off-by: Vladimir Zapolskiy <vzapolskiy@gmail.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-09-28 13:41:50 -04:00
Eric Dumazet	16e5726269	af_unix: dont send SCM_CREDENTIALS by default Since commit `7361c36c52` (af_unix: Allow credentials to work across user and pid namespaces) af_unix performance dropped a lot. This is because we now take a reference on pid and cred in each write(), and release them in read(), usually done from another process, eventually from another cpu. This triggers false sharing. # Events: 154K cycles # # Overhead Command Shared Object Symbol # ........ ....... .................. ......................... # 10.40% hackbench [kernel.kallsyms] [k] put_pid 8.60% hackbench [kernel.kallsyms] [k] unix_stream_recvmsg 7.87% hackbench [kernel.kallsyms] [k] unix_stream_sendmsg 6.11% hackbench [kernel.kallsyms] [k] do_raw_spin_lock 4.95% hackbench [kernel.kallsyms] [k] unix_scm_to_skb 4.87% hackbench [kernel.kallsyms] [k] pid_nr_ns 4.34% hackbench [kernel.kallsyms] [k] cred_to_ucred 2.39% hackbench [kernel.kallsyms] [k] unix_destruct_scm 2.24% hackbench [kernel.kallsyms] [k] sub_preempt_count 1.75% hackbench [kernel.kallsyms] [k] fget_light 1.51% hackbench [kernel.kallsyms] [k] __mutex_lock_interruptible_slowpath 1.42% hackbench [kernel.kallsyms] [k] sock_alloc_send_pskb This patch includes SCM_CREDENTIALS information in a af_unix message/skb only if requested by the sender, [man 7 unix for details how to include ancillary data using sendmsg() system call] Note: This might break buggy applications that expected SCM_CREDENTIAL from an unaware write() system call, and receiver not using SO_PASSCRED socket option. If SOCK_PASSCRED is set on source or destination socket, we still include credentials for mere write() syscalls. Performance boost in hackbench : more than 50% gain on a 16 thread machine (2 quad-core cpus, 2 threads per core) hackbench 20 thread 2000 4.228 sec instead of 9.102 sec Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-09-28 13:29:50 -04:00
Linus Torvalds	e689ec8057	Merge branch 'writeback-for-linus' of git://github.com/fengguang/linux * 'writeback-for-linus' of git://github.com/fengguang/linux: writeback: show raw dirtied_when in trace writeback_single_inode	2011-09-28 08:01:05 -07:00
Mat Martineau	84084a3197	Bluetooth: Perform L2CAP SDU reassembly without copying data Use sk_buff fragment capabilities to link together incoming skbs instead of allocating a new skb for reassembly and copying. The new reassembly code works equally well for ERTM and streaming mode, so there is now one reassembly function instead of two. Signed-off-by: Mat Martineau <mathewm@codeaurora.org> Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>	2011-09-27 18:16:18 -03:00

1 2 3 4 5 ...

45749 Commits