linux

Author	SHA1	Message	Date
Linus Torvalds	c34c15b02e	Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus * 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: [MIPS] Malta: Enable tickless and highres timers. [MIPS] Bigsur: Enable tickless and and highres timers. qemu: do not enable IP7 blindly [MIPS] Alchemy: Fix Au1x SD controller IRQ [MIPS] Don't byteswap writes to display when running bigendian	2007-12-10 19:45:17 -08:00
Maciej W. Rozycki	522939d45c	esp_scsi: fix reset cleanup spinlock recursion The esp_reset_cleanup() function is called with the host lock held and invokes starget_for_each_device() which wants to take it too. Here is a fix along the lines of shost_for_each_device()/__shost_for_each_device() adding a __starget_for_each_device() counterpart which assumes the lock has already been taken. Eventually, I think the driver should get modified so that more work is done as a softirq rather than in the interrupt context, but for now it fixes a bug that causes the spinlock debugger to fire. While at it, it fixes a small number of cosmetic problems with starget_for_each_device() too. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-10 19:43:55 -08:00
Roel Kluin	da7ce6e2fe	asm-h8300: parentheses around definition CLOCK_TICK_RATE Some places where CLOCK_TICK_RATE may be used incorrectly: arch/arm/mach-mx3/time.c:125: __raw_writel((v / CLOCK_TICK_RATE) - 1, MXC_GPT_GPTPR); drivers/watchdog/davinci_wdt.c:103: timer_margin = (((u64)heartbeat * CLOCK_TICK_RATE) & 0xffffffff); drivers/watchdog/davinci_wdt.c:105: timer_margin = (((u64)heartbeat * CLOCK_TICK_RATE) >> 32); drivers/watchdog/ks8695_wdt.c:64: unsigned long tval = wdt_time * CLOCK_TICK_RATE; I'm not sure whether this definition is used there, but adding parentheses should be good anyway. Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-10 19:43:54 -08:00
Manuel Lauss	0f5e49a2e2	[MIPS] Alchemy: Fix Au1x SD controller IRQ With the introduction of MIPS_CPU_IRQ_BASE, the hardcoded IRQ number of the au1100/au1200 SD controller(s) is no longer valid. Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2007-12-09 04:51:10 +00:00
Simon Horman	9e004ebd2d	[IA64] iosapic cleanup Make some IOSAPIC functions static and remove one that is unused. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Tony Luck <tony.luck@intel.com>	2007-12-07 16:11:12 -08:00
Jay Vosburgh	6f6652be18	bonding: Add new layer2+3 hash for xor/802.3ad modes Add new hash for balance-xor and 802.3ad modes. Originally submitted by "Glenn Griffin" <ggriffin.kernel@gmail.com>; modified by Jay Vosburgh to move setting of hash policy out of line, tweak the documentation update and add version update to 3.2.2. Glenn's original comment follows: Included is a patch for a new xmit_hash_policy for the bonding driver that selects slaves based on MAC and IP information. This is a middle ground between what currently exists in the layer2 only policy and the layer3+4 policy. This policy strives to be fully 802.3ad compliant by transmitting every packet of any particular flow over the same link. As documented the layer3+4 policy is not fully compliant for extreme cases such as ip fragmentation, so this policy is a nice compromise for environments that require full compliance but desire more than the layer2 only policy. Signed-off-by: "Glenn Griffin" <ggriffin.kernel@gmail.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-07 15:00:32 -05:00
Linus Torvalds	e17587b5b9	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6: [AVR32] Fix wrong pt_regs in critical exception handler [AVR32] Fix copy_to_user_page() breakage [AVR32] Follow the rules when dealing with the OCD system [AVR32] Clean up OCD register usage [AVR32] Implement irqflags trace and lockdep support [AVR32] Implement stacktrace support [AVR32] Kconfig: Use def_bool instead of bool + default [AVR32] Fix invalid status register bit definitions in asm/ptrace.h [AVR32] Add TIF_RESTORE_SIGMASK to the work masks	2007-12-07 11:00:31 -08:00
Linus Torvalds	29ac0052ea	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [AF_RXRPC]: Add a missing goto [VLAN]: Lost rtnl_unlock() in vlan_ioctl() [SCTP]: Fix the bind_addr info during migration. [SCTP]: Add bind hash locking to the migrate code [IPV4]: Remove prototype of ip_rt_advice [IPv4]: Reply net unreachable ICMP message [IPv6] SNMP: Increment OutNoRoutes when connecting to unreachable network [BRIDGE]: Section fix. [NIU]: Fix link LED handling.	2007-12-07 10:59:48 -08:00
Linus Torvalds	bbce0b5ca2	Merge branch 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds * 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds: leds: Fix led trigger locking bugs	2007-12-07 10:58:19 -08:00
Haavard Skinnemoen	68ca3e537f	[AVR32] Fix copy_to_user_page() breakage The current implementation of copy_to_user_page() gives "vaddr" to the cache instruction when trying to sync the icache with the dcache. If vaddr does not exist in the TLB, the CPU will silently abort the operation, which may result in the caches staying out of sync. To fix this, pass the "dst" parameter to flush_icache_range() instead -- we know this is valid because we just wrote to it. Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>	2007-12-07 14:54:47 +01:00
Haavard Skinnemoen	2507bc1338	[AVR32] Follow the rules when dealing with the OCD system The current debug trap handling code does a number of things that are illegal according to the AVR32 Architecture manual. Most importantly, it may try to schedule from Debug Mode, thus clearing the D bit, which can lead to "undefined behaviour". It seems like this works in most cases, but several people have observed somewhat unstable behaviour when debugging programs, including soft lockups. So there's definitely something which is not right with the existing code. The new code will never schedule from Debug mode, it will always exit Debug mode with a "retd" instruction, and if something not running in Debug mode needs to do something debug-related (like doing a single step), it will enter debug mode through a "breakpoint" instruction. The monitor code will then return directly to user space, bypassing its own saved registers if necessary (since we don't actually care about the trapped context, only the one that came before.) This adds three instructions to the common exception handling code, including one branch. It does not touch super-hot paths like the TLB miss handler. Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>	2007-12-07 14:54:46 +01:00
Haavard Skinnemoen	8dfe8f29cd	[AVR32] Clean up OCD register usage Generate a new set of OCD register definitions in asm/ocd.h and rename __mfdr() and __mtdr() to ocd_read() and ocd_write() respectively. The bitfield definitions are a lot more complete now, and they are entirely based on bit numbers, not masks. This is because OCD registers are frequently accessed from assembly code, where bit numbers are a lot more useful (can be fed directly to sbr, bfins, etc.) Bitfields that consist of more than one bit have two definitions: _START, which indicates the number of the first bit, and _SIZE, which indicates the number of bits. These directly correspond to the parameters taken by the bfextu, bfexts and bfins instructions. Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>	2007-12-07 14:54:40 +01:00
Haavard Skinnemoen	320516b78b	[AVR32] Implement irqflags trace and lockdep support Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>	2007-12-07 14:52:37 +01:00
Haavard Skinnemoen	df679771ce	[AVR32] Fix invalid status register bit definitions in asm/ptrace.h The 'H' bit is bit 29, while the 'R' bit doesn't exist. Luckily, we don't actually use any of the bits in question. Also update show_regs() to show the Debug Mask and Debug state bits. Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>	2007-12-07 14:52:33 +01:00
Haavard Skinnemoen	702f22b306	[AVR32] Add TIF_RESTORE_SIGMASK to the work masks We really need to check TIF_RESTORE_SIGMASK before returning to userspace. The existing code does not necessarily do this. Define the work masks as a bitwise OR of the respective flags instead of a hardcoded hex value to make it easier to spot errors like this in the future. Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>	2007-12-07 14:52:32 +01:00
Vlad Yasevich	8e71a11c9f	[SCTP]: Fix the bind_addr info during migration. During accept/migrate the code attempts to copy the addresses from the parent endpoint to the new endpoint. However, if the parent was bound to a wildcard address, then we end up pointlessly copying all of the current addresses on the system. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-07 01:07:49 -08:00
Denis V. Lunev	56c99d0415	[IPV4]: Remove prototype of ip_rt_advice ip_rt_advice has been gone, so no need to keep prototype and debug message. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-07 01:07:38 -08:00
Richard Purdie	dc47206e55	leds: Fix led trigger locking bugs Convert part of the led trigger core from rw spinlocks to rw semaphores. We're calling functions which can sleep from invalid contexts otherwise. Fixes bug #9264. Signed-off-by: Richard Purdie <rpurdie@rpsys.net>	2007-12-07 09:06:53 +00:00
Linus Torvalds	5fa2e15913	Merge branch 'merge' of master.kernel.org:/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of master.kernel.org:/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] virtex bug fix: Use canonical value for AC97 interrupt xparams [POWERPC] Update defconfigs [POWERPC] PS3: Update ps3_defconfig [POWERPC] Update iseries_defconfig [POWERPC] Fix hardware IRQ time accounting problem.	2007-12-06 17:50:07 -08:00
Linus Torvalds	09f3eca2b7	Merge branch 'for-2.6.24' of git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc * 'for-2.6.24' of git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc: [POWERPC] Fix swapper_pg_dir size when CONFIG_PTE_64BIT=y on FSL_BOOKE	2007-12-06 12:27:09 -08:00
Linus Torvalds	e1b7361f32	Merge git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6: [PARISC] lba_pci: pci_claim_resources disabled expansion roms [PARISC] print more than one character at a time for pdc console [PARISC] Update parisc-linux MAINTAINERS entries [PARISC] timer interrupt should not be IRQ_DISABLED Revert "[PARISC] import necessary bits of libgcc.a"	2007-12-06 12:26:17 -08:00
Kumar Gala	bee86f14d5	[POWERPC] Fix swapper_pg_dir size when CONFIG_PTE_64BIT=y on FSL_BOOKE The size of swapper_pg_dir is 8k instead of 4k when using 64-bit PTEs (CONFIG_PTE_64BIT). This was reported by Cedric Hombourger <chombourger@gmail.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2007-12-06 13:11:04 -06:00
Kyle McMartin	721fdf3416	[PARISC] print more than one character at a time for pdc console There's really no reason not to print more than one character at a time to the PDC console... Booting is measurably speedier, and now I don't have to watch individual characters get drawn. Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>	2007-12-06 09:32:15 -08:00
Sergei Shtylyov	0e8120e094	[MIPS] Alchemy: fix IRQ bases Do what the commits commits `f3e8d1da38` and `9d360ab4a7` failed to achieve -- actually convert the Alchemy code to irq_cpu. Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2007-12-06 17:15:58 +00:00
Tony Breeds	81a3843f97	[POWERPC] Fix hardware IRQ time accounting problem. The commit `fa13a5a1f2` (sched: restore deterministic CPU accounting on powerpc), unconditionally calls update_process_tick() in system context. In the deterministic accounting case this is the correct thing to do. However, in the non-deterministic accounting case we need to not do this, since doing this results in the time accounted as hardware irq time being artificially elevated. Also this collapses 2 consecutive '#ifdef CONFIG_VIRT_CPU_ACCOUNTING' checks in time.h into one for neatness. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-12-06 16:08:59 +11:00
Linus Torvalds	7e1fb765c6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: futex: correctly return -EFAULT not -EINVAL lockdep: in_range() fix lockdep: fix debug_show_all_locks() sched: style cleanups futex: fix for futex_wait signal stack corruption	2007-12-05 09:27:46 -08:00
Linus Torvalds	ad658cec23	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6: VM/Security: add security hook to do_brk Security: round mmap hint address above mmap_min_addr security: protect from stack expantion into low vm addresses Security: allow capable check to permit mmap or low vm space SELinux: detect dead booleans SELinux: do not clear f_op when removing entries	2007-12-05 09:26:52 -08:00
Linus Torvalds	2a1292b36b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: [LRO]: fix lro_gen_skb() alignment [TCP]: NAGLE_PUSH seems to be a wrong way around [TCP]: Move prior_in_flight collect to more robust place [TCP] FRTO: Use of existing funcs make code more obvious & robust [IRDA]: Move ircomm_tty_line_info() under #ifdef CONFIG_PROC_FS [ROSE]: Trivial compilation CONFIG_INET=n case [IPVS]: Fix sched registration race when checking for name collision. [IPVS]: Don't leak sysctl tables if the scheduler registration fails.	2007-12-05 09:26:13 -08:00
Alexey Dobriyan	5a622f2d0f	proc: fix proc_dir_entry refcounting Creating PDEs with refcount 0 and "deleted" flag has problems (see below). Switch to usual scheme: * PDE is created with refcount 1 * every de_get does +1 * every de_put() and remove_proc_entry() do -1 * once refcount reaches 0, PDE is freed. This elegantly fixes at least two following races (both observed) without introducing new locks, without abusing old locks, without spreading lock_kernel(): 1) PDE leak remove_proc_entry de_put ----------------- ------ [refcnt = 1] if (atomic_read(&de->count) == 0) if (atomic_dec_and_test(&de->count)) if (de->deleted) /* also not taken! / free_proc_entry(de); else de->deleted = 1; [refcount=0, deleted=1] 2) use after free remove_proc_entry de_put ----------------- ------ [refcnt = 1] if (atomic_dec_and_test(&de->count)) if (atomic_read(&de->count) == 0) free_proc_entry(de); / boom! / if (de->deleted) free_proc_entry(de); BUG: unable to handle kernel paging request at virtual address 6b6b6b6b printing eip: c10acdda pdpt = 00000000338f8001 *pde = 0000000000000000 Oops: 0000 [#1] PREEMPT SMP Modules linked in: af_packet ipv6 cpufreq_ondemand loop serio_raw psmouse k8temp hwmon sr_mod cdrom Pid: 23161, comm: cat Not tainted (2.6.24-rc2-8c0863403f109a43d7000b4646da4818220d501f #4) EIP: 0060:[<c10acdda>] EFLAGS: 00210097 CPU: 1 EIP is at strnlen+0x6/0x18 EAX: 6b6b6b6b EBX: 6b6b6b6b ECX: 6b6b6b6b EDX: fffffffe ESI: c128fa3b EDI: f380bf34 EBP: ffffffff ESP: f380be44 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process cat (pid: 23161, ti=f380b000 task=f38f2570 task.ti=f380b000) Stack: c10ac4f0 00000278 c12ce000 f43cd2a8 00000163 00000000 7da86067 00000400 c128fa20 00896b18 f38325a8 c128fe20 ffffffff 00000000 c11f291e 00000400 f75be300 c128fa20 f769c9a0 c10ac779 f380bf34 f7bfee70 c1018e6b f380bf34 Call Trace: [<c10ac4f0>] vsnprintf+0x2ad/0x49b [<c10ac779>] vscnprintf+0x14/0x1f [<c1018e6b>] vprintk+0xc5/0x2f9 [<c10379f1>] handle_fasteoi_irq+0x0/0xab [<c1004f44>] do_IRQ+0x9f/0xb7 [<c117db3b>] preempt_schedule_irq+0x3f/0x5b [<c100264e>] need_resched+0x1f/0x21 [<c10190ba>] printk+0x1b/0x1f [<c107c8ad>] de_put+0x3d/0x50 [<c107c8f8>] proc_delete_inode+0x38/0x41 [<c107c8c0>] proc_delete_inode+0x0/0x41 [<c1066298>] generic_delete_inode+0x5e/0xc6 [<c1065aa9>] iput+0x60/0x62 [<c1063c8e>] d_kill+0x2d/0x46 [<c1063fa9>] dput+0xdc/0xe4 [<c10571a1>] __fput+0xb0/0xcd [<c1054e49>] filp_close+0x48/0x4f [<c1055ee9>] sys_close+0x67/0xa5 [<c10026b6>] sysenter_past_esp+0x5f/0x85 ======================= Code: c9 74 0c f2 ae 74 05 bf 01 00 00 00 4f 89 fa 5f 89 d0 c3 85 c9 57 89 c7 89 d0 74 05 f2 ae 75 01 4f 89 f8 5f c3 89 c1 89 c8 eb 06 <80> 38 00 74 07 40 4a 83 fa ff 75 f4 29 c8 c3 90 90 90 57 83 c9 EIP: [<c10acdda>] strnlen+0x6/0x18 SS:ESP 0068:f380be44 Also, remove broken usage of ->deleted from reiserfs: if sget() succeeds, module is already pinned and remove_proc_entry() can't happen => nobody can mark PDE deleted. Dummy proc root in netns code is not marked with refcount 1. AFAICS, we never get it, it's just for proper /proc/net removal. I double checked CLONE_NETNS continues to work. Patch survives many hours of modprobe/rmmod/cat loops without new bugs which can be attributed to refcounting. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:21:20 -08:00
Jan Kara	d4beaf4ab5	jbd: Fix assertion failure in fs/jbd/checkpoint.c Before we start committing a transaction, we call __journal_clean_checkpoint_list() to cleanup transaction's written-back buffers. If this call happens to remove all of them (and there were already some buffers), __journal_remove_checkpoint() will decide to free the transaction because it isn't (yet) a committing transaction and soon we fail some assertion - the transaction really isn't ready to be freed :). We change the check in __journal_remove_checkpoint() to free only a transaction in T_FINISHED state. The locking there is subtle though (as everywhere in JBD ;(). We use j_list_lock to protect the check and a subsequent call to __journal_drop_transaction() and do the same in the end of journal_commit_transaction() which is the only place where a transaction can get to T_FINISHED state. Probably I'm too paranoid here and such locking is not really necessary - checkpoint lists are processed only from log_do_checkpoint() where a transaction must be already committed to be processed or from __journal_clean_checkpoint_list() where kjournald itself calls it and thus transaction cannot change state either. Better be safe if something changes in future... Signed-off-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:21:20 -08:00
Bryan Wu	003d922618	Blackfin SPI driver: move hard coded pin_req to board file Remove some sort of bloaty code, try to get these pin_req arrays built at compile-time - move this static things to the blackfin board file - add pin_req array to struct bfin5xx_spi_master - tested on BF537/BF548 with SPI flash Signed-off-by: Bryan Wu <bryan.wu@analog.com> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:21:20 -08:00
Bryan Wu	62310e51ac	spi: spi_bfin: update handling of delay-after-deselect Move cs_chg_udelay handling (specific to this driver) to cs_deactive(), fixing a bug when some SPI LCD driver needs delay after cs_deactive. Fix bug reported by Cameron Barfield <cbarfield@cyberdata.net> https://blackfin.uclinux.org/gf/project/uclinux-dist/forum/?action=ForumBrowse&forum_id=39&_forum_action=ForumMessageBrowse&thread_id=23630&feedback=Message%20replied. Cc: Cameron Barfield <cbarfield@cyberdata.net> Signed-off-by: Bryan Wu <bryan.wu@analog.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:21:20 -08:00
Michael Hennerich	cc2f81a695	spi: bfin spi uses portmux calls Use new Blackfin portmux interface, add error handling. Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Bryan Wu <bryan.wu@analog.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:21:19 -08:00
Bryan Wu	131b17d42d	spi: initial BF54x SPI support Initial BF54x SPI support - support BF54x SPI0 - clean up some code (whitespace etc) - will support multiports in the future - start using portmux calls Signed-off-by: Bryan Wu <bryan.wu@analog.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:21:19 -08:00
Steven Rostedt	ce6bd420f4	futex: fix for futex_wait signal stack corruption David Holmes found a bug in the -rt tree with respect to pthread_cond_timedwait. After trying his test program on the latest git from mainline, I found the bug was there too. The bug he was seeing that his test program showed, was that if one were to do a "Ctrl-Z" on a process that was in the pthread_cond_timedwait, and then did a "bg" on that process, it would return with a "-ETIMEDOUT" but early. That is, the timer would go off early. Looking into this, I found the source of the problem. And it is a rather nasty bug at that. Here's the relevant code from kernel/futex.c: (not in order in the file) [...] smlinkage long sys_futex(u32 __user uaddr, int op, u32 val, struct timespec __user utime, u32 __user uaddr2, u32 val3) { struct timespec ts; ktime_t t, tp = NULL; u32 val2 = 0; int cmd = op & FUTEX_CMD_MASK; if (utime && (cmd == FUTEX_WAIT \|\| cmd == FUTEX_LOCK_PI)) { if (copy_from_user(&ts, utime, sizeof(ts)) != 0) return -EFAULT; if (!timespec_valid(&ts)) return -EINVAL; t = timespec_to_ktime(ts); if (cmd == FUTEX_WAIT) t = ktime_add(ktime_get(), t); tp = &t; } [...] return do_futex(uaddr, op, val, tp, uaddr2, val2, val3); } [...] long do_futex(u32 __user uaddr, int op, u32 val, ktime_t timeout, u32 __user uaddr2, u32 val2, u32 val3) { int ret; int cmd = op & FUTEX_CMD_MASK; struct rw_semaphore fshared = NULL; if (!(op & FUTEX_PRIVATE_FLAG)) fshared = &current->mm->mmap_sem; switch (cmd) { case FUTEX_WAIT: ret = futex_wait(uaddr, fshared, val, timeout); [...] static int futex_wait(u32 __user uaddr, struct rw_semaphore fshared, u32 val, ktime_t abs_time) { [...] struct restart_block restart; restart = &current_thread_info()->restart_block; restart->fn = futex_wait_restart; restart->arg0 = (unsigned long)uaddr; restart->arg1 = (unsigned long)val; restart->arg2 = (unsigned long)abs_time; restart->arg3 = 0; if (fshared) restart->arg3 \|= ARG3_SHARED; return -ERESTART_RESTARTBLOCK; [...] static long futex_wait_restart(struct restart_block restart) { u32 __user uaddr = (u32 __user )restart->arg0; u32 val = (u32)restart->arg1; ktime_t abs_time = (ktime_t )restart->arg2; struct rw_semaphore fshared = NULL; restart->fn = do_no_restart_syscall; if (restart->arg3 & ARG3_SHARED) fshared = &current->mm->mmap_sem; return (long)futex_wait(uaddr, fshared, val, abs_time); } So when the futex_wait is interrupt by a signal we break out of the hrtimer code and set up or return from signal. This code does not return back to userspace, so we set up a RESTARTBLOCK. The bug here is that we save the "abs_time" which is a pointer to the stack variable "ktime_t t" from sys_futex. This returns and unwinds the stack before we get to call our signal. On return from the signal we go to futex_wait_restart, where we update all the parameters for futex_wait and call it. But here we have a problem where abs_time is no longer valid. I verified this with print statements, and sure enough, what abs_time was set to ends up being garbage when we get to futex_wait_restart. The solution I did to solve this (with input from Linus Torvalds) was to add unions to the restart_block to allow system calls to use the restart with specific parameters. This way the futex code now saves the time in a 64bit value in the restart block instead of storing it on the stack. Note: I'm a bit nervious to add "linux/types.h" and use u32 and u64 in thread_info.h, when there's a #ifdef __KERNEL__ just below that. Not sure what that is there for. If this turns out to be a problem, I've tested this with using "unsigned int" for u32 and "unsigned long long" for u64 and it worked just the same. I'm using u32 and u64 just to be consistent with what the futex code uses. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 15:46:09 +01:00
Andrew Gallatin	621544eb8c	[LRO]: fix lro_gen_skb() alignment Add a field to the lro_mgr struct so that drivers can specify how much padding is required to align layer 3 headers when a packet is copied into a freshly allocated skb by inet_lro.c:lro_gen_skb(). Without padding, skbs generated by LRO will cause alignment warnings on architectures which require strict alignment (seen on sparc64). Myri10GE is updated to use this field. Signed-off-by: Andrew Gallatin <gallatin@myri.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-05 05:37:32 -08:00
Eric Paris	7cd94146cd	Security: round mmap hint address above mmap_min_addr If mmap_min_addr is set and a process attempts to mmap (not fixed) with a non-null hint address less than mmap_min_addr the mapping will fail the security checks. Since this is just a hint address this patch will round such a hint address above mmap_min_addr. gcj was found to try to be very frugal with vm usage and give hint addresses in the 8k-32k range. Without this patch all such programs failed and with the patch they happily get a higher address. This patch is wrappad in CONFIG_SECURITY since mmap_min_addr doesn't exist without it and there would be no security check possible no matter what. So we should not bother compiling in this rounding if it is just a waste of time. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2007-12-06 00:25:10 +11:00
Anton Vorontsov	6f4a7f4183	PHY: Add the phy_device_release device method. Lately I've got this nice badness on mdio bus removal: Device 'e0103120:06' does not have a release() function, it is broken and must be fixed. ------------[ cut here ]------------ Badness at drivers/base/core.c:107 NIP: c015c1a8 LR: c015c1a8 CTR: c0157488 REGS: c34bdcf0 TRAP: 0700 Not tainted (2.6.23-rc5-g9ebadfbb-dirty) MSR: 00029032 <EE,ME,IR,DR> CR: 24088422 XER: 00000000 ... [c34bdda0] [c015c1a8] device_release+0x78/0x80 (unreliable) [c34bddb0] [c01354cc] kobject_cleanup+0x80/0xbc [c34bddd0] [c01365f0] kref_put+0x54/0x6c [c34bdde0] [c013543c] kobject_put+0x24/0x34 [c34bddf0] [c015c384] put_device+0x1c/0x2c [c34bde00] [c0180e84] mdiobus_unregister+0x2c/0x58 ... Though actually there is nothing broken, it just device subsystem core expects another "pattern" of resource managment. This patch implement phy device's release function, thus we're getting rid of this badness. Also small hidden bug fixed, hope none other introduced. ;-) Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-by: Andy Fleming <afleming@freescale.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-04 15:06:33 -05:00
Linus Torvalds	1a2edea9af	Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86 * git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: x86: fix x86-32 early fixmap initialization. x86: disable hpet legacy replacement for kdump x86: disable hpet on shutdown	2007-12-03 08:45:15 -08:00
Linus Torvalds	f589b86d4b	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] Remove xmon from ml300 and ml403 defconfig in arch/ppc Revert "[POWERPC] Fix RTAS os-term usage on kernel panic"	2007-12-03 08:23:58 -08:00
Linus Torvalds	ca6435f188	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: sched: cpu accounting controller (V2)	2007-12-03 08:21:06 -08:00
OGAWA Hirofumi	c86c7fbc82	x86: disable hpet on shutdown If HPET was enabled by pci quirks, we use i8253 as initial clockevent because pci quirks doesn't run until pci is initialized. The above means the kernel (or something) is assuming HPET legacy replacement is disabled and can use i8253 at boot. If we used kexec, it isn't true. So, this patch disables HPET legacy replacement for kexec in machine_shutdown(). Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-12-03 17:17:10 +01:00
Linus Torvalds	8002cedc1a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6: (27 commits) [INET]: Fix inet_diag dead-lock regression [NETNS]: Fix /proc/net breakage [TEXTSEARCH]: Do not allow zero length patterns in the textsearch infrastructure [NETFILTER]: fix forgotten module release in xt_CONNMARK and xt_CONNSECMARK [NETFILTER]: xt_TCPMSS: remove network triggerable WARN_ON [DECNET]: dn_nl_deladdr() almost always returns no error [IPV6]: Restore IPv6 when MTU is big enough [RXRPC]: Add missing select on CRYPTO mac80211: rate limit wep decrypt failed messages rfkill: fix double-mutex-locking mac80211: drop unencrypted frames if encryption is expected mac80211: Fix behavior of ieee80211_open and ieee80211_close ieee80211: fix unaligned access in ieee80211_copy_snap mac80211: free ifsta->extra_ie and clear IEEE80211_STA_PRIVACY_INVOKED SCTP: Fix build issues with SCTP AUTH. SCTP: Fix chunk acceptance when no authenticated chunks were listed. SCTP: Fix the supported extensions paramter SCTP: Fix SCTP-AUTH to correctly add HMACS paramter. SCTP: Fix the number of HB transmissions. [TCP] illinois: Incorrect beta usage ...	2007-12-03 08:15:36 -08:00
Paul Mackerras	8f51506164	Revert "[POWERPC] Fix RTAS os-term usage on kernel panic" This reverts commit `a2b51812a4`. It turns out that this change caused some machines to fail to come back up when being rebooted, and generated an error in the hypervisor error log on some machines. The platform architecture (PAPR) is a little unclear on exactly when the RTAS ibm,os-term function should be called. Until that is clarified I'm reverting this commit. Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-12-03 09:39:45 +11:00
Srivatsa Vaddagiri	d842de871c	sched: cpu accounting controller (V2) Commit `cfb5285660` removed a useful feature for us, which provided a cpu accounting resource controller. This feature would be useful if someone wants to group tasks only for accounting purpose and doesnt really want to exercise any control over their cpu consumption. The patch below reintroduces the feature. It is based on Paul Menage's original patch (Commit `62d0df6406`), with these differences: - Removed load average information. I felt it needs more thought (esp to deal with SMP and virtualized platforms) and can be added for 2.6.25 after more discussions. - Convert group cpu usage to be nanosecond accurate (as rest of the cfs stats are) and invoke cpuacct_charge() from the respective scheduler classes - Make accounting scalable on SMP systems by splitting the usage counter to be per-cpu - Move the code from kernel/cpu_acct.c to kernel/sched.c (since the code is not big enough to warrant a new file and also this rightly needs to live inside the scheduler. Also things like accessing rq->lock while reading cpu usage becomes easier if the code lived in kernel/sched.c) The patch also modifies the cpu controller not to provide the same accounting information. Tested-by: Balbir Singh <balbir@linux.vnet.ibm.com> Tested the patches on top of 2.6.24-rc3. The patches work fine. Ran some simple tests like cpuspin (spin on the cpu), ran several tasks in the same group and timed them. Compared their time stamps with cpuacct.usage. Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-12-02 20:04:49 +01:00
Kim Phillips	7d400a4c58	phylib: add PHY interface modes for internal delay for tx and rx only Allow phylib specification of cases where hardware needs to configure PHYs for Internal Delay only on either RX or TX (not both). Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Tested-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-by: Li Yang <leoli@freescale.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-01 16:32:30 -05:00
Jeff Garzik	c99da91e7a	Merge branch 'master' into upstream-fixes	2007-12-01 16:18:56 -05:00
Eric W. Biederman	2b1e300a9d	[NETNS]: Fix /proc/net breakage Well I clearly goofed when I added the initial network namespace support for /proc/net. Currently things work but there are odd details visible to user space, even when we have a single network namespace. Since we do not cache proc_dir_entry dentries at the moment we can just modify ->lookup to return a different directory inode depending on the network namespace of the process looking at /proc/net, replacing the current technique of using a magic and fragile follow_link method. To accomplish that this patch: - introduces a shadow_proc method to allow different dentries to be returned from proc_lookup. - Removes the old /proc/net follow_link magic - Fixes a weakness in our not caching of proc generic dentries. As shadow_proc uses a task struct to decided which dentry to return we can go back later and fix the proc generic caching without modifying any code that uses the shadow_proc method. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2007-12-02 00:33:17 +11:00
Linus Torvalds	1811534a80	Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus * 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: [MIPS] Fix build. [MIPS] Fix use of smp_processor_id() in preemptible code.	2007-11-30 17:07:38 -08:00
Pavel Kiryukhin	54fd6441e0	[MIPS] Fix use of smp_processor_id() in preemptible code. Freeing prom memory: 956kb freed Freeing firmware memory: 978944k freed Freeing unused kernel memory: 180k freed BUG: using smp_processor_id() in preemptible [00000000] code: swapper/1 caller is r4k_dma_cache_wback_inv+0x144/0x2a0 Call Trace: [<80117af8>] r4k_dma_cache_wback_inv+0x144/0x2a0 [<802e4b84>] debug_smp_processor_id+0xd4/0xf0 [<802e4b7c>] debug_smp_processor_id+0xcc/0xf0 ... CONFIG_DEBUG_PREEMPT is enabled. -- Bug cause is blast_dcache_range() in preemptible code [in r4k_dma_cache_wback_inv()]. blast_dcache_range() is constructed via __BUILD_BLAST_CACHE_RANGE that uses cpu_dcache_line_size(). It uses current_cpu_data that use smp_processor_id() in turn. In case of CONFIG_DEBUG_PREEMPT smp_processor_id emits BUG if we are executing with preemption enabled. Cpu options of cpu0 are assumed to be the superset of all processors. Can I make the same assumptions for cache line size and fix this issue the following way: Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2007-12-01 00:39:37 +00:00

1 2 3 4 5 ...

17766 Commits