Move SKB trim before we lookup the socket so we don't have to
put it on failure.
Based upon an initial patch by Jarek Poplawski and suggestions
from Herbert Xu.
Signed-off-by: David S. Miller <davem@davemloft.net>
libata currently imposes a UDMA5 max transfer rate and 200 sector max
transfer size for SATA devices that sit behind a pata-sata bridge. Lots
of devices have known good bridges that don't need this limit applied.
The MTRON SSD disks are such devices. Transfer rates are increased by
20-30% with the restriction removed.
So add a "blacklist" entry for the MTRON devices, with a flag indicating
that the bridge is known good.
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
5287 used to be treated as vt6420 but it didn't work. It's new family
of controllers called vt8251 which hosts four SATA ports as M/S of the
two ATA ports. This configuration is rather peculiar in that although
the M/S devices are on the same port, each have its own SCR (or
equivalent link status/control) registers which screws up the
port-link-device hierarchy assumed by libata. Another controller
which falls into this category is ata_piix w/ SIDPR access.
libata now has facility to deal with this class of controllers named
slave_link. A low level driver for such controllers can just call
ata_slave_link_init() on the respective ports and libata will handle
all the difficult parts like following up with single SRST after
hardresetting both ports.
This patch creates new controller class vt8251, implements slave_link
aware init sequence and config space based SCR access for it and moves
5287 to the new class.
This patch is based on Joseph Chan's larger patch which was created
before slave_link was implemented in libata.
http://thread.gmane.org/gmane.linux.kernel.commits.mm/40640
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Joseph Chan <JosephChan@via.com.tw>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
In ata_tf_to_lba48(), when evaluating
(tf->hob_lbal & 0xff) << 24
the expression is promoted to signed int (since int can hold all values
of u8). However, if hob_lbal is 128 or more, then it is treated as a
negative signed value and sign-extended when promoted to u64 to | into
sectors, which leads to the MSB 32 bits of section getting set
incorrectly.
For example, Phillip O'Donnell <phillip.odonnell@gmail.com> reported
that a 1.5GB drive caused:
ata3.00: HPA detected: current 2930277168, native 18446744072344861488
where 2930277168 == 0xAEA87B30 and 18446744072344861488 == 0xffffffffaea87b30
which shows the problem when hob_lbal is 0xae.
Fix this by adding a cast to u64, just as is used by for hob_lbah and
hob_lbam in the function.
Reported-by: Phillip O'Donnell <phillip.odonnell@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Remove excess kernel-doc function parameter notation from drivers/ata/:
Warning(drivers/ata/libata-core.c:1622): Excess function parameter or struct member 'fn' description in 'ata_pio_queue_task'
Warning(drivers/ata/libata-core.c:4655): Excess function parameter or struct member 'err_mask' description in 'ata_qc_complete'
Warning(drivers/ata/ata_piix.c:751): Excess function parameter or struct member 'udma' description in 'do_pata_set_dmamode'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The size of the pm_signal_local array should be equal to the
number of SPUs being configured in the array. Currently, the
array is of size 4 (NR_PHYS_CTRS) but being indexed by a for
loop from 0 to 7 (NUM_SPUS_PER_NODE). This could potentially
cause an oops or random memory corruption since the pm_signal_local
array is on the stack. This fixes it.
Signed-off-by: Carl Love <carll@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The Freescale implementation of MPIC only allows a single CPU destination
for non-IPI interrupts. We add a flag to the mpic_init to distinquish
these variants of MPIC. We pull in the irq_choose_cpu from sparc64 to
select a single CPU as the destination of the interrupt.
This is to deal with the fact that the default smp affinity was
changed by commit 1840475676 ("genirq:
Expose default irq affinity mask (take 3)") to be all CPUs.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
After the merge of the 32 and 64bit DMA code, dma_direct_ops lost
their map/unmap_single() functions but gained map/unmap_page(). This
caused a problem for Cell because Cell's dma_iommu_fixed_ops called
the dma_direct_ops if the fixed linear mapping was to be used or the
iommu ops if the dynamic window was to be used. So in order to fix
this problem we need to update the 64bit DMA code to use
map/unmap_page.
First, we update the generic IOMMU code so that iommu_map_single()
becomes iommu_map_page() and iommu_unmap_single() becomes
iommu_unmap_page(). Then we propagate these changes up through all
the callers of these two functions and in the process update all the
dma_mapping_ops so that they have map/unmap_page rahter than
map/unmap_single. We can do this because on 64bit there is no HIGHMEM
memory so map/unmap_page ends up performing exactly the same function
as map/unmap_single, just taking different arguments.
This has no affect on drivers because the dma_map_single_attrs() just
ends up calling the map_page() function of the appropriate
dma_mapping_ops and similarly the dma_unmap_single_attrs() calls
unmap_page().
This fixes an oops on Cell blades, which oops on boot without this
because they call dma_direct_ops.map_single, which is NULL.
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
A typo/thinko made us pass the wrong argument to __flush_hash_table_range
when unplugging bridges, thus not flushing all the translations for
the IO space on unplug. The third parameter to __flush_hash_table_range
is `end', not `size'.
This causes the hypervisor to refuse unplugging slots.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Resources for PHB's that are dynamically added to a system are not
properly allocated in the resource tree.
Not having these resources allocated causes an oops when removing
the PHB when we try to release them.
The diff appears a bit messy, this is mainly due to moving everything
one tab to the left in the pcibios_allocate_bus_resources routine.
The functionality change in this routine is only that the
list_for_each_entry() loop is pulled out and moved to the necessary
calling routine.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently, the numa_node of OF-devices will be overwritten during
device_register, which simply sets the node to -1. On cell machines,
this means that devices can't find their IOMMU, which is referenced
through the device's numa node.
Set the numa node for OF devices with no parent, and use the
lower-level device_initialize and device_add functions, so that the
node is preserved.
We can remove the call to set_dev_node in of_device_alloc, as it
will be overwritten during register.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since VSX support was added, we now have two sizes of ucontext_t;
the older, smaller size without the extra VSX state, and the new
larger size with the extra VSX state. A program using the
sys_swapcontext system call and supplying smaller ucontext_t
structures will currently get an EINVAL error if the task has
used VSX (e.g. because of calling library code that uses VSX) and
the old_ctx argument is non-NULL (i.e. the program is asking for
its current context to be saved). Thus the program will start
getting EINVAL errors on calls that previously worked.
This commit changes this behaviour so that we don't send an EINVAL in
this case. It will now return the smaller context but the VSX MSR bit
will always be cleared to indicate that the ucontext_t doesn't include
the extra VSX state, even if the task has executed VSX instructions.
Both 32 and 64 bit cases are updated.
[paulus@samba.org - also fix some access_ok() and get_user() calls]
Thanks to Ben Herrenschmidt for noticing this problem.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Fixes this warning:
arch/powerpc/kernel/setup_64.c:447:5: warning: "kernstart_addr" is not defined
which arises because PHYSICAL_START is no longer a constant when
CONFIG_RELOCATABLE=y.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Commit 549e8152de ("powerpc: Make the
64-bit kernel as a position-independent executable") added lines to
vmlinux.lds.S to add the extra sections needed to implement a
relocatable kernel. However, those lines seem to trigger a bug in
older versions of GNU ld (such as 2.16.1) when building a
non-relocatable kernel. Since ld 2.16.1 is still a popular choice for
cross-toolchains, this adds an #ifdef to vmlinux.lds.S so the added
lines are only included when building a relocatable kernel.
Signed-off-by: Paul Mackerras <paulus@samba.org>
The __kdump_flag ABI is overly constraining for future development.
As of 2.6.27, the kernel entry point has 4 constraints: Offset 0 is
the starting point for the master (boot) cpu (entered with r3 pointing
to the device tree structure), offset 0x60 is code for the slave cpus
(entered with r3 set to their device tree physical id), offset 0x20 is
used by the iseries hypervisor, and secondary cpus must be well behaved
when the first 256 bytes are copied to address 0.
Placing the __kdump_flag at 0x18 is bad because:
- It was taking the last 8 bytes before the iseries hypervisor data.
- It was 8 bytes for a boolean flag
- It had no way of identifying that the flag was present
- It does leave any room for the master to add any additional code
before branching, which hurts debug.
- It will be unnecessarily hard for 32 bit code to be common (8 bytes)
Now that we have eliminated the use of __kdump_flag in favor of
the standard is_kdump_kernel(), this flag only controls run without
relocating the kernel to PHYSICAL_START (0), so rename it __run_at_load.
Move the flag to 0x5c, 1 word before the secondary cpu entry point at
0x60. Initialize it with "run0" to say it will run at 0 unless it is
set to 1. It only exists if we are relocatable.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
linux/crash_dump.h defines is_kdump_kernel() to be used by code that
needs to know if the previous kernel crashed instead of a (clean) boot
or reboot.
This updates the just added powerpc code to use it. This is needed
for the next commit, which will remove __kdump_flag.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Commit 54622f10a6 ("powerpc: Support for
relocatable kdump kernel") added a magic flag value in a register to
tell purgatory that it should be a panic kernel. This part is wrong
and is reverted by this commit.
The kernel gets a list of memory blocks and a entry point from user space.
Its job is to copy the blocks into place and then branch to the designated
entry point (after turning "off" the mmu).
The user space tool inserts a trampoline, called purgatory, that runs
before the user supplied code. Its job is to establish the entry
environment for the new kernel or other application based on the contents
of memory. The purgatory code is compiled and embedded in the tool,
where it is later patched using the elf symbol table using elf symbols.
Since the tool knows it is creating a purgatory that will run after a
kernel crash, it should just patch purgatory (or the kernel directly)
if something needs to happen.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The link may be up already via the chip's reset strapping, or though action
of U-Boot, or from the last time the interface was brought up. Resetting
the link causes it to go down for several seconds. This can significantly
increase the time from power-on to DHCP completion and a device being
accessible to the network.
Signed-off-by: Trent Piepho <tpiepho@freescale.com>
Acked-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The init_phy() function attaches to the PHY, then configures the
SerDes<->TBI link (in SGMII mode). The TBI is on the MDIO bus with the PHY
(sort of) and is accessed via the gianfar's MDIO registers, using the
functions gfar_local_mdio_read/write(), which don't do any locking.
The previously attached PHY will start a work-queue on a timer, and
probably an irq handler as well, which will talk to the PHY and thus use
the MDIO bus. This uses phy_read/write(), which have locking, but not
against the gfar_local_mdio versions.
The result is that PHY code will try to use the MDIO bus at the same time
as the SerDes setup code, corrupting the transfers.
Setting up the SerDes before attaching to the PHY will insure that there is
no race between the SerDes code and *our* PHY, but doesn't fix everything.
Typically the PHYs for all gianfar devices are on the same MDIO bus, which
is associated with the first gianfar device. This means that the first
gianfar's SerDes code could corrupt the MDIO transfers for a different
gianfar's PHY.
The lock used by phy_read/write() is contained in the mii_bus structure,
which is pointed to by the PHY. This is difficult to access from the
gianfar drivers, as there is no link between a gianfar device and the
mii_bus which shares the same MDIO registers. As far as the device layer
and drivers are concerned they are two unrelated devices (which happen to
share registers).
Generally all gianfar devices' PHYs will be on the bus associated with the
first gianfar. But this might not be the case, so simply locking the
gianfar's PHY's mii bus might not lock the mii bus that the SerDes setup
code is going to use.
We solve this by having the code that creates the gianfar platform device
look in the device tree for an mdio device that shares the gianfar's
registers. If one is found the ID of its platform device is saved in the
gianfar's platform data.
A new function in the gianfar mii code, gfar_get_miibus(), can use the bus
ID to search through the platform devices for a gianfar_mdio device with
the right ID. The platform device's driver data is the mii_bus structure,
which the SerDes setup code can use to lock the current bus.
Signed-off-by: Trent Piepho <tpiepho@freescale.com>
CC: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
When the at91_ether driver is using a GPIO for its PHY interrupt,
be sure to request (and later, if needed, free) that GPIO.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Commit 401c0aabec introduced a regression
in the atl1 driver by storing the VLAN tag in the wrong TX descriptor
field.
This patch causes the VLAN tag to be stored in its proper location.
Tested-by: Ramon Casellas <ramon.casellas@cttc.es>
Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net>
Cc: stable@kernel.org
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Use mmiowb() to ensure "stop" and "go" commands are sent in order on ia64.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
A panic was discovered with bonding when using mode 5 or 6 and trying to
remove the slaves from the bond after the interface was taken down.
When calling 'ifconfig bond0 down' the following happens:
bond_close()
bond_alb_deinitialize()
tlb_deinitialize()
kfree(bond_info->tx_hashtbl)
bond_info->tx_hashtbl = NULL
Unfortunately if there are still slaves in the bond, when removing the
module the following happens:
bonding_exit()
bond_free_all()
bond_release_all()
bond_alb_deinit_slave()
tlb_clear_slave()
tx_hash_table = BOND_ALB_INFO(bond).tx_hashtbl
u32 next_index = tx_hash_table[index].next
As you might guess we panic when trying to access a few entries into the
table that no longer exists.
I experimented with several options (like moving the calls to
tlb_deinitialize somewhere else), but it really makes the most sense to
be part of the bond_close routine. It also didn't seem logical move
tlb_clear_slave around too much, so the simplest option seems to add a
check in tlb_clear_slave to make sure we haven't already wiped the
tx_hashtbl away before searching for all the non-existent hash-table
entries that used to point to the slave as the output interface.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This patch reworks the resource free logic performed at the time
a bonding device is released. This (a) closes two resource leaks, one
for workqueues and one for multicast lists, and (b) improves commonality
of code between the "destroy one" and "destroy all" paths by performing
final free activity via destructor instead of explicitly (and differently)
in each path.
"Sean E. Millichamp" <sean@bruenor.org> reported the workqueue
leak, and included a different patch.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
During the rework of the mii monitor for:
commit f0c76d6177
Author: Jay Vosburgh <fubar@us.ibm.com>
Date: Wed Jul 2 18:21:58 2008 -0700
bonding: refactor mii monitor
I left out the increment of the link failure counter. This
patch corrects that omission.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
lguest: fix irq vectors.
lguest: fix early_ioremap.
lguest: fix example launcher compile after moved asm-x86 dir.
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: remove sched-design.txt from 00-INDEX
sched: change sched_debug's mode to 0444
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
ftrace: handle archs that do not support irqs_disabled_flags
do_IRQ: cannot handle IRQ -1 vector 0x20 cpu 0
------------[ cut here ]------------
kernel BUG at arch/x86/kernel/irq_32.c:219!
We're not ISA: we have a 1:1 mapping from vectors to irqs.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
dmi_scan_machine breaks under lguest:
lguest: unhandled trap 14 at 0xc04edeae (0xffa00000)
This is because we use current_cr3 for the read_cr3() paravirt
function, and it isn't set until the first cr3 change. We got away
with it until this happened.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
fix:
arch/x86/kernel/cpu/common.c: In function 'early_identify_cpu':
arch/x86/kernel/cpu/common.c:553: error: 'struct cpuinfo_x86' has no member named 'cpu_index'
as cpu_index is only available on SMP.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix /proc/cpuinfo output on x86/Voyager
Ever since
| commit 92cb7612ae
| Author: Mike Travis <travis@sgi.com>
| Date: Fri Oct 19 20:35:04 2007 +0200
|
| x86: convert cpuinfo_x86 array to a per_cpu array
We've had an extra field in cpuinfo_x86 which is cpu_index.
Unfortunately, voyager has never initialised this, although the only
noticeable impact seems to be that /proc/cpuinfo shows all zeros for
the processor ids.
Anyway, fix this by initialising the boot CPU properly and setting the
index when the secondaries update.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: build fix on x86/Voyager
Given commits like this:
| Author: Suresh Siddha <suresh.b.siddha@intel.com>
| Date: Tue Jul 29 10:29:19 2008 -0700
|
| x86, xsave: enable xsave/xrstor on cpus with xsave support
Which deliberately expose boot cpu dependence to pieces of the system,
I think it's time to explicitly have a variable for it to prevent this
continual misassumption that the boot CPU is zero.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: build fix on non-lockdep architectures
Some architectures do not support a way to read the irq flags that
is set from "local_irq_save(flags)" to determine if interrupts were
disabled or enabled. Ftrace uses this information to display to the user
if the trace occurred with interrupts enabled or disabled.
Besides the fact that those archs that do not support this will fail to
compile, unless they fix it, we do not want to have the trace simply
say interrupts were not disabled or they were enabled, without knowing
the real answer.
This patch adds a 'X' in the output to let the user know that the
architecture they are running on does not support a way for the tracer
to determine if interrupts were enabled or disabled. It also lets those
same archs compile with tracing enabled.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: allow /dev/mem mmaps on non-PAT CPUs/platforms
Fix mmap to /dev/mem when CONFIG_X86_PAT is off and CONFIG_STRICT_DEVMEM is
off
mmap to /dev/mem on kernel memory has been failing since the
introduction of PAT (CONFIG_STRICT_DEVMEM=n case). Seems like
the check to avoid cache aliasing with PAT is kicking in even
when PAT is disabled. The bug seems to have crept in 2.6.26.
This patch makes sure that the mmap to regular
kernel memory succeeds if CONFIG_STRICT_DEVMEM=n and
PAT is disabled, and the checks to avoid cache aliasing
still happens if PAT is enabled.
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Tested-by: Tim Sirianni <tim@scalemp.com>
Cc: <stable@kernel.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix build failure on x86/Voyager
Before:
| commit 329513a35d
| Author: Yinghai Lu <yhlu.kernel@gmail.com>
| Date: Wed Jul 2 18:54:40 2008 -0700
|
| x86: move prefill_possible_map calling early
prefill_possible_mask() was hidden under CONFIG_HOTPLUG_CPU rendering
it invisitble to voyager. Since this commit it's exposed, but not
provided by the voyager subarch, so add a dummy stub to fix the link
breakage.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix x86/Voyager boot
CONFIG_SMP is used for features which work on *all* x86 boxes.
CONFIG_X86_SMP is used for standard PC like x86 boxes (for things like
multi core and apics)
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: boot up secondary CPUs as well on x86/Voyager systems
This commit:
| commit 3e9704739d
| Author: Glauber Costa <gcosta@redhat.com>
| Date: Wed May 28 13:01:54 2008 -0300
|
| x86: boot secondary cpus through initial_code
removed the use of initialize_secondary. However, it didn't update
voyager, so the secondary cpus no longer boot. Fix this by adding the
initial_code switch to voyager as well.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The nlm_host_rebooted() function uses nlm_cmp_addr() to find an
nsm_handle that matches the rebooted peer. In order for this to work,
the passed-in address must have a proper address family.
This fixes a post-2.6.28 regression introduced by commit 781b61a6, which
added AF_INET6 support to nlm_cmp_addr(). Before that commit,
nlm_cmp_addr() didn't care about the address family; it compared only
the sin_addr.s_addr field for equality.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>