pppoe_flush_dev() kicks all sockets bound to a device that is going down.
In doing so, locks must be taken in the right order consistently (sock lock,
followed by the pppoe_hash_lock). However, the scan process is based on
us holding the sock lock. So, when something is found in the scan we must
release the lock we're holding and grab the sock lock.
This patch fixes race conditions between this code and pppoe_release(),
both of which perform similar functions but would naturally prefer to grab
locks in opposing orders. Both code paths are now going after these locks
in a consistent manner.
pppoe_hash_lock protects the contents of the "pppox_sock" objects that reside
inside the hash. Thus, NULL'ing out the pppoe_dev field should be done
under the protection of this lock.
Signed-off-by: Michal Ostrowski <mostrows@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
below you find a patch that fixes a memory leak when a PPPoE socket is
release()d after it has been connect()ed, but before the PPPIOCGCHAN ioctl
ever has been called on it.
This is somewhat of a security problem, too, since PPPoE sockets can be
created by any user, so any user can easily allocate all the machine's
RAM to non-swappable address space and thus DoS the system.
Is there any specific reason for PPPoE sockets being available to any
unprivileged process, BTW? After all, you need a packet socket for the
discovery stage anyway, so it's unlikely that any unprivileged process
will ever need to create a PPPoE socket, no? Allocating all session IDs
for a known AC is a kind of DoS, too, after all - with Juniper ERXes,
this is really easy, actually, since they don't ever assign session ids
above 8000 ...
Signed-off-by: Florian Zumbiehl <florz@florz.de>
Acked-by: Michal Ostrowski <mostrows@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
below you find a patch that (hopefully) fixes a race between an interface
going down and a connect() to a peer on that interface. Before,
connect() would determine that an interface is up, then the interface
could go down and all entries referring to that interface in the
item_hash_table would be marked as ZOMBIEs and their references to
the device would be freed, and after that, connect() would put a new
entry into the hash table referring to the device that meanwhile is
down already - which also would cause unregister_netdevice() to wait
until the socket has been release()d.
This patch does not suffice if we are not allowed to accept connect()s
referring to a device that we already acked a NETDEV_GOING_DOWN for
(that is: all references are only guaranteed to be freed after
NETDEV_DOWN has been acknowledged, not necessarily after the
NETDEV_GOING_DOWN already). And if we are allowed to, we could avoid
looking through the hash table upon NETDEV_GOING_DOWN completely and
only do that once we get the NETDEV_DOWN ...
mostrows:
pppoe_flush_dev is called on NETDEV_GOING_DOWN and NETDEV_DOWN to deal with
this "late connect" issue. Ideally one would hope to notify users at the
"NETDEV_GOING_DOWN" phase (just to pretend to be nice). However, it is the
NETDEV_DOWN scan that takes all the responsibility for ensuring nobody is
hanging around at that time.
Signed-off-by: Florian Zumbiehl <florz@florz.de>
Acked-by: Michal Ostrowski <mostrows@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
below is a patch that just removes dead code/initializers without any
effect (first access is an assignment) that I stumbled accross while
reading the source.
Signed-off-by: Florian Zumbiehl <florz@florz.de>
Acked-by: Michal Ostrowski <mostrows@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch cb_lock to mutex and allow netlink kernel users to override it
with a subsystem specific mutex for consistent locking in dump callbacks.
All netlink_dump_start users have been audited not to rely on any
side-effects of the previously used spinlock.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rusty added a new 'stats' field to struct net_device.
loopback driver can use it instead of declaring another struct
net_device_stats This saves some memory.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When csum_offset was introduced we did a conversion from csum to
csum_offset where applicable. A couple of drivers were missed in
this process.
It was harmless to begin with since the two fields coincided. Now
that we've made them different with the addition of csum_start, the
missed drivers must be converted or they can't send packets out at
all that require checksum offload.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
To clearly state the intent of copying to linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
We have to put back the cast to "char *" because these
pointers are volatile.
Reported by Andrew Morton.
Signed-off-by: David S. Miller <davem@davemloft.net>
Network drivers which keep stats allocate their own stats structure
then write a get_stats() function to return them. It would be nice if
this were done by default.
1) Add a new "stats" field to "struct net_device".
2) Add a new feature field to say "this driver uses the internal one"
3) Have a default "get_stats" which returns NULL if that feature not set.
4) Change callers to check result of get_stats call for NULL, not if
->get_stats is set.
This should not break backwards compatibility with older drivers, yet
allow modern drivers to shed some boilerplate code.
Lightly tested: works for a modified lguest network driver.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to get more randomness for secure_tcpv6_sequence_number(),
secure_tcp_sequence_number(), secure_dccp_sequence_number() functions,
we can use the high resolution time services, providing nanosec
resolution.
I've also done two kmalloc()/kzalloc() conversions.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
To clearly state the intent of copying from linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For consistency with other skb data accessors, reducing the number of direct
accesses to skb->data.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reducing the number of skb->data direct accesses.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
At that point it is equivalent to what was being used, skb->end - skb->data,
and the need is clearly the one skb_tailroom satisfies.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the common "(struct nlmsghdr *)skb->data" sequence, so that we reduce the
number of direct accesses to skb->data and for consistency with all the other
cast skb member helpers.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now to convert the last one, skb->data, that will allow many simplifications
and removal of some of the offset helpers.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
on 64bit architectures, allowing us to combine the 4 bytes hole left by the
layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
:-)
Many calculations that previously required that skb->{transport,network,
mac}_header be first converted to a pointer now can be done directly, being
meaningful as offsets or pointers.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SMC SuperIO Chip LPC47N227 used for IrDA is not detected because its device
identification byte can be 0x7A instead of 0x5A.
Patch from Peter Kovar <peter.kovar@gmail.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and
skb->mac to skb->mac_header, to match the names of the associated helpers
(skb[_[re]set]_{transport,network,mac}_header).
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the common sequence "skb->h.raw - skb->nh.raw", similar to skb->mac_len,
that is precalculated tho, don't think we need to bloat skb with one more
member, so just use this new helper, reducing the number of non-skbuff.h
references to the layer headers even more.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the places where we need a pointer to the transport header, it is
still legal to touch skb->h.raw directly if just adding to,
subtracting from or setting it to another layer header.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ip_hdrlen() buddy, created to reduce the number of skb->h.th-> uses and to
avoid the longer, open coded equivalent.
Ditched a no-op in bnx2 in the process.
I wonder if we should have a BUG_ON(skb->h.th->doff < 5) in tcp_optlen()...
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the quite common 'skb->h.raw - skb->data' sequence.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the common, open coded 'skb->h.raw = skb->data' operation, so that we can
later turn skb->h.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.
This one touches just the most simple cases:
skb->h.raw = skb->data;
skb->h.raw = {skb_push|[__]skb_pull}()
The next ones will handle the slightly more "complex" cases.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now the skb->nh union has just one member, .raw, i.e. it is just like the
skb->mac union, strange, no? I'm just leaving it like that till the transport
layer is done with, when we'll rename skb->mac.raw to skb->mac_header (or
->mac_header_offset?), ditto for ->{h,nh}.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the common sequence "skb->nh.iph->ihl * 4", removing a good number of open
coded skb->nh.iph uses, now to go after the rest...
Just out of curiosity, here are the idioms found to get the same result:
skb->nh.iph->ihl << 2
skb->nh.iph->ihl<<2
skb->nh.iph->ihl * 4
skb->nh.iph->ihl*4
(skb->nh.iph)->ihl * sizeof(u32)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the places where we need a pointer to the network header, it is still legal
to touch skb->nh.raw directly if just adding to, subtracting from or setting it
to another layer header.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the quite common 'skb->nh.raw - skb->data' sequence.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the common, open coded 'skb->nh.raw = skb->data' operation, so that we can
later turn skb->nh.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.
This one touches just the most simple case, next will handle the slightly more
"complex" cases.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For consistency with all the other skb->nh.raw accessors.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For consistency with all the other skb->nh.raw accessors.
Also do some really obvious simplifications in pppoe_recvmsg, well the
kfree_skb one is not so obvious, but free() and kfree() have the same behaviour
(hint :-) ).
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the places where we need a pointer to the mac header, it is still legal to
touch skb->mac.raw directly if just adding to, subtracting from or setting it
to another layer header.
This one also converts some more cases to skb_reset_mac_header() that my
regex missed as it had no spaces before nor after '=', ugh.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the cases where we want to set skb->mac.raw to an offset from skb->data.
Simple cases first, the memmove ones and specially pktgen will be left for later.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the common, open coded 'skb->mac.raw = skb->data' operation, so that we can
later turn skb->mac.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.
This one touches just the most simple case, next will handle the slightly more
"complex" cases.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
One less thing for drivers writers to worry about.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For consistency with other skb->mac.raw users.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now all the _type_trans routines are consistent in this regard.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the following not or no longer used exports:
- drivers/char/random.c: secure_tcp_sequence_number
- net/dccp/options.c: sysctl_dccp_feat_sequence_window
- net/netlink/af_netlink.c: netlink_set_err
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[PARPORT] SUNBPP: Fix OOPS when debugging is enabled.
[SPARC] openprom: Switch to ref counting PCI API
The packet driver is assuming (reasonably) that the (undocumented)
request.errors is an errno. But it is in fact some mysterious bitfield. When
things go wrong we return weird positive numbers to the VFS as pointers and it
goes oops.
Thanks to William Heimbigner for reporting and diagnosis.
(It doesn't oops, but this driver still doesn't work for William)
Cc: William Heimbigner <icxcnika@mar.tar.cc>
Cc: Peter Osterlund <petero2@telia.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All RDMA drivers except ehca set class_dev->dev to their dma_device
value (ehca leaves this unset). dma_device is the only value that
makes any sense, so move this assignment to core/sysfs.c. This reduce
the duplicated code in the rest of the drivers and gives ehca a nice
/sys/class/infiniband/ehcaX/device symlink.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add "Modify Port" verb support to eHCA driver. The IB communication
manager needs this to set the IsCM port capability bit when
initializing.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
There are quite a few places in ipoib_cm.c where we know IRQs are
enabled because we do something that sleeps in the same function, so
we can convert several occurrences of spin_lock_irqsave() to a plain
spin_lock_irq(). This cleans up the source a little and makes the
code smaller too:
add/remove: 0/0 grow/shrink: 1/5 up/down: 3/-51 (-48)
function old new delta
ipoib_cm_tx_reap 403 406 +3
ipoib_cm_stale_task 146 145 -1
ipoib_cm_dev_stop 173 172 -1
ipoib_cm_tx_handler 964 956 -8
ipoib_cm_rx_handler 956 937 -19
ipoib_cm_skb_reap 212 190 -22
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Clarify code by changing return values from SMI functions to named
enum values instead of magic 0/1 values.
Signed-off-by: Hal Rosenstock <halr@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
We need to set the SGID index for routed MADs and pass received
GRH information to userspace when a MAD is received.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
To support destinations that are not on the local IB subnet, IPoIB
should include the GRH information when constructing an address
handle. Using the existing ib_init_ah_from_path() call will do this
for us.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
mthca_free_qp() already has local variables to hold the QP's send_cq
and recv_cq, so we can slightly clean up the calls to mthca_cq_clean()
by using those local variables instead of expressions like
to_mcq(qp->ibqp.send_cq).
Also, by cleaning the recv_cq first, we can avoid worrying about
whether the QP is attached to an SRQ for the second call, because we
would only clean send_cq if send_cq is not equal to recv_cq, and that
means send_cq cannot have any receive completions from the QP being
destroyed.
All this work even improves the generated code a bit:
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-5 (-5)
function old new delta
mthca_free_qp 510 505 -5
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Commit b2875d4c ("IB/mthca: Always fill MTTs from CPU") causes a crash
in mthca_write_mtt() with non-memfree HCAs that have their memory
hidden (that is, have only two PCI BARs instead of having a third BAR
that allows access to the RAM attached to the HCA) on 64-bit
architectures. This is because the commit just before, c20e20ab
("IB/mthca: Merge MR and FMR space on 64-bit systems") makes
dev->mr_table.fmr_mtt_buddy equal to &dev->mr_table.mtt_buddy and
hence mthca_write_mtt() tries to write directly into the HCA's MTT
table. However, since that table is in the HCA's memory, this is
impossible without the PCI BAR that gives access to that memory.
This causes a crash because mthca_tavor_write_mtt_seg() basically
tries to dereference some offset of a NULL pointer. Fix this by
adding a test of MTHCA_FLAG_FMR in mthca_write_mtt() so that we always
use the WRITE_MTT firmware command rather than writing directly if
FMRs are not enabled.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Tweak a register setting to prevent the tx mailbox from halting.
Update version to 1.5.8.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sparc64:
drivers/net/hamradio/baycom_ser_fdx.c: In function `ser12_open':
drivers/net/hamradio/baycom_ser_fdx.c:417: error: `NR_IRQS' undeclared (first us
e in this function)
drivers/net/hamradio/baycom_ser_fdx.c:417: error: (Each undeclared identifier is
reported only once
drivers/net/hamradio/baycom_ser_fdx.c:417: error: for each function it appears i
n.)
Cc: Folkert van Heusden <folkert@vanheusden.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Broken by 4a1728a28a which switched the
return semantics of read_mii_word() but didn't fix usage of
read_mii_word() to conform to the new semantics.
Setting carrier to off based on the NO_CARRIER flag is also incorrect as
that flag only triggers on TX failure and therefore isn't correct when
no frames are being transmitted. Since there is already a 2*HZ MII
carrier check going on, defer to that.
Add a TRUST_LINK_STATUS feature flag for adapters where the LINK_STATUS
flag is actually correct, and use that rather than the NO_CARRIER flag.
Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The sis900 driver appears to have a bug in which the receive routine
passes the skbuff holding the received frame to the network stack before
refilling the buffer in the rx ring. If a new skbuff cannot be allocated, the
driver simply leaves a hole in the rx ring, which causes the driver to stop
receiving frames and become non-recoverable without an rmmod/insmod according to
reporters. This patch reverses that order, attempting to allocate a replacement
buffer first, and receiving the new frame only if one can be allocated. If no
skbuff can be allocated, the current skbuf in the rx ring is recycled, dropping
the current frame, but keeping the NIC operational.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The following patch fixes a kernel bug in depca_platform_probe().
We don't use a dynamic pointer for pldev->dev.platform_data, so it seems
that the correct way to proceed if platform_device_add(pldev) fails is
to explicitly set the pldev->dev.platform_data pointer to NULL, before
calling the platform_device_put(pldev), or it will be kfree'ed by
platform_device_release().
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Commit 40b36daa introduced possibility that serial8250_backup_timeout() ->
serial8250_handle_port() locks port.lock without disabling irqs, thus
allowing deadlock against interrupt handler (port.lock is acquired in
serial8250_interrupt()).
Spotted by lockdep.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Alex Williamson <alex.williamson@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Two functions are called from __devinit context, but they are marked as
__init. Fix this.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On ia64, kernel headers define REGION_OFFSET so we can't use that.
Reported by Andrew Morton.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: David Hubbard <david.c.hubbard@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
irq values are u32, not u8. Large irq numbers will be truncated,
free_irq may free a different irq.
Remove incorrectly sized struct member and use the one from pci_dev.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use relative time, not absolute. Discovered by Jung-Ik (John) Lee
<jilee@google.com>.
Cc: Jung-Ik (John) Lee <jilee@google.com>
Acked-by: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
pcd_lock and pf_spin_lock are passed to blk_init_queue() which, seeing them
as valid lock pointer, sets it as ->queue_lock.
The problem is that pcd_lock and pf_spin_lock aren't initialized anywhere.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There was schedule() missing in the TIOCMIWAIT ioctl. Solve it by moving
the code to the wait_event_interruptible.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Jan Yenya Kasprzak <kas@fi.muni.cz>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There was schedule() missing in the TIOCMIWAIT ioctl. Solve it by moving
the code to the wait_event_interruptible.
Cc: Jan "Yenya" Kasprzak <kas@fi.muni.cz>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I encountered the following kernel panic. The cause of this problem was
NULL pointer access in check_modem_status() in 8250.c. I confirmed this
problem is fixed by the attached patch, but I don't know this is the
correct fix.
sadc[4378]: NaT consumption 2216203124768 [1]
Modules linked in: binfmt_misc dm_mirror dm_mod thermal processor fan
container button sg e100 eepro100 mii ehci_hcd ohci_hcd
Pid: 4378, CPU 0, comm: sadc
psr : 00001210085a2010 ifs : 8000000000000289 ip : [<a000000100482071>]
Not tainted
ip is at check_modem_status+0xf1/0x360
Call Trace:
[<a000000100013940>] show_stack+0x40/0xa0
[<a0000001000145a0>] show_regs+0x840/0x880
[<a0000001000368e0>] die+0x1c0/0x2c0
[<a000000100036a30>] die_if_kernel+0x50/0x80
[<a000000100037c40>] ia64_fault+0x11e0/0x1300
[<a00000010000bdc0>] ia64_leave_kernel+0x0/0x280
[<a000000100482070>] check_modem_status+0xf0/0x360
[<a000000100482300>] serial8250_get_mctrl+0x20/0xa0
[<a000000100478170>] uart_read_proc+0x250/0x860
[<a0000001001c16d0>] proc_file_read+0x1d0/0x4c0
[<a0000001001394b0>] vfs_read+0x1b0/0x300
[<a000000100139cd0>] sys_read+0x70/0xe0
[<a00000010000bc20>] ia64_ret_from_syscall+0x0/0x20
[<a000000000010620>] __kernel_syscall_via_break+0x0/0x20
Fix the possible NULL pointer access in check_modem_status() in 8250.c. The
check_modem_status() would access 'info' member of uart_port structure, but it
is not initialized before uart_open() is called. The check_modem_status() can
be called through /proc/tty/driver/serial before uart_open() is called.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Taku Izumi <izumi2005@soft.fujitsu.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The debugging code would dereference __iomem pointers instead
of going through sbus_{read,write}{b,w,l}().
Signed-off-by: David S. Miller <davem@davemloft.net>
USRobotics Wireless Adapter (Model 5423) works well with current
zd1211rw driver also (i have tested 2.6.18, 2.6.20 and 2.6.21-rc7).
It just needs its ID added to the list of devices.
Signed-off-by: S.Çağlar Onur <caglar@pardus.org.tr>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The MAC address assignment at module loading is simply forgotten.
The bug at module unloading is caused by an incorrect call.
The bug at module unloading does not only happen for sunqe,
sunlance and sunhme (sbus) suffer from it too.
I've tested this on my SS20.
Signed-off-by: Marcel van Nies <morcles@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replacing kmalloc/memset combination with kzalloc.
Signed-off-by: vignesh babu <vignesh.babu@wipro.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ide_hwif_to_major[] has only 10 entries as there are 10 major numbers
reserved for IDE (if somebody needs more it shouldn't be hard to fix).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
The driver crashes the kernel on HPT302N chips due to the missing initializer
for 'hpt302n.settings' having been unfortunately overlooked so far. :-<
Much thanks to Mike Mattie for pin-pointing the reason of crash.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add PCI ID for a newer variant of cardbus CF/IDE adapter card.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This reverts commit 60cba200f1. It's been
linked to lockups of the e1000 hardware, see for example
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=229603
but it's likely that the commit itself is not really introducing the
bug, but just allowing an unrelated problem to rear its ugly head (ie
one current working theory is that the code exposes us to a hardware
race condition by decreasing the amount of time we spend in each NAPI
poll cycle).
We'll revert it until root cause is known. Intel has a repeatable
reproduction on two different machines and bus traces of the hardware
doing something bad.
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Greg KH <gregkh@suse.de>
Cc: Dave Jones <davej@redhat.com>
Cc: Auke Kok <auke-jan.h.kok@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A small number of SiS setups require special handling (not many judging
by how long this dumb bug survived). A couple of Fedora 7 devel testers
hit an Oops on pata_sis loading which is caused by terminal confusion
between chipset as 'the chipset we have found' and chipset as 'array
iterator'
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
From: Paul Mackerras <paulus@samba.org>
This fixes:
Subject: kernel BUG at net/core/skbuff.c in linux-2.6.21-rc6
process_input_packet() treats the case where the first byte is 0xff
(PPP_ALLSTATIONS) but the second byte is 0x03 (PPP_UI) as indicating a
packet with a PPP protocol number of 0xff. Arguably that's wrong
since PPP protocol 0xff is reserved, and the RFC does envision the
possibility of receiving frames where the control field has values
other than 0x03.
Signed-off-by: David S. Miller <davem@davemloft.net>
The Yukon FE (100mbit only) chips do not support large packets.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The Yukon EC Ultra chips have transmit settings for store and
forward and PCI buffering. By setting these appropriately, normal
performance goes from 750Mbytes/sec to 940Mbytes/sec (non-jumbo).
It is also possible to do Jumbo mode, but it means turning off
TSO and checksum offload so the performance gets worse. There isn't
enough buffering for checksum offload to work.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Need to make sure and disable ASF on all chip types. Otherwise, there may be
random reboots.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
There should never be descriptor error unless hardware or driver is buggy.
But if an error occurs, print useful information, clear irq, and recover.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This device is having all sorts of problems that lead to data corruption
and system instability. It gets receive status and data out of order,
it generates descriptor and TSO errors, etc.
Until the problems are resolved, it should not be used by anyone
who cares about there system.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The basic structure of "normal" UDP/IP/Ethernet
frames (that actually work):
- It starts with the Ethernet header (dest MAC, src MAC, etc.)
- The next part is occupied by the IP header (version info, length of
packet, id=0, fragment offset=0, checksum, from / to address, etc.)
- Then comes the UDP header (src / dest port, length, checksum)
- Actual payload
- Ethernet checksum
Now what's different for IP fragment:
- The IP header has id set to some value (same for all fragments),
offset is set appropriately (i.e. 0 for first fragment, following
according to size of other fragments), size is the length of the frame.
- UDP header is unchanged. I.e. length is according to full UDP
datagram, not just the part within the actual frame! But this is only
true within the first frame: all following frames don't have a valid
UDP-header at all.
The spidernet silicon seems to be quite intelligent: It's able to
compute (IP / UDP / Ethernet) checksums on the fly and tests if frames
are conforming to RFC -- at least conforming to RFC on complete frames.
But IP fragments are different as explained above:
I.e. for IP fragments containing part of a UDP datagram it sees
incompatible length in the headers for IP and UDP in the first frame
and, thus, skips this frame. But the content *is* correct for IP
fragments. For all following frames it finds (most probably) no valid
UDP header at all. But this *is* also correct for IP fragments.
The Linux IP-stack seems to be clever in this point. It expects the
spidernet to calculate the checksum (since the module claims to be able
to do so) and marks the skb's for "normal" frames accordingly
(ip_summed set to CHECKSUM_HW).
But for the IP fragments it does not expect the driver to be capable to
handle the frames appropriately. Thus all checksums are allready
computed. This is also flaged within the skb (ip_summed set to
CHECKSUM_NONE).
Unfortunately the spidernet driver ignores that hints. It tries to send
the IP fragments of UDP datagrams as normal UDP/IP frames. Since they
have different structure the silicon detects them the be not
"well-formed" and skips them.
The following one-liner against 2.6.21-rc2 changes this behavior. If the
IP-stack claims to have done the checksumming, the driver should not
try to checksum (and analyze) the frame but send it as is.
Signed-off-by: Norbert Eicker <n.eicker@fz-juelich.de>
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Remove assumption that PHY interrupts use GPIOs 3 and 5.
Deal with PHY interrupts connected to any GPIO pins.
Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Reuse the incoming skb when a clientless abort req is recieved.
The release of RDMA connections HW resources might be deferred in
low memory situations.
Ensure that no further activity is passed up to the RDMA driver
for these connections.
Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Nonpae guest pdes are shadowed by two pae ptes, so we double the offset
twice: once to account for the pte size difference, and once because we
need to shadow pdes for a single guest pde.
But when writing to the upper guest pde we also need to truncate the
lower bits, otherwise the multiply shifts these bits into the pde index
and causes an access to the wrong shadow pde. If we're at the end of the
page (accessing the very last guest pde) we can even overflow into the
next host page and oops.
Signed-off-by: Avi Kivity <avi@qumranet.com>
The kernel ib_wc structure now uses a QP pointer, but the user space
equivalent uses a QP number instead. This means we can no longer use
a simple structure copy to copy stuff into user space.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Don't let userspace use the direct-physical-map L_key or R_key.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
At some point things changed so that all the affinity bits can be set,
but cpus_full() macro is not true. This caused problems with the unit
selection logic on multi-unit (board) configurations.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Move the code that shuts down the IB link earlier in the unload
process, to be sure no new packets can arrive while we are unloading.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
To prevent random utility reads and writes of the diag interface to the
chip, we first require a handshake of reading from offset 0 and writing
to offset 0 before any other reads or writes can be done through the
diags device. Otherwise chip errors can be triggered.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the chip is no longer usable, LEDs should be turned off so system
can be found easily in the cluster.
Also some minor reorganizing so both chips print hardware error
message at same point and only if there were unrecovered errors
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Re-init of the kernel structures after a chip reset was leaving the
portdata structure for port zero in an inconsistent state, and a
pointer to it either stale (in re-init code) or NULL (in devdata)
Fixing the order of operations on this struct, and the condition for
interrupt access, prevents the crashes.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Due to a chip bug, the PIOAvail register is not always updated to
memory. This patch allows userspace to force an update.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In initialization, if we bailed at chip specific initialization, we
forgot to clean up the irq we had requested.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch fixes a bug where multicast packets without a GRH were not
being dropped as per the IB spec.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the module parameter "kpiobufs" is set too high, the calculation to
reset it to a sane value was incorrect.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix RDMA read response length checking for RDMA_READ_RESPONSE_ONLY to
allow a zero length response. RDMA read responses which don't match
the expected length or occur in response to some other operation
should generate a completion queue error (see table 56, ch. 9.9.2.3 in
the IB spec).
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Improve port-sharing performance by allowing any process to receive
packets from the shared hardware port under a spin lock for mutual
exclusion. Previously, one process was nominated as the master and
that process was responsible for receiving all packets from the shared
hardware port and either consuming them or forwarding them to their
destination. This led to starvation problems for other processes when
the master process was busy in computation phases.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The port sharing feature mixed kernel virtual addresses as well as
physical addresses for the offset used to describe the mmap address to
map the InfiniPath hardware into user space. This had a conflict on
powerpc. The new scheme converts it to a physical address so it
doesn't conflict with chip addresses and yet still fits in 40/44 bits
so it isn't truncated by 32-bit applications calling mmap64().
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If a receive work request has been removed from the queue but has not
had a CQ entry generated for it and the QP is modified to the error
state, the completion entry generated is incorrect. This patch fixes
the problem.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Code was converted from a &= ~mask to clear_bit, but the bit was left
shifted instead of being used directly, so we were either trashing
memory several pages away, or sometimes taking a kernel page fault on
an invalid page.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Some types of packet errors are moderately common with longer IB
cables and large clusters, and are not reported with prints by other
IB HCA drivers. This suppresses those messages unless the new
__IPATH_ERRPKTDBG bit is set in ipath_debug. Reporting of temporarily
disabled frequent error interrupts was also made clearer
We also distinguish between chip errors, and bad packets sent or
received in the wording of the messages.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch fixes a number of bugs with updating the PSN for retries of
RC requests.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When switching to the QP error state, the completion queue entries
(error or flush) were not being generated correctly.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
ipath_dbg doesn't need the same prefixes that printk does.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch adds support for multiple RDMA reads and atomics to be sent
before an ACK is required to be seen by the requester.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If a post send is done in loopback and there is no receive queue
entry, the sending QP is put on a timeout list for a while so the
receiver has a chance to post a receive buffer. If the another post
send is done, the code incorrectly tried to put the QP on the timeout
list again an corrupted the timeout list. This eventually leads to a
spin lock deadlock NMI due to the timer function looping forever with
the lock held.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
A silly programming error causes a CQ entry to not be generated if a
SRQ limit event is generated.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
A recent change was made to allocate memory for a port after CPU
affinity is set. That change didn't account for subports and was
trying to allocate memory for the port twice.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The chip documentation on the expected TID vs eager TID parity error
bits was reversed from what was implemented in the RTL, for both
chips. This corrects the definitions.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The loop which initializes the user memory region from an array of
pages was using the wrong limit for the array. This worked OK when
dma_map_sg() returned the same number as the number of pages. This
patch fixes the problem.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This is a sticky state. It is useful for diagnosing problems with
boards versus cable/switch problems.
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
There's no point in printing the opcode field in the completion
handling debugging output, since the type of completion is already
printed at the beginning of the line. In fact the opcode field is not
even defined for completions with a status other than success.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The current ib_umad code never accesses bits past IB_UMAD_MAX_PORTS in
dev_map[]. We shouldn't declare it to be twice as big.
Pointed-out-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Hal Rosenstock <halr@voltaire.com>
Since commit b1c1b6a3 ("IB/ipath: merge ipath_core and ib_ipath
drivers"), CONFIG_IPATH_CORE no longer exists, so there's no reason to
have a line for it in drivers/Makefile.
Pointed out by Robert P. J. Day <rpjday@mindspring.com>.
Signed-off-by: Roland Dreier <rolandd@cisco.com>