Commit 1018b5c016 ("Set rt->rt_iif more
sanely on output routes.") breaks rt_is_{output,input}_route.
This became the cause to return "IP_PKTINFO's ->ipi_ifindex == 0".
To fix it, this does:
1) Add "int rt_route_iif;" to struct rtable
2) For input routes, always set rt_route_iif to same value as rt_iif
3) For output routes, always set rt_route_iif to zero. Set rt_iif
as it is done currently.
4) Change rt_is_{output,input}_route() to test rt_route_iif
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit c6e1a0d12c broken the calc
(net: Allow no-cache copy from user on transmit)
of checksum, which may cause some tcp packets be dropped because
incorrect checksum. ssh does not work under today's net-next-2.6
tree.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When an interrupt occurs, the INT pin is driven low by the
MCP251x controller (falling edge) but in some cases the INT
pin can be connected to the MPU through a transistor or level
translator which inverts this signal. In this case interrupt
should be configured in rising edge.
This patch adds support to pass the IRQ flags via
mcp251x_platform_data.
Signed-off-by: Enric Balletbo i Serra <eballetbo@iseebcn.com>
Acked-by: Wolfgang Grandegger <wg@grandegger.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethtool ETHTOOL_PHYS_ID command runs for an arbitrarily long
period of time, holding the RTNL lock. This blocks routing updates,
device enumeration, and various important operations that one might
want to keep running while hunting for the flashing LED.
We need to drop the RTNL lock during this operation, but currently the
core implementation is a thin wrapper around a driver operation and
drivers may well depend upon holding the lock.
Define a new driver operation 'set_phys_id' with an argument that sets
the ID indicator on/off/inactive/active (the last optional, for any
driver or firmware that prefers to handle blinking asynchronously).
When this is defined, the ethtool core drops the lock while waiting
and only acquires it around calls to this operation.
Deprecate the 'phys_id' operation in favour of this. It can be
removed once all in-tree drivers are converted.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Briefly document all operations (except get_rx_ntuple), including
whether they may return an error code and whether they are deprecated.
Also mention some things that should be handled by the ethtool core
rather than by drivers.
Briefly document general requirements for callers.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
This patch uses __copy_from_user_nocache on transmit to bypass data
cache for a performance improvement. skb_add_data_nocache and
skb_copy_to_page_nocache can be called by sendmsg functions to use
this feature, initial support is in tcp_sendmsg. This functionality is
configurable per device using ethtool.
Presumably, this feature would only be useful when the driver does
not touch the data. The feature is turned on by default if a device
indicates that it does some form of checksum offload; it is off by
default for devices that do no checksum offload or indicate no checksum
is necessary. For the former case copy-checksum is probably done
anyway, in the latter case the device is likely loopback in which case
the no cache copy is probably not beneficial.
This patch was tested using 200 instances of netperf TCP_RR with
1400 byte request and one byte reply. Platform is 16 core AMD x86.
No-cache copy disabled:
672703 tps, 97.13% utilization
50/90/99% latency:244.31 484.205 1028.41
No-cache copy enabled:
702113 tps, 96.16% utilization,
50/90/99% latency 238.56 467.56 956.955
Using 14000 byte request and response sizes demonstrate the
effects more dramatically:
No-cache copy disabled:
79571 tps, 34.34 %utlization
50/90/95% latency 1584.46 2319.59 5001.76
No-cache copy enabled:
83856 tps, 34.81% utilization
50/90/95% latency 2508.42 2622.62 2735.88
Note especially the effect on latency tail (95th percentile).
This seems to provide a nice performance improvement and is
consistent in the tests I ran. Presumably, this would provide
the greatest benfits in the presence of an application workload
stressing the cache and a lot of transmit data happening.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The description for buf_size was misleading and
just said you couldn't TX larger aggregates, but
of course you can't TX aggregates in a way that
would exceed the window either, which is possible
even if the aggregates are shorter than that.
Expand the description, thanks to Emmanuel for
explaining this to me.
Cc: Emmanuel Grumbach <egrumbach@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This is an implementation of the Quick Fair Queue scheduler developed
by Fabio Checconi. The same algorithm is already implemented in ipfw
in FreeBSD. Fabio had an earlier version developed on Linux, I just
cleaned it up. Thanks to Eric Dumazet for testing this under load.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ipv6 fib lookup can set RT6_LOOKUP_F_IFACE flag to restrict search
to an interface, but this flag cannot be set via struct flowi.
Also, it cannot be set via ip6_route_output: this function uses the
passed sock struct to determine if this flag is required
(by testing for nonzero sk_bound_dev_if).
Work around this by passing in an artificial struct sk in case
'strict' argument is true.
This is required to replace the rt6_lookup call in xt_addrtype.c with
nf_afinfo->route().
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This is required to eventually replace the rt6_lookup call in
xt_addrtype.c with nf_afinfo->route().
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
ipvsadm -ln --daemon will trigger a Null pointer exception because
ip_vs_genl_dump_daemons() uses skb_net() instead of skb_sknet().
To prevent others from NULL ptr a check is made in ip_vs.h skb_net().
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The timeout variant of the list:set type must reference the member sets.
However, its garbage collector runs at timer interrupt so the mutex
protection of the references is a no go. Therefore the reference protection
is converted to rwlock.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Issue FEAT_CHANGE notification when features are changed by
netdev_update_features(). This will allow changes made by extra constraints
on e.g. MTU change to be properly propagated like changes via ethtool.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
auth_hmacs field of struct sctp_cookie is used for store
Requested HMAC Algorithm Parameter, and each HMAC Identifier
is 2 bytes, so the length should be:
SCTP_AUTH_NUM_HMACS * sizeof(__u16) + 2
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dev_ethtool_get_rx_csum() won't report rx checksumming when it's not
changeable and driver is converted to hw_features and friends. Fix this.
(dev->hw_features & NETIF_F_RXCSUM) check is dropped - if the
ethtool_ops->get_rx_csum is set, then driver is not coverted, yet.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
The documentation for the USB ethernet devices suggests that
only some devices are supposed to use usb0 as the network interface
name instead of eth0. The logic used there, and documented in
Kconfig for CDC is that eth0 will be used when the mac address
is a globally assigned one, but usb0 is used for the locally
managed range that is typically used on point-to-point links.
Unfortunately, this has caused a lot of pain on the smsc95xx
device that is used on the popular pandaboard without an
EEPROM to store the MAC address, which causes the driver to
call random_ether_address().
Obviously, there should be a proper MAC addressed assigned to
the device, and discussions are ongoing about how to solve
this, but this patch at least makes sure that the default
interface naming gets a little saner and matches what the
user can expect based on the documentation, including for
new devices.
The approach taken here is to flag whether a device might be a
point-to-point link with the new FLAG_POINTTOPOINT setting in
the usbnet driver_info. A driver can set both FLAG_POINTTOPOINT
and FLAG_ETHER if it is not sure (e.g. cdc_ether), or just one
of the two. The usbnet framework only looks at the MAC address
for device naming if both flags are set, otherwise it trusts the
flag.
Signed-off-by: Arnd Bergmann <arnd.bergmann@linaro.org>
Tested-by: Andy Green <andy.green@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
On-stack initialization via assignment of flow structures are
expensive because GCC emits a memset() to clear the entire
structure out no matter what.
Add a helper for ipv4 output flow key setup which we can use to avoid
the memset.
Signed-off-by: David S. Miller <davem@davemloft.net>
Commits 01a16b21 (netlink: kill eff_cap from struct netlink_skb_parms)
and c53fa1ed (netlink: kill loginuid/sessionid/sid members from struct
netlink_skb_parms) removed some members from struct netlink_skb_parms
that depend on the current context, all netlink users are now required
to do synchronous message processing.
connector however queues received messages and processes them in a work
queue, which is not valid anymore. This patch converts connector to do
synchronous message processing by invoking the registered callback handler
directly from the netlink receive function.
In order to avoid invoking the callback with connector locks held, a
reference count is added to struct cn_callback_entry, the reference
is taken when finding a matching callback entry on the device's queue_list
and released after the callback handler has been invoked.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't flap VCs when carrier state changes; higher-level protocols
can detect loss of connectivity and act accordingly. This is more
consistent with how other network interfaces work.
We no longer use release_vccs() so we can delete it.
release_vccs() was duplicated from net/atm/common.c; make the
corresponding function exported, since other code duplicates it
and could leverage it if it were public.
Signed-off-by: Philip A. Prindeville <philipp@redfish-solutions.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds a driver for the CDC Ethernet part of this modem. The
device's ID is blacklisted in cdc_ether.c and is white-listed in
this new driver because of the quirks needed to make it useful.
The modem's firmware exposes a CDC ACM port for modem control and a
CDC Ethernet port for network data. The descriptors look fine but
both ports actually are some sort of multiplexers requiring non-
standard headers added/removed from every packet or they get
ignored. All information is based on a usb traffic log from a
Windows machine.
On the Verizon 4G network I've seen speeds up to 1.1MB/s so far with
this driver, a speed-o-meter site reports 16.2Mbps/10.5Mbps.
Userspace scripts are required to talk to the CDC ACM port.
Signed-off-by: Andrzej Zaborowski <balrogg@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
My commit 6d55cb91a0 (gre: fix hard header destination
address checking) broke multicast.
The reason is that ip_gre used to get ipgre_header() calls with
zero destination if we have NOARP or multicast destination. Instead
the actual target was decided at ipgre_tunnel_xmit() time based on
per-protocol dissection.
Instead of allowing the "abuse" of ->header() calls with invalid
destination, this creates multicast mappings for ip_gre. This also
fixes "ip neigh show nud noarp" to display the proper multicast
mappings used by the gre device.
Reported-by: Doug Kehn <rdkehn@yahoo.com>
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Acked-by: Doug Kehn <rdkehn@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit a715dea3c8 ("net: Always
allocate at least 16 skb frags regardless of page size"), the value
of MAX_SKB_FRAGS can now take on either an "unsigned long" or an
"int" value.
This causes warnings like:
net/packet/af_packet.c: In function ‘tpacket_fill_skb’:
net/packet/af_packet.c:948: warning: format ‘%lu’ expects type ‘long unsigned int’, but argument 2 has type ‘int’
Fix by forcing the constant to be unsigned long, otherwise we have
a situation where the type of a system wide constant is variable.
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (26 commits)
mmc: SDHI should depend on SUPERH || ARCH_SHMOBILE
mmc: tmio_mmc: Move some defines into a shared header
mmc: tmio: support aggressive clock gating
mmc: tmio: fix power-mode interpretation
mmc: tmio: remove work-around for unmasked SDIO interrupts
sh: fix SDHI IO address-range
ARM: mach-shmobile: fix SDHI IO address-range
mmc: tmio: only access registers above 0xff, if available
mfd: remove now redundant sh_mobile_sdhi.h header
sh: convert boards to use linux/mmc/sh_mobile_sdhi.h
ARM: mach-shmobile: convert boards to use linux/mmc/sh_mobile_sdhi.h
mmc: tmio: convert the SDHI MMC driver from MFD to a platform driver
sh: ecovec: use the CONFIG_MMC_TMIO symbols instead of MFD
mmc: tmio: split core functionality, DMA and MFD glue
mmc: tmio: use PIO for short transfers
mmc: tmio-mmc: Improve DMA stability on sh-mobile
mmc: fix mmc_app_send_scr() for dma transfer
mmc: sdhci-esdhc: enable esdhc on imx53
mmc: sdhci-esdhc: use writel/readl as general APIs
mmc: sdhci: add the abort CMDTYPE bits definition
...
* 'frv' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-frv:
FRV: Use generic show_interrupts()
FRV: Convert genirq namespace
frv: Select GENERIC_HARDIRQS_NO_DEPRECATED
frv: Convert cpu irq_chip to new functions
frv: Convert mb93493 irq_chip to new functions
frv: Convert mb93093 irq_chip to new function
frv: Convert mb93091 irq_chip to new functions
frv: Fix typo from __do_IRQ overhaul
frv: Remove stale irq_chip.end
FRV: Do some cleanups
FRV: Missing node arg in alloc_thread_info_node() macro
NOMMU: implement access_remote_vm
NOMMU: support SMP dynamic percpu_alloc
NOMMU: percpu should use is_vmalloc_addr().
* 'irq-final-for-linus-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (111 commits)
gpio: ab8500: Mark broken
genirq: Remove move_*irq leftovers
genirq: Remove compat code
drivers: Final irq namespace conversion
mn10300: Use generic show_interrupts()
mn10300: Cleanup irq_desc access
mn10300: Convert genirq namespace
frv: Use generic show_interrupts()
frv: Convert genirq namespace
frv: Select GENERIC_HARDIRQS_NO_DEPRECATED
frv: Convert cpu irq_chip to new functions
frv: Convert mb93493 irq_chip to new functions
frv: Convert mb93093 irq_chip to new function
frv: Convert mb93091 irq_chip to new functions
frv: Fix typo from __do_IRQ overhaul
frv: Remove stale irq_chip.end
m68k: Convert irq function namespace
xen: Use new irq_move functions
xen: Cleanup genirq namespace
unicore32: Use generic show_interrupts()
...
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits)
xfrm: Restrict extended sequence numbers to esp
xfrm: Check for esn buffer len in xfrm_new_ae
xfrm: Assign esn pointers when cloning a state
xfrm: Move the test on replay window size into the replay check functions
netdev: bfin_mac: document TE setting in RMII modes
drivers net: Fix declaration ordering in inline functions.
cxgb3: Apply interrupt coalescing settings to all queues
net: Always allocate at least 16 skb frags regardless of page size
ipv4: Don't ip_rt_put() an error pointer in RAW sockets.
net: fix ethtool->set_flags not intended -EINVAL return value
mlx4_en: Fix loss of promiscuity
tg3: Fix inline keyword usage
tg3: use <linux/io.h> and <linux/uaccess.h> instead <asm/io.h> and <asm/uaccess.h>
net: use CHECKSUM_NONE instead of magic number
Net / jme: Do not use legacy PCI power management
myri10ge: small rx_done refactoring
bridge: notify applications if address of bridge device changes
ipv4: Fix IP timestamp option (IPOPT_TS_PRESPEC) handling in ip_options_echo()
can: c_can: Fix tx_bytes accounting
can: c_can_platform: fix irq check in probe
...
When we clone a xfrm state we have to assign the replay_esn
and the preplay_esn pointers to the state if we use the
new replay detection method. To this end, we add a
xfrm_replay_clone() function that allocates memory for
the replay detection and takes over the necessary values
from the original state.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
When analysing performance of the cxgb3 on a ppc64 box I noticed that
we weren't doing much GRO merging. It turns out we are limited by the
number of SKB frags:
#define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)
With a 4kB page size we have 18 frags, but with a 64kB page size we
only have 3 frags.
I ran a single stream TCP bandwidth test to compare the performance of
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'irq-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
vlynq: Convert irq functions
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq; Fix cleanup fallout
genirq: Fix typo and remove unused variable
genirq: Fix new kernel-doc warnings
genirq: Add setter for AFFINITY_SET in irq_data state
genirq: Provide setter inline for IRQD_IRQ_INPROGRESS
genirq: Remove handle_IRQ_event
arm: Ns9xxx: Remove private irq flow handler
powerpc: cell: Use the core flow handler
genirq: Provide edge_eoi flow handler
genirq: Move INPROGRESS, MASKED and DISABLED state flags to irq_data
genirq: Split irq_set_affinity() so it can be called with lock held.
genirq: Add chip flag for restricting cpu_on/offline calls
genirq: Add chip hooks for taking CPUs on/off line.
genirq: Add irq disabled flag to irq_data state
genirq: Reserve the irq when calling irq_set_chip()
* 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (45 commits)
Btrfs: fix __btrfs_map_block on 32 bit machines
btrfs: fix possible deadlock by clearing __GFP_FS flag
btrfs: check link counter overflow in link(2)
btrfs: don't mess with i_nlink of unlocked inode in rename()
Btrfs: check return value of btrfs_alloc_path()
Btrfs: fix OOPS of empty filesystem after balance
Btrfs: fix memory leak of empty filesystem after balance
Btrfs: fix return value of setflags ioctl
Btrfs: fix uncheck memory allocations
btrfs: make inode ref log recovery faster
Btrfs: add btrfs_trim_fs() to handle FITRIM
Btrfs: adjust btrfs_discard_extent() return errors and trimmed bytes
Btrfs: make btrfs_map_block() return entire free extent for each device of RAID0/1/10/DUP
Btrfs: make update_reserved_bytes() public
btrfs: return EXDEV when linking from different subvolumes
Btrfs: Per file/directory controls for COW and compression
Btrfs: add datacow flag in inode flag
btrfs: use GFP_NOFS instead of GFP_KERNEL
Btrfs: check return value of read_tree_block()
btrfs: properly access unaligned checksum buffer
...
Fix up trivial conflicts in fs/btrfs/volumes.c due to plug removal in
the block layer.
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86: (81 commits)
xo15-ebook: Remove device.wakeup_count
ips: use interruptible waits in ips-monitor
acer-wmi: does not poll device status when WMI event is available
acer-wmi: does not set persistence state by rfkill_init_sw_state
platform-drivers: x86: fix common misspellings
acer-wmi: use pr_<level> for messages
asus-wmi: potential NULL dereference in show_call()
asus-wmi: signedness bug in read_brightness()
platform-driver-x86: samsung-laptop: make dmi_check_cb to return 1 instead of 0
platform-driver-x86: fix wrong merge for compal-laptop.c
msi-laptop: use pr_<level> for messages
Platform: add Samsung Laptop platform driver
acer-wmi: Fix WMI ID
acer-wmi: deactive mail led when power off
msi-laptop: send out touchpad on/off key
acer-wmi: set the touchpad toggle key code to KEY_TOUCHPAD_TOGGLE
platform-driver-x86: intel_mid_thermal: fix unterminated platform_device_id table
sony-laptop: potential null dereference
sony-laptop: handle allocation failures
sony-laptop: return negative on failure in sony_nc_add()
...
* 'for-torvalds' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson:
mach-ux500: configure board for the TPS61052 regulator v2
mach-ux500: provide ab8500 init vector
mach-ux500: board support for AB8500 GPIO driver
gpio: driver for 42 AB8500 GPIO pins
Fix new irq-related kernel-doc warnings in 2.6.38:
Warning(kernel/irq/manage.c:149): No description found for parameter 'mask'
Warning(kernel/irq/manage.c:149): Excess function parameter 'cpumask' description in 'irq_set_affinity'
Warning(include/linux/irq.h:161): No description found for parameter 'state_use_accessors'
Warning(include/linux/irq.h:161): Excess struct/union/enum/typedef member 'state_use_accessor' description in 'irq_data'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
LKML-Reference: <20110318093356.b939558d.randy.dunlap@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some archs want to prevent the default affinity being set on their
chips in the reqeust_irq() path.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>