Commit Graph

245305 Commits

Author SHA1 Message Date
Selvin Xavier
005d569600 be2net: Stats for Lancer
Added Lancer stats implementation.

Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-16 14:13:53 -04:00
Ajit Khaparde
89a88ab84b be2net: Support for version 1 of stats for BE3
Added support to get version 1 of the stats for BE3.
Use old stats command for BE2.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-16 14:13:52 -04:00
Mahesh Bandewar
538dd2e397 bnx2x: Allow ethtool to enable/disable loopback.
This patch updates the bnx2x_set_features() to handle loopback mode.
When enabled; it sets internal-MAC loopback.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-16 13:51:45 -04:00
David S. Miller
c5be24ff62 ipv4: Trivial rt->rt_src conversions in net/ipv4/route.c
At these points we have a fully filled in value via the IP
header the form of ip_hdr(skb)->saddr

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-16 13:49:05 -04:00
Eric Dumazet
1a8218e962 net: ping: dont call udp_ioctl()
udp_ioctl() really handles UDP and UDPLite protocols.

1) It can increment UDP_MIB_INERRORS in case first_packet_length() finds
a frame with bad checksum.

2) It has a dependency on sizeof(struct udphdr), not applicable to
ICMP/PING

If ping sockets need to handle SIOCINQ/SIOCOUTQ ioctl, this should be
done differently.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Vasiliy Kulikov <segoon@openwall.com>
Acked-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-16 11:49:39 -04:00
Shan Wei
534ea99b06 net: drivers: kill two unused macro definitions
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Reviewed-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15 18:01:15 -04:00
sjur.brandeland@stericsson.com
3f874adc4a caif: remove unesesarry exports
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15 17:45:56 -04:00
sjur.brandeland@stericsson.com
33b2f5598b caif: Bugfix debugfs directory name must be unique.
Race condition caused debugfs_create_dir() to fail due to duplicate
name. Use atomic counter to create unique directory name.

net_ratelimit() is introduced to limit debug printouts.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15 17:45:56 -04:00
sjur.brandeland@stericsson.com
c85c2951d4 caif: Handle dev_queue_xmit errors.
Do proper handling of dev_queue_xmit errors in order to
avoid double free of skb and leaks in error conditions.
In cfctrl pending requests are removed when CAIF Link layer goes down.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15 17:45:56 -04:00
sjur.brandeland@stericsson.com
bee925db9a caif: prepare support for namespaces
Use struct net to reference CAIF configuration object instead of static variables.
Refactor functions caif_connect_client, caif_disconnect_client and squach
files cfcnfg.c and caif_config_utils.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15 17:45:55 -04:00
sjur.brandeland@stericsson.com
b3ccfbe409 caif: Protected in-flight packets using dev or sock refcont.
CAIF Socket Layer and ip-interface registers reference counters
in CAIF service layer. The functions sock_hold, sock_put and
dev_hold, dev_put are used by CAIF Stack to protect from freeing
memory while packets are in-flight.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15 17:45:55 -04:00
sjur.brandeland@stericsson.com
43e3692101 caif: Move refcount from service layer to sock and dev.
Instead of having reference counts in caif service layers,
we hook into existing refcount handling in socket layer and netdevice.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15 17:45:55 -04:00
sjur.brandeland@stericsson.com
cb3cb423a0 caif: Add ref-count to framing layer
Introduce Per-cpu reference for lower part of CAIF Stack.
Before freeing payload is disabled, synchronize_rcu() is called,
and then ref-count verified to be zero.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15 17:45:55 -04:00
sjur.brandeland@stericsson.com
f362144084 caif: Use RCU and lists in cfcnfg.c for managing caif link layers
RCU lists are used for handling the link layers instead of array.
When generating CAIF phy-id, ifindex is used as base. Legal range is 1-6.
Introduced set_phy_state() for managing CAIF Link layer state.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15 17:45:54 -04:00
sjur.brandeland@stericsson.com
bd30ce4bc0 caif: Use RCU instead of spin-lock in caif_dev.c
RCU read_lock and refcount is used to protect in-flight packets.

Use RCU and counters to manage freeing lower part of the CAIF stack if
CAIF-link layer is removed. Old solution based on delaying removal of
device is removed.

When CAIF link layer goes down the use of CAIF link layer is disabled
(by calling caif_set_phy_state()), but removal and freeing of the
lower part of the CAIF stack is done when Link layer is unregistered.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15 17:45:54 -04:00
sjur.brandeland@stericsson.com
0b1e9738de caif: Use rcu_read_lock in CAIF mux layer.
Replace spin_lock with rcu_read_lock when accessing lists to layers
and cache. While packets are in flight rcu_read_lock should not be held,
instead ref-counters are used in combination with RCU.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15 17:45:54 -04:00
Eric Dumazet
1b1cb1f78a net: ping: small changes
ping_table is not __read_mostly, since it contains one rwlock,
and is static to ping.c

ping_port_rover & ping_v4_lookup are static

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-15 01:22:21 -04:00
David S. Miller
89c64d755f Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/jkirsher/net-next-2.6 2011-05-15 01:08:23 -04:00
David S. Miller
4dc6ec26fe Merge branch 'batman-adv/next' of git://git.open-mesh.org/ecsv/linux-merge 2011-05-14 22:47:51 -04:00
Don Skidmore
4f6290cf61 ixgbe: Add support for new 82599 adapter
This patch adds support for a new adapter in the 82599 family.  Included
in that support is a new media_type ixgbe_media_type_fiber_lco.

Signed-of-by: Don Skidmore <donald.c.skidmore@intel.com>

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-05-14 18:10:32 -07:00
Emil Tantilov
c050999e2c ixgbe: fix sparse warning
error: bad constant expression

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Evan Swanson <evan.swanson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-05-14 18:09:39 -07:00
Alexander Duyck
34cecbbfad ixgbe: cleanup some minor issues in ixgbe_down()
This patch cleans up two minor issues in ixgbe_down.  Specifically it
addresses the fact that the VFs should not be pinged until after interrupts
are disabled otherwise they might still get a response.  It also drops the
use of the txdctl temporary variable since the only bit we should be
writing to the TXDCTL registers during a shutdown is the flush bit.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-05-14 18:08:46 -07:00
Alexander Duyck
f0f9778d04 ixgbe: Merge over-temp task into service task
This change merges the over-temp task into the service task.  As a result
all tasklets are finally combined into once single tasklet for easier
management.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-05-14 18:08:09 -07:00
Alexander Duyck
d034acf185 ixgbe: Merge ATR reinit into the service task
This change merges the ATR table reinitialization into the service task.
This is yet another opportunity to avoid any race conditions as we don't
want to be attempting to reinitialize the table during a possible reset.

In addition this change adds a counter for table reinitialization so that
it can be tracked as part of the regular statistics.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Evan Swanson <evan.swanson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-05-14 18:05:37 -07:00
Alexander Duyck
c83c6cbdbf ixgbe: merge reset task into service task
This change is meant to further help to reduce possible configuration
collisions between the various tasklets.  This change combines the device
reset with the service task.  As a result it is now not possible to be
updating the link on the device while also resetting the part.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Evan Swanson <evan.swanson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-05-14 18:04:28 -07:00
Alexander Duyck
93c52dd003 ixgbe: Merge watchdog functionality into service task
This patch is meant to merge the functionality of the ixgbe watchdog task
into the service task.  By doing this all link state functionality will be
controlled by a single task.  As a result the reliability of the interface
will be improved as the likelihood of any race conditions is further
reduced.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-05-14 18:03:24 -07:00
Alexander Duyck
7086400d87 ixgbe: Combine SFP and multi-speed fiber task into single service task
This change is meant to address several race conditions with multi-speed
fiber SFP+ modules in 82599 adapters.  Specifically issues have been seen
in which both the SFP configuration and the multi-speed fiber configuration
are running simultaneously which will result in the device getting into an
erroneous link down state.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Evan Swanson <evan.swanson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-05-14 18:01:13 -07:00
Alexander Duyck
e606bfe74d ixgbe: move flags and state into the same cacheline
This change moves flags and state into the same cacheline.  The reason for
this change is because both are frequently read around the same time and
infrequently written.  By combining them into the same cacheline this
should help to reduce memory utilization.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Evan Swanson <evan.swanson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-05-14 17:59:20 -07:00
Emil Tantilov
51275d37a8 ixgbe: force unlock on timeout
The semaphore can be in locked state upon driver load, particularly
on 82598 if a machine is rebooted due to panic and the semaphore was
acquired just prior to the panic.

This patch unlocks the semaphore if it times out.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-05-14 17:57:01 -07:00
Greg Rose
a1cbb15c13 ixgbe: Add macvlan support for VF
Add infrastructure in the PF driver to support macvlan in the VF driver.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-05-14 17:53:09 -07:00
Greg Rose
46ec20ff7d ixgbevf: Add macvlan support in the set rx mode op
Implement setup of unicast address list in the VF driver's set_rx_mode
netdev op.  Unicast addresses are sent to the PF via a mailbox message
and the PF will check if it has room in the RAR table and if so set the
filter for the VF.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-05-14 17:50:44 -07:00
Bruce Allan
d64a6f4dca e1000e: minor comment cleanups
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-05-14 17:50:01 -07:00
Marek Lindner
ca06c6eb9a batman-adv: reset broadcast flood protection on error
The broadcast flood protection should be reset to its original value
if the primary interface could not be retrieved.

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-05-15 00:02:06 +02:00
Sven Eckelmann
6d5808d4ae batman-adv: Add missing hardif_free_ref in forw_packet_free
add_bcast_packet_to_list increases the refcount for if_incoming but the
reference count is never decreased. The reference count must be
increased for all kinds of forwarded packets which have the primary
interface stored and forw_packet_free must decrease them. Also
purge_outstanding_packets has to invoke forw_packet_free when a work
item was really cancelled.

This regression was introduced in
32ae9b221e.

Reported-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-05-15 00:02:06 +02:00
David S. Miller
7be799a70b ipv4: Remove rt->rt_dst reference from ip_forward_options().
At this point iph->daddr equals what rt->rt_dst would hold.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 17:31:02 -04:00
David S. Miller
8e36360ae8 ipv4: Remove route key identity dependencies in ip_rt_get_source().
Pass in the sk_buff so that we can fetch the necessary keys from
the packet header when working with input routes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 17:29:41 -04:00
David S. Miller
22f728f8f3 ipv4: Always call ip_options_build() after rest of IP header is filled in.
This will allow ip_options_build() to reliably look at the values of
iph->{daddr,saddr}

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 17:21:27 -04:00
David S. Miller
0374d9ceb0 ipv4: Kill spurious write to iph->daddr in ip_forward_options().
This code block executes when opt->srr_is_hit is set.  It will be
set only by ip_options_rcv_srr().

ip_options_rcv_srr() walks until it hits a matching nexthop in the SRR
option addresses, and when it matches one 1) looks up the route for
that nexthop and 2) on route lookup success it writes that nexthop
value into iph->daddr.

ip_forward_options() runs later, and again walks the SRR option
addresses looking for the option matching the destination of the route
stored in skb_rtable().  This route will be the same exact one looked
up for the nexthop by ip_options_rcv_srr().

Therefore "rt->rt_dst == iph->daddr" must be true.

All it really needs to do is record the route's source address in the
matching SRR option adddress.  It need not write iph->daddr again,
since that has already been done by ip_options_rcv_srr() as detailed
above.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 17:15:50 -04:00
Alexey Dobriyan
48e2046722 olympic: convert to seq_file
->read_proc interface is going away, switch to seq_file.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 16:50:49 -04:00
Peter Pan(潘卫平)
0696c3a8ac net:set valid name before calling ndo_init()
In commit 1c5cae815d (net: call dev_alloc_name from register_netdevice),
a bug of bonding was involved, see example 1 and 2.

In register_netdevice(), the name of net_device is not valid until
dev_get_valid_name() is called. But dev->netdev_ops->ndo_init(that is
bond_init) is called before dev_get_valid_name(),
and it uses the invalid name of net_device.

I think register_netdevice() should make sure that the name of net_device is
valid before calling ndo_init().

example 1:
modprobe bonding
ls  /proc/net/bonding/bond%d

ps -eLf
root      3398     2  3398  0    1 21:34 ?        00:00:00 [bond%d]

example 2:
modprobe bonding max_bonds=3

[  170.100292] bonding: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
[  170.101090] bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not detect link failures! see bonding.txt for details.
[  170.102469] ------------[ cut here ]------------
[  170.103150] WARNING: at /home/pwp/net-next-2.6/fs/proc/generic.c:586 proc_register+0x126/0x157()
[  170.104075] Hardware name: VirtualBox
[  170.105065] proc_dir_entry 'bonding/bond%d' already registered
[  170.105613] Modules linked in: bonding(+) sunrpc ipv6 uinput microcode ppdev parport_pc parport joydev e1000 pcspkr i2c_piix4 i2c_core [last unloaded: bonding]
[  170.108397] Pid: 3457, comm: modprobe Not tainted 2.6.39-rc2+ #14
[  170.108935] Call Trace:
[  170.109382]  [<c0438f3b>] warn_slowpath_common+0x6a/0x7f
[  170.109911]  [<c051a42a>] ? proc_register+0x126/0x157
[  170.110329]  [<c0438fc3>] warn_slowpath_fmt+0x2b/0x2f
[  170.110846]  [<c051a42a>] proc_register+0x126/0x157
[  170.111870]  [<c051a4dd>] proc_create_data+0x82/0x98
[  170.112335]  [<f94e6af6>] bond_create_proc_entry+0x3f/0x73 [bonding]
[  170.112905]  [<f94dd806>] bond_init+0x77/0xa5 [bonding]
[  170.113319]  [<c0721ac6>] register_netdevice+0x8c/0x1d3
[  170.113848]  [<f94e0e30>] bond_create+0x6c/0x90 [bonding]
[  170.114322]  [<f94f4763>] bonding_init+0x763/0x7b1 [bonding]
[  170.114879]  [<c0401240>] do_one_initcall+0x76/0x122
[  170.115317]  [<f94f4000>] ? 0xf94f3fff
[  170.115799]  [<c0463f1e>] sys_init_module+0x1286/0x140d
[  170.116879]  [<c07c6d9f>] sysenter_do_call+0x12/0x28
[  170.117404] ---[ end trace 64e4fac3ae5fff1a ]---
[  170.117924] bond%d: Warning: failed to register to debugfs
[  170.128728] ------------[ cut here ]------------
[  170.129360] WARNING: at /home/pwp/net-next-2.6/fs/proc/generic.c:586 proc_register+0x126/0x157()
[  170.130323] Hardware name: VirtualBox
[  170.130797] proc_dir_entry 'bonding/bond%d' already registered
[  170.131315] Modules linked in: bonding(+) sunrpc ipv6 uinput microcode ppdev parport_pc parport joydev e1000 pcspkr i2c_piix4 i2c_core [last unloaded: bonding]
[  170.133731] Pid: 3457, comm: modprobe Tainted: G        W   2.6.39-rc2+ #14
[  170.134308] Call Trace:
[  170.134743]  [<c0438f3b>] warn_slowpath_common+0x6a/0x7f
[  170.135305]  [<c051a42a>] ? proc_register+0x126/0x157
[  170.135820]  [<c0438fc3>] warn_slowpath_fmt+0x2b/0x2f
[  170.137168]  [<c051a42a>] proc_register+0x126/0x157
[  170.137700]  [<c051a4dd>] proc_create_data+0x82/0x98
[  170.138174]  [<f94e6af6>] bond_create_proc_entry+0x3f/0x73 [bonding]
[  170.138745]  [<f94dd806>] bond_init+0x77/0xa5 [bonding]
[  170.139278]  [<c0721ac6>] register_netdevice+0x8c/0x1d3
[  170.139828]  [<f94e0e30>] bond_create+0x6c/0x90 [bonding]
[  170.140361]  [<f94f4763>] bonding_init+0x763/0x7b1 [bonding]
[  170.140927]  [<c0401240>] do_one_initcall+0x76/0x122
[  170.141494]  [<f94f4000>] ? 0xf94f3fff
[  170.141975]  [<c0463f1e>] sys_init_module+0x1286/0x140d
[  170.142463]  [<c07c6d9f>] sysenter_do_call+0x12/0x28
[  170.142974] ---[ end trace 64e4fac3ae5fff1b ]---
[  170.144949] bond%d: Warning: failed to register to debugfs

Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Reviewed-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 16:49:49 -04:00
Giuseppe CAVALLARO
64c7f304b8 stmmac: fix autoneg in set_pauseparam
This patch fixes a bug in the set_pauseparam
function that didn't well manage the ANE
field and returned broken values when use
ethtool -A|-a.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 16:12:07 -04:00
David Decotigny
1334cb6082 stmmac: don't go through ethtool to start auto-negotiation
The driver used to call phy's ethtool configuration routine to start
auto-negotiation. This change has it call directly phy's routine to
start auto-negotiation.

The initial version was hiding phy_start_aneg() return value,
this patch returns it (<0 upon error).

Tested: module compiles, tested on STM HDK7108 STB.

Signed-off-by: David Decotigny <decot@google.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 16:12:07 -04:00
Julia Lawall
5310cbce90 drivers/isdn/hisax: Drop unused list
The file st5481_init.c locally defines and initializes the adapter_list
variable, but does not use it for anything.  Removing the list makes it
possible to remove the list field from the st5481_adapter data structure.
In the function probe_st5481, it also makes it possible to free the locally
allocated adapter value on an error exit.

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 16:10:00 -04:00
Vasiliy Kulikov
c319b4d76b net: ipv4: add IPPROTO_ICMP socket kind
This patch adds IPPROTO_ICMP socket kind.  It makes it possible to send
ICMP_ECHO messages and receive the corresponding ICMP_ECHOREPLY messages
without any special privileges.  In other words, the patch makes it
possible to implement setuid-less and CAP_NET_RAW-less /bin/ping.  In
order not to increase the kernel's attack surface, the new functionality
is disabled by default, but is enabled at bootup by supporting Linux
distributions, optionally with restriction to a group or a group range
(see below).

Similar functionality is implemented in Mac OS X:
http://www.manpagez.com/man/4/icmp/

A new ping socket is created with

    socket(PF_INET, SOCK_DGRAM, PROT_ICMP)

Message identifiers (octets 4-5 of ICMP header) are interpreted as local
ports. Addresses are stored in struct sockaddr_in. No port numbers are
reserved for privileged processes, port 0 is reserved for API ("let the
kernel pick a free number"). There is no notion of remote ports, remote
port numbers provided by the user (e.g. in connect()) are ignored.

Data sent and received include ICMP headers. This is deliberate to:
1) Avoid the need to transport headers values like sequence numbers by
other means.
2) Make it easier to port existing programs using raw sockets.

ICMP headers given to send() are checked and sanitized. The type must be
ICMP_ECHO and the code must be zero (future extensions might relax this,
see below). The id is set to the number (local port) of the socket, the
checksum is always recomputed.

ICMP reply packets received from the network are demultiplexed according
to their id's, and are returned by recv() without any modifications.
IP header information and ICMP errors of those packets may be obtained
via ancillary data (IP_RECVTTL, IP_RETOPTS, and IP_RECVERR). ICMP source
quenches and redirects are reported as fake errors via the error queue
(IP_RECVERR); the next hop address for redirects is saved to ee_info (in
network order).

socket(2) is restricted to the group range specified in
"/proc/sys/net/ipv4/ping_group_range".  It is "1 0" by default, meaning
that nobody (not even root) may create ping sockets.  Setting it to "100
100" would grant permissions to the single group (to either make
/sbin/ping g+s and owned by this group or to grant permissions to the
"netadmins" group), "0 4294967295" would enable it for the world, "100
4294967295" would enable it for the users, but not daemons.

The existing code might be (in the unlikely case anyone needs it)
extended rather easily to handle other similar pairs of ICMP messages
(Timestamp/Reply, Information Request/Reply, Address Mask Request/Reply
etc.).

Userspace ping util & patch for it:
http://openwall.info/wiki/people/segoon/ping

For Openwall GNU/*/Linux it was the last step on the road to the
setuid-less distro.  A revision of this patch (for RHEL5/OpenVZ kernels)
is in use in Owl-current, such as in the 2011/03/12 LiveCD ISOs:
http://mirrors.kernel.org/openwall/Owl/current/iso/

Initially this functionality was written by Pavel Kankovsky for
Linux 2.4.32, but unfortunately it was never made public.

All ping options (-b, -p, -Q, -R, -s, -t, -T, -M, -I), are tested with
the patch.

PATCH v3:
    - switched to flowi4.
    - minor changes to be consistent with raw sockets code.

PATCH v2:
    - changed ping_debug() to pr_debug().
    - removed CONFIG_IP_PING.
    - removed ping_seq_fops.owner field (unused for procfs).
    - switched to proc_net_fops_create().
    - switched to %pK in seq_printf().

PATCH v1:
    - fixed checksumming bug.
    - CAP_NET_RAW may not create icmp sockets anymore.

RFC v2:
    - minor cleanups.
    - introduced sysctl'able group range to restrict socket(2).

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 16:08:13 -04:00
KOSAKI Motohiro
f20190302e convert old cpumask API into new one
Adapt new API.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 14:55:21 -04:00
Ursula Braun
9f6298a6ca af_iucv: get rid of compile warning
-Wunused-but-set-variable generates compile warnings. The affected
variables are removed.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 14:55:21 -04:00
Ursula Braun
5db79c0679 iucv: get rid of compile warning
-Wunused-but-set-variable generates a compile warning. The affected
variable is removed.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 14:55:21 -04:00
Ursula Braun
ff2aed7da1 ctcm: get rid of compile warning
-Wunused-but-set-variable generates compile warnings. The affected
variables are removed.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 14:55:20 -04:00
Heiko Carstens
424f73b3ec lcs: get rid of compile warning
-Wunused-but-set-variable generates a compile warning for lcs' tasklet
function. Invoked functions contain already error handling; thus
additional return code checking is not needed here.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 14:55:20 -04:00
Heiko Carstens
38ed18ff5e claw: remove unused return code handling
Remove unused return code handling. The claw driver is mostly dead, so
just make sure it keeps compiling without warnings.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 14:55:20 -04:00