linux

Author	SHA1	Message	Date
Rick Jones	66846048f5	enable virtio_net to return bus_info in ethtool -i consistent with emulated NICs Add a new .bus_name to virtio_config_ops then modify virtio_net to call through to it in an ethtool .get_drvinfo routine to report bus_info in ethtool -i output which is consistent with other emulated NICs and the output of lspci. Signed-off-by: Rick Jones <rick.jones2@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-16 17:26:46 -05:00
Srinivas Kandagatla	64882709ef	mdio-gpio: Add reset functionality to mdio-gpio driver(v2). This patch adds phy reset functionality to mdio-gpio driver. Now mdio_gpio_platform_data has new member as function pointer which can be filled at the bsp level for a callback from phy infrastructure. Also the mdio-bitbang driver fills-in the reset function of mii_bus structure. Without this patch the bsp level code has to takecare of the reseting PHY's on the bus, which become bit hacky for every bsp and phy-infrastructure is ignored aswell. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-15 16:56:17 -05:00
Matti Vaittinen	229a66e3be	IPv6: Removing unnecessary NULL checks. This patch removes unnecessary NULL checks noticed by Dan Carpenter. Checks were introduced in commit `4a287eba2d` to net-next. Signed-off-by: Matti Vaittinen <Mazziesaccount@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-15 16:54:20 -05:00
RongQing.Li	ad79eefc42	ipv4: fix a memory leak in ic_bootp_send_if when dev_hard_header() failed, the newly allocated skb should be freed. Signed-off-by: RongQing.Li <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 14:37:24 -05:00
Dmitry Kravkov	5219e4c93c	bnx2x: add endline at end of message Reported-by: Joe Perches <joe@perches.com> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 14:36:40 -05:00
Matti Vaittinen	4a287eba2d	IPv6 routing, NLM_F_* flag support: REPLACE and EXCL flags support, warn about missing CREATE flag The support for NLM_F_* flags at IPv6 routing requests. If NLM_F_CREATE flag is not defined for RTM_NEWROUTE request, warning is printed, but no error is returned. Instead new route is added. Later NLM_F_CREATE may be required for new route creation. Exception is when NLM_F_REPLACE flag is given without NLM_F_CREATE, and no matching route is found. In this case it should be safe to assume that the request issuer is familiar with NLM_F_* flags, and does really not want route to be created. Specifying NLM_F_REPLACE flag will now make the kernel to search for matching route, and replace it with new one. If no route is found and NLM_F_CREATE is specified as well, then new route is created. Also, specifying NLM_F_EXCL will yield returning of error if matching route is found. Patch created against linux-3.2-rc1 Signed-off-by: Matti Vaittinen <Mazziesaccount@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 14:35:33 -05:00
Matti Vaittinen	d71314b4ac	IPv6 routing, NLM_F_* flag support: warn if new route is created without NLM_F_CREATE The support for NLM_F_* flags at IPv6 routing requests. Warn if NLM_F_CREATE flag is not defined for RTM_NEWROUTE request, creating new table. Later NLM_F_CREATE may be required for new route creation. Patch created against linux-3.2-rc1 Signed-off-by: Matti Vaittinen <Mazziesaccount@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 14:35:33 -05:00
Wolfgang Grandegger	abbd00b82a	net/can/mscan: Fix buggy listen only mode setting This patch fixes an issue introduced recently with commit `452448f928`. CC: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 14:30:05 -05:00
Rick Jones	612a94d6f2	Sweep the last of the active .get_drvinfo floors under ethernet/ This round of floor sweeping converts strncpy calls in various .get_drvinfo routines to the preferred strlcpy. It also does a modicum of other cleaning in those routines. Signed-off-by: Rick Jones <rick.jones2@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 14:13:31 -05:00
Eric Dumazet	e52fcb2462	bnx2x: uses build_skb() in receive path bnx2x uses following formula to compute its rx_buf_sz : dev->mtu + 2*L1_CACHE_BYTES + 14 + 8 + 8 + 2 Then core network adds NET_SKB_PAD and SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) Final allocated size for skb head on x86_64 (L1_CACHE_BYTES = 64, MTU=1500) : 2112 bytes : SLUB/SLAB round this to 4096 bytes. Since skb truesize is then bigger than SK_MEM_QUANTUM, we have lot of false sharing because of mem_reclaim in UDP stack. One possible way to half truesize is to reduce the need by 64 bytes (2112 -> 2048 bytes) Instead of allocating a full cache line at the end of packet for alignment, we can use the fact that skb_shared_info sits at the end of skb->head, and we can use this room, if we convert bnx2x to new build_skb() infrastructure. skb_shared_info will be initialized after hardware finished its transfert, so we can eventually overwrite the final padding. Using build_skb() also reduces cache line misses in the driver, since we use cache hot skb instead of cold ones. Number of in-flight sk_buff structures is lower, they are recycled while still hot. Performance results : (820.000 pps on a rx UDP monothread benchmark, instead of 720.000 pps) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Eilon Greenstein <eilong@broadcom.com> CC: Ben Hutchings <bhutchings@solarflare.com> CC: Tom Herbert <therbert@google.com> CC: Jamal Hadi Salim <hadi@mojatatu.com> CC: Stephen Hemminger <shemminger@vyatta.com> CC: Thomas Graf <tgraf@infradead.org> CC: Herbert Xu <herbert@gondor.apana.org.au> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 14:13:30 -05:00
Eric Dumazet	b2b5ce9d1c	net: introduce build_skb() One of the thing we discussed during netdev 2011 conference was the idea to change some network drivers to allocate/populate their skb at RX completion time, right before feeding the skb to network stack. In old days, we allocated skbs when populating the RX ring. This means bringing into cpu cache sk_buff and skb_shared_info cache lines (since we clear/initialize them), then 'queue' skb->data to NIC. By the time NIC fills a frame in skb->data buffer and host can process it, cpu probably threw away the cache lines from its caches, because lot of things happened between the allocation and final use. So the deal would be to allocate only the data buffer for the NIC to populate its RX ring buffer. And use build_skb() at RX completion to attach a data buffer (now filled with an ethernet frame) to a new skb, initialize the skb_shared_info portion, and give the hot skb to network stack. build_skb() is the function to allocate an skb, caller providing the data buffer that should be attached to it. Drivers are expected to call skb_reserve() right after build_skb() to adjust skb->data to the Ethernet frame (usually skipping NET_SKB_PAD and NET_IP_ALIGN, but some drivers might add a hardware provided alignment) Data provided to build_skb() MUST have been allocated by a prior kmalloc() call, with enough room to add SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) bytes at the end of the data without corrupting incoming frame. data = kmalloc(NET_SKB_PAD + NET_IP_ALIGN + 1536 + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)), GFP_ATOMIC); ... skb = build_skb(data); if (!skb) { recycle_data(data); } else { skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN); ... } Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Eilon Greenstein <eilong@broadcom.com> CC: Ben Hutchings <bhutchings@solarflare.com> CC: Tom Herbert <therbert@google.com> CC: Jamal Hadi Salim <hadi@mojatatu.com> CC: Stephen Hemminger <shemminger@vyatta.com> CC: Thomas Graf <tgraf@infradead.org> CC: Herbert Xu <herbert@gondor.apana.org.au> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 14:13:30 -05:00
Baruch Siach	c3e072f8a6	net: fsl_pq_mdio: fix non tbi phy access Since `952c5ca1` (fsl_pq_mdio: Clean up tbi address configuration) .probe returns -EBUSY when the "tbi-phy" node is missing. Fix this. Cc: Andy Fleming <afleming@freescale.com> Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 01:44:55 -05:00
Marc Kleine-Budde	452448f928	net/can/mscan: add listen only mode This patch adds listen only mode to the mscan controller. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Acked-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 00:51:22 -05:00
Eric Dumazet	8b5c171bb3	neigh: new unresolved queue limits Le mercredi 09 novembre 2011 à 16:21 -0500, David Miller a écrit : > From: David Miller <davem@davemloft.net> > Date: Wed, 09 Nov 2011 16:16:44 -0500 (EST) > > > From: Eric Dumazet <eric.dumazet@gmail.com> > > Date: Wed, 09 Nov 2011 12:14:09 +0100 > > > >> unres_qlen is the number of frames we are able to queue per unresolved > >> neighbour. Its default value (3) was never changed and is responsible > >> for strange drops, especially if IP fragments are used, or multiple > >> sessions start in parallel. Even a single tcp flow can hit this limit. > > ... > > > > Ok, I've applied this, let's see what happens :-) > > Early answer, build fails. > > Please test build this patch with DECNET enabled and resubmit. The > decnet neigh layer still refers to the removed ->queue_len member. > > Thanks. Ouch, this was fixed on one machine yesterday, but not the other one I used this morning, sorry. [PATCH V5 net-next] neigh: new unresolved queue limits unres_qlen is the number of frames we are able to queue per unresolved neighbour. Its default value (3) was never changed and is responsible for strange drops, especially if IP fragments are used, or multiple sessions start in parallel. Even a single tcp flow can hit this limit. $ arp -d 192.168.20.108 ; ping -c 2 -s 8000 192.168.20.108 PING 192.168.20.108 (192.168.20.108) 8000(8028) bytes of data. 8008 bytes from 192.168.20.108: icmp_seq=2 ttl=64 time=0.322 ms Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 00:47:54 -05:00
stephen hemminger	292d139898	bridge: add NTF_USE support More changes to the recent code to support control of forwarding database via netlink. * Support NTF_USE like neighbour table * Validate state bits from application * Only send notifications (and change bits) if new entry is different. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 00:41:54 -05:00
Rick Jones	23020ab353	Sweep additional floors of strcpy in .get_drvinfo routines Perform another round of floor sweeping, converting the .get_drvinfo routines of additional drivers from strcpy to strlcpy along with some conversion of sprintf to snprintf. Signed-off-by: Rick Jones <rick.jones2@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 00:35:46 -05:00
Andy Fleming	952c5ca14e	fsl_pq_mdio: Clean up tbi address configuration The code for setting the address of the internal TBI PHY was convoluted enough without a maze of ifdefs. Clean it up a bit so we allow the logic to fail down to -ENODEV at the end of the if/else ladder, rather than using ifdefs to repeat the same failure code over and over. Also, remove the support for the auto-configuration. I'm not aware of anyone using it, and it ends up using the bus mutex before it's been initialized. Signed-off-by: Andy Fleming <afleming@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 00:26:11 -05:00
Sanjay Hortikar	e19df76a11	net-forcedeth: Add internal loopback support for forcedeth NICs. Support enabling/disabling/querying internal loopback mode for forcedeth NICs using ethtool. Signed-off-by: Sanjay Hortikar <horti@google.com> Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 00:22:46 -05:00
alex.bluesman.smirnov@gmail.com	63ce40e4fd	6LoWPAN: update documentation This patch adds chapter to documentation which describes how to use 6lowpan technology. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 00:19:43 -05:00
alex.bluesman.smirnov@gmail.com	f8b1b5d231	6LoWPAN: UDP header decompression This patch provides possibility to decompress UDP headers. Derived from Contiki OS. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 00:19:43 -05:00
alex.bluesman.smirnov@gmail.com	3bd5b958c2	6LoWPAN: UDP header compression This patch adds support for UDP header compression. Derived from Contiki OS. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 00:19:43 -05:00
alex.bluesman.smirnov@gmail.com	4d039f6843	6LoWPAN: set proper netdev flags This patch fixes settings for device initialization which makes possible to use NDISC and TCP. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Acked-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 00:19:43 -05:00
alex.bluesman.smirnov@gmail.com	e86586ba8c	6LoWPAN: disable debugging by default This patch disables debug output enabled by default. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Acked-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 00:19:42 -05:00
alex.bluesman.smirnov@gmail.com	719269afbc	6LoWPAN: add fragmentation support This patch adds support for frame fragmentation. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 00:19:42 -05:00
Eric Dumazet	2a24444f8f	ipv6: reduce percpu needs for icmpv6msg mibs Reading /proc/net/snmp6 on a machine with a lot of cpus is very expensive (can be ~88000 us). This is because ICMPV6MSG MIB uses 4096 bytes per cpu, and folding values for all possible cpus can read 16 Mbytes of memory (32MBytes on non x86 arches) ICMP messages are not considered as fast path on a typical server, and eventually few cpus handle them anyway. We can afford an atomic operation instead of using percpu data. This saves 4096 bytes per cpu and per network namespace. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-14 00:12:26 -05:00
Jiri Pirko	3d249d4ca7	net: introduce ethernet teaming device This patch introduces new network device called team. It supposes to be very fast, simple, userspace-driven alternative to existing bonding driver. Userspace library called libteam with couple of demo apps is available here: https://github.com/jpirko/libteam Note it's still in its dipers atm. team<->libteam use generic netlink for communication. That and rtnl suppose to be the only way to configure team device, no sysfs etc. Python binding of libteam was recently introduced. Daemon providing arpmon/miimon active-backup functionality will be introduced shortly. All what's necessary is already implemented in kernel team driver. v7->v8: - check ndo_ndo_vlan_rx_[add/kill]_vid functions before calling them. - use dev_kfree_skb_any() instead of dev_kfree_skb() v6->v7: - transmit and receive functions are not checked in hot paths. That also resolves memory leak on transmit when no port is present v5->v6: - changed couple of _rcu calls to non _rcu ones in non-readers v4->v5: - team_change_mtu() uses team->lock while travesing though port list - mac address changes are moved completely to jurisdiction of userspace daemon. This way the daemon can do FOM1, FOM2 and possibly other weird things with mac addresses. Only round-robin mode sets up all ports to bond's address then enslaved. - Extended Kconfig text v3->v4: - remove redundant synchronize_rcu from __team_change_mode() - revert "set and clear of mode_ops happens per pointer, not per byte" - extend comment of function __team_change_mode() v2->v3: - team_change_mtu() uses rcu version of list traversal to unwind - set and clear of mode_ops happens per pointer, not per byte - port hashlist changed to be embedded into team structure - error branch in team_port_enter() does cleanup now - fixed rtln->rtnl v1->v2: - modes are made as modules. Makes team more modular and extendable. - several commenters' nitpicks found on v1 were fixed - several other bugs were fixed. - note I ignored Eric's comment about roundrobin port selector as Eric's way may be easily implemented as another mode (mode "random") in future. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-13 16:10:10 -05:00
Dmitry Kravkov	5d70b88cd4	bnx2x: update driver version to 1.70.35-0 Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-13 16:03:57 -05:00
Ariel Elior	72754080d1	bnx2x: Remove on-stack napi struct variable Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-13 16:03:56 -05:00
Dmitry Kravkov	4a025f49d3	bnx2x: prevent race in statistics flow The race may cause access of registers while MAC hw block is in reset state. As a result syslog will show error messages. We can prevent this by using state from local variable. Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-13 16:03:56 -05:00
Ariel Elior	8304859adc	bnx2x: add fan failure event handling Shut down the device in case of fan failure to prevent HW damage. Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-13 16:03:56 -05:00
Dmitry Kravkov	46fa1309fe	bnx2x: remove unused #define Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-13 16:03:55 -05:00
Dmitry Kravkov	b363782761	bnx2x: simplify definition of RX_SGE_MASK_LEN and use it. Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-13 16:03:55 -05:00
Dmitry Kravkov	f9c058b633	bnx2x: DCBX: use #define instead of magic Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-13 16:03:54 -05:00
Dmitry Kravkov	00253a8cf3	bnx2x: propagate DCBX negotiation We need propagate the DCBX results from PMF to other functions on the same port, in order to properly update netdev structure and allow following new ETS and PFC configurations. Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-13 16:03:54 -05:00
Dmitry Kravkov	b306f5edf6	bnx2x: separate FCoE and iSCSI license initialization. FCoE license info must be initialized at probe(), but iSCSI at open(). Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-13 16:03:54 -05:00
Dmitry Kravkov	ad756594a8	bnx2x: remove unused variable Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-13 16:03:53 -05:00
Dmitry Kravkov	f233cafe1a	bnx2x: use rx_queue index for skb_record_rx_queue() Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-13 16:03:53 -05:00
Dmitry Kravkov	62ac0dc9ec	bnx2x: allow FCoE and DCB for 578xx Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-13 16:03:52 -05:00
Sathya Perla	6589ade019	be2net: stop issuing FW cmds if any cmd times out A FW cmd timeout (with a sufficiently large timeout value in the order of tens of seconds) indicates an unresponsive FW. In this state issuing further cmds and waiting for a completion will only stall the process. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-12 17:59:36 -05:00
Sathya Perla	434b3648e9	be2net: don't log more than one error on detecting EEH/UE errors Currently we're spamming error messages each time a FW cmd call is made while in EEH/UE error state. One log msg on error detection is enough. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-12 17:59:36 -05:00
Sathya Perla	72f0248562	be2net: stop checking the UE registers after an EEH error Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-12 17:59:35 -05:00
Sathya Perla	30128031d7	be2net: init (vf)_if_handle/vf_pmac_id to handle failure scenarios Initialize if_handle, vf_if_handle and vf_pmac_id with "-1" so that in failure cases when be_clear() is called, we can skip over if_destroy/pmac_del cmds if they have not been created. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-12 17:59:35 -05:00
Eric Dumazet	d826eb14ec	ipv4: PKTINFO doesnt need dst reference Le lundi 07 novembre 2011 à 15:33 +0100, Eric Dumazet a écrit : > At least, in recent kernels we dont change dst->refcnt in forwarding > patch (usinf NOREF skb->dst) > > One particular point is the atomic_inc(dst->refcnt) we have to perform > when queuing an UDP packet if socket asked PKTINFO stuff (for example a > typical DNS server has to setup this option) > > I have one patch somewhere that stores the information in skb->cb[] and > avoid the atomic_{inc\|dec}(dst->refcnt). > OK I found it, I did some extra tests and believe its ready. [PATCH net-next] ipv4: IP_PKTINFO doesnt need dst reference When a socket uses IP_PKTINFO notifications, we currently force a dst reference for each received skb. Reader has to access dst to get needed information (rt_iif & rt_spec_dst) and must release dst reference. We also forced a dst reference if skb was put in socket backlog, even without IP_PKTINFO handling. This happens under stress/load. We can instead store the needed information in skb->cb[], so that only softirq handler really access dst, improving cache hit ratios. This removes two atomic operations per packet, and false sharing as well. On a benchmark using a mono threaded receiver (doing only recvmsg() calls), I can reach 720.000 pps instead of 570.000 pps. IP_PKTINFO is typically used by DNS servers, and any multihomed aware UDP application. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-09 16:36:27 -05:00
Eric Dumazet	acb32ba3de	ipv4: reduce percpu needs for icmpmsg mibs Reading /proc/net/snmp on a machine with a lot of cpus is very expensive (can be ~88000 us). This is because ICMPMSG MIB uses 4096 bytes per cpu, and folding values for all possible cpus can read 16 Mbytes of memory. ICMP messages are not considered as fast path on a typical server, and eventually few cpus handle them anyway. We can afford an atomic operation instead of using percpu data. This saves 4096 bytes per cpu and per network namespace. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-09 16:04:20 -05:00
Eric Dumazet	e56c57d0d3	net: rename sk_clone to sk_clone_lock Make clear that sk_clone() and inet_csk_clone() return a locked socket. Add _lock() prefix and kerneldoc. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-08 17:07:07 -05:00
Eric Dumazet	9ecd04bc04	sch_choke: use skb_header_pointer() Remove the assumption that skb_get_rxhash() makes IP header and ports linear, and use skb_header_pointer() instead in choke_match_flow() This permits __skb_get_rxhash() to use skb_header_pointer() eventually. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-08 16:41:31 -05:00
Ricardo Ribalda	8d8bdfe803	ll_temac: Add support for phy_mii_ioctl This patch enables the ioctl support for the driver. So userspace programs like mii-tool can work. Resend in merge window Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-08 15:13:38 -05:00
Maciej Żenczykowski	2563fa5954	net: make ipv6 PKTINFO honour freebind This just makes it possible to spoof source IPv6 address on a socket without having to create and bind a new socket for every source IP we wish to spoof. Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-08 15:13:03 -05:00
Maciej Żenczykowski	f74024d9f0	net: make ipv6 bind honour freebind This makes native ipv6 bind follow the precedent set by: - native ipv4 bind behaviour - dual stack ipv4-mapped ipv6 bind behaviour. This does allow an unpriviledged process to spoof its source IPv6 address, just like it currently can spoof its source IPv4 address (for example when using UDP). Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-08 15:13:03 -05:00
Rick Jones	68aad78c50	sweep the floors and convert some .get_drvinfo routines to strlcpy Per the mention made by Ben Hutchings that strlcpy is now the preferred string copy routine for a .get_drvinfo routine, do a bit of floor sweeping and convert some of the as-yet unconverted ethernet drivers to it. Signed-off-by: Rick Jones <rick.jones2@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-08 15:11:57 -05:00

1 2 3 4 5 ...

275210 Commits