linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-22 10:56:40 +00:00

Author	SHA1	Message	Date
Eric Dumazet	c70eba7453	igmp: fix new sparse errors Fix following sparse errors : net/ipv4/igmp.c:1222:25: warning: cast from restricted __be32 net/ipv4/igmp.c🔢31: warning: incorrect type in assignment (different address spaces) net/ipv4/igmp.c🔢31: expected struct ip_mc_list [noderef] <asn:4>next_hash net/ipv4/igmp.c🔢31: got struct ip_mc_list <noident> net/ipv4/igmp.c:1250:31: warning: incorrect type in assignment (different address spaces) net/ipv4/igmp.c:1250:31: expected struct ip_mc_list [noderef] <asn:4>next_hash net/ipv4/igmp.c:1250:31: got struct ip_mc_list <noident> net/ipv4/igmp.c:2380:37: warning: cast from restricted __be32 These were added by commit `e989707135` ("igmp: hash a hash table to speedup ip_check_mc_rcu()") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 14:14:55 -07:00
Claudiu Manoil	5eaedf3131	gianfar: Add backwards compatible Single Queue mode polling Older Single Queue (SQ_SG_MODE) devices like TSEC (i.e. mpc83xx) don't feature the frame receive indication bits (RXF) in RSTAT. For these and for the rest of the SQ_SG_MODE devices, provide the appropiate polling routine that handles a single pair of Rx/Tx BD rings, removing the overhead incurred by the multiple queues/ multiple interrupt group devices (veTSEC/ eTSEC2.0 devices). So this is primarily a fix for the TSEC devices. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 03:16:20 -07:00
Ben Hutchings	6602041b83	sfc: Store port number in private data, not net_device::dev_id We should not use net_device::dev_id to indicate the port number, as this affects the way the local part of IPv6 addresses is normally generated. This field was intended for use where multiple devices may share a single assigned MAC address and need to have different IPv6 addresses. Siena's two ports each have their own MAC addresses. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 03:15:02 -07:00
Rami Rosen	503b47eafc	ipv4: remove is_data also from ip_options documentation. commit `ef722495c8` ( [IPV4]: Remove unused ip_options->is_data) removed the unused is_data member from ip_options struct. This patch removes is_data also from the documentation of the ip_options struct. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 03:13:50 -07:00
Jiri Pirko	735d381fa5	team: remove synchronize_rcu() called during port disable Check the unlikely case of team->en_port_count == 0 before modulo operation. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 03:06:16 -07:00
Jiri Pirko	d80b35beac	team: use kfree_rcu instead of synchronize_rcu in team_port_dev Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 03:05:54 -07:00
Jiri Pirko	6c31ff366c	team: remove synchronize_rcu() called during queue override change This patch removes synchronize_rcu() from function __team_queue_override_port_del(). That can be done because it is ok to do list_del_rcu() and list_add_tail_rcu() on the same list_head member without calling synchronize_rcu() in between. A bit of refactoring needed to be done because INIT_LIST_HEAD needed to be removed (to not kill the forward pointer) as well. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 03:05:53 -07:00
Narendra K	dffebd2c5c	doc:networking: Update comment for dev_id field in netdevice.h This patch updates the comment for 'dev_id' field in 'include/linux/netdevice.h' to reflect the intended usage of 'dev_id'. References: http://marc.info/?l=linux-netdev&m=136992115300526&w=2 References: http://marc.info/?l=linux-netdev&m=137062569014612&w=2 Signed-off-by: Narendra K <narendra_k@dell.com> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 03:03:47 -07:00
Tushar Behera	32766fff24	net: can: Convert to use devm_ioremap_resource Commit `75096579c3` ("lib: devres: Introduce devm_ioremap_resource()") introduced devm_ioremap_resource() and deprecated the use of devm_request_and_ioremap(). Signed-off-by: Tushar Behera <tushar.behera@linaro.org> CC: netdev@vger.kernel.org CC: linux-can@vger.kernel.org CC: Marc Kleine-Budde <mkl@pengutronix.de> CC: Wolfgang Grandegger <wg@grandegger.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 02:22:35 -07:00
Tushar Behera	eed5d29d78	net: emaclite: Convert to use devm_ioremap_resource Commit `75096579c3` ("lib: devres: Introduce devm_ioremap_resource()") introduced devm_ioremap_resource() and deprecated the use of devm_request_and_ioremap(). Signed-off-by: Tushar Behera <tushar.behera@linaro.org> CC: netdev@vger.kernel.org CC: "David S. Miller" <davem@davemloft.net> CC: Michal Simek <michal.simek@xilinx.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 02:22:35 -07:00
Tushar Behera	941e173a53	net: fec: Convert to use devm_ioremap_resource Commit `75096579c3` ("lib: devres: Introduce devm_ioremap_resource()") introduced devm_ioremap_resource() and deprecated the use of devm_request_and_ioremap(). Signed-off-by: Tushar Behera <tushar.behera@linaro.org> CC: netdev@vger.kernel.org CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 02:22:35 -07:00
Sergei Shtylyov	afd6eae13c	3c59x: consolidate error cleanup in vortex_init_one() The PCI driver's probe() method duplicates the error cleanup code each time it has to do error exit. Consolidate the error cleanup code in one place and use goto to jump to the right places. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: Steffen Klassert <klassert@mathematik.tu-chemnitz.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 01:53:52 -07:00
Hong zhi guo	3f8b96379a	veth: remove redundant call of dev_alloc_name it's called in the following register_netdevice. No need to call it here. Tested with "ip link add type veth" and "ip link add xxx%d type veth". Signed-off-by: Hong Zhiguo <honkiko@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 01:21:20 -07:00
Daniel Borkmann	7a6e288d27	pktgen: ipv6: numa: consolidate skb allocation to pktgen_alloc_skb We currently allow for numa-node aware skb allocation only within the fill_packet_ipv4() path, but not in fill_packet_ipv6(). Consolidate that code to a common allocation helper to enable numa-node aware skb allocation for ipv6, and use it in both paths. This also makes both functions a bit more readable. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 00:47:25 -07:00
Daniel Borkmann	da5bab079f	net: udp4: move GSO functions to udp_offload Similarly to TCP offloading and UDPv6 offloading, move all related UDPv4 functions to udp_offload.c to make things more explicit. Also, by this, we can make those functions static. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 00:47:25 -07:00
Shawn Bohrer	946d3bd723	igmp: remove unnecessary in_device member zeroing ip_mc_init_dev() is passed a freshly kzalloc'd in_device so it is unnecessary to explicitly zero out the members. Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 00:41:15 -07:00
Eric Dumazet	e989707135	igmp: hash a hash table to speedup ip_check_mc_rcu() After IP route cache removal, multicast applications using a lot of multicast addresses hit a O(N) behavior in ip_check_mc_rcu() Add a per in_device hash table to get faster lookup. This hash table is created only if the number of items in mc_list is above 4. Reported-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 00:25:23 -07:00
Eric Dumazet	64153ce0a7	net_sched: htb: do not setup default rate estimators With a thousand htb classes, est_timer() spends ~5 million cpu cycles and throws out cpu cache, because each htb class has a default rate estimator (est 4sec 16sec). Most users do not use default rate estimators, so switch htb to not setup ones. Add a module parameter (htb_rate_est) so that users relying on this default rate estimator can revert the behavior. echo 1 >/sys/module/sch_htb/parameters/htb_rate_est Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-12 00:14:21 -07:00
Eric Dumazet	130d3d68b5	net_sched: psched_ratecfg_precompute() improvements Before allowing 64bits bytes rates, refactor psched_ratecfg_precompute() to get better comments and increased accuracy. rate_bps field is renamed to rate_bytes_ps, as we only have to worry about bytes per second. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 22:39:47 -07:00
Eric Dumazet	45203a3b38	net_sched: add 64bit rate estimators struct gnet_stats_rate_est contains u32 fields, so the bytes per second field can wrap at 34360Mbit. Add a new gnet_stats_rate_est64 structure to get 64bit bps/pps fields, and switch the kernel to use this structure natively. This structure is dumped to user space as a new attribute : TCA_STATS_RATE_EST64 Old tc command will now display the capped bps (to 34360Mbit), instead of wrapped values, and updated tc command will display correct information. Old tc command output, after patch : eric:~# tc -s -d qd sh dev lo qdisc pfifo 8001: root refcnt 2 limit 1000p Sent 80868245400 bytes 1978837 pkt (dropped 0, overlimits 0 requeues 0) rate 34360Mbit 189696pps backlog 0b 0p requeues 0 This patch carefully reorganizes "struct Qdisc" layout to get optimal performance on SMP. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 02:51:03 -07:00
Peter Pan(潘卫平)	b41abb42bf	net: pass correct parameter to skb_headers_offset_update() Since commit 1a37e412a022(net: Use 16bits for _headers fields of struct skbuff), skb->_header are relative to skb->head, so copy_skb_header() should not call skb_headers_offset_update() now, and we should pass correct parameter to skb_headers_offset_update() in pskb_expand_head() and skb_copy_expand(). Signed-off-by: Weiping Pan <panweiping3@gmail.com> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 02:48:06 -07:00
Gao feng	da12c90e09	netlink: Add compare function for netlink_table As we know, netlink sockets are private resource of net namespace, they can communicate with each other only when they in the same net namespace. this works well until we try to add namespace support for other subsystems which use netlink. Don't like ipv4 and route table.., it is not suited to make these subsytems belong to net namespace, Such as audit and crypto subsystems,they are more suitable to user namespace. So we must have the ability to make the netlink sockets in same user namespace can communicate with each other. This patch adds a new function pointer "compare" for netlink_table, we can decide if the netlink sockets can communicate with each other through this netlink_table self-defined compare function. The behavior isn't changed if we don't provide the compare function for netlink_table. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 02:39:42 -07:00
Li RongQing	8249152c47	xen-netfront: use skb_partial_csum_set() to simplify the codes use skb_partial_csum_set() to simplify the codes Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 02:34:36 -07:00
David S. Miller	2e42206948	Merge branch 'bridge_flags' Vlad Yasevich says: ==================== The following series adds 2 new flags to bridge. One flag allows the user to control whether mac learning is performed on the interface or not. By default mac learning is on. The other flag allows the user to control whether unicast traffic is flooded (send without an fdb) to a given unicast port. Default is on. Changes since v4: - Implemented Stephen's suggestions. Changes since v2: - removed unused "unlock" tag. Changes since v1: - Integrated suggestion from MST to not impact RTM_NEWNEIGH and to skip lookups when learning is disabled. Vlad Yasevich (2): bridge: Add flag to control mac learning. bridge: Add a flag to control unicast packet flood. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 02:04:43 -07:00
Vlad Yasevich	867a59436f	bridge: Add a flag to control unicast packet flood. Add a flag to control flood of unicast traffic. By default, flood is on and the bridge will flood unicast traffic if it doesn't know the destination. When the flag is turned off, unicast traffic without an FDB will not be forwarded to the specified port. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 02:04:32 -07:00
Vlad Yasevich	9ba18891f7	bridge: Add flag to control mac learning. Allow user to control whether mac learning is enabled on the port. By default, mac learning is enabled. Disabling mac learning will cause new dynamic FDB entries to not be created for a particular port. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 02:04:32 -07:00
Cong Wang	30f3a40f9a	net: remove last caller of skb_tail_offset() and itself Similar to the following commits: commit `00f97da17a` (netpoll: fix position of network header) commit `525cebedb3` (pktgen: Fix position of ip and udp header) using skb_tail_offset() seems not correct since the offset is based on head pointer. With the last caller removed, skb_tail_offset() can be killed finally. Cc: Thomas Graf <tgraf@suug.ch> Cc: Daniel Borkmann <dborkmann@redhat.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 22:22:23 -07:00
David S. Miller	0a4db187a9	Merge branch 'll_poll' Eliezer Tamir says: ==================== This patch set adds the ability for the socket layer code to poll directly on an Ethernet device's RX queue. This eliminates the cost of the interrupt and context switch and with proper tuning allows us to get very close to the HW latency. This is a follow up to Jesse Brandeburg's Kernel Plumbers talk from last year http://www.linuxplumbersconf.org/2012/wp-content/uploads/2012/09/2012-lpc-Low-Latency-Sockets-slides-brandeburg.pdf Patch 1 adds a napi_id and a hashing mechanism to lookup a napi by id. Patch 2 adds an ndo_ll_poll method and the code that supports it. Patch 3 adds support for busy-polling on UDP sockets. Patch 4 adds support for TCP. Patch 5 adds the ixgbe driver code implementing ndo_ll_poll. Patch 6 adds additional statistics to the ixgbe driver for ndo_ll_poll. Performance numbers: setup TCP_RR UDP_RR kernel Config C3/6 rx-usecs tps cpu% S.dem tps cpu% S.dem patched optimized on 100 87k 3.13 11.4 94K 3.17 10.7 patched optimized on 0 71k 3.12 14.0 84k 3.19 12.0 patched optimized on adaptive 80k 3.13 12.5 90k 3.46 12.2 patched typical on 100 72 3.13 14.0 79k 3.17 12.8 patched typical on 0 60k 2.13 16.5 71k 3.18 14.0 patched typical on adaptive 67k 3.51 16.7 75k 3.36 14.5 3.9 optimized on adaptive 25k 1.0 12.7 28k 0.98 11.2 3.9 typical off 0 48k 1.09 7.3 52k 1.11 4.18 3.9 typical 0ff adaptive 35k 1.12 4.08 38k 0.65 5.49 3.9 optimized off adaptive 40k 0.82 4.83 43k 0.70 5.23 3.9 optimized off 0 57k 1.17 4.08 62k 1.04 3.95 Test setup details: Machines: each with two Intel Xeon 2680 CPUs and X520 (82599) optical NICs Tests: Netperf tcp_rr and udp_rr, 1 byte (round trips per second) Kernel: unmodified 3.9 and patched 3.9 Config: typical is derived from RH6.2, optimized is a stripped down config. Interrupt coalescing (ethtool rx-usecs) settings: 0=off, 1=adaptive, 100 us When C3/6 states were turned on (via BIOS) the performance governor was used. These performance numbers were measured with v2 of the patch set. Performance of the optimized config with an rx-usecs setting of 100 (the first line in the table above) was tracked during the evolution of the patches and has never varied by more than 1%. Design: A global hash table that allows us to look up a struct napi by a unique id was added. A napi_id field was added both to struct sk_buff and struct sk. This is used to track which NAPI we need to poll for a specific socket. The device driver marks every incoming skb with this id. This is propagated to the sk when the socket is looked up in the protocol handler. When the socket code does not find any more data on the socket queue, it now may call ndo_ll_poll which will crank the device's rx queue and feed incoming packets to the stack directly from the context of the socket. A sysctl value (net.core4.low_latency_poll) controls how many microseconds we busy-wait before giving up. (setting to 0 globally disables busy-polling) Locking: 1. Locking between napi poll and ndo_ll_poll: Since what needs to be locked between a device's NAPI poll and ndo_ll_poll, is highly device / configuration dependent, we do this inside the Ethernet driver. For example, when packets for high priority connections are sent to separate rx queues, you might not need locking between napi poll and ndo_ll_poll at all. For ixgbe we only lock the RX queue. ndo_ll_poll does not touch the interrupt state or the TX queues. (earlier versions of this patchset did touch them, but this design is simpler and works better.) If a queue is actively polled by a socket (on another CPU) napi poll will not service it, but will wait until the queue can be locked and cleaned before doing a napi_complete(). If a socket can't lock the queue because another CPU has it, either from napi or from another socket polling on the queue, the socket code can busy wait on the socket's skb queue. Ndo_ll_poll does not have preferential treatment for the data from the calling socket vs. data from others, so if another CPU is polling, you will see your data on this socket's queue when it arrives. Ndo_ll_poll is called with local BHs disabled, so it won't race on the same CPU with net_rx_action, which calls the napi poll method. 2. Napi_hash The napi hash mechanism uses RCU. napi_by_id() must be called under rcu_read_lock(). After a call to napi_hash_del(), caller must take care to wait an rcu grace period before freeing the memory containing the napi struct. (Ixgbe already had this because the queue vector structure uses rcu to protect the statistics counters in it.) how to test: 1. The patchset should apply cleanly to net-next. (don't forget to configure INET_LL_RX_POLL). 2. The ethtool -c setting for rx-usecs should be on the order of 100. 3. Use ethtool -K to disable GRO and LRO (You are encouraged to try it both ways. If you find that your workload does better with GRO on do tell us.) 4. Sysctl value net.core.low_latency_poll controls how long (in us) to busy-wait for more data, You are encouraged to play with this and see what works for you. The default is now 0 so you need to set it to turn the feature on. I recommend a value around 50. 4. benchmark thread and IRQ should be bound to separate cores. Both cores should be on the same CPU NUMA node as the NIC. When the app and the IRQ run on the same CPU you get a small penalty. If interrupt coalescing is set to a low value this penalty can be very large. 5. If you suspect that your machine is not configured properly, use numademo to make sure that the CPU to memory BW is OK. numademo 128m memcpy local copy numbers should be more than 8GB/s on a properly configured machine. Change log: v10 - removed select/poll support. (we will work on this some more and try again) v9 - correct sysctl proc_handler, reported by Eric Dumazet and Amir Vadai. - more int -> bool changes, reported by Eric Dumazet. - better mask testing in sock_poll(), reported by Eric Dumazet. v8 - split out udp and select/poll into separate patches. what used to be patch 2/5 is now three patches. - type corrections from Amir Vadai and Cong Wang: one unsigned long that was left when changing to cycles_t int -> bool - more detailed patch descriptions. v7 - suggested by Ben Hutchings and Eric Dumazet: type fixes, static for globals in net/core.c, avoid napi_id collisions in napi_hash_add() v6 - many small fixes suggested by Eric Dumazet: data locality, typos, documentation protect napi_hash insert/delete with a spinlock (napi_gen_id is no longer atomic_t since it's only accessed with the spinlock held.) - added IPv6 TCP and UDP support (only minimally tested) v5 - corrections suggested by Ben Hutchings: fixed typos, moved the config option and sysctl value from IPv4 to net - moved sk_mark_ll() to the protocol handlers - removed global id mechanism, replaced with a hashed napi_id. based on code sample from Eric Dumazet Note that ixgbe_free_q_vector() already waits an rcu grace period before freeing the q_vector, so nothing additional needs to be done when adding a call to napi_hash_del(). - simple poll/select support v4 - removed separate config option for TCP as suggested Eric Dumazet. - added linux mib counter for packets received through the low latency path, as suggested by Andi Kleen. - re-allow module unloading, remove module param, use a global generation id instead to prevent the use of a stale napi pointer, as suggested by Eric Dumazet - updated Documentation/networking/ip-sysctl.txt text v3 - coding style changes suggested by Dave Miller v2 - the sysctl knob is now in microseconds. The default value is now 0 (off). - for now the code depends at configure time on CONFIG_I86_TSC - the napi reference in struct skb is now a union with the dma cookie since the former is only used on RX and the latter on TX, as suggested by Eric Dumazet. - we do a better job at honoring non-blocking operations. - removed busy-polling support for tcp_read_sock() - remove dynamic disabling of GRO - coding style fixes - disallow unloading the device module after the feature has been used Credit: Jesse Brandeburg, Arun Chekhov Ilango, Julie Cummings, Alexander Duyck, Eric Geisler, Jason Neighbors, Yadong Li, Mike Polehn, Anil Vasudevan, Don Wood Special thanks for finding bugs in earlier versions: Willem de Bruijn and Andi Kleen ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 21:23:57 -07:00
Eliezer Tamir	7e15b90ff9	ixgbe: add extra stats for ndo_ll_poll Add additional statistics to the ixgbe driver for ndo_ll_poll Defined under LL_EXTENDED_STATS Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 21:22:36 -07:00
Eliezer Tamir	5a85e737f3	ixgbe: add support for ndo_ll_poll Add the ixgbe driver code implementing ndo_ll_poll. Adds ndo_ll_poll method and locking between it and the napi poll. When receiving a packet we use skb_mark_ll to record the napi it came from. Add each napi to the napi_hash right after netif_napi_add(). Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 21:22:36 -07:00
Eliezer Tamir	d30e383bb8	tcp: add low latency socket poll support. Adds low latency socket poll support for TCP. In tcp_v[46]_rcv() add a call to sk_mark_ll() to copy the napi_id from the skb to the sk. In tcp_recvmsg(), when there is no data in the socket we busy-poll. This is a good example of how to add busy-poll support to more protocols. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Tested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 21:22:36 -07:00
Eliezer Tamir	a5b50476f7	udp: add low latency socket poll support Add upport for busy-polling on UDP sockets. In __udp[46]_lib_rcv add a call to sk_mark_ll() to copy the napi_id from the skb into the sk. This is done at the earliest possible moment, right after we identify which socket this skb is for. In __skb_recv_datagram When there is no data and the user tries to read we busy poll. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Tested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 21:22:36 -07:00
Eliezer Tamir	0602129286	net: add low latency socket poll Adds an ndo_ll_poll method and the code that supports it. This method can be used by low latency applications to busy-poll Ethernet device queues directly from the socket code. sysctl_net_ll_poll controls how many microseconds to poll. Default is zero (disabled). Individual protocol support will be added by subsequent patches. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Tested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 21:22:35 -07:00
Eliezer Tamir	af12fa6e46	net: add napi_id and hash Adds a napi_id and a hashing mechanism to lookup a napi by id. This will be used by subsequent patches to implement low latency Ethernet device polling. Based on a code sample by Eric Dumazet. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 21:22:35 -07:00
Maxime Bizon	6f00a02296	bcm63xx_enet: add support for Broadcom BCM63xx integrated gigabit switch Newer Broadcom BCM63xx SoCs: 6328, 6362 and 6368 have an integrated switch which needs to be driven slightly differently from the traditional external switches. This patch introduces changes in arch/mips/bcm63xx in order to: - register a bcm63xx_enetsw driver instead of bcm63xx_enet driver - update DMA channels configuration & state RAM base addresses - add a new platform data configuration knob to define the number of ports per switch/device and force link on some ports - define the required switch registers On the driver side, the following changes are required: - the switch ports need to be polled to ensure the link is up and running and RX/TX can properly work - basic switch configuration needs to be performed for the switch to forward packets to the CPU - update the MIB counters since the integrated Signed-off-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Jonas Gorski <jogo@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 14:28:28 -07:00
Maxime Bizon	0ae99b5fed	bcm63xx_enet: split DMA channel register accesses The current bcm63xx_enet driver always uses bcmenet_shared_base whenever it needs to access DMA channel configuration space or access the DMA channel state RAM. Split these register in 3 parts to be more accurate: - global DMA configuration - per DMA channel configuration space - per DMA channel state RAM space This is preliminary to support new chips where the global DMA configuration remains the same, but there is a varying number of DMA channels located at a different memory offset. Signed-off-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Jonas Gorski <jogo@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 14:28:27 -07:00
Maxime Bizon	7260aac974	bcm63xx_enet: implement reset autoneg ethtool callback Implement the rset_nway ethtool callback which uses libphy generic autonegotiation restart function. Signed-off-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 14:28:27 -07:00
Jason Wang	df09b36f22	macvtap: enable multiqueue flag To notify the userspace about our capability of multiqueue. Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 23:49:09 -07:00
Jason Wang	815f236d62	macvtap: add TUNSETQUEUE ioctl This patch adds TUNSETQUEUE ioctl to let userspace can temporarily disable or enable a queue of macvtap. This is used to be compatible at API layer of tuntap to simplify the userspace to manage the queues. This is done through introducing a linked list to track all taps while using vlan->taps array to only track active taps. Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 23:49:09 -07:00
Jason Wang	376b1aabe1	macvtap: eliminate linear search Linear search were used in both get_slot() and macvtap_get_queue(), this is because: - macvtap didn't reshuffle the array of taps when create or destroy a queue, so when adding a new queue, macvtap must do linear search to find a location for the new queue. This will also complicate the TUNSETQUEUE implementation for multiqueue API. - the queue itself didn't track the queue index, so the we must do a linear search in the array to find the location of a existed queue. The solution is straightforward: reshuffle the array and introduce a queue_index to macvtap_queue. Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 23:49:09 -07:00
Jason Wang	f0afce01aa	macvlan: change the max number of queues to 16 Macvtap should be at least compatible with tap, so change the max number to 16. Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 23:49:09 -07:00
Jason Wang	8f475a318a	macvtap: introduce macvtap_get_vlan() Factor out the device holding logic to a macvtap_get_vlan(), this will be also used by multiqueue API. Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 23:49:08 -07:00
Jason Wang	1d2f41ed23	macvlan: switch to use IS_ENABLED() Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 23:49:08 -07:00
Jason Wang	89cee917de	macvtap: do not add self to waitqueue if doing a nonblock read There's no need to add self to waitqueue if doing a nonblock read. This could help to avoid the spinlock contention. Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 23:49:08 -07:00
Jason Wang	ed0483fa06	macvtap: fix a possible race between queue selection and changing queues Complier may generate codes that re-read the vlan->numvtaps during macvtap_get_queue(). This may lead a race if vlan->numvtaps were changed in the same time and which can lead unexpected result (e.g. very huge value). We need prevent the compiler from generating such codes by adding an ACCESS_ONCE() to make sure vlan->numvtaps were only read once. Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 23:49:08 -07:00
David S. Miller	e403d29581	sh_eth: Fix warnings on 64-bit. Don't cast a plain integer to a pointer. drivers/net/ethernet/renesas/sh_eth.c: In function ‘sh_eth_chip_reset_giga’: drivers/net/ethernet/renesas/sh_eth.c:482:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] drivers/net/ethernet/renesas/sh_eth.c:483:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] drivers/net/ethernet/renesas/sh_eth.c:492:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] drivers/net/ethernet/renesas/sh_eth.c:493:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 23:40:41 -07:00
Sergei Shtylyov	eb770bf430	sh_eth: remove dependencies from Kconfig Since dependence on the certain SoCs is no longer necessary to compile the driver, remove the dependency list from its Kconfig entry which is a popular demand anyway... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 23:38:24 -07:00
Sergei Shtylyov	589ebdef7e	sh_eth: get R8A777x support out of #ifdef Get the R-Car code/data in the driver out of #ifdef by adding "r8a777x-ether" to the platfrom driver's ID table; since it's the last #ifdef, we remove CARDNAME from the ID table and no longer check the driver data before assigning it to 'mdp->cd'... Change the Ether platform device's name in the ARM platform code accordingly. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 23:38:24 -07:00
Sergei Shtylyov	9c3beaabb9	sh_eth: get SH7724 support out of #ifdef Get the SH7724 code/data in the driver out of #ifdef by adding "r8a7724-ether" to the platform driver's ID table. Change the Ether platform device's name in the SH platform code accordingly. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 23:38:24 -07:00
Sergei Shtylyov	24549e2a0f	sh_eth: get SH7757 support out of #ifdef Get the SH7757 code/data in the driver out of #ifdef by adding "sh7757-ether" and "sh7757-gether" to the platform driver's ID table. Note that we can remove SH_ETH_HAS_BOTH_MODULES and sh_eth_get_cpu_data(). Change the Ether/GEther platform devices' names in the SH platform code accordingly. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-07 23:38:24 -07:00

1 2 3 4 5 ...

377157 Commits