linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-08 13:11:45 +00:00

Author	SHA1	Message	Date
Pablo Neira Ayuso	0ca743a559	netfilter: nf_tables: add compatibility layer for x_tables This patch adds the x_tables compatibility layer. This allows you to use existing x_tables matches and targets from nf_tables. This compatibility later allows us to use existing matches/targets for features that are still missing in nf_tables. We can progressively replace them with native nf_tables extensions. It also provides the userspace compatibility software that allows you to express the rule-set using the iptables syntax but using the nf_tables kernel components. In order to get this compatibility layer working, I've done the following things: * add NFNL_SUBSYS_NFT_COMPAT: this new nfnetlink subsystem is used to query the x_tables match/target revision, so we don't need to use the native x_table getsockopt interface. * emulate xt structures: this required extending the struct nft_pktinfo to include the fragment offset, which is already obtained from ip[6]_tables and that is used by some matches/targets. * add support for default policy to base chains, required to emulate x_tables. * add NFTA_CHAIN_USE attribute to obtain the number of references to chains, required by x_tables emulation. * add chain packet/byte counters using per-cpu. * support 32-64 bits compat. For historical reasons, this patch includes the following patches that were posted in the netfilter-devel mailing list. From Pablo Neira Ayuso: * nf_tables: add default policy to base chains * netfilter: nf_tables: add NFTA_CHAIN_USE attribute * nf_tables: nft_compat: private data of target and matches in contiguous area * nf_tables: validate hooks for compat match/target * nf_tables: nft_compat: release cached matches/targets * nf_tables: x_tables support as a compile time option * nf_tables: fix alias for xtables over nftables module * nf_tables: add packet and byte counters per chain * nf_tables: fix per-chain counter stats if no counters are passed * nf_tables: don't bump chain stats * nf_tables: add protocol and flags for xtables over nf_tables * nf_tables: add ip[6]t_entry emulation * nf_tables: move specific layer 3 compat code to nf_tables_ipv[4\|6] * nf_tables: support 32bits-64bits x_tables compat * nf_tables: fix compilation if CONFIG_COMPAT is disabled From Patrick McHardy: * nf_tables: move policy to struct nft_base_chain * nf_tables: send notifications for base chain policy changes From Alexander Primak: * nf_tables: remove the duplicate NF_INET_LOCAL_OUT From Nicolas Dichtel: * nf_tables: fix compilation when nf-netlink is a module Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-10-14 18:00:04 +02:00
Pablo Neira Ayuso	9370761c56	netfilter: nf_tables: convert built-in tables/chains to chain types This patch converts built-in tables/chains to chain types that allows you to deploy customized table and chain configurations from userspace. After this patch, you have to specify the chain type when creating a new chain: add chain ip filter output { type filter hook input priority 0; } ^^^^ ------ The existing chain types after this patch are: filter, route and nat. Note that tables are just containers of chains with no specific semantics, which is a significant change with regards to iptables. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-10-14 17:16:11 +02:00
Patrick McHardy	c29b72e025	netfilter: nft_payload: add optimized payload implementation for small loads Add an optimized payload expression implementation for small (up to 4 bytes) aligned data loads from the linear packet area. This patch also includes original Patrick McHardy's entitled (nf_tables: inline nft_payload_fast_eval() into main evaluation loop). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-10-14 17:16:10 +02:00
Patrick McHardy	cb7dbfd039	netfilter: nf_tables: add optimized data comparison for small values Add an optimized version of nft_data_cmp() that only handles values of to 4 bytes length. This patch includes original Patrick McHardy's patch entitled (nf_tables: inline nft_cmp_fast_eval() into main evaluation loop). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-10-14 17:16:09 +02:00
Patrick McHardy	ef1f7df917	netfilter: nf_tables: expression ops overloading Split the expression ops into two parts and support overloading of the runtime expression ops based on the requested function through a ->select_ops() callback. This can be used to provide optimized implementations, for instance for loading small aligned amounts of data from the packet or inlining frequently used operations into the main evaluation loop. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-10-14 17:16:08 +02:00
Patrick McHardy	20a69341f2	netfilter: nf_tables: add netlink set API This patch adds the new netlink API for maintaining nf_tables sets independently of the ruleset. The API supports the following operations: - creation of sets - deletion of sets - querying of specific sets - dumping of all sets - addition of set elements - removal of set elements - dumping of all set elements Sets are identified by name, each table defines an individual namespace. The name of a set may be allocated automatically, this is mostly useful in combination with the NFT_SET_ANONYMOUS flag, which destroys a set automatically once the last reference has been released. Sets can be marked constant, meaning they're not allowed to change while linked to a rule. This allows to perform lockless operation for set types that would otherwise require locking. Additionally, if the implementation supports it, sets can (as before) be used as maps, associating a data value with each key (or range), by specifying the NFT_SET_MAP flag and can be used for interval queries by specifying the NFT_SET_INTERVAL flag. Set elements are added and removed incrementally. All element operations support batching, reducing netlink message and set lookup overhead. The old "set" and "hash" expressions are replaced by a generic "lookup" expression, which binds to the specified set. Userspace is not aware of the actual set implementation used by the kernel anymore, all configuration options are generic. Currently the implementation selection logic is largely missing and the kernel will simply use the first registered implementation supporting the requested operation. Eventually, the plan is to have userspace supply a description of the data characteristics and select the implementation based on expected performance and memory use. This patch includes the new 'lookup' expression to look up for element matching in the set. This patch includes kernel-doc descriptions for this set API and it also includes the following fixes. From Patrick McHardy: * netfilter: nf_tables: fix set element data type in dumps * netfilter: nf_tables: fix indentation of struct nft_set_elem comments * netfilter: nf_tables: fix oops in nft_validate_data_load() * netfilter: nf_tables: fix oops while listing sets of built-in tables * netfilter: nf_tables: destroy anonymous sets immediately if binding fails * netfilter: nf_tables: propagate context to set iter callback * netfilter: nf_tables: add loop detection From Pablo Neira Ayuso: * netfilter: nf_tables: allow to dump all existing sets * netfilter: nf_tables: fix wrong type for flags variable in newelem Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-10-14 17:16:07 +02:00
Patrick McHardy	96518518cc	netfilter: add nftables This patch adds nftables which is the intended successor of iptables. This packet filtering framework reuses the existing netfilter hooks, the connection tracking system, the NAT subsystem, the transparent proxying engine, the logging infrastructure and the userspace packet queueing facilities. In a nutshell, nftables provides a pseudo-state machine with 4 general purpose registers of 128 bits and 1 specific purpose register to store verdicts. This pseudo-machine comes with an extensible instruction set, a.k.a. "expressions" in the nftables jargon. The expressions included in this patch provide the basic functionality, they are: * bitwise: to perform bitwise operations. * byteorder: to change from host/network endianess. * cmp: to compare data with the content of the registers. * counter: to enable counters on rules. * ct: to store conntrack keys into register. * exthdr: to match IPv6 extension headers. * immediate: to load data into registers. * limit: to limit matching based on packet rate. * log: to log packets. * meta: to match metainformation that usually comes with the skbuff. * nat: to perform Network Address Translation. * payload: to fetch data from the packet payload and store it into registers. * reject (IPv4 only): to explicitly close connection, eg. TCP RST. Using this instruction-set, the userspace utility 'nft' can transform the rules expressed in human-readable text representation (using a new syntax, inspired by tcpdump) to nftables bytecode. nftables also inherits the table, chain and rule objects from iptables, but in a more configurable way, and it also includes the original datatype-agnostic set infrastructure with mapping support. This set infrastructure is enhanced in the follow up patch (netfilter: nf_tables: add netlink set API). This patch includes the following components: * the netlink API: net/netfilter/nf_tables_api.c and include/uapi/netfilter/nf_tables.h * the packet filter core: net/netfilter/nf_tables_core.c * the expressions (described above): net/netfilter/nft_.c the filter tables: arp, IPv4, IPv6 and bridge: net/ipv4/netfilter/nf_tables_ipv4.c net/ipv6/netfilter/nf_tables_ipv6.c net/ipv4/netfilter/nf_tables_arp.c net/bridge/netfilter/nf_tables_bridge.c * the NAT table (IPv4 only): net/ipv4/netfilter/nf_table_nat_ipv4.c * the route table (similar to mangle): net/ipv4/netfilter/nf_table_route_ipv4.c net/ipv6/netfilter/nf_table_route_ipv6.c * internal definitions under: include/net/netfilter/nf_tables.h include/net/netfilter/nf_tables_core.h * It also includes an skeleton expression: net/netfilter/nft_expr_template.c and the preliminary implementation of the meta target net/netfilter/nft_meta_target.c It also includes a change in struct nf_hook_ops to add a new pointer to store private data to the hook, that is used to store the rule list per chain. This patch is based on the patch from Patrick McHardy, plus merged accumulated cleanups, fixes and small enhancements to the nftables code that has been done since 2009, which are: From Patrick McHardy: * nf_tables: adjust netlink handler function signatures * nf_tables: only retry table lookup after successful table module load * nf_tables: fix event notification echo and avoid unnecessary messages * nft_ct: add l3proto support * nf_tables: pass expression context to nft_validate_data_load() * nf_tables: remove redundant definition * nft_ct: fix maxattr initialization * nf_tables: fix invalid event type in nf_tables_getrule() * nf_tables: simplify nft_data_init() usage * nf_tables: build in more core modules * nf_tables: fix double lookup expression unregistation * nf_tables: move expression initialization to nf_tables_core.c * nf_tables: build in payload module * nf_tables: use NFPROTO constants * nf_tables: rename pid variables to portid * nf_tables: save 48 bits per rule * nf_tables: introduce chain rename * nf_tables: check for duplicate names on chain rename * nf_tables: remove ability to specify handles for new rules * nf_tables: return error for rule change request * nf_tables: return error for NLM_F_REPLACE without rule handle * nf_tables: include NLM_F_APPEND/NLM_F_REPLACE flags in rule notification * nf_tables: fix NLM_F_MULTI usage in netlink notifications * nf_tables: include NLM_F_APPEND in rule dumps From Pablo Neira Ayuso: * nf_tables: fix stack overflow in nf_tables_newrule * nf_tables: nft_ct: fix compilation warning * nf_tables: nft_ct: fix crash with invalid packets * nft_log: group and qthreshold are 2^16 * nf_tables: nft_meta: fix socket uid,gid handling * nft_counter: allow to restore counters * nf_tables: fix module autoload * nf_tables: allow to remove all rules placed in one chain * nf_tables: use 64-bits rule handle instead of 16-bits * nf_tables: fix chain after rule deletion * nf_tables: improve deletion performance * nf_tables: add missing code in route chain type * nf_tables: rise maximum number of expressions from 12 to 128 * nf_tables: don't delete table if in use * nf_tables: fix basechain release From Tomasz Bursztyka: * nf_tables: Add support for changing users chain's name * nf_tables: Change chain's name to be fixed sized * nf_tables: Add support for replacing a rule by another one * nf_tables: Update uapi nftables netlink header documentation From Florian Westphal: * nft_log: group is u16, snaplen u32 From Phil Oester: * nf_tables: operational limit match Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-10-14 17:15:48 +02:00
Pablo Neira Ayuso	f59cb0453c	netfilter: nf_nat: move alloc_null_binding to nf_nat_core.c Similar to nat_decode_session, alloc_null_binding is needed for both ip_tables and nf_tables, so move it to nf_nat_core.c. This change is required by nf_tables. This is an adapted version of the original patch from Patrick McHardy. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-10-14 11:29:39 +02:00
Patrick McHardy	795aa6ef6a	netfilter: pass hook ops to hookfn Pass the hook ops to the hookfn to allow for generic hook functions. This change is required by nf_tables. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-10-14 11:29:31 +02:00
Eric Dumazet	ccdbb6e96b	tcp: tcp_transmit_skb() optimizations 1) We need to take a timestamp only for skb that should be cloned. Other skbs are not in write queue and no rtt estimation is done on them. 2) the unlikely() hint is wrong for receivers (they send pure ACK) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: MF Nowlan <fitz@cs.yale.edu> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-By: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-11 17:48:18 -04:00
David S. Miller	29b67c39dc	Included changes: - update emails for A. Quartulli and M. Lindner in MAINTAINERS - switch to the next on-the-wire protocol version - introduce the T(ype) V(ersion) L(ength) V(alue) framework - adjust the existing components to make them use the new TVLV code - make the TT component use CRC32 instead of CRC16 - totally remove the VIS functionality (has been moved to userspace) - reorder packet types and flags - add static checks on packet format - remove __packed from batadv_ogm_packet -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJSVa1BAAoJEADl0hg6qKeOiZkP/iD8Gehkb+jII3J1AE4+TdaW X5XzB40A6bRr/RhiqqFXA6FhjLPCwPlaYwo+FL9QC9fIbV6Drofd8B4Vxvp8daQI A5ZTzA7GFJhUkzM4vvw5eyF5848OYGITAJh5kSkiYiFSC6x9X9UFrG1EBBX7ikGd 7MXCjNa5GXaO7BsSnSbGuuMYw6CuZxwwxUZ3F/fG031yTlRqz88TiBpOGCZXSM5J eYOf68byDPPsRvj/RgpmYTt7MzWnAgp9gJotp5Owuqpb3kvZC6SkA8uj4O2f29rn nmQ+1q5onjHpfqukezPOIx9HEKo2XCq3qbFiW74dyNi+EQlYJ6RAGyRBkTTl4hse cTOeWENANoFco/SPiOi1PjGrytuflAiIegwkVugAvnmYlbsVIi6MXV+BGo/ARybv RwPBFULCaxO5HNkYz6elcu8cwFY9s3PROEdMVpRj+iNTeQq3ZQHj+QfPsHdedkAz mkHqiZQo/pgBeWd+KiN98YJtQ4nPV1Qi+yGiayc3hmsk5fIABaOHxP3NV0NrTeOO PYKWFcKfdLgtvrfcOXbcWpPz0d8SUkyF8zjlzIscXlpJVWK1sOw8oEER3IrDgzDy PmzHkEL+qfDh4I5hEPOIEXxzPw/E4b+G3WI34bUN/4chGSmJdo6yyaC6XcXe7r9g AsxmABE84uT2npNSOSkN =YQD5 -----END PGP SIGNATURE----- Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge Included changes: - update emails for A. Quartulli and M. Lindner in MAINTAINERS - switch to the next on-the-wire protocol version - introduce the T(ype) V(ersion) L(ength) V(alue) framework - adjust the existing components to make them use the new TVLV code - make the TT component use CRC32 instead of CRC16 - totally remove the VIS functionality (has been moved to userspace) - reorder packet types and flags - add static checks on packet format - remove __packed from batadv_ogm_packet	2013-10-11 16:28:55 -04:00
David S. Miller	58308451e9	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== This series contains updates to i40e only. Alex provides the majority of the patches against i40e, where he does cleanup of the Tx and RX queues and to align the code with the known good Tx/Rx queue code in the ixgbe driver. Anjali provides an i40e patch to update link events to not print to the log until the device is administratively up. Catherine provides a patch to update the driver version. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-10 15:29:44 -04:00
Eric Dumazet	b44084c2c8	inet: rename ir_loc_port to ir_num In commit `634fb979e8` ("inet: includes a sock_common in request_sock") I forgot that the two ports in sock_common do not have same byte order : skc_dport is __be16 (network order), but skc_num is __u16 (host order) So sparse complains because ir_loc_port (mapped into skc_num) is considered as __u16 while it should be __be16 Let rename ir_loc_port to ireq->ir_num (analogy with inet->inet_num), and perform appropriate htons/ntohs conversions. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-10 14:37:35 -04:00
Catherine Sullivan	d04795d663	i40e: Bump version Update the version number of the driver. Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2013-10-09 23:25:33 -07:00
Alexander Duyck	980e9b1186	i40e: Add support for 64 bit netstats This change brings support for 64 bit netstats to the driver. Previously the stats were 64 bit but highly racy due to the fact that 64 bit transactions are not atomic on 32 bit systems. This change makes is so that the 64 bit byte and packet stats are reliable on all architectures. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2013-10-09 23:16:27 -07:00
Alexander Duyck	9f65e15b4f	i40e: Move rings from pointer to array to array of pointers Allocate the queue pairs individually instead of as a group. This allows for much easier queue management as it is possible to dynamically resize the queues without having to free and allocate the entire block. Ease statistic collection by treating Tx/Rx queue pairs as a single unit. Each pair is allocated together and starts with a Tx queue and ends with an Rx queue. By ordering them this way it is possible to know the Rx offset based on a pointer to the Tx queue. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2013-10-09 22:48:35 -07:00
Alexander Duyck	cd0b6fa656	i40e: Replace ring container array with linked list This replaces the ring container array with a linked list. The idea is to make the logic much easier to deal with since this will allow us to call a simple helper function from the q_vectors to go through the entire list. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2013-10-09 22:39:30 -07:00
Alexander Duyck	493fb30011	i40e: Move q_vectors from pointer to array to array of pointers Allocate the q_vectors individually. The advantage to this is that it allows for easier freeing and allocation. In addition it makes it so that we could do node specific allocations at some point in the future. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2013-10-09 22:32:10 -07:00
Alexander Duyck	a114d0a6ac	i40e: Split bytes and packets from Rx/Tx stats This makes it so that the Tx and Rx byte and packet counts are separated from the rest of the statistics. This allows for better isolation of these stats when we move them into the 64 bit statistics. Simplify things by re-ordering how the stats display in ethtool. Instead of displaying all of the Tx queues as a block, followed by all the Rx queues, the new order is Tx[0], Rx[0], Tx[1], Rx[1], ..., Tx[n], Rx[n]. This reduces the loops and cleans up the display for testing purposes since it is very easy to verify if flow director is doing the right thing as the Tx and Rx queue pair are shown in pairs. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2013-10-09 22:21:39 -07:00
Alexander Duyck	7070ce0a64	i40e: Add support for Tx byte queue limits Implement BQL (byte queue limit) support in i40e. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2013-10-09 22:13:08 -07:00
Alexander Duyck	b194130627	i40e: Drop dead code and flags from Tx hotpath Drop Tx flag and TXSW which is tested but never set. As a result of this change we can drop a complicated check that always resulted in the final result of i40e_tx_csum being equal to the CHECKSUM_PARTIAL value. As such we can replace the entire function call with just a check for skb->summed == CHECKSUM_PARTIAL. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2013-10-09 22:06:02 -07:00
Alexander Duyck	a5e9c57264	i40e: clean up Tx fast path Sync the fast path for i40e_tx_map and i40e_clean_tx_irq so that they are similar to igb and ixgbe. - Only update the Tx descriptor ring in tx_map - Make skb mapping always on the first buffer in the chain - Drop the use of MAPPED_AS_PAGE Tx flag - Only store flags on the first buffer_info structure Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2013-10-09 21:58:04 -07:00
Alexander Duyck	fc4ac67bc9	i40e: Do not directly increment Tx next_to_use Avoid directly incrementing next_to_use for multiple reasons. The main reason being that if we directly increment it then it can attain a state where it is equal to the ring count. Technically this is a state it should not be able to reach but the way this is written it now can. This patch pulls the value off into a register and then increments it and writes back either the value or 0 depending on if the value is equal to the ring count. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2013-10-09 21:22:01 -07:00
Alexander Duyck	35a1e2ad17	i40e: Cleanup Tx buffer info layout - drop the mapped_as_page u8 from the Tx buffer info as it was unused - use the DMA unmap accessors for Tx DMA - replace checks of DMA with checks of the unmap length to verify if an unmap is needed - update the Tx buffer layout to make it consistent with igb, ixgbe Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2013-10-09 21:14:41 -07:00
Eric Dumazet	ba537427d7	tcp: use ACCESS_ONCE() in tcp_update_pacing_rate() sk_pacing_rate is read by sch_fq packet scheduler at any time, with no synchronization, so make sure we update it in a sensible way. ACCESS_ONCE() is how we instruct compiler to not do stupid things, like using the memory location as a temporary variable. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-10 00:08:07 -04:00
Eric Dumazet	634fb979e8	inet: includes a sock_common in request_sock TCP listener refactoring, part 5 : We want to be able to insert request sockets (SYN_RECV) into main ehash table instead of the per listener hash table to allow RCU lookups and remove listener lock contention. This patch includes the needed struct sock_common in front of struct request_sock This means there is no more inet6_request_sock IPv6 specific structure. Following inet_request_sock fields were renamed as they became macros to reference fields from struct sock_common. Prefix ir_ was chosen to avoid name collisions. loc_port -> ir_loc_port loc_addr -> ir_loc_addr rmt_addr -> ir_rmt_addr rmt_port -> ir_rmt_port iif -> ir_iif Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-10 00:08:07 -04:00
Eric Dumazet	8a29111c7c	net: gro: allow to build full sized skb skb_gro_receive() is currently limited to 16 or 17 MSS per GRO skb, typically 24616 bytes, because it fills up to MAX_SKB_FRAGS frags. It's relatively easy to extend the skb using frag_list to allow more frags to be appended into the last sk_buff. This still builds very efficient skbs, and allows reaching 45 MSS per skb. (45 MSS GRO packet uses one skb plus a frag_list containing 2 additional sk_buff) High speed TCP flows benefit from this extension by lowering TCP stack cpu usage (less packets stored in receive queue, less ACK packets processed) Forwarding setups could be hurt, as such skbs will need to be linearized, although its not a new problem, as GRO could already provide skbs with a frag_list. We could make the 65536 bytes threshold a tunable to mitigate this. (First time we need to linearize skb in skb_needs_linearize(), we could lower the tunable to ~16*1460 so that following skb_gro_receive() calls build smaller skbs) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-10 00:08:07 -04:00
baker.zhang	4c60f1d67f	fib_trie: only calc for the un-first node This is a enhancement. for the first node in fib_trie, newpos is 0, bit is 1. Only for the leaf or node with unmatched key need calc pos. Signed-off-by: baker.zhang <baker.kernel@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-10 00:08:07 -04:00
Gao feng	5c70ef85a2	veth: allow to setup multicast address for veth device We can only setup multicast address for network device when net_device_ops->ndo_set_rx_mode is not null. Some configurations need to add multicast address for net device, such as netfilter cluster match module. Add a fake ndo_set_rx_mode function to allow this operation. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-10 00:08:07 -04:00
Alexander Duyck	c304fdac6c	i40e: Drop unused completed stat The Tx "completed" stat was part of the original rewrite for detecting Tx hangs. However some time ago in ixgbe I determined that we could just use the packets stat instead. Since then this stat was removed from ixgbe and it serves no purpose in i40e so it can be dropped. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2013-10-09 21:05:10 -07:00
Anjali Singhai	6d779b41f7	i40e: Link code updates Link events should not print to the log until the device is administratively up. Signed-off-by: Anjali Singhai <anjali.singhai@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2013-10-09 20:46:28 -07:00
Ajit Khaparde	b68656b22f	be2net: change the driver version number to 4.9.224.0 Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-09 15:50:51 -04:00
Ajit Khaparde	461ae37922	be2net: Display RoCE specific counters in ethtool -S SkyHawk-R can support RoCE. Add code to display RoCE specific counters maintained in hardware. Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-09 15:50:51 -04:00
Ajit Khaparde	61000861e8	be2net: Call version 2 of GET_STATS ioctl for Skyhawk-R Moving to version 2 of GET_STATS command as SkyHawk-R supports higher number of rings. Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-09 15:50:51 -04:00
Simon Wunderlich	18c68d5960	batman-adv: reorder batadv_iv_flags The vis flag is not needed anymore, and since we do a compat bump we can start with the first bit again Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:35 +02:00
Simon Wunderlich	9284a47e8b	batman-adv: remove packed from batadv_ogm_packet As we decreased the struct size from 26 to 24 byte, we can remove __packed as the compiler will not add any more padding. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:34 +02:00
Simon Wunderlich	a1f1ac5c4d	batman-adv: reorder packet types Reordering the packet type numbers allows us to handle unicast packets in a general way - even if we don't know the specific packet type, we can still forward it. There was already code handling this for a couple of unicast packets, and this is the more generalized version to do that. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:34 +02:00
Simon Wunderlich	80067c8320	batman-adv: add build check macros for packet member offset Since we removed the __packed from most of the packets, we should make sure that the offset generated by the compiler are correct for sent/received data. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:33 +02:00
Simon Wunderlich	9f4980e68b	batman-adv: remove vis functionality This is replaced by a userspace program, we don't need this functionality to bloat the kernel. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:32 +02:00
Antonio Quartulli	0035f97e65	batman-adv: move BATADV_TT_CLIENT_TEMP to higher bit Client flags from bit 0 to 7 are sent over the wire. BATADV_TT_CLIENT_TEMP is a local flag and is not supposed to be sent to the network. Therefore it has occupy a higher bit. Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-09 21:22:32 +02:00
Antonio Quartulli	ced72933a5	batman-adv: use CRC32C instead of CRC16 in TT code CRC32C has to be preferred to CRC16 because of its possible HW native support and because of the reduced collision probability. With this change the Translation Table component now uses CRC32C to compute the local and global table checksum. Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-09 21:22:31 +02:00
Marek Lindner	122edaa059	batman-adv: tvlv - convert roaming adv packet to use tvlv unicast packets Instead of generating roaming specific packets the TVLV unicast API is used to send roaming information. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:30 +02:00
Marek Lindner	335fbe0f5d	batman-adv: tvlv - convert tt query packet to use tvlv unicast packets Instead of generating TT specific packets the TVLV unicast API is used to send translation table data. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:30 +02:00
Marek Lindner	e1bf0c1409	batman-adv: tvlv - convert tt data sent within OGMs The translation table meta data (version number, crc checksum, etc) as well as the translation table diff propgated within OGMs now uses the newly introduced tvlv infrastructure. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:29 +02:00
Marek Lindner	3f4841ffb3	batman-adv: tvlv - add network coding container Create network coding container to announce network coding capabilities (if enabled). Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:28 +02:00
Marek Lindner	17cf0ea455	batman-adv: tvlv - add distributed arp table container Create DAT container to announce DAT capabilities (if enabled). Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:27 +02:00
Marek Lindner	414254e342	batman-adv: tvlv - gateway download/upload bandwidth container Prior to this patch batman-adv read the advertised uplink bandwidth from userspace and compressed this information into a single byte called "gateway class". Now the download & upload bandwidth information is sent as-is. No userspace change is necessary since the sysfs API always allowed to specify a bandwidth. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Spyros Gasteratos <morfeas3000@gmail.com> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:27 +02:00
Marek Lindner	ef26157747	batman-adv: tvlv - basic infrastructure The goal is to provide the infrastructure for sending, receiving and parsing information 'containers' while preserving backward compatibility. TVLV (based on the commonly known Type Length Value technique) was chosen as the format for those containers. Even if a node does not know the tvlv type of a certain container it can simply skip the current container and proceed with the next. Past experience has shown features evolve over time, so a 'version' field was added right from the start to allow differentiating between feature variants - hence the name: T(ype) V(ersion) L(ength) V(alue). This patch introduces the basic TVLV infrastructure: * register / unregister tvlv containers to be sent with each OGM (on primary interfaces only) * register / unregister callback handlers to be called upon finding the corresponding tvlv type in a tvlv buffer * unicast tvlv send / receive API calls Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Spyros Gasteratos <morfeas3000@gmail.com> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:26 +02:00
Antonio Quartulli	60cf7981b7	batman-adv: switch to a new packet compatibility version With this change batman-adv is breaking compatibility with older versions and it is moving to compat-version 15. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2013-10-09 21:22:25 +02:00
Antonio Quartulli	207df49e75	MAINTAINERS: batman-adv - update emails Update my and Marek Lindner's email in the MAINTAINERS file Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:25 +02:00

1 2 3 4 5 ...

400985 Commits