linux

Author	SHA1	Message	Date
Jesse Gross	46b1e4f911	geneve: Check family when reusing sockets. When searching for an existing socket to reuse, the address family is not taken into account - only port number. This means that an IPv4 socket could be used for IPv6 traffic and vice versa, which is sure to cause problems when passing packets. It is not possible to trigger this problem currently because the only user of Geneve creates just IPv4 sockets. However, that is likely to change in the near future. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-04 22:21:33 -05:00
Jesse Gross	df5dba8e52	geneve: Remove socket hash table. The hash table for open Geneve ports is used only on creation and deletion time. It is not performance critical and is not likely to grow to a large number of items. Therefore, this can be changed to use a simple linked list. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-04 22:21:33 -05:00
Jesse Gross	829a3ada9c	geneve: Simplify locking. The existing Geneve locking scheme was pulled over directly from VXLAN. However, VXLAN has a number of built in mechanisms which make the locking more complex and are unlikely to be necessary with Geneve. This simplifies the locking to use a basic scheme of a mutex when doing updates plus RCU on receive. In addition to making the code easier to read, this also avoids the possibility of a race when creating or destroying sockets since UDP sockets and the list of Geneve sockets are protected by different locks. After this change, the entire operation is atomic. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-04 22:21:33 -05:00
Jesse Gross	61f3cade76	geneve: Remove workqueue. The work queue is used only to free the UDP socket upon destruction. This is not necessary with Geneve and generally makes the code more difficult to reason about. It also introduces nondeterministic behavior such as when a socket is rapidly deleted and recreated, which could fail as the the deletion happens asynchronously. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-04 22:21:33 -05:00
David S. Miller	7beceebf5b	Merge branch 'rhashtable-next' Thomas Graf says: ==================== rhashtable: Per bucket locks & deferred table resizing Prepares for and introduces per bucket spinlocks and deferred table resizing. This allows for parallel table mutations in different hash buckets from atomic context. The resizing occurs in the background in a separate worker thread while lookups, inserts, and removals can continue. Also modified the chain linked list to be terminated with a special nulls marker to allow entries to move between multiple lists. Last but not least, reintroduces lockless netlink_lookup() with deferred Netlink socket destruction to avoid the side effect of increased netlink_release() runtime. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-03 14:33:03 -05:00
Thomas Graf	21e4902aea	netlink: Lockless lookup with RCU grace period in socket release Defers the release of the socket reference using call_rcu() to allow using an RCU read-side protected call to rhashtable_lookup() This restores behaviour and performance gains as previously introduced by `e341694` ("netlink: Convert netlink_lookup() to use RCU protected hash table") without the side effect of severely delayed socket destruction. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-03 14:32:57 -05:00
Thomas Graf	f89bd6f87a	rhashtable: Supports for nulls marker In order to allow for wider usage of rhashtable, use a special nulls marker to terminate each chain. The reason for not using the existing nulls_list is that the prev pointer usage would not be valid as entries can be linked in two different buckets at the same time. The 4 nulls base bits can be set through the rhashtable_params structure like this: struct rhashtable_params params = { [...] .nulls_base = (1U << RHT_BASE_SHIFT), }; This reduces the hash length from 32 bits to 27 bits. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-03 14:32:57 -05:00
Thomas Graf	97defe1ecf	rhashtable: Per bucket locks & deferred expansion/shrinking Introduces an array of spinlocks to protect bucket mutations. The number of spinlocks per CPU is configurable and selected based on the hash of the bucket. This allows for parallel insertions and removals of entries which do not share a lock. The patch also defers expansion and shrinking to a worker queue which allows insertion and removal from atomic context. Insertions and deletions may occur in parallel to it and are only held up briefly while the particular bucket is linked or unzipped. Mutations of the bucket table pointer is protected by a new mutex, read access is RCU protected. In the event of an expansion or shrinking, the new bucket table allocated is exposed as a so called future table as soon as the resize process starts. Lookups, deletions, and insertions will briefly use both tables. The future table becomes the main table after an RCU grace period and initial linking of the old to the new table was performed. Optimization of the chains to make use of the new number of buckets follows only the new table is in use. The side effect of this is that during that RCU grace period, a bucket traversal using any rht_for_each() variant on the main table will not see any insertions performed during the RCU grace period which would at that point land in the future table. The lookup will see them as it searches both tables if needed. Having multiple insertions and removals occur in parallel requires nelems to become an atomic counter. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-03 14:32:57 -05:00
Thomas Graf	113948d841	spinlock: Add spin_lock_bh_nested() Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-03 14:32:57 -05:00
Thomas Graf	897362e446	nft_hash: Remove rhashtable_remove_pprev() The removal function of nft_hash currently stores a reference to the previous element during lookup which is used to optimize removal later on. This was possible because a lock is held throughout calling rhashtable_lookup() and rhashtable_remove(). With the introdution of deferred table resizing in parallel to lookups and insertions, the nftables lock will no longer synchronize all table mutations and the stored pprev may become invalid. Removing this optimization makes removal slightly more expensive on average but allows taking the resize cost out of the insert and remove path. Signed-off-by: Thomas Graf <tgraf@suug.ch> Cc: netfilter-devel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-03 14:32:57 -05:00
Thomas Graf	b8e1943e9f	rhashtable: Factor out bucket_tail() function Subsequent patches will require access to the bucket tail. Access to the tail is relatively cheap as the automatic resizing of the table should keep the number of entries per bucket to no more than 0.75 on average. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-03 14:32:57 -05:00
Thomas Graf	88d6ed15ac	rhashtable: Convert bucket iterators to take table and index This patch is in preparation to introduce per bucket spinlocks. It extends all iterator macros to take the bucket table and bucket index. It also introduces a new rht_dereference_bucket() to handle protected accesses to buckets. It introduces a barrier() to the RCU iterators to the prevent the compiler from caching the first element. The lockdep verifier is introduced as stub which always succeeds and properly implement in the next patch when the locks are introduced. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-03 14:32:56 -05:00
Thomas Graf	a4b18cda4c	rhashtable: Use rht_obj() instead of manual offset calculation Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-03 14:32:56 -05:00
Thomas Graf	8d24c0b431	rhashtable: Do hashing inside of rhashtable_lookup_compare() Hash the key inside of rhashtable_lookup_compare() like rhashtable_lookup() does. This allows to simplify the hashing functions and keep them private. Signed-off-by: Thomas Graf <tgraf@suug.ch> Cc: netfilter-devel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-03 14:32:56 -05:00
David S. Miller	dd95539888	Merge branch 'timecounter-next' Richard Cochran says: ==================== Fixing the "Time Counter fixes and improvements" For this series I had only tested the build with ARCH=x86 and arm, but others like sparc64, microblaze, powerpc, and s390 will fail because they somehow don't indirectly include clocksource.h for the drivers in question. This series fixes the build issues reported by: kbuild test robot <fengguang.wu@intel.com> ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:47:51 -05:00
Richard Cochran	5ce07a5cef	microblaze: include the new timecounter header. The timecounter/cyclecounter code has moved, so users need the new include. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:47:36 -05:00
Richard Cochran	d9f393734a	mlx4: include clocksource.h again This driver uses the function, clocksource_khz2mult, and so it really must include clocksource.h. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:47:36 -05:00
Richard Cochran	d312da293f	ixgbe: convert to CYCLECOUNTER_MASK macro. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:47:36 -05:00
Richard Cochran	b57c894040	igb: convert to CYCLECOUNTER_MASK macro. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:47:36 -05:00
Richard Cochran	4d045b4c06	e1000e: convert to CYCLECOUNTER_MASK macro. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:47:35 -05:00
Richard Cochran	f28ba401db	bnx2x: convert to CYCLECOUNTER_MASK macro. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:47:35 -05:00
Richard Cochran	1891172aa5	timecounter: provide a macro to initialize the cyclecounter mask field. There is no need for users of the timecounter/cyclecounter code to include clocksource.h just for a single macro. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:47:35 -05:00
Pravin B Shelar	b422da7c36	MAINTAINERS: Update Open vSwitch entry. OVS development is moved to netdev mailing list. Update tree and list in MAINTAINERS file. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:46:20 -05:00
David S. Miller	c2f471f9c3	Changes: * ath9k: enable Transmit Power Control (TPC) for ar9003 chips * rtlwifi: cleanup and updates from the vendor driver * rsi: fix memory leak related to firmware image * ath: parameter fix for FCC DFS pattern -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAABAgAGBQJUpqFFAAoJEG4XJFUm622brwUH/iLBBtWvqbhFMKDlA9eUG0hD z+LQkPtNR5gLYk11Qne7H8BjILDWgzdQcbyAUEpCnuplDThCDfj+8JB51gfNGpqU pv9XwVO2Nf0afh4+hJBkBREI0vAJDod860AG+PV3E5G/WZZyt2MDxF9mk3IbvKVd APR7cnUxsAltxjr7IWvPFY43wtbRJHbGM8EUVGkXDBaARaPipTJ7GqhwUyv45jCo LRnW0VQ5njMZZD6DfteB9BiE+2GfZF9Ay4aOzRVJGc83NMHDBQxD7VSHVrBBgCt1 L+Ikz8O0UHG9TsoAGZqEcJ12o2iGWjVFm4TecEYsuhRA1fJmXJOispl/lnudZus= =ZbEI -----END PGP SIGNATURE----- Merge tag 'wireless-drivers-next-for-davem-2015-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Changes: * ath9k: enable Transmit Power Control (TPC) for ar9003 chips * rtlwifi: cleanup and updates from the vendor driver * rsi: fix memory leak related to firmware image * ath: parameter fix for FCC DFS pattern Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:40:44 -05:00
Rickard Strandqvist	3adc0becfe	net: ethernet: cisco: enic: enic_dev: Remove some unused functions Removes some functions that are not used anywhere: enic_dev_enable2_done() enic_dev_enable2() enic_dev_deinit_done() enic_dev_init_prov2() enic_vnic_dev_deinit() This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:36:08 -05:00
Rickard Strandqvist	50ab97d71b	isdn: hisax: hfc4s8s_l1: Remove some unused functions Removes some functions that are not used anywhere: Read_hfc32() Write_hfc32() Write_hfc16() This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:36:08 -05:00
Rickard Strandqvist	6af01a70f4	net: fddi: skfp: smt.c: Remove unused function Remove the function smt_ifconfig() that is not used anywhere. This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:36:07 -05:00
Rickard Strandqvist	7841d5d622	net: ethernet: chelsio: cxgb3: mc5.c: Remove some unused functions Removes some functions that are not used anywhere: dbgi_rd_rsp3() dbgi_wr_addr3() This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:32:37 -05:00
Florian Westphal	e8768f9715	net: skbuff: don't zero tc members when freeing skb Not needed, only four cases: - kfree_skb (or one of its aliases). Don't need to zero, memory will be freed. - kfree_skb_partial and head was stolen: memory will be freed. - skb_morph: The skb header fields (including tc ones) will be copied over from the 'to-be-morphed' skb right after skb_release_head_state returns. - skb_segment: Same as before, all the skb header fields are copied over from the original skb right away. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:04:29 -05:00
David S. Miller	6c032edc8a	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg say: ==================== pull request: bluetooth-next 2014-12-31 Here's the first batch of bluetooth patches for 3.20. - Cleanups & fixes to ieee802154 drivers - Fix synchronization of mgmt commands with respective HCI commands - Add self-tests for LE pairing crypto functionality - Remove 'BlueFritz!' specific handling from core using a new quirk flag - Public address configuration support for ath3012 - Refactor debugfs support into a dedicated file - Initial support for LE Data Length Extension feature from Bluetooth 4.2 Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 15:58:21 -05:00
Joe Stringer	a4c9ea5e8f	geneve: Add Geneve GRO support This results in an approximately 30% increase in throughput when handling encapsulated bulk traffic. Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 15:46:41 -05:00
Jesse Gross	9b174d88c2	net: Add Transparent Ethernet Bridging GRO support. Currently the only tunnel protocol that supports GRO with encapsulated Ethernet is VXLAN. This pulls out the Ethernet code into a proper layer so that it can be used by other tunnel protocols such as GRE and Geneve. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 15:46:41 -05:00
Shaohui Xie	05930b5ec1	net/fsl: remove hardcoded clock setting from xgmac_mdio There is no need to set the clock speed in read/write which will be performed unnecessarily for each mdio access. Init it during probe is enough. Also, the hardcoded clock value is not a proper way for all SoCs. Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 15:39:49 -05:00
Shaohui Xie	aa84247804	net/fsl: remove irq assignment from xgmac_mdio Which is wrong and not used, so no extra space needed by mdiobus_alloc_size(), use mdiobus_alloc() instead. Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 15:39:49 -05:00
Shaohui Xie	cef27f971e	net/fsl: remove reset from xgmac_mdio Since the reset is just clock setting, individual mdio reset is not available. Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 15:39:49 -05:00
David S. Miller	01aa29978b	Merge branch 'gmac-next' Roger Chen says: ==================== support GMAC driver for RK3288 Roger Chen (6): patch1: add driver for Rockchip RK3288 SoCs integrated GMAC patch2: define clock ID used for GMAC patch3: modify CRU config for Rockchip RK3288 SoCs integrated GMAC patch4: dts: rockchip: add gmac info for rk3288 patch5: dts: rockchip: enable gmac on RK3288 evb board patch6: add document for Rockchip RK3288 GMAC Tested on rk3288 evb board: Execute the following command to enable ethernet, set local IP and ping a remote host. busybox ifconfig eth0 up busybox ifconfig eth0 192.168.1.111 ping 192.168.1.1 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 19:14:52 -05:00
Roger Chen	53a8393037	GMAC: add document for Rockchip RK3288 GMAC The document descripts how to add properties for GMAC in device tree. change since v2: 1. remove power-gpio, reset-gpio, phyirq-gpio, pmu_regulator setting 2. add "snps,reset-gpio", "snps,reset-active-low;" "snps,reset-delays-us" Signed-off-by: Roger Chen <roger.chen@rock-chips.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 19:14:43 -05:00
Roger Chen	e35e47ac52	ARM: dts: rockchip: enable gmac on RK3288 evb board enable gmac in rk3288-evb-rk808.dts changes since v2: 1. add fixed regulator for PHY 2. remove power-gpio, reset-gpio, phyirq-gpio, pmu_regulator setting 3. add "snps,reset-gpio", "snps,reset-active-low;" "snps,reset-delays-us" Signed-off-by: Roger Chen <roger.chen@rock-chips.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 19:14:18 -05:00
Roger Chen	3d3fb74afc	ARM: dts: rockchip: add gmac info for rk3288 add gmac info in rk3288.dtsi for GMAC driver changes since v2: 1. add drive-strength in the pinctrl settings Signed-off-by: Roger Chen <roger.chen@rock-chips.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 19:14:18 -05:00
Roger Chen	7f186025c7	GMAC: modify CRU config for Rockchip RK3288 SoCs integrated GMAC modify CRU config for GMAC driver changes since v2: 1. remove SCLK_MAC_PLL Signed-off-by: Roger Chen <roger.chen@rock-chips.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 19:14:18 -05:00
Roger Chen	3cf8e53a48	GMAC: define clock ID used for GMAC changes since v2: 1. remove SCLK_MAC_PLL Signed-off-by: Roger Chen <roger.chen@rock-chips.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 19:14:18 -05:00
Roger Chen	7ad269ea1a	GMAC: add driver for Rockchip RK3288 SoCs integrated GMAC This driver is based on stmmac driver. changes since v2: - use tab instead of space for macros - use HIWORD_UPDATE macro for GMAC_CLK_RX_DL_CFG and GMAC_CLK_TX_DL_CFG - remove drive-strength setting in the driver and set it in the pinctrl settings - use dev_err instead of pr_err - remove clock names's macros, just use the real name of the clock - use devm_clk_get() instead of clk_get() - remove clk_set_parent(bsp_priv->clk_mac, bsp_priv->clk_mac_pll) - remove gpio setting for LDO, just use regulator API - remove phy reset using gpio in the glue layer, it has been handled in the stmmac driver - remove handling phy interrupt (mii interrupt) changes since v1: - use BIT() to set register - combine two remap_write() operations into one for the same register - use macros for register value setting - remove grf fail check in rk_gmac_setup() and save all the check in set_rgmii_speed() - remove .tx_coe=1 in rk_gmac_data Signed-off-by: Roger Chen <roger.chen@rock-chips.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 19:14:05 -05:00
David S. Miller	9aacfb2023	igb_ptp: Include clocksource.h to get CLOCKSOURCE_MASK. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:32:40 -05:00
David S. Miller	54da5083b7	e1000e: Include clocksource.h to get CLOCKSOURCE_MASK. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:32:25 -05:00
David S. Miller	e495f78d78	Merge branch 'fib_trie-next' Alexander Duyck says: ==================== fib_trie: Reduce time spent in fib_table_lookup by 35 to 75% These patches are meant to address several performance issues I have seen in the fib_trie implementation, and fib_table_lookup specifically. With these changes in place I have seen a reduction of up to 35 to 75% for the total time spent in fib_table_lookup depending on the type of search being performed. On a VM running in my Corei7-4930K system with a trie of maximum depth of 7 this resulted in a reduction of over 370ns per packet in the total time to process packets received from an ixgbe interface and route them to a dummy interface. This represents a failed lookup in the local trie followed by a successful search in the main trie. Baseline Refactor ixgbe->dummy routing 1.20Mpps 2.21Mpps ------------------------------------------------------------ processing time per packet 835ns 453ns fib_table_lookup 50.1% 418ns 25.0% 113ns check_leaf.isra.9 7.9% 66ns -- -- ixgbe_clean_rx_irq 5.3% 44ns 9.8% 44ns ip_route_input_noref 2.9% 25ns 4.6% 21ns pvclock_clocksource_read 2.6% 21ns 4.6% 21ns ip_rcv 2.6% 22ns 4.0% 18ns In the simple case of receiving a frame and dropping it before it can reach the socket layer I saw a reduction of 40ns per packet. This represents a trip through the local trie with the correct leaf found with no need for any backtracing. Baseline Refactor ixgbe->local receive 2.65Mpps 2.96Mpps ------------------------------------------------------------ processing time per packet 377ns 337ns fib_table_lookup 25.1% 95ns 25.8% 87ns ixgbe_clean_rx_irq 8.7% 33ns 9.0% 30ns check_leaf.isra.9 7.2% 27ns -- -- ip_rcv 5.7% 21ns 6.5% 22ns These changes have resulted in several functions being inlined such as check_leaf and fib_find_node, but due to the code simplification the overall size of the code has been reduced. text data bss dec hex filename 16932 376 16 17324 43ac net/ipv4/fib_trie.o - before 15259 376 8 15643 3d1b net/ipv4/fib_trie.o - after Changes since RFC: Replaced this_cpu_ptr with correct call to this_cpu_inc in patch 1 Changed test for leaf_info mismatch to (key ^ n->key) & li->mask_plen in patch 10 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:26:02 -05:00
Alexander Duyck	5405afd1a3	fib_trie: Add tracking value for suffix length This change adds a tracking value for the maximum suffix length of all prefixes stored in any given tnode. With this value we can determine if we need to backtrace or not based on if the suffix is greater than the pos value. By doing this we can reduce the CPU overhead for lookups in the local table as many of the prefixes there are 32b long and have a suffix length of 0 meaning we can immediately backtrace to the root node without needing to test any of the nodes between it and where we ended up. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:55 -05:00
Alexander Duyck	21d1f11db0	fib_trie: Remove checks for index >= tnode_child_length from tnode_get_child For some reason the compiler doesn't seem to understand that when we are in a loop that runs from tnode_child_length - 1 to 0 we don't expect the value of tn->bits to change. As such every call to tnode_get_child was rerunning tnode_chile_length which ended up consuming quite a bit of space in the resultant assembly code. I have gone though and verified that in all cases where tnode_get_child is used we are either winding though a fixed loop from tnode_child_length - 1 to 0, or are in a fastpath case where we are verifying the value by either checking for any remaining bits after shifting index by bits and testing for leaf, or by using tnode_child_length. size net/ipv4/fib_trie.o Before: text data bss dec hex filename 15506 376 8 15890 3e12 net/ipv4/fib_trie.o After: text data bss dec hex filename 14827 376 8 15211 3b6b net/ipv4/fib_trie.o Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:55 -05:00
Alexander Duyck	12c081a5c8	fib_trie: inflate/halve nodes in a more RCU friendly way This change pulls the node_set_parent functionality out of put_child_reorg and instead leaves that to the function to take care of as well. By doing this we can fully construct the new cluster of tnodes and all of the pointers out of it before we start routing pointers into it. I am suspecting this will likely fix some concurency issues though I don't have a good test to show as such. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:55 -05:00
Alexander Duyck	fc86a93b46	fib_trie: Push tnode flushing down to inflate/halve This change pushes the tnode freeing down into the inflate and halve functions. It makes more sense here as we have a better grasp of what is going on and when a given cluster of nodes is ready to be freed. I believe this may address a bug in the freeing logic as well. For some reason if the freelist got to a certain size we would call synchronize_rcu(). I'm assuming that what they meant to do is call synchronize_rcu() after they had handed off that much memory via call_rcu(). As such that is what I have updated the behavior to be. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:55 -05:00
Alexander Duyck	ff181ed876	fib_trie: Push assignment of child to parent down into inflate/halve This change makes it so that the assignment of the tnode to the parent is handled directly within whatever function is currently handling the node be it inflate, halve, or resize. By doing this we can avoid some of the need to set NULL pointers in the tree while we are resizing the subnodes. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-31 18:25:55 -05:00

1 2 3 4 5 ...

494876 Commits