linux

Author	SHA1	Message	Date
Heiner Kallweit	215d08a85b	net: phy: consider PHY_IGNORE_INTERRUPT in phy_start_aneg_priv The situation described in the comment can occur also with PHY_IGNORE_INTERRUPT, therefore change the condition to include it. Fixes: `f555f34fdc` ("net: phy: fix auto-negotiation stall due to unavailable interrupt") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 16:20:38 -07:00
David S. Miller	1a03f867aa	Merge branch 'qed-Fix-series-II' Sudarsana Reddy Kalluru says: ==================== qed: Fix series II. The patch series fixes few issues in the qed driver. Please consider applying it to 'net' branch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 16:19:04 -07:00
Sudarsana Reddy Kalluru	25c020a909	qed: Correct Multicast API to reflect existence of 256 approximate buckets. FW hsi contains 256 approximation buckets which are split in ramrod into eight u32 values, but driver is using eight 'unsigned long' variables. This patch fixes the mcast logic by making the API utilize u32. Fixes: `83aeb933` ("qed*: Trivial modifications") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 16:19:04 -07:00
Sudarsana Reddy Kalluru	58874c7b24	qed: Fix possible race for the link state value. There's a possible race where driver can read link status in mid-transition and see that virtual-link is up yet speed is 0. Since in this mid-transition we're guaranteed to see a mailbox from MFW soon, we can afford to treat this as link down. Fixes: `cc875c2e` ("qed: Add link support") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 16:19:04 -07:00
Sudarsana Reddy Kalluru	4ad95a93a7	qed: Fix link flap issue due to mismatching EEE capabilities. Apparently, MFW publishes EEE capabilities even for Fiber-boards that don't support them, and later since qed internally sets adv_caps it would cause link-flap avoidance (LFA) to fail when driver would initiate the link. This in turn delays the link, causing traffic to fail. Driver has been modified to not to ask MFW for any EEE config if EEE isn't to be enabled. Fixes: `645874e5` ("qed: Add support for Energy efficient ethernet.") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 16:19:04 -07:00
Gustavo A. R. Silva	baa2d2b17e	net: sched: use PTR_ERR_OR_ZERO macro in tcf_block_cb_register This line makes up what macro PTR_ERR_OR_ZERO already does. So, make use of PTR_ERR_OR_ZERO rather than an open-code version. This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 16:17:08 -07:00
YueHaibing	64119e05f7	net: caif: Add a missing rcu_read_unlock() in caif_flow_cb Add a missing rcu_read_unlock in the error path Fixes: `c95567c803` ("caif: added check for potential null return") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 16:14:39 -07:00
Linus Torvalds	490fc05386	mm: make vm_area_alloc() initialize core fields Like vm_area_dup(), it initializes the anon_vma_chain head, and the basic mm pointer. The rest of the fields end up being different for different users, although the plan is to also initialize the 'vm_ops' field to a dummy entry. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-07-21 15:24:03 -07:00
Linus Torvalds	95faf6992d	mm: make vm_area_dup() actually copy the old vma data .. and re-initialize th eanon_vma_chain head. This removes some boiler-plate from the users, and also makes it clear why it didn't need use the 'zalloc()' version. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-07-21 14:48:45 -07:00
Linus Torvalds	3928d4f5ee	mm: use helper functions for allocating and freeing vm_area structs The vm_area_struct is one of the most fundamental memory management objects, but the management of it is entirely open-coded evertwhere, ranging from allocation and freeing (using kmem_cache_[z]alloc and kmem_cache_free) to initializing all the fields. We want to unify this in order to end up having some unified initialization of the vmas, and the first step to this is to at least have basic allocation functions. Right now those functions are literally just wrappers around the kmem_cache_*() calls. This is a purely mechanical conversion: # new vma: kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL) -> vm_area_alloc() # copy old vma kmem_cache_alloc(vm_area_cachep, GFP_KERNEL) -> vm_area_dup(old) # free vma kmem_cache_free(vm_area_cachep, vma) -> vm_area_free(vma) to the point where the old vma passed in to the vm_area_dup() function isn't even used yet (because I've left all the old manual initialization alone). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-07-21 13:48:51 -07:00
Linus Torvalds	191a3afa98	Merge branch 'akpm' (patches from Andrew) Merge fixes from Andrew Morton: "5 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: memcg: fix use after free in mem_cgroup_iter() mm/huge_memory.c: fix data loss when splitting a file pmd fat: fix memory allocation failure handling of match_strdup() MAINTAINERS: Peter has moved mm/memblock: add missing include <linux/bootmem.h>	2018-07-21 13:14:17 -07:00
Jing Xia	9f15bde671	mm: memcg: fix use after free in mem_cgroup_iter() It was reported that a kernel crash happened in mem_cgroup_iter(), which can be triggered if the legacy cgroup-v1 non-hierarchical mode is used. Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b8f ...... Call trace: mem_cgroup_iter+0x2e0/0x6d4 shrink_zone+0x8c/0x324 balance_pgdat+0x450/0x640 kswapd+0x130/0x4b8 kthread+0xe8/0xfc ret_from_fork+0x10/0x20 mem_cgroup_iter(): ...... if (css_tryget(css)) <-- crash here break; ...... The crashing reason is that mem_cgroup_iter() uses the memcg object whose pointer is stored in iter->position, which has been freed before and filled with POISON_FREE(0x6b). And the root cause of the use-after-free issue is that invalidate_reclaim_iterators() fails to reset the value of iter->position to NULL when the css of the memcg is released in non- hierarchical mode. Link: http://lkml.kernel.org/r/1531994807-25639-1-git-send-email-jing.xia@unisoc.com Fixes: `6df38689e0` ("mm: memcontrol: fix possible memcg leak due to interrupted reclaim") Signed-off-by: Jing Xia <jing.xia.mail@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: <chunyan.zhang@unisoc.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-07-21 12:50:46 -07:00
Hugh Dickins	e1f1b1572e	mm/huge_memory.c: fix data loss when splitting a file pmd __split_huge_pmd_locked() must check if the cleared huge pmd was dirty, and propagate that to PageDirty: otherwise, data may be lost when a huge tmpfs page is modified then split then reclaimed. How has this taken so long to be noticed? Because there was no problem when the huge page is written by a write system call (shmem_write_end() calls set_page_dirty()), nor when the page is allocated for a write fault (fault_dirty_shared_page() calls set_page_dirty()); but when allocated for a read fault (which MAP_POPULATE simulates), no set_page_dirty(). Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1807111741430.1106@eggly.anvils Fixes: `d21b9e57c7` ("thp: handle file pages in split_huge_pmd()") Signed-off-by: Hugh Dickins <hughd@google.com> Reported-by: Ashwin Chaugule <ashwinch@google.com> Reviewed-by: Yang Shi <yang.shi@linux.alibaba.com> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: <stable@vger.kernel.org> [4.8+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-07-21 12:50:46 -07:00
OGAWA Hirofumi	35033ab988	fat: fix memory allocation failure handling of match_strdup() In parse_options(), if match_strdup() failed, parse_options() leaves opts->iocharset in unexpected state (i.e. still pointing the freed string). And this can be the cause of double free. To fix, this initialize opts->iocharset always when freeing. Link: http://lkml.kernel.org/r/8736wp9dzc.fsf@mail.parknet.co.jp Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reported-by: syzbot+90b8e10515ae88228a92@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-07-21 12:50:46 -07:00
Peter Senna Tschudin	5a6964944c	MAINTAINERS: Peter has moved Update my E-mail address in the MAINTAINERS file. Link: http://lkml.kernel.org/r/20180710144702.1308-1-peter.senna@gmail.com Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk> Acked-by: Martyn Welch <martyn.welch@collabora.co.uk> Cc: David S. Miller <davem@davemloft.net> Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Martin Donnelly <martin.donnelly@ge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-07-21 12:50:46 -07:00
Mathieu Malaterre	1937367205	mm/memblock: add missing include <linux/bootmem.h> Commit `26f09e9b3a` ("mm/memblock: add memblock memory allocation apis") introduced two new function definitions: memblock_virt_alloc_try_nid_nopanic() memblock_virt_alloc_try_nid() and commit `ea1f5f3712` ("mm: define memblock_virt_alloc_try_nid_raw") introduced the following function definition: memblock_virt_alloc_try_nid_raw() This commit adds an include of header file <linux/bootmem.h> to provide the missing function prototypes. This silences the following gcc warning (W=1): mm/memblock.c:1334:15: warning: no previous prototype for `memblock_virt_alloc_try_nid_raw' [-Wmissing-prototypes] mm/memblock.c:1371:15: warning: no previous prototype for `memblock_virt_alloc_try_nid_nopanic' [-Wmissing-prototypes] mm/memblock.c:1407:15: warning: no previous prototype for `memblock_virt_alloc_try_nid' [-Wmissing-prototypes] Also adds #ifdef blockers to prevent compilation failure on mips/ia64 where CONFIG_NO_BOOTMEM=n as could be seen in commit commit `6cc22dc08a` ("revert "mm/memblock: add missing include <linux/bootmem.h>""). Because Makefile already does: obj-$(CONFIG_HAVE_MEMBLOCK) += memblock.o The #ifdef has been simplified from: #if defined(CONFIG_HAVE_MEMBLOCK) && defined(CONFIG_NO_BOOTMEM) to simply: #if defined(CONFIG_NO_BOOTMEM) Link: http://lkml.kernel.org/r/20180626184422.24974-1-malat@debian.org Signed-off-by: Mathieu Malaterre <malat@debian.org> Suggested-by: Tony Luck <tony.luck@intel.com> Suggested-by: Michal Hocko <mhocko@kernel.org> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-07-21 12:50:46 -07:00
David S. Miller	d1afdc5139	Merge branch 'tcp-improve-setsockopt-TCP_USER_TIMEOUT-accuracy' Jon Maxwell says: ==================== tcp: improve setsockopt() TCP_USER_TIMEOUT accuracy The patch was becoming bigger based on feedback therefore I have implemented a series of 3 commits instead in V4. This series is a continuation based on V3 here and associated feedback: https://patchwork.kernel.org/patch/10516195/ Suggestions by Neal Cardwell: 1) Fix up units mismatch regarding msec/jiffies. 2) Address possiblility of time_remaining being negative. 3) Add a helper routine tcp_clamp_rto_to_user_timeout() to do the rto calculation. 4) Move start_ts logic into helper routine tcp_retrans_stamp() to validate tcp_sk(sk)->retrans_stamp. 5) Some u32 declation and return refactoring. 6) Return 0 instead of false in tcp_retransmit_stamp(), it's not a bool. Suggestions by David Laight: 1) Don't cache rto in tcp_clamp_rto_to_user_timeout(). Suggestions by Eric Dumazet: 1) Make u32 declartions consistent. 2) Use patch series for easier review. 3) Convert icsk->icsk_user_timeout to millisconds to avoid jiffie to msec dance. 4) Use seperate titles for each commit in the series. 5) Fix fuzzy indentation and line wrap issues. 6) Make commit titles descriptive. Changes: 1) Call tcp_clamp_rto_to_user_timeout(sk) as an argument to inet_csk_reset_xmit_timer() to save on rto declaration. Every time the TCP retransmission timer fires. It checks to see if there is a timeout before scheduling the next retransmit timer. The retransmit interval between each retransmission increases exponentially. The issue is that in order for the timeout to occur the retransmit timer needs to fire again. If the user timeout check happens after the 9th retransmit for example. It needs to wait for the 10th retransmit timer to fire in order to evaluate whether a timeout has occurred or not. If the interval is large enough then the timeout will be inaccurate. For example with a TCP_USER_TIMEOUT of 10 seconds without patch: 1st retransmit: 22:25:18.973488 IP host1.49310 > host2.search-agent: Flags [.] Last retransmit: 22:25:26.205499 IP host1.49310 > host2.search-agent: Flags [.] Timeout: send: Connection timed out Sun Jul 1 22:25:34 EDT 2018 We can see that last retransmit took ~7 seconds. Which pushed the total timeout to ~15 seconds instead of the expected 10 seconds. This gets more inaccurate the larger the TCP_USER_TIMEOUT value. As the interval increases. Add tcp_clamp_rto_to_user_timeout() to determine if the user rto has expired. Or whether the rto interval needs to be recalculated. Use the original interval if user rto is not set. Test results with the patch is the expected 10 second timeout: 1st retransmit: 01:37:59.022555 IP host1.49310 > host2.search-agent: Flags [.] Last retransmit: 01:38:06.486558 IP host1.49310 > host2.search-agent: Flags [.] Timeout: send: Connection timed out Mon Jul 2 01:38:09 EDT 2018 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:28:55 -07:00
Jon Maxwell	b701a99e43	tcp: Add tcp_clamp_rto_to_user_timeout() helper to improve accuracy Create the tcp_clamp_rto_to_user_timeout() helper routine. To calculate the correct rto, so that the TCP_USER_TIMEOUT socket option is more accurate. Taking suggestions and feedback into account from Eric Dumazet, Neal Cardwell and David Laight. Due to the 1st commit we can avoid the msecs_to_jiffies() and jiffies_to_msecs() dance. Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:28:55 -07:00
Jon Maxwell	a7fa37703d	tcp: Add tcp_retransmit_stamp() helper routine Create a seperate helper routine as per Neal Cardwells suggestion. To be used by the final commit in this series and retransmits_timed_out(). Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:28:55 -07:00
Jon Maxwell	9bcc66e198	tcp: convert icsk_user_timeout from jiffies to msecs This is a preparatory commit. Part of this series that improves the socket TCP_USER_TIMEOUT option accuracy. Implement Eric Dumazets idea to convert icsk->icsk_user_timeout from jiffies to msecs. To eliminate the msecs_to_jiffies() and jiffies_to_msecs() dance in future. Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:28:55 -07:00
Jarod Wilson	c1f897ce18	bonding: set default miimon value for non-arp modes if not set For some time now, if you load the bonding driver and configure bond parameters via sysfs using minimal config options, such as specifying nothing but the mode, relying on defaults for everything else, modes that cannot use arp monitoring (802.3ad, balance-tlb, balance-alb) all wind up with both arp_interval=0 (as it should be) and miimon=0, which means the miimon monitor thread never actually runs. This is particularly problematic for 802.3ad. For example, from an LNST recipe I've set up: $ modprobe bonding max_bonds=0" $ echo "+t_bond0" > /sys/class/net/bonding_masters" $ ip link set t_bond0 down" $ echo "802.3ad" > /sys/class/net/t_bond0/bonding/mode" $ ip link set ens1f1 down" $ echo "+ens1f1" > /sys/class/net/t_bond0/bonding/slaves" $ ip link set ens1f0 down" $ echo "+ens1f0" > /sys/class/net/t_bond0/bonding/slaves" $ ethtool -i t_bond0" $ ip link set ens1f1 up" $ ip link set ens1f0 up" $ ip link set t_bond0 up" $ ip addr add 192.168.9.1/24 dev t_bond0" $ ip addr add 2002::1/64 dev t_bond0" This bond comes up okay, but things look slightly suspect in /proc/net/bonding/t_bond0 output: $ grep -i mii /proc/net/bonding/t_bond0 MII Status: up MII Polling Interval (ms): 0 MII Status: up MII Status: up Now, pull a cable on one of the ports in the bond, then reconnect it, and you'll see: Slave Interface: ens1f0 MII Status: down Speed: 1000 Mbps Duplex: full I believe this became a major issue as of commit `4d2c0cda07`, which for 802.3ad bonds, sets slave->link = BOND_LINK_DOWN, with a comment about relying on link monitoring via miimon to set it correctly, but since the miimon work queue never runs, the link just stays marked down. If we simply tweak bond_option_mode_set() slightly, we can check for the non-arp modes having no miimon value set, and insert BOND_DEFAULT_MIIMON, which gets things back in full working order. This problem exists as far back as 4.14, and might be worth fixing in all stable trees since, though the work-around is to simply specify an miimon value yourself. Reported-by: Bob Ball <ball@umich.edu> Signed-off-by: Jarod Wilson <jarod@redhat.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:26:21 -07:00
David S. Miller	a6fc8594a5	mlx5-fixes-2018-07-18 -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJbT+cLAAoJEEg/ir3gV/o+I5QH/3LQemGzH33iNsg4khpPeNA+ Q4mGd2jqbwfL17FTSGpTsPje6rpwzR+j8W1fGTx1vzYmE79ZyDu4EwHS7YZJcGyz q8P0HgrUe4NrJV8mlOpbIRbTuSwfqultw2qRpmCfLf5kK1nqSIPpUHIfBUMqwy0o O7GJrytUI4Av+r5Px/6bjb5kBaVe5YBe0tg8nSrN2vtzHVQWm+5/uaNRW2SrCN+4 5SI2AsWyMwfGCC+IE8i9OlIFCy6Iu2vwcUabK+6EeGKP4Wb6rukyG01TkQPSd7gy ozcAjvj+ppHmVFath1uzLCFU3RbKt6GbVRGaFQg5jO5vvK3uzFJnm59Vqw/WzNs= =UXsy -----END PGP SIGNATURE----- Merge tag 'mlx5-fixes-2018-07-18' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2018-07-18 The following series provides fixes to mlx5 core and net device driver. Please pull and let me know if there's any problem. For -stable v4.7 net/mlx5e: Don't allow aRFS for encapsulated packets net/mlx5e: Fix quota counting in aRFS expire flow For -stable v4.15 net/mlx5e: Only allow offloading decap egress (egdev) flows net/mlx5e: Refine ets validation function net/mlx5: Adjust clock overflow work period For -stable v4.17 net/mlx5: E-Switch, UBSAN fix undefined behavior in mlx5_eswitch_mode ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:18:28 -07:00
David S. Miller	975cd350c2	Merge branch 's390-qeth-updates' Julian Wiedmann says: ==================== s390/qeth: updates 2018-07-19 please apply one more round of qeth patches to net-next. This brings additional performance improvements for the transmit code, and some refactoring to pave the way for using netdev_priv. Also, two minor fixes for rare corner cases. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:12:30 -07:00
Julian Wiedmann	5f89eca577	s390/qeth: speed up L2 IQD xmit Modify the L2 OSA xmit path so that it also supports L2 IQD devices (in particular, their HW header requirements). This allows IQD devices to advertise NETIF_F_SG support, and eliminates the allocation overhead for the HW header. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:12:30 -07:00
Julian Wiedmann	a7c2f4a332	s390/qeth: add support for constrained HW headers Some transmit modes require that the HW header is located in the same page as the initial protocol headers in skb->data. Let callers specify the size of this contiguous header range, and enforce it when building the HW header. While at it, apply some gentle renaming to the relevant L2 code so that it matches the L3 code. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:12:30 -07:00
Julian Wiedmann	ba86ceee9d	s390/qeth: merge linearize-check into HW header construction When checking whether an skb needs to be linearized to fit into an IO buffer, it's desirable to consider the skb's final size and layout (ie. after the HW header was added). But a subsequent linearization can then cause the re-positioned HW header to violate its alignment restrictions. Dealing with this situation in two different code paths is quite tricky. This patch integrates a) linearize-check and b) HW header construction into one 3 step-sequence: 1. evaluate how the HW header needs to be added (to identify if it takes up an additional buffer element), then 2. check if the required buffer elements exceed the device's limit. Linearize when necessary and re-evaluate the HW header placement. 3. Add the HW header in the best-possible way: a) push, without taking up an additional buffer element b) push, but consume another buffer element c) allocate a header object from the cache. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:12:30 -07:00
Julian Wiedmann	d2a274b25b	s390/qeth: add statistics for consumed buffer elements Nowadays an skb fragment typically spans over multiple pages. So replace the obsolete, SG-only 'fragments' counter with one that tracks the consumed buffer elements. This is what actually matters for performance. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:12:30 -07:00
Julian Wiedmann	72f219da79	s390/qeth: use core MTU range checking qeth's ndo_change_mtu() only applies some trivial bounds checking. Set up dev->min_mtu properly, so that dev_set_mtu() can do this for us. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:12:30 -07:00
Julian Wiedmann	8ce7a9e064	s390/qeth: simplify max MTU handling When the MPC initialization code discovers the HW-specific max MTU, apply the resulting changes straight to the netdevice. If this is the device's first initialization, also set its MTU (HiperSockets: the max MTU; else: a layer-specific default value). Then cap the current MTU by the new max MTU. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:12:30 -07:00
Julian Wiedmann	92d2720969	s390/qeth: don't cache HW port number The netdevice is always available now, so get the portno from there. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:12:30 -07:00
Julian Wiedmann	d3d1b205e8	s390/qeth: allocate netdevice early Allocation of the netdevice is currently delayed until a qeth card first goes online. This complicates matters in several places, where we need to cache values instead of applying them straight to the netdevice. Improve on this by moving the allocation up to where the qeth card itself is created. This is also one step in direction of eventually placing the qeth card into netdev_priv(). In all subsequent code, remove the now redundant checks whether card->dev is valid. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:12:29 -07:00
Julian Wiedmann	addc5ee872	s390/qeth: remove redundant netif_carrier_ok() checks netif_carrier_off() does its own checking. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:12:29 -07:00
Julian Wiedmann	70551dc46f	s390/qeth: reset layer2 attribute on layer switch After the subdriver's remove() routine has completed, the card's layer mode is undetermined again. Reflect this in the layer2 field. If qeth_dev_layer2_store() hits an error after remove() was called, the card _always_ requires a setup(), even if the previous layer mode is requested again. But qeth_dev_layer2_store() bails out early if the requested layer mode still matches the current one. So unless we reset the layer2 field, re-probing the card back to its previous mode is currently not possible. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:12:29 -07:00
Julian Wiedmann	a702349a40	s390/qeth: fix race in used-buffer accounting By updating q->used_buffers only _after_ do_QDIO() has completed, there is a potential race against the buffer's TX completion. In the unlikely case that the TX completion path wins, qeth_qdio_output_handler() would decrement the counter before qeth_flush_buffers() even incremented it. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 10:12:29 -07:00
David S. Miller	d528114bd3	Merge branch 'hns3-misc-cleanups' Salil Mehta says: ==================== Misc. cleanups for HNS3 ethernet driver This patch-set presents some cleanups for HNS3 Ethernet Driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 08:44:24 -07:00
Jian Shen	d71d8381c5	net: hns3: Add SPDX tags to HNS3 PF driver Add the SPDX identifiers to HNS3 PF driver. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 08:44:23 -07:00
Jian Shen	584b464f83	net: hns3: Remove unused struct member and definition The struct hclge_desc_cb and hclge_desc_cb are never used in anywhere. This patch removes them. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 08:44:23 -07:00
Jian Shen	ef0c500961	net: hns3: Fix misleading parameter name The input parameter "dev" of hns3_irq_handle() is indeed used as a tqp vector, it is misleadin. The struct member "flag" is used to indicate ring type, so rename it. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 08:44:23 -07:00
Jian Shen	c79301d8d9	net: hns3: Modify inconsistent bit mask macros Use BIT() and GENMASK() to convert the bit mask, modify the inconsistent ones, and remove useless ones. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 08:44:23 -07:00
Jian Shen	f8a91784a1	net: hns3: Use decimal for bit offset macros Using hex for bit offsets is inconsistent with the rest of the file. Change them to decimal. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 08:44:23 -07:00
Jian Shen	fdace1bc4a	net: hns3: Correct unreasonable code comments This patch fixes some comment spelling errors, removes redundant comments, rewrites misleading comments, and adds some necessary comments. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 08:44:23 -07:00
Jian Shen	a10829c4ae	net: hns3: Remove extra space and brackets Remove extra space and brackets. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 08:44:23 -07:00
Jian Shen	3f639907e0	net: hns3: Standardize the handle of return value Apply the standard minor cleanup by returning ret outside the brackets. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 08:44:23 -07:00
Jian Shen	646cb51228	net: hns3: Remove some redundant assignments Remove some redundant assignments, because they have been set to zero when allocate hdev. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-21 08:44:23 -07:00
David S. Miller	eae249b27f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-07-20 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Add sharing of BPF objects within one ASIC: this allows for reuse of the same program on multiple ports of a device, and therefore gains better code store utilization. On top of that, this now also enables sharing of maps between programs attached to different ports of a device, from Jakub. 2) Cleanup in libbpf and bpftool's Makefile to reduce unneeded feature detections and unused variable exports, also from Jakub. 3) First batch of RCU annotation fixes in prog array handling, i.e. there are several __rcu markers which are not correct as well as some of the RCU handling, from Roman. 4) Two fixes in BPF sample files related to checking of the prog_cnt upper limit from sample loader, from Dan. 5) Minor cleanup in sockmap to remove a set but not used variable, from Colin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-20 23:58:30 -07:00
David S. Miller	f1d66bf9ab	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2018-07-20 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Fix in BPF Makefile to detect llvm-objcopy in a more robust way which is needed for pahole's BTF converter and minor UAPI tweaks in BTF_INT_BITS() to shrink the mask before eventual UAPI freeze, from Martin. 2) Fix a segfault in bpftool when prog pin id has no further arguments such as id value or file specified, from Taeung. 3) Fix powerpc JIT handling of XADD which has jumps to exit path that would potentially bypass verifier expectations e.g. with subprog calls. Also add a test case to make sure XADD is not mangling src/dst register, from Daniel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-20 23:57:03 -07:00
David S. Miller	c59e18b876	Merge branch 'Make-sys-class-net-per-net-namespace-objects-belong-to-container' Tyler Hicks says: ==================== Make /sys/class/net per net namespace objects belong to container This is a revival of an older patch set from Dmitry Torokhov: https://lore.kernel.org/lkml/1471386795-32918-1-git-send-email-dmitry.torokhov@gmail.com/ My submission of v2 is here: https://lore.kernel.org/lkml/1531497949-1766-1-git-send-email-tyhicks@canonical.com/ Here's Dmitry's description: There are objects in /sys hierarchy (/sys/class/net/) that logically belong to a namespace/container. Unfortunately all sysfs objects start their life belonging to global root, and while we could change ownership manually, keeping tracks of all objects that come and go is cumbersome. It would be better if kernel created them using correct uid/gid from the beginning. This series changes kernfs to allow creating object's with arbitrary uid/gid, adds get_ownership() callback to ktype structure so subsystems could supply their own logic (likely tied to namespace support) for determining ownership of kobjects, and adjusts sysfs code to make use of this information. Lastly net-sysfs is adjusted to make sure that objects in net namespace are owned by the root user from the owning user namespace. Note that we do not adjust ownership of objects moved into a new namespace (as when moving a network device into a container) as userspace can easily do it. I'm reviving this patch set because we would like this feature for system containers. One specific use case that we have is that libvirt is unable to configure its bridge device inside of a system container due to the bridge files in /sys/class/net/ being owned by init root instead of container root. The last two patches in this set are patches that I've added to Dmitry's original set to allow such configuration of the bridge device. Eric had previously provided feedback that he didn't favor these changes affecting all layers of the stack and that most of the changes could remain local to drivers/base/core.c. That feedback is certainly sensible but I wanted to send out v2 of the patch set without making that large of a change since quite a bit of time has passed and the bridge changes in the last patch of this set shows that not all of the changes will be local to drivers/base/core.c. I'm happy to make the changes if the original request still stands. * Changes since v2: - Added my Co-Developed-by and Signed-off-by tags to all of Dmitry's patches that I've modified - Patch 1 received build failure fixes in arch/x86/kernel/cpu/intel_rdt_rdtgroup.c - Patch 2 was updated to drop the declaration of sysfs_add_file() from sysfs.h since the patch removed all other uses of the function - Patch 5 is a new patch that prevents tx_maxrate from being written to from inside of a container + Maybe I'm being too cautious here but the restriction can always be loosened up later - Patches 6 and 7 were updated to make net_ns_get_ownership() always initialize uid and gid, even when the network namespace is NULL, so that it isn't a dangerous function to reuse + Requested by Christian Brauner - I've looked at all sysfs attributes affected by this patch set and feel comfortable about the changes. There are quite a few affected attributes that don't have any capable()/ns_capable() checks in their store operations (per_bond_attrs, at91_sysfs_attrs, sysfs_grcan_attrs, ican3_sysfs_attrs, cdc_ncm_sysfs_attrs, qmi_wwan_sysfs_attrs) but I think this is acceptable. It means that container root, rather than specifically CAP_NET_ADMIN inside of the network namespace that the device belongs to, can write to those device attributes. It's the same situation that those devices have today in that init root is able to write to the attributes without necessarily having CAP_NET_ADMIN. I think that this should probably be fixed in order to be consistent with what netdev_store() does by verifying CAP_NET_ADMIN in the network namespace but that it doesn't need to happen in this patch set. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-20 23:44:36 -07:00
Tyler Hicks	705e0dea4d	bridge: make sure objects belong to container's owner When creating various bridge objects in /sys/class/net/... make sure that they belong to the container's owner instead of global root (if they belong to a container/namespace). Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-20 23:44:36 -07:00
Tyler Hicks	fbdeaed408	net: create reusable function for getting ownership info of sysfs inodes Make net_ns_get_ownership() reusable by networking code outside of core. This is useful, for example, to allow bridge related sysfs files to be owned by container root. Add a function comment since this is a potentially dangerous function to use given the way that kobject_get_ownership() works by initializing uid and gid before calling .get_ownership(). Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-20 23:44:36 -07:00
Dmitry Torokhov	b0e37c0d8a	net-sysfs: make sure objects belong to container's owner When creating various objects in /sys/class/net/... make sure that they belong to container's owner instead of global root (if they belong to a container/namespace). Co-Developed-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-20 23:44:35 -07:00

... 4 5 6 7 8 ...

769330 Commits