Another TCP issue is triggered by ECN.
Under pressure, receiver gets ECN marks, and send back ACK packets
with ECE TCP flag. Senders enter CA_CWR state.
In this state, tcp_tso_should_defer() is short cut :
if (icsk->icsk_ca_state != TCP_CA_Open)
goto send_now;
This means that about all ACK packets we receive are triggering
a partial send, and because cwnd is kept small, we can only send
a small amount of data for each incoming ACK,
which in return generate more ACK packets.
Allowing CA_Open and CA_CWR states to enable TSO defer in
tcp_tso_should_defer() brings performance back :
TSO autodefer has more chance to defer under pressure.
This patch increases TSO and LRO/GRO efficiency back to normal levels,
and does not impact overall ECN behavior.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With sysctl_tcp_min_tso_segs being 4, it is very possible
that tcp_tso_should_defer() decides not sending last 2 MSS
of initial window of 10 packets. This also applies if
autosizing decides to send X MSS per GSO packet, and cwnd
is not a multiple of X.
This patch implements an heuristic based on age of first
skb in write queue : If it was sent very recently (less than half srtt),
we can predict that no ACK packet will come in less than half rtt,
so deferring might cause an under utilization of our window.
This is visible on initial send (IW10) on web servers,
but more generally on some RPC, as the last part of the message
might need an extra RTT to get delivered.
Tested:
Ran following packetdrill test
// A simple server-side test that sends exactly an initial window (IW10)
// worth of packets.
`sysctl -e -q net.ipv4.tcp_min_tso_segs=4`
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
+.1 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7>
+0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6>
+.1 < . 1:1(0) ack 1 win 257
+0 accept(3, ..., ...) = 4
+0 write(4, ..., 14600) = 14600
+0 > . 1:5841(5840) ack 1 win 457
+0 > . 5841:11681(5840) ack 1 win 457
// Following packet should be sent right now.
+0 > P. 11681:14601(2920) ack 1 win 457
+.1 < . 1:1(0) ack 14601 win 257
+0 close(4) = 0
+0 > F. 14601:14601(0) ack 1
+.1 < F. 1:1(0) ack 14602 win 257
+0 > . 14602:14602(0) ack 2
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TSO relies on ability to defer sending a small amount of packets.
Heuristic is to wait for future ACKS in hope to send more packets at once.
Current algorithm uses a per socket tso_deferred field as a pseudo timer.
This pseudo timer relies on future ACK, but there is no guarantee
we receive them in time.
Fix would be to use a real timer, but cost of such timer is probably too
expensive for typical cases.
This patch changes the logic to test the time of last transmit,
because we should not add bursts of more than 1ms for any given flow.
We've used this patch for about two years at Google, before FQ/pacing
as it would reduce a fair amount of bursts.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prior to this patch, sending a packet with the source MAC address of one
of the CPSW interfaces to one of the CPSW slave ports while it's configured in
dual_emac mode would update the port_num field of the VLAN/Unicast Address
Table Entry. This would cause it to discard all incoming traffic addressed to
that MAC address, essentially rendering the port useless until the ALE table is
cleared (by starting and stopping the interface or rebooting.)
For example, if eth0 has a MAC address of 90:59:af:8f:43:e9 it will have
an ALE table entry:
00 00 00 00 59 90 02 30 e9 43 8f af
(VLAN Addr vlan_id=2 unicast type=0 port_num=0 addr=90:59:af:8f:43:e9)
If you configure another device with the same MAC address and connect it
to the first CPSW slave port and send some traffic the ALE table entry
becomes:
04 00 00 00 59 90 02 30 e9 43 8f af
(VLAN Addr vlan_id=2 unicast type=0 port_num=1 addr=90:59:af:8f:43:e9)
>From this point forward all incoming traffic addressed to
90:59:af:8f:43:e9 will be dropped.
Setting the SECURE bit for the VLAN/Unicast address table entry for each
interface's MAC address corrects the problem.
Signed-off-by: George McCollister <george.mccollister@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the usbnet core does not update the tx_packets statistic for
drivers with FLAG_MULTI_PACKET and there is no hook in the TX
completion path where they could do this.
cdc_ncm and dependent drivers are bumping tx_packets stat on the
transmit path while asix and sr9800 aren't updating it at all.
Add a packet count in struct skb_data so these drivers can fill it
in, initialise it to 1 for other drivers, and add the packet count
to the tx_packets statistic on completion.
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Tested-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull drm fixes from Dave Airlie:
"Just general fixes: radeon, i915, atmel, tegra, amdkfd and one core
fix"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (28 commits)
drm: atmel-hlcdc: remove clock polarity from crtc driver
drm/radeon: only enable DP audio if the monitor supports it
drm/radeon: fix atom aux payload size check for writes (v2)
drm/radeon: fix 1 RB harvest config setup for TN/RL
drm/radeon: enable SRBM timeout interrupt on EG/NI
drm/radeon: enable SRBM timeout interrupt on SI
drm/radeon: enable SRBM timeout interrupt on CIK v2
drm/radeon: dump full IB if we hit a packet error
drm/radeon: disable mclk switching with 120hz+ monitors
drm/radeon: use drm_mode_vrefresh() rather than mode->vrefresh
drm/radeon: enable native backlight control on old macs
drm/i915: Fix frontbuffer false positve.
drm/i915: Align initial plane backing objects correctly
drm/i915: avoid processing spurious/shared interrupts in low-power states
drm/i915: Check obj->vma_list under the struct_mutex
drm/i915: Fix a use after free, and unbalanced refcounting
drm: atmel-hlcdc: remove useless pm_runtime_put_sync in probe
drm: atmel-hlcdc: reset layer A2Q and UPDATE bits when disabling it
drm: Fix deadlock due to getconnector locking changes
drm/i915: Dell Chromebook 11 has PWM backlight
...
Pull block layer fixes from Jens Axboe:
"Two smaller fixes for this cycle:
- A fixup from Keith so that NVMe compiles without BLK_INTEGRITY,
basically just moving the code around appropriately.
- A fixup for shm, fixing an oops in shmem_mapping() for mapping with
no inode. From Sasha"
[ The shmem fix doesn't look block-layer-related, but fixes a bug that
happened due to the backing_dev_info removal.. - Linus ]
* 'for-linus' of git://git.kernel.dk/linux-block:
mm: shmem: check for mapping owner before dereferencing
NVMe: Fix for BLK_DEV_INTEGRITY not set
This update contains:
o ensure quota type is reset in on-disk dquots
o fix missing partial EOF block data flush on truncate extension
o fix transaction leak in error handling for new pnfs block layout
support
o add missing target_ip check to RENAME_EXCHANGE
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJU78w2AAoJEK3oKUf0dfod68IQALzcN8py4QxvmxVXf8F7+ymo
PrUc/ZiP8EOS+q2wk4V0RgyoCAFA02pFjCEpWVm3PBdyfsd9DC12w7VYBlbDMO8f
wApPots48NbqYVQA2+YLzC2+dgHwxLWzzJFyS6jDb/xtrVarHZtbhJU6hvl3a1gH
8RwEW+mplMmIN8Qh7vxJ2/2K+97lfS2AW0jAnnOZKCsx98XWvSgeCk+3VszwZWjD
obQn2WrvlfUSSERs0z2sygx5GxR/3Wnm5LrzpiX/+gH6LdPED53o6K/tKf5ncbmF
maXkYUMxvTs3tOO9ZPohtL4Zc9JarPu2U6sKmMxULOaRgZLwmk6W2cyoCbdW2du5
0ardLB89fUvGCJGMXojVtxZ6BX8IEoyhSDUX1qGF9/HFr0Rz5zIkeeqAWkj89+Cj
VYvR/AmLBYwdaUPL+aHmG3P6B07u42n4650UQIVYw29rGEpxYOaBr7BAEYgyWFoM
Omizf05rsz5aAxXCTjfUl+s9VsO6H0lNCjRyNs+QRIqkGf9rgxJGIAJuoh+bNNOm
+WcId+5BPInuAy1YFP9Z02fe1NqIkSihTbL6daIlGIYralauXG+wyrsm9DaMsNSq
VPul6HFMUwv2g5ECjvhiGZcvElOcBKcVQEUBJP3izFczP9o2i5NKcIOVFW/AxwTZ
NW1qOYsLAQmD/hYpx1p2
=kTai
-----END PGP SIGNATURE-----
Merge tag 'xfs-for-linus-4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs
Pull xfs fixes from Dave Chinner:
"These are fixes for regressions/bugs introduced in the 4.0 merge cycle
and problems discovered during the merge window that need to be pushed
back to stable kernels ASAP.
This contains:
- ensure quota type is reset in on-disk dquots
- fix missing partial EOF block data flush on truncate extension
- fix transaction leak in error handling for new pnfs block layout
support
- add missing target_ip check to RENAME_EXCHANGE"
* tag 'xfs-for-linus-4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
xfs: cancel failed transaction in xfs_fs_commit_blocks()
xfs: Ensure we have target_ip for RENAME_EXCHANGE
xfs: ensure truncate forces zeroed blocks to disk
xfs: Fix quota type in quota structures when reusing quota file
There is a discrepancy here because the niu_class_to_ethflow() returns
zero on failure and one on success but the caller expected zero on
success and negative on failure.
The problem means that we allow the user to pass classes and flow_types
which we don't want. I've looked at it a bit and I don't see it as a
very serious bug.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Core mm expects __PAGETABLE_{PUD,PMD}_FOLDED to be defined if these page
table levels folded. Usually, these defines are provided by
<asm-generic/pgtable-nopmd.h> and <asm-generic/pgtable-nopud.h>.
But some architectures fold page table levels in a custom way. They
need to define these macros themself. This patch adds missing defines.
The patch fixes mm->nr_pmds underflow and eliminates dead __pmd_alloc()
and __pud_alloc() on architectures without these page table levels.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Historically, !__GFP_FS allocations were not allowed to invoke the OOM
killer once reclaim had failed, but nevertheless kept looping in the
allocator.
Commit 9879de7373 ("mm: page_alloc: embed OOM killing naturally into
allocation slowpath"), which should have been a simple cleanup patch,
accidentally changed the behavior to aborting the allocation at that
point. This creates problems with filesystem callers (?) that currently
rely on the allocator waiting for other tasks to intervene.
Revert the behavior as it shouldn't have been changed as part of a
cleanup patch.
Fixes: 9879de7373 ("mm: page_alloc: embed OOM killing naturally into allocation slowpath")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Dave Chinner <david@fromorbit.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: <stable@vger.kernel.org> [3.19.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There's a uname workaround for broken userspace which can't handle kernel
versions of 3.x. Update it for 4.x.
Signed-off-by: Jon DeVree <nuxi@vault24.org>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The memcg control knobs indicate the highest possible value using the
symbolic name "infinity", which is long and awkward to type.
Switch to the string "max", which is just as descriptive but shorter and
sweeter.
This changes a user interface, so do it before the release and before
the development flag is dropped from the default hierarchy.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
max_used_pages is defined as atomic_long_t so we need to use unsigned
long to keep temporary value for it rather than int which is smaller
than unsigned long in a 64 bit system.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix a conditional statement checking for NULL in both
ds1685_rtc_sysfs_time_regs_show and ds1685_rtc_sysfs_time_regs_store
that was using a logical AND when it should be using a logical OR so
that we fail out of the function properly if the condition ever
evaluates to true.
Fixes: aaaf5fbf56 ("rtc: add driver for DS1685 family of real time clocks")
Signed-off-by: Joshua Kinard <kumba@gentoo.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Each inode of nilfs2 stores a root node of a b-tree, and it turned out to
have a memory overrun issue:
Each b-tree node of nilfs2 stores a set of key-value pairs and the number
of them (in "bn_nchildren" member of nilfs_btree_node struct), as well as
a few other "bn_*" members.
Since the value of "bn_nchildren" is used for operations on the key-values
within the b-tree node, it can cause memory access overrun if a large
number is incorrectly set to "bn_nchildren".
For instance, nilfs_btree_node_lookup() function determines the range of
binary search with it, and too large "bn_nchildren" leads
nilfs_btree_node_get_key() in that function to overrun.
As for intermediate b-tree nodes, this is prevented by a sanity check
performed when each node is read from a drive, however, no sanity check
has been done for root nodes stored in inodes.
This patch fixes the issue by adding missing sanity check against b-tree
root nodes so that it's called when on-memory inodes are read from ifile,
inode metadata file.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This got lost during the initial merge process: Python requires an
__init__.py script, even if empty, in order to accept a directory as
package. Add it, this time as a non-empty file.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
drivers/rtc/rtc-ds1685.c: In function `ds1685_rtc_read_alarm':
drivers/rtc/rtc-ds1685.c:402: warning: comparison is always true due to limited range of data type
drivers/rtc/rtc-ds1685.c:409: warning: comparison is always true due to limited range of data type
drivers/rtc/rtc-ds1685.c:416: warning: comparison is always true due to limited range of data type
drivers/rtc/rtc-ds1685.c: In function `ds1685_rtc_set_alarm':
drivers/rtc/rtc-ds1685.c:475: warning: comparison is always true due to limited range of data type
drivers/rtc/rtc-ds1685.c:478: warning: comparison is always true due to limited range of data type
drivers/rtc/rtc-ds1685.c:481: warning: comparison is always true due to limited range of data type
u8 cannot contain a value larger than 0xff, hence drop the checks.
Wrapping the checks in unlikely() indicated some sense of humor, though ;-)
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Joshua Kinard <kumba@gentoo.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The newly added ds1685 driver causes a build error when enabled without
CONFIG_RTC_INTF_DEV:
drivers/rtc/rtc-ds1685.c:919:22: error: 'ds1685_rtc_alarm_irq_enable' undeclared here (not in a function)
.alarm_irq_enable = ds1685_rtc_alarm_irq_enable,
Apparently the driver was incorrectly changed to reflect the interface
change from 16380c153a ("RTC: Convert rtc drivers to use the
alarm_irq_enable method"), which removed the respective #ifdef from all
other rtc drivers.
This does the same change that was merged for the other drivers before and
removes the #ifdef, allowing the interrupts to be enabled through the
in-kernel rtc interface independent of the existence of /dev/rtc.
Fixes: aaaf5fbf56 ("rtc: add driver for DS1685 family of real time clocks")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Joshua Kinard <kumba@gentoo.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A memcg is considered low limited even when the current usage is equal to
the low limit. This leads to interesting side effects e.g.
groups/hierarchies with no memory accounted are considered protected and
so the reclaim will emit MEMCG_LOW event when encountering them.
Another and much bigger issue was reported by Joonsoo Kim. He has hit a
NULL ptr dereference with the legacy cgroup API which even doesn't have
low limit exposed. The limit is 0 by default but the initial check fails
for memcg with 0 consumption and parent_mem_cgroup() would return NULL if
use_hierarchy is 0 and so page_counter_read would try to dereference NULL.
I suppose that the current implementation is just an overlook because the
documentation in Documentation/cgroups/unified-hierarchy.txt says:
"The memory.low boundary on the other hand is a top-down allocated
reserve. A cgroup enjoys reclaim protection when it and all its
ancestors are below their low boundaries"
Fix the usage and the low limit comparision in mem_cgroup_low accordingly.
Fixes: 241994ed86 (mm: memcontrol: default hierarchy interface for memory)
Reported-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Maxime reported the following memory leak regression due to commit
dbc8358c72 ("mm/nommu: use alloc_pages_exact() rather than its own
implementation").
On v3.19, I am facing a memory leak. Each time I run a command one page
is lost. Here an example with busybox's free command:
/ # free
total used free shared buffers cached
Mem: 7928 1972 5956 0 0 492
-/+ buffers/cache: 1480 6448
/ # free
total used free shared buffers cached
Mem: 7928 1976 5952 0 0 492
-/+ buffers/cache: 1484 6444
/ # free
total used free shared buffers cached
Mem: 7928 1980 5948 0 0 492
-/+ buffers/cache: 1488 6440
/ # free
total used free shared buffers cached
Mem: 7928 1984 5944 0 0 492
-/+ buffers/cache: 1492 6436
/ # free
total used free shared buffers cached
Mem: 7928 1988 5940 0 0 492
-/+ buffers/cache: 1496 6432
At some point, the system fails to sastisfy 256KB allocations:
free: page allocation failure: order:6, mode:0xd0
CPU: 0 PID: 67 Comm: free Not tainted 3.19.0-05389-gacf2cf1-dirty #64
Hardware name: STM32 (Device Tree Support)
show_stack+0xb/0xc
warn_alloc_failed+0x97/0xbc
__alloc_pages_nodemask+0x295/0x35c
__get_free_pages+0xb/0x24
alloc_pages_exact+0x19/0x24
do_mmap_pgoff+0x423/0x658
vm_mmap_pgoff+0x3f/0x4e
load_flat_file+0x20d/0x4f8
load_flat_binary+0x3f/0x26c
search_binary_handler+0x51/0xe4
do_execveat_common+0x271/0x35c
do_execve+0x19/0x1c
ret_fast_syscall+0x1/0x4a
Mem-info:
Normal per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
active_anon:0 inactive_anon:0 isolated_anon:0
active_file:0 inactive_file:0 isolated_file:0
unevictable:123 dirty:0 writeback:0 unstable:0
free:1515 slab_reclaimable:17 slab_unreclaimable:139
mapped:0 shmem:0 pagetables:0 bounce:0
free_cma:0
Normal free:6060kB min:352kB low:440kB high:528kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:492kB isolated(anon):0ks
lowmem_reserve[]: 0 0
Normal: 23*4kB (U) 22*8kB (U) 24*16kB (U) 23*32kB (U) 23*64kB (U) 23*128kB (U) 1*256kB (U) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6060kB
123 total pagecache pages
2048 pages of RAM
1538 free pages
66 reserved pages
109 slab pages
-46 pages shared
0 pages swap cached
nommu: Allocation of length 221184 from process 67 (free) failed
Normal per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
active_anon:0 inactive_anon:0 isolated_anon:0
active_file:0 inactive_file:0 isolated_file:0
unevictable:123 dirty:0 writeback:0 unstable:0
free:1515 slab_reclaimable:17 slab_unreclaimable:139
mapped:0 shmem:0 pagetables:0 bounce:0
free_cma:0
Normal free:6060kB min:352kB low:440kB high:528kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:492kB isolated(anon):0ks
lowmem_reserve[]: 0 0
Normal: 23*4kB (U) 22*8kB (U) 24*16kB (U) 23*32kB (U) 23*64kB (U) 23*128kB (U) 1*256kB (U) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6060kB
123 total pagecache pages
Unable to allocate RAM for process text/data, errno 12 SEGV
This problem happens because we allocate ordered page through
__get_free_pages() in do_mmap_private() in some cases and we try to free
individual pages rather than ordered page in free_page_series(). In
this case, freeing pages whose refcount is not 0 won't be freed to the
page allocator so memory leak happens.
To fix the problem, this patch changes __get_free_pages() to
alloc_pages_exact() since alloc_pages_exact() returns
physically-contiguous pages but each pages are refcounted.
Fixes: dbc8358c72 ("mm/nommu: use alloc_pages_exact() rather than its own implementation").
Reported-by: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Tested-by: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <stable@vger.kernel.org> [3.19]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We (the Ocfs2 project) recently moved the location of our ocfs2-tools
git tree and project web page. The pertinent discussion can be seen
here:
https://oss.oracle.com/pipermail/ocfs2-devel/2015-February/010579.html
The following patch updates the Ocfs2 documentation in MAINTAINERS,
ocfs2.txt, and dlmfs.txt. I added our new official web page, changed
the location of our tools git tree and removed the link to Joel's
ancient kernel git tree - Andrew has handled our patches for a while
now.
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The smc91x driver traditionally gets configured at compile-time
for whichever hardware it runs on. This no longer works on
ARM as we continue to move to building all-in-one kernels.
Most ARM configurations with this driver already use run-time
configuration through DT or through platform_data, but a
few have not been converted yet.
I've checked all ARM boards that use this driver in their
legacy board files, and converted the ones that were using
compile-time configuration in smc91x.h to behave like the
other ones and provide the interrupt polarity along with
the MMIO configuration (width, stride) at platform device
creation time.
In particular, these combinations were previously selectable
in Kconfig but in fact broken:
- sa1100 assabet plus pleb
- msm combined with any other armv6/v7 platform
- pxa-idp combined with any non-DMA pxa variant
- LogicPD PXA270 combined with any other pxa
- nomadik combined with any other armv4/v5 platform,
e.g. versatile.
None of these seem critical enough to warrant a backport
to stable, but it would be nice to clean this up for good.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
----
I would like the patch to get merged through netdev, after
Robert and/or Linus have verified it on at least some hardware.
There are a few other non-ARM platforms using this driver,
I could do the same patch for those if we want to take
it further.
arch/arm/mach-msm/board-halibut.c | 8 ++++-
arch/arm/mach-msm/board-qsd8x50.c | 8 ++++-
arch/arm/mach-pxa/idp.c | 5 +++
arch/arm/mach-pxa/lpd270.c | 8 ++++-
arch/arm/mach-realview/core.c | 7 ++++
arch/arm/mach-realview/realview_eb.c | 2 +-
arch/arm/mach-sa1100/neponset.c | 6 ++++
arch/arm/mach-sa1100/pleb.c | 7 ++++
drivers/net/ethernet/smsc/smc91x.c | 9 +++--
drivers/net/ethernet/smsc/smc91x.h | 114 ++----------------------------------------------------------
10 files changed, 57 insertions(+), 117 deletions(-)
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit:
1e02ce4ccc ("x86: Store a per-cpu shadow copy of CR4")
added a shadow CR4 such that reads and writes that do not
modify the CR4 execute much faster than always reading the
register itself.
The change modified cpu_init() in common.c, so that the
shadow CR4 gets initialized before anything uses it.
Unfortunately, there's two cpu_init()s in common.c. There's
one for 64-bit and one for 32-bit. The commit only added
the shadow init to the 64-bit path, but the 32-bit path
needs the init too.
Link: http://lkml.kernel.org/r/20150227125208.71c36402@gandalf.local.home Fixes: 1e02ce4ccc "x86: Store a per-cpu shadow copy of CR4"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150227145019.2bdd4354@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It is possible that _ART/_TRT tables are missing or have errors.
Ignore those failures, as INT3400 thermal zone is still required
for _OSC or mode switch.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Enable Intel Powerclamp driver on Atom* Processor C2000 Product
Family for Microservers (Avoton). Avoton - SoCs for micro-servers
has package C-states which can be used for idle injection.
Reported-by: Jose Navarro <jose.navarro@intel.com>
Suggested-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Tested-by: Jose Carlos Venegas Munoz <jos.c.venegas.munoz@intel.com>
Signed-off-by: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
gcc complains about the 'cols' variable being unused. This is
unavoidable, given the ncurses getmaxyx() macro-based API, which wants
to assign to a variable directly, even when we're not going to use it.
Warning:
gcc -O1 -Wall -Wshadow -W -Wformat -Wimplicit-function-declaration -Wimplicit-int -fstack-protector -D VERSION=\"1.0\" -c -o tui.o tui.c
tui.c: In function ‘show_dialogue’:
tui.c:288:12: warning: variable ‘cols’ set but not used [-Wunused-but-set-variable]
int rows, cols;
^
So, add a hack to get rid of that warning.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Some distros (e.g., Arch Linux) don't package the tinfo library
separately from ncurses, so don't unconditionally include it. Instead,
use pkg-config.
The $(STATIC) ugliness is to handle the reported build case from commit
6b533269fb ("tools/thermal: tmon: fix compilation errors when building
statically"), where a developer wants to be able to build with:
make LDFLAGS=-static
which requires an additional pkg-config flag.
Finally, support a lowest common denominator fallback (-lpanel
-lncurses) for build systems that don't have pkg-config entries for
ncurses.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
We might want to prepare CFLAGS outside of this Makefile, so don't
overwrite its initial value.
Then, support $(CROSS_COMPILE), so we can use a cross-compile toolchain.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
The number of rows in the dialog vary according to the number of cooling
devices. However, some of the windowing computations were assuming a
fixed number of rows. This computation is OK when we have between 4 and
9 cooling devices (and they wrap to the next column), but with fewer
devices, we end up printing off the end of the window.
This unifies the row computation into a single function and uses that
throughout the TUI code. This also accounts for increasing the number of
rows when there are more than 9 total cooling devices.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
We can use the ncurses API to get the number of rows.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
If we launch in daemon mode (--daemon), we don't have the ncurses UI,
but we might want to set the target temperature still. For example,
someone might stick the following in their boot script:
tmon --control intel_powerclamp --target-temp 90 --log --daemon
This would turn on CPU idle injection when we're around 90 degrees
celsius, and would log temperature and throttling info to
/var/tmp/tmon.log.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
The arm-soc bug fixes this time around are mostly for the omap
platform, coming from a pull request from Tony Lindgren and are
almost entirely fixing dts files.
The other two changes enable support for the shmobile platform
in generic armv7 kernels and change some properties in the
ARM64 reference board dts files.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAVPDl6WCrR//JCVInAQJHDBAAhrvnfpj2vz8mzkI38LrQrBbygu5/oVbK
6+mmHDBVgJwnK2Y2W8XWlKBJyWySLS+WKH0iJL1LArC32BXn83vt5yA4wVex1BoM
luX3NzoB4fC1DApbH4xQqp69gwiU8RnPoyASE3BzEENuIms2P3ymiEOY4ursCAdj
zhUe5m/ksbJCsce+zVXYKLwtDDNByo/JQPstfd7l3e43jzdBtV6IBs+xznUMyWND
B0OR62WSVQ8UOAxvtuLMxLuMlPlcJ6o51fiTyke1tjy2aXr1QXuphZnYmK5oAsuO
qihLdmfR5wbU8QmYDzoUomZ6aPAjMOW1O5J35ZfteO3WgB79GS2N/21WspbvDeFh
J6aq2VNTnpFaUqOcqgkUuOzO5PqPN5vEbt60gx918PROFjOrOJFidDQOw7YnKUOq
IIV/vp0v8xo3zx96b3mOgVtq8cZn4o8HyAvTx7+YnNw22aF4i7JndsIZtDOoNtZW
a5rkdQloyhpfAgrxmp8gS+o8VhuxZNutAMIQw8rtXCX1Z4EMrtN3FA3GBCkRvzPb
R7XNI+dyTHPe3KKlM4cJqKf+HU+bAvPQ40+l/jgd9Dipmt0+noKQin45ooB6NzOs
0Z6+vBB8AtgrLsJomqYTsXfCIskh9B+KQWrt53bwZnPOzAHHJzxcynwt+tfusz4I
L1/Ew6lJtaM=
=tYZw
-----END PGP SIGNATURE-----
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"The arm-soc bug fixes this time around are mostly for the omap
platform, coming from a pull request from Tony Lindgren and are almost
entirely fixing dts files.
The other two changes enable support for the shmobile platform in
generic armv7 kernels and change some properties in the ARM64
reference board dts files"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: multi_v7_defconfig: Enable shmobile platforms
arm64: Add L2 cache topology to ARM Ltd boards/models
ARM: dts: am335x-bone*: usb0 is hardwired for peripheral
ARM: dts: dra7x-evm: beagle-x15: Fix USB Host
ARM: omap2plus_defconfig: Fix SATA boot
ARM: omap2plus_defconfig: Enable OMAP NAND BCH driver
ARM: dts: dra7: Correct the dma controller's property names
ARM: dts: omap5: Correct the dma controller's property names
ARM: dts: omap4: Correct the dma controller's property names
ARM: dts: omap3: Correct the dma controller's property names
ARM: dts: omap2: Correct the dma controller's property names
ARM: dts: am437x-idk: fix sleep pinctrl state
ARM: omap2plus_defconfig: enable TPS62362 regulator
ARM: dts: am437x-idk: fix TPS62362 i2c bus
ARM: dts: n900: Fix offset for smc91x ethernet
ARM: dts: n900: fix i2c bus numbering
ARM: dts: Fix USB dts configuration for dm816x
ARM: dts: OMAP5: Fix SATA PHY node
ARM: dts: DRA7: Fix SATA PHY node
- ftrace branch generation fix
- branch instruction encoding fix
- include files, guards and unused prototypes clean-up
- minor VDSO ABI fix (clock_getres)
- PSCI functions moved to .S to avoid compilation error with gcc 5
- pte_modify fix to not ignore the mapping type
- crypto: AES interleaved increased to 4x (for performance reasons)
- text patching fix for modules
- swiotlb increased back to 64MB
- copy_siginfo_to_user32() fix for big endian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJU8LqKAAoJEGvWsS0AyF7xE/IQALJa+y7Ow3vkXEaMmDwTWXDD
o2G/8ecZXgKhLnO/1XokHOOpRvkmWRlo4gQSTRmhEtDl4+4x0KNNOJ2CnctTZpvc
rIUS0PrjR3EWigU9WLdXCg4FtyO44w1hTBUimmPddzD1LcxsgMJd041HQcvsWhAL
gHq0Y+asB3cAHSEdRuiBevDFJ0Y+O+JocR911hPbfgv5do/rzcCoHlvOmgOaFe1S
voAq40XR0w2nRLxRj33EMYFngtkZgvzEEV2kCCSOn4yOhX91kX/Sg4gQL7AhcAJ/
Vla8Im7lmZiOSxFrqc536TQ5GG6x7dKaFh2OftC1vdMBtaS3oe104/2tF7IIpErP
t6OTkbdAUMImf0m+XhNpP4pUQneQIfl58WxqBOPk+H/YBoGOXzOrbozfxTp13OPS
LrOk5f8fdisgCYDvEODJZVtnRiqGNtkWZiNEmSFj+EpqBxJTlNTEsEAQe/ThuSY2
/Gha+Ar64zwKRY+KtqcuCAuNEvJpyg4z7PAGMMtznSXtBZ6Xn+H6ME8IB/B+y9j6
URqKlcMnPNiLJAE/ecEx7EoNsvg+DLO30fxs+LPhQ11zoEAWIQTMQiw0fTYdMgU5
KXXCelXTrIyX5uC/aJulDzknNWohXZQlMR/9hwKxRtNnGvMjQM19oRsgM3g3WVRT
U1Xnv69tFHJG3kSh2bY5
=IGAn
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
"Various arm64 fixes:
- ftrace branch generation fix
- branch instruction encoding fix
- include files, guards and unused prototypes clean-up
- minor VDSO ABI fix (clock_getres)
- PSCI functions moved to .S to avoid compilation error with gcc 5
- pte_modify fix to not ignore the mapping type
- crypto: AES interleaved increased to 4x (for performance reasons)
- text patching fix for modules
- swiotlb increased back to 64MB
- copy_siginfo_to_user32() fix for big endian"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: cpuidle: add asm/proc-fns.h inclusion
arm64: compat Fix siginfo_t -> compat_siginfo_t conversion on big endian
arm64: Increase the swiotlb buffer size 64MB
arm64: Fix text patching logic when using fixmap
arm64: crypto: increase AES interleave to 4x
arm64: enable PTE type bit in the mask for pte_modify
arm64: mm: remove unused functions and variable protoypes
arm64: psci: move psci firmware calls out of line
arm64: vdso: minor ABI fix for clock_getres
arm64: guard asm/assembler.h against multiple inclusions
arm64: insn: fix compare-and-branch encodings
arm64: ftrace: fix ftrace_modify_graph_caller for branch replace
Erik Hugne says:
====================
tipc: bug fix and some improvements
Most important is a fix for a nullptr exception that would occur when
name table subscriptions fail. The remaining patches are performance
improvements and cosmetic changes.
v2: remove unnecessary whitespace in patch #2
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
With the exception of infiniband media which does not use media
offsets, the media address is always located at offset 4 in the
media info field as defined by the protocol, so we move the
definition to the generic bearer.h
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TIPC_MEDIA_ADDR_SIZE and TIPC_MEDIA_ADDR_OFFSET names
are misleading, as they actually define the size and offset of
the whole media info field and not the address part. This patch
does not have any functional changes.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a bearer is disabled by manual intervention, all links over that
bearer should be purged, indicated with the 'shutting_down' flag.
Otherwise tipc will get confused if a new bearer is enabled using
a different media type.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a subscription request is sent to a topology server
connection, and any error occurs (malformed request, oom
or limit reached) while processing this request, TIPC should
terminate the subscriber connection. While doing so, it tries
to access fields in an already freed (or never allocated)
subscription element leading to a nullpointer exception.
We fix this by removing the subscr_terminate function and
terminate the connection immediately upon any subscription
failure.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TIPC name distributor pushes topology updates to the cluster
neighbors. Currently this is done in a unicast manner, and the
skb holding the update is cloned for each cluster member. This
is unnecessary, as we only modify the destnode field in the header
so we change it to do pskb_copy instead.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a hash table has 128 slots and 16384 elems, expand to 256 slots
takes more than one second. For larger sets, a soft lockup is detected.
Holding cpu for that long, even in a work queue is a show stopper
for non preemptable kernels.
cond_resched() at strategic points to allow process scheduler
to reschedule us.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2015-02-26
This series contains fixes for i40e and i40evf only.
Alexey Khoroshilov found a possible leak of 'cmd_buf' when copy_from_user()
failed in i40e_dbg_command_write(), so resolved by calling kfree().
Shannon provides a fix to ensure the shift and bitwise precedences do not
work backwards for us by adding parans. Fixed the driver by preventing
the driver from allowing stray interrupts or causing system logs from
un-handled interrupts by combining the ICR0 shutdown with the standard
interrupt shutdown and add the interrupt clearing to the PCI shutdown
path. Fixed an issue where a NVM write times out before a transaction
can complete, so Shannon added logic to make another attempt by
reacquiring the semaphore, then retry the write, if the one retry fails,
we will then give up. Adds checks to pointers before their use to ensure
we do not try to dereference NULL pointers when returning values from the
AdminQ calls.
Akeem adds a check to bail out if the device is already down when checking
for Tx hang subtask.
Anjali fixes TSO with more than 8 frags per segment issue. The hardware
has some limitations which the driver needs to adhere to:
1) no more than 8 descriptors per packet on the wire
2) no header can span more than 3 descriptors
If one of these events happens, the hardware will generate an internal
error and freeze the Tx queue, so Anjali fixes this by linearizes the skb
to avoid these situations. Fixed an issue where the per Traffic Class
queue count was higher than queues enabled, which will fix a warning
with multiple function mode where systems regularly have more cores than
vectors. Fixed TCP/IPv6 over VXLAN Tx checksum offload, where we were
checking the outer protocol flags and deciding the flow for the inner
header.
Jesse fixes a race condition in the transmit hang detection. Before we
were having issues of false Tx hang detection, no the driver makes more
direct with the checks for progress forward by directly checking the head
write back address and tail register when determining progress. This
avoids Tx hangs where the software gets behind, because we are directly
checking hardware state when determining a hang state.
Neerav fixes the transmit ring Qset handle when DCB reconfigures. The issue
was when DCB is reconfigured to a single traffic class (TC) and the driver
did not reset the Tx ring Qset handle to correct the mapping, which caused
the Tx queue to disable timeouts. Also as part of DCB reconfiguration flow
if the Tx queue disable times out, then issue a PF reset to do some level
of recovery.
Mitch stops flow director on shutdown because, in some cases, the hardware
would continue to try to access the FDIR ring after entering D3Hot state,
which would cause either PCIe errors or NMIs, depending upon the system
configuration.
* NOTE * I have verified that this series of patches for net will not cause
any merge issues when you sync up your net tree with your net-next tree.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
It is possible that the hardware may not have been properly shutdown
before this driver gets control, through use by firmware, for example.
Until the driver is loaded, interrupts associated with the hardware
could go pending. When the IRQs are requested napi support has not
been initialized yet, but the ISR will get control and schedule napi
processing resulting in a kernel panic because the poll routine has not
been set.
Adjust the code so that the driver is fully ready to handle and process
interrupts as soon as the IRQs are requested. This involves requesting
and freeing IRQs during start and stop processing and ordering the napi
add and delete calls appropriately.
Also adjust the powerup and powerdown routines to match the start and
stop routines in regards to the ordering of tasks, including napi
related calls.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Just another AX88178-based 10/100/1000 USB-to-Ethernet dongle. This one
shows up in lsusb as: "Sitecom Europe B.V. LN-028 Network USB 2.0 Adapter".
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Cc: Francois Romieu <romieu@fr.zoreil.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-usb@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>