Wolfgang Walter reported this oops on his via C3 using padlock for
AES-encryption:
##################################################################
BUG: unable to handle kernel NULL pointer dereference at 000001f0
IP: [<c01028c5>] __switch_to+0x30/0x117
*pde = 00000000
Oops: 0002 [#1] PREEMPT
Modules linked in:
Pid: 2071, comm: sleep Not tainted (2.6.26 #11)
EIP: 0060:[<c01028c5>] EFLAGS: 00010002 CPU: 0
EIP is at __switch_to+0x30/0x117
EAX: 00000000 EBX: c0493300 ECX: dc48dd00 EDX: c0493300
ESI: dc48dd00 EDI: c0493530 EBP: c04cff8c ESP: c04cff7c
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process sleep (pid: 2071, ti=c04ce000 task=dc48dd00 task.ti=d2fe6000)
Stack: dc48df30 c0493300 00000000 00000000 d2fe7f44 c03b5b43 c04cffc8 00000046
c0131856 0000005a dc472d3c c0493300 c0493470 d983ae00 00002696 00000000
c0239f54 00000000 c04c4000 c04cffd8 c01025fe c04f3740 00049800 c04cffe0
Call Trace:
[<c03b5b43>] ? schedule+0x285/0x2ff
[<c0131856>] ? pm_qos_requirement+0x3c/0x53
[<c0239f54>] ? acpi_processor_idle+0x0/0x434
[<c01025fe>] ? cpu_idle+0x73/0x7f
[<c03a4dcd>] ? rest_init+0x61/0x63
=======================
Wolfgang also found out that adding kernel_fpu_begin() and kernel_fpu_end()
around the padlock instructions fix the oops.
Suresh wrote:
These padlock instructions though don't use/touch SSE registers, but it behaves
similar to other SSE instructions. For example, it might cause DNA faults
when cr0.ts is set. While this is a spurious DNA trap, it might cause
oops with the recent fpu code changes.
This is the code sequence that is probably causing this problem:
a) new app is getting exec'd and it is somewhere in between
start_thread() and flush_old_exec() in the load_xyz_binary()
b) At pont "a", task's fpu state (like TS_USEDFPU, used_math() etc) is
cleared.
c) Now we get an interrupt/softirq which starts using these encrypt/decrypt
routines in the network stack. This generates a math fault (as
cr0.ts is '1') which sets TS_USEDFPU and restores the math that is
in the task's xstate.
d) Return to exec code path, which does start_thread() which does
free_thread_xstate() and sets xstate pointer to NULL while
the TS_USEDFPU is still set.
e) At the next context switch from the new exec'd task to another task,
we have a scenarios where TS_USEDFPU is set but xstate pointer is null.
This can cause an oops during unlazy_fpu() in __switch_to()
Now:
1) This should happen with or with out pre-emption. Viro also encountered
similar problem with out CONFIG_PREEMPT.
2) kernel_fpu_begin() and kernel_fpu_end() will fix this problem, because
kernel_fpu_begin() will manually do a clts() and won't run in to the
situation of setting TS_USEDFPU in step "c" above.
3) This was working before the fpu changes, because its a spurious
math fault which doesn't corrupt any fpu/sse registers and the task's
math state was always in an allocated state.
With out the recent lazy fpu allocation changes, while we don't see oops,
there is a possible race still present in older kernels(for example,
while kernel is using kernel_fpu_begin() in some optimized clear/copy
page and an interrupt/softirq happens which uses these padlock
instructions generating DNA fault).
This is the failing scenario that existed even before the lazy fpu allocation
changes:
0. CPU's TS flag is set
1. kernel using FPU in some optimized copy routine and while doing
kernel_fpu_begin() takes an interrupt just before doing clts()
2. Takes an interrupt and ipsec uses padlock instruction. And we
take a DNA fault as TS flag is still set.
3. We handle the DNA fault and set TS_USEDFPU and clear cr0.ts
4. We complete the padlock routine
5. Go back to step-1, which resumes clts() in kernel_fpu_begin(), finishes
the optimized copy routine and does kernel_fpu_end(). At this point,
we have cr0.ts again set to '1' but the task's TS_USEFPU is stilll
set and not cleared.
6. Now kernel resumes its user operation. And at the next context
switch, kernel sees it has do a FP save as TS_USEDFPU is still set
and then will do a unlazy_fpu() in __switch_to(). unlazy_fpu()
will take a DNA fault, as cr0.ts is '1' and now, because we are
in __switch_to(), math_state_restore() will get confused and will
restore the next task's FP state and will save it in prev tasks's FP state.
Remember, in __switch_to() we are already on the stack of the next task
but take a DNA fault for the prev task.
This causes the fpu leakage.
Fix the padlock instruction usage by calling them inside the
context of new routines irq_ts_save/restore(), which clear/restore cr0.ts
manually in the interrupt context. This will not generate spurious DNA
in the context of the interrupt which will fix the oops encountered and
the possible FPU leakage issue.
Reported-and-bisected-by: Wolfgang Walter <wolfgang.walter@stwm.de>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The changeset ca786dc738
crypto: hash - Fixed digest size check
missed one spot for the digest type. This patch corrects that
error.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
My changeset 4b22f0ddb6
crypto: tcrpyt - Remove unnecessary kmap/kunmap calls
introduced a typo that broke AEAD chunk testing. In particular,
axbuf should really be xbuf.
There is also an issue with testing the last segment when encrypting.
The additional part produced by AEAD wasn't tested. Similarly, on
decryption the additional part of the AEAD input is mistaken for
corruption.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Later SEC revision requires the link table (used for scatter/gather)
to have an extra entry to account for the total length in descriptor [4],
which contains cipher Input and ICV.
This only applies to decrypt, not encrypt.
Without this change, on 837x, a gather return/length error results
when a decryption uses a link table to gather the fragments.
This is observed by doing a ping with size of 1447 or larger with AES,
or a ping with size 1455 or larger with 3des.
So, add check for SEC compatible "fsl,3.0" for using extra link table entry.
Signed-off-by: Lee Nipper <lee.nipper@freescale.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/ehca: Discard double CQE for one WR
IB/ehca: Check idr_find() return value
IB/ehca: Repoll CQ on invalid opcode
IB/ehca: Rename goto label in ehca_poll_cq_one()
IB/ehca: Update qp_state on cached modify_qp()
IPoIB/cm: Use vmalloc() to allocate rx_rings
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] use bcd2bin/bin2bcd
[IA64] Ensure cpu0 can access per-cpu variables in early boot code
Various cleanup the drivers/firmware/memmap (after review by AKPM):
- fix kdoc to conform to the standard
- move kdoc from header to implementation files
- remove superfluous WARN_ON() after kmalloc()
- WARN_ON(x); if (!x) -> if(!WARN_ON(x))
- improve some comments
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The attached patch seems to already exist in a number of branches -- it
keeps popping up on Google for me, and is certainly already in Debian --
but is strangely absent from mainstream.
The problem appears to be that the patched file ends up as part of the
target toolchain, but unfortunately the gcc constant folding doesn't
appear to eliminate the __invalid_size_argument_for_IOC value early
enough. Certainly compiling C++ programs which use _IO... macros as
constants fails without this patch.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix printf format type warnings (seen on alpha & ia64):
Documentation/accounting/getdelays.c:206: warning: format '%15llu' expects type 'long long unsigned int', but argument 6 has type '__u64'
Documentation/accounting/getdelays.c:206: warning: format '%15llu' expects type 'long long unsigned int', but argument 7 has type '__u64'
Documentation/accounting/getdelays.c:206: warning: format '%15llu' expects type 'long long unsigned int', but argument 8 has type '__u64'
Documentation/accounting/getdelays.c:206: warning: format '%15llu' expects type 'long long unsigned int', but argument 9 has type '__u64'
Documentation/accounting/getdelays.c:206: warning: format '%15llu' expects type 'long long unsigned int', but argument 12 has type '__u64'
Documentation/accounting/getdelays.c:206: warning: format '%15llu' expects type 'long long unsigned int', but argument 13 has type '__u64'
Documentation/accounting/getdelays.c:206: warning: format '%15llu' expects type 'long long unsigned int', but argument 16 has type '__u64'
Documentation/accounting/getdelays.c:206: warning: format '%15llu' expects type 'long long unsigned int', but argument 17 has type '__u64'
Documentation/accounting/getdelays.c:214: warning: format '%15llu' expects type 'long long unsigned int', but argument 4 has type '__u64'
Documentation/accounting/getdelays.c:214: warning: format '%15llu' expects type 'long long unsigned int', but argument 5 has type '__u64'
Documentation/accounting/getdelays.c:221: warning: format '%llu' expects type 'long long unsigned int', but argument 2 has type '__u64'
Documentation/accounting/getdelays.c:221: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type '__u64'
Documentation/accounting/getdelays.c:221: warning: format '%llu' expects type 'long long unsigned int', but argument 4 has type '__u64'
Documentation/accounting/getdelays.c:221: warning: format '%llu' expects type 'long long unsigned int', but argument 5 has type '__u64'
Documentation/accounting/getdelays.c:221: warning: format '%llu' expects type 'long long unsigned int', but argument 6 has type '__u64'
Documentation/accounting/getdelays.c:236: warning: 'cmd_type' may be used uninitialized in this function
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Documentation/networking/ifenslave.c:1084: warning: pointer targets in assignment differ in signedness
>From include/linux/socket.h:
* 1003.1g requires sa_family_t and that sa_data is char.
and from SUSv3:
(http://www.opengroup.org/onlinepubs/009695399/basedefs/sys/socket.h.html)
The <sys/socket.h> header shall define the sockaddr structure that includes at least the following members:
sa_family_t sa_family Address family.
char sa_data[] Socket address (variable-length data).
<end SUSv3>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add MODULE_LICENSE() to DocBook/procfs_example.c since modpost complained
about a missing license there.
Remove tty procfs removal since the creation was deleted long ago
(http://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commitdiff;h=5ad9cb65e9b15e5b83e2dd1c10a4bcaccc4ec644).
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: <J.A.K.Mouw@its.tudelft.nl>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently source files in the Documentation/ sub-dir can easily bit-rot
since they are not generally buildable, either because they are hidden in
text files or because there are no Makefile rules for them. This needs to
be fixed so that the source files remain usable and good examples of code
instead of bad examples.
Add the ability to build source files that are in the Documentation/ dir.
Add to Kconfig as "BUILD_DOCSRC" config symbol.
Use "CONFIG_BUILD_DOCSRC=1 make ..." to build objects from the
Documentation/ sources. Or enable BUILD_DOCSRC in the *config system.
However, this symbol depends on HEADERS_CHECK since the header files need
to be installed (for userspace builds).
Built (using cross-tools) for x86-64, i386, alpha, ia64, sparc32,
sparc64, powerpc, sh, m68k, & mips.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Collect the implementations from include/linux/byteorder/swab.h, swabb.h
in swab.h
The functionality provided covers:
u16 swab16(u16 val) - return a byteswapped 16 bit value
u32 swab32(u32 val) - return a byteswapped 32 bit value
u64 swab64(u64 val) - return a byteswapped 64 bit value
u32 swahw32(u32 val) - return a wordswapped 32 bit value
u32 swahb32(u32 val) - return a high/low byteswapped 32 bit value
Similar to above, but return swapped value from a naturally-aligned pointer
u16 swab16p(u16 *p)
u32 swab32p(u32 *p)
u64 swab64p(u64 *p)
u32 swahw32p(u32 *p)
u32 swahb32p(u32 *p)
Similar to above, but swap the value in-place (in-situ)
void swab16s(u16 *p)
void swab32s(u32 *p)
void swab64s(u64 *p)
void swahw32s(u32 *p)
void swahb32s(u32 *p)
Arches can override any of these with an optimized version by defining an
inline in their asm/byteorder.h (example given for swab16()):
u16 __arch_swab16() {}
#define __arch_swab16 __arch_swab16
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch /proc/irq/*/smp_affinity , /proc/irq/default_smp_affinity to
seq_files.
cat(1) reads with 1024 chunks by default, with high enough NR_CPUS, there
will be -EINVAL.
As side effect, there are now two less users of the ->read_proc interface.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Short enough reads from /proc/irq/*/smp_affinity return -EINVAL for no
good reason.
This became noticed with NR_CPUS=4096 patches, when length of printed
representation of cpumask becase 1152, but cat(1) continued to read with
1024-byte chunks. bitmap_scnprintf() in good faith fills buffer, returns
1023, check returns -EINVAL.
Fix it by switching to seq_file, so handler will just fill buffer and
doesn't care about offsets, length, filling EOF and all this crap.
For that add seq_bitmap(), and wrappers around it -- seq_cpumask() and
seq_nodemask().
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Paul Jackson <pj@sgi.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix wrong conversion function used by strict_strtou*
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Reported-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adjust and honor the vc_scrl_erase_char for 256 and 512 character fonts.
It fixes the issue with disappearing cursor during scrolling
(http://bugzilla.kernel.org/show_bug.cgi?id=11258). The issue was
reported and tracked by Peter Hanzel.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Reported-by: Peter Hanzel <hanzelpeter@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Specify how much physically continuous, DMA capable memory will be
allocated at driver initialization time. This allow to create framebuffer
device with larger virtual resolution. Combine with y-panning this can be
used to implement double buffering acceleration method.
Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl>
Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Panning in the y-direction can be done by simply changing the DMA base
address. This code is already in place, but FBIOPAN_DISPLAY will
currently fail because ypanstep is 0.
Set ypanstep to 1 to indicate that we do support y-panning and also set
the necessary acceleration flags on AT91 (AVR32 already have them.)
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The legacy i2c model is going away soon, so switch to the new model.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Petr Vandrovec <VANDROVE@vc.cvut.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Clean up the use of structure templates in i2c-matroxfb. In this case
it's more efficient to initialize the few fields we need individually.
This makes i2c-matroxfb.ko 16% smaller on my system.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Petr Vandrovec <VANDROVE@vc.cvut.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I broke an error path with d03c21ec0b,
sorry about that.
The machine will crash if the i2c_attach_client() or maven_init_client()
calls fail, although nobody has yet reported this happening.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Petr Vandrovec <VANDROVE@vc.cvut.cz>
Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some chips appear to have the 2D engine hang during screen redraw,
typically in a sequence of copyarea operations. This appear to be
solved by adding a flush of the engine destination pixel cache
and waiting for the engine to be idle before issuing the accel
operation. The performance impact seems to be fairly small.
Here is a trace on an RV370 (PCI device ID 0x5b64), it records the
RBBM_STATUS register, then the source x/y, destination x/y, and
width/height used for the copy:
----------------------------------------
radeonfb_prim_copyarea: STATUS[00000140] src[210:70] dst[210:60] wh[a0:10]
radeonfb_prim_copyarea: STATUS[00000140] src[2b8:70] dst[2b8:60] wh[88:10]
radeonfb_prim_copyarea: STATUS[00000140] src[348:70] dst[348:60] wh[40:10]
radeonfb_prim_copyarea: STATUS[80020140] src[390:70] dst[390:60] wh[88:10]
radeonfb_prim_copyarea: STATUS[8002613f] src[40:80] dst[40:70] wh[28:10]
radeonfb_prim_copyarea: STATUS[80026139] src[a8:80] dst[a8:70] wh[38:10]
radeonfb_prim_copyarea: STATUS[80026133] src[e8:80] dst[e8:70] wh[80:10]
radeonfb_prim_copyarea: STATUS[8002612d] src[170:80] dst[170:70] wh[30:10]
radeonfb_prim_copyarea: STATUS[80026127] src[1a8:80] dst[1a8:70] wh[8:10]
radeonfb_prim_copyarea: STATUS[80026121] src[1b8:80] dst[1b8:70] wh[88:10]
radeonfb_prim_copyarea: STATUS[8002611b] src[248:80] dst[248:70] wh[68:10]
----------------------------------------
When things are going fine the copies complete before the next ROP is
even issued, but all of a sudden the 2D unit becomes active (bit 17 in
RBBM_STATUS) and the FIFO retry (bit 13) and FIFO pipeline busy (bit
14) are set as well. The FIFO begins to backup until it becomes full.
What happens next is the radeon_fifo_wait() times out, and we access
the chip illegally leading to a bus error which usually wedges the
box. None of this makes it to the console screen, of course :-)
radeon_fifo_wait() should be modified to reset the accelerator when
this timeout happens instead of programming the chip anyways.
----------------------------------------
radeonfb: FIFO Timeout !
ERROR(0): Cheetah error trap taken afsr[0010080005000000] afar[000007f900800e40] TL1(0)
ERROR(0): TPC[595114] TNPC[595118] O7[459788] TSTATE[11009601]
ERROR(0): TPC<radeonfb_copyarea+0xfc/0x248>
ERROR(0): M_SYND(0), E_SYND(0), Privileged
ERROR(0): Highest priority error (0000080000000000) "Bus error response from system bus"
ERROR(0): D-cache idx[0] tag[0000000000000000] utag[0000000000000000] stag[0000000000000000]
ERROR(0): D-cache data0[0000000000000000] data1[0000000000000000] data2[0000000000000000] data3[0000000000000000]
ERROR(0): I-cache idx[0] tag[0000000000000000] utag[0000000000000000] stag[0000000000000000] u[0000000000000000] l[00\
ERROR(0): I-cache INSN0[0000000000000000] INSN1[0000000000000000] INSN2[0000000000000000] INSN3[0000000000000000]
ERROR(0): I-cache INSN4[0000000000000000] INSN5[0000000000000000] INSN6[0000000000000000] INSN7[0000000000000000]
ERROR(0): E-cache idx[800e40] tag[000000000e049f4c]
ERROR(0): E-cache data0[fffff8127d300180] data1[00000000004b5384] data2[0000000000000000] data3[0000000000000000]
Ker:xnel panic - not syncing: Irrecoverable deferred error trap.
----------------------------------------
Another quirk is that these copyarea calls will not happen until the
first drivers/char/vt.c:redraw_screen() occurs. This will only happen
if you 1) VC switch or 2) run "consolechars" or 3) unblank the screen.
This seems to happen because until a redraw_screen() the screen scrolling
method used by fbcon is not finalized yet. I've seen this with other fb
drivers too.
So if all you do is boot straight into X you will never see this bug on
the relevant chips.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[Andrew this should replace the previous version which did not check
the returns from the region prepare for errors. This has been tested by
us and Gerald and it looks good.
Bah, while reviewing the locking based on your previous email I spotted
that we need to check the return from the vma_needs_reservation call for
allocation errors. Here is an updated patch to correct this. This passes
testing here.]
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In the normal case, hugetlbfs reserves hugepages at map time so that the
pages exist for future faults. A struct file_region is used to track when
reservations have been consumed and where. These file_regions are
allocated as necessary with kmalloc() which can sleep with the
mm->page_table_lock held. This is wrong and triggers may-sleep warning
when PREEMPT is enabled.
Updates to the underlying file_region are done in two phases. The first
phase prepares the region for the change, allocating any necessary memory,
without actually making the change. The second phase actually commits the
change. This patch makes use of this by checking the reservations before
the page_table_lock is taken; triggering any necessary allocations. This
may then be safely repeated within the locks without any allocations being
required.
Credit to Mel Gorman for diagnosing this failure and initial versions of
the patch.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These attributes are really sysdev class attributes. The incorrect
definition leads to an oops because of recent changes which make sysdev
attributes use a different prototype.
Based on Andi's f718cd4add ("sched: make
scheduler sysfs attributes sysdev class devices")
Reported-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Rabin Vincent <rabin@rab.in>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
WARNING: vmlinux.o(.text+0x2fdf): Section mismatch in reference from the variable .LM3 to the variable .init.text:___alloc_bootmem
The function .LM3() references
the variable __init ___alloc_bootmem.
This is often because .LM3 lacks a __init
annotation or the annotation of ___alloc_bootmem is wrong.
WARNING: vmlinux.o(.text+0x2ff5): Section mismatch in reference from the variable .LM4 to the variable .init.text:___alloc_bootmem
The function .LM4() references
the variable __init ___alloc_bootmem.
This is often because .LM4 lacks a __init
annotation or the annotation of ___alloc_bootmem is wrong.
WARNING: vmlinux.o(.text+0x300b): Section mismatch in reference from the variable .LM5 to the variable .init.text:___alloc_bootmem
The function .LM5() references
the variable __init ___alloc_bootmem.
This is often because .LM5 lacks a __init
annotation or the annotation of ___alloc_bootmem is wrong.
WARNING: vmlinux.o(.text+0x304b): Section mismatch in reference from the variable .LM10 to the variable .init.text:_free_area_init
The function .LM10() references
the variable __init _free_area_init.
This is often because .LM10 lacks a __init
annotation or the annotation of _free_area_init is wrong.
WARNING: vmlinux.o(.text+0x30a3): Section mismatch in reference from the variable .LM17 to the variable .init.text:_free_all_bootmem
The function .LM17() references
the variable __init _free_all_bootmem.
This is often because .LM17 lacks a __init
annotation or the annotation of _free_all_bootmem is wrong.
Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Revert commit 51a776fa7a ("rtc: cdev
lock_kernel() pushdown"). The RTC framework does not need BKL
protection.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Alessandro Zummo <alessandro.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Got an oops in mem_cgroup_shrink_usage() when testing loop over tmpfs:
yes, of course, loop0 has no mm: other entry points check but this didn't.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
s390 doesn't support the additional_cpus kernel parameter anymore since a
long time. So we better update the code and documentation to reflect
that.
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
.. since a failed allocation is being (initially) handled gracefully, and
panic()-ed upon failure explicitly in the function if retries with smaller
sizes failed.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The s390 software large page emulation implements shared page tables by
using page->index of the first tail page from a compound large page to
store page table information. This is set up in arch_prepare_hugepage(),
which is called from alloc_fresh_huge_page_node().
A similar call to arch_prepare_hugepage() is missing for surplus large
pages that are allocated in alloc_buddy_huge_page(), which breaks the
software emulation mode for (surplus) large pages on s390. This patch
adds the missing call to arch_prepare_hugepage(). It will have no effect
on other architectures where arch_prepare_hugepage() is a nop.
Also, use the correct order in the error path in alloc_fresh_huge_page_node().
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Acked-by: Adam Litke <agl@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that the driver is removed we should also remove the entry in
Documentation/feature-removal-schedule.txt
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch changes ia64 to use the new bcd2bin/bin2bcd functions instead
of the obsolete BCD2BIN/BIN2BCD macros.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>