Commit Graph

310438 Commits

Author SHA1 Message Date
Ben Myers
11159a0500 xfs: shutdown xfs_sync_worker before the log
Revert commit 1307bbd, which uses the s_umount semaphore to provide
exclusion between xfs_sync_worker and unmount, in favor of shutting down
the sync worker before freeing the log in xfs_log_unmount.  This is a
cleaner way of resolving the race between xfs_sync_worker and unmount
than using s_umount.

Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2012-06-21 13:41:04 -05:00
Jan Kara
bcf62ab64d xfs: Fix overallocation in xfs_buf_allocate_memory()
Commit de1cbee which removed b_file_offset in favor of b_bn introduced a bug
causing xfs_buf_allocate_memory() to overestimate the number of necessary
pages. The problem is that xfs_buf_alloc() sets b_bn to -1 and thus effectively
every buffer is straddling a page boundary which causes
xfs_buf_allocate_memory() to allocate two pages and use vmalloc() for access
which is unnecessary.

Dave says xfs_buf_alloc() doesn't need to set b_bn to -1 anymore since the
buffer is inserted into the cache only after being fully initialized now.
So just make xfs_buf_alloc() fill in proper block number from the beginning.

CC: David Chinner <dchinner@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-21 13:38:25 -05:00
Dave Chinner
079da28c64 xfs: fix allocbt cursor leak in xfs_alloc_ag_vextent_near
When we fail to find an matching extent near the requested extent
specification during a left-right distance search in
xfs_alloc_ag_vextent_near, we fail to free the original cursor that
we used to look up the XFS_BTNUM_CNT tree and hence leak it.

Reported-by: Chris J Arges <chris.j.arges@canonical.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-21 12:32:43 -05:00
Brian Foster
76e8f13866 xfs: check for stale inode before acquiring iflock on push
An inode in the AIL can be flush locked and marked stale if
a cluster free transaction occurs at the right time. The
inode item is then marked as flushing, which causes xfsaild
to spin and leaves the filesystem stalled. This is
reproduced by running xfstests 273 in a loop for an
extended period of time.

Check for stale inodes before the flush lock. This marks
the inode as pinned, leads to a log flush and allows the
filesystem to proceed.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-21 10:53:43 -05:00
Chen Baozi
51c84223af xfs: fix typo in comment of xfs_dinode_t.
There should be "XFS_DFORK_DPTR, XFS_DFORK_APTR, and XFS_DFORK_PTR" instead
of "XFS_DFORK_PTR, XFS_DFORK_DPTR, and XFS_DFORK_PTR".

Signed-off-by: Chen Baozi <baozich@gmail.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-14 12:28:26 -05:00
Dave Chinner
5276432997 xfs: kill copy and paste segment checks in xfs_file_aio_read
The generic segment check code now returns a count of the number of
bytes in the iovec, so we don't need to roll our own anymore.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-14 12:28:25 -05:00
Dave Chinner
32972383ca xfs: make largest supported offset less shouty
XFS_MAXIOFFSET() is just a simple macro that resolves to
mp->m_maxioffset. It doesn't need to exist, and it just makes the
code unnecessarily loud and shouty.

Make it quiet and easy to read.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-14 12:28:24 -05:00
Dave Chinner
d2c2819117 xfs: m_maxioffset is redundant
The m_maxioffset field in the struct xfs_mount contains the same
value as the superblock s_maxbytes field. There is no need to carry
two copies of this limit around, so use the VFS superblock version.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-14 12:28:22 -05:00
Jeff Liu
0f2cf9d3d9 xfs: fix debug_object WARN at xfs_alloc_vextent()
Fengguang reports:

[  780.529603] XFS (vdd): Ending clean mount
[  781.454590] ODEBUG: object is on stack, but not annotated
[  781.455433] ------------[ cut here ]------------
[  781.455433] WARNING: at /c/kernel-tests/sound/lib/debugobjects.c:301 __debug_object_init+0x173/0x1f1()
[  781.455433] Hardware name: Bochs
[  781.455433] Modules linked in:
[  781.455433] Pid: 26910, comm: kworker/0:2 Not tainted 3.4.0+ #51
[  781.455433] Call Trace:
[  781.455433]  [<ffffffff8106bc84>] warn_slowpath_common+0x83/0x9b
[  781.455433]  [<ffffffff8106bcb6>] warn_slowpath_null+0x1a/0x1c
[  781.455433]  [<ffffffff814919a5>] __debug_object_init+0x173/0x1f1
[  781.455433]  [<ffffffff81491c65>] debug_object_init+0x14/0x16
[  781.455433]  [<ffffffff8108842a>] __init_work+0x20/0x22
[  781.455433]  [<ffffffff8134ea56>] xfs_alloc_vextent+0x6c/0xd5

Use INIT_WORK_ONSTACK in xfs_alloc_vextent instead of INIT_WORK.

Reported-by: Wu Fengguang <wfg@linux.intel.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-14 12:28:21 -05:00
Alain Renaud
7d0fa3ecba xfs: xfs_vm_writepage clear iomap_valid when !buffer_uptodate (REV2)
On filesytems with a block size smaller than PAGE_SIZE we currently have
a problem with unwritten extents.  If a we have multi-block page for
which an unwritten extent has been allocated, and only some of the
buffers have been written to, and they are not contiguous, we can expose
stale data from disk in the blocks between the writes after extent
conversion.

Example of a page with unwritten and real data.
buffer  content
0       empty  b_state = 0
1       DATA   b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
2       DATA   b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
3       empty  b_state = 0
4       empty  b_state = 0
5       DATA   b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
6       DATA   b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
7       empty  b_state = 0

Buffers 1, 2, 5, and 6 have been written to, leaving 0, 3, 4, and 7
empty.  Currently buffers 1, 2, 5, and 6 are added to a single ioend,
and when IO has completed, extent conversion creates a real extent from
block 1 through block 6, leaving 0 and 7 unwritten.  However buffers 3
and 4 were not written to disk, so stale data is exposed from those
blocks on a subsequent read.

Fix this by setting iomap_valid = 0 when we find a buffer that is not
Uptodate.  This ensures that buffers 5 and 6 are not added to the same
ioend as buffers 1 and 2.  Later these blocks will be converted into two
separate real extents, leaving the blocks in between unwritten.

Signed-off-by: Alain Renaud <arenaud@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-14 12:28:20 -05:00
Linus Torvalds
f8f5701bda Linux 3.5-rc1 2012-06-02 18:29:26 -07:00
Linus Torvalds
912afc3616 Improve multipath's retrying mechanism in some defined circumstances
and provide a simple reserve/release mechanism for userspace tools to
 access thin provisioning metadata while the pool is in use.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPyqIdAAoJEK2W1qbAHj1nSPoQAJvAb/6UHufTWC/lufbEyo7t
 ft6uwZZ4S/VV1Gdx8V5YXo3rxkVIZj/CV0hiJIDctmDMKGPMlzup39kCgjD/rOUF
 mzcFAE8sEr3QEavkfjSWw2RHIIlhnJpvqVnb8nu3p/mSgAB4qYGgaDjBpi+W60PV
 aqQSSWgwH1uNhfGDBIxQoJ8OIjjYvKPIf2Ir2FAXam/dNi9chWO9nzFdj3q2LccP
 nZir094BDsFac1BF0FYW3J+rgT1FfPO7RRGAQct6WNJ197IZlYWYjKH3XehxnUHE
 wgiJmjfUO8vrho1hhWmWDOesKJPPWFN67EQnl5FqAu9itP7c7k8bd7Ay4jWgtZQU
 QIx10uiAgAuFUmTdWGK1fLlE8HGKUFINYLp63N5n5NZ4TDJrgo8e7CIID3rvYf/O
 EtmL7HzAyztL9Uc6oaXzCK6TgMUtd/ht8OJCDFhjitzQTNjbrfAGz6m+RHnEZyyj
 dtOVK7WBlmuKEANl2vDFGuVVF0+MwJLTlvPx1/b/ejFvnHI/R5Wuk9EH7t/DO4LB
 nCmiwzB6uWMzU3y3vnZG72AYSF5NTKSvnAl5B8U/0rI1MZU+6PehjeviJNx6ddJN
 2YheHBLU4vbBV/LF4XIpaHK2aiHN1ltaKCp8INo3EKhCwpR4ZdlVvnAGU9ocf9+c
 qoaFTOP7zGD9zgPeGjoG
 =wCpY
 -----END PGP SIGNATURE-----

Merge tag 'dm-3.5-changes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm

Pull device-mapper updates from Alasdair G Kergon:
 "Improve multipath's retrying mechanism in some defined circumstances
  and provide a simple reserve/release mechanism for userspace tools to
  access thin provisioning metadata while the pool is in use."

* tag 'dm-3.5-changes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
  dm thin: provide userspace access to pool metadata
  dm thin: use slab mempools
  dm mpath: allow ioctls to trigger pg init
  dm mpath: delay retry of bypassed pg
  dm mpath: reduce size of struct multipath
2012-06-02 17:39:40 -07:00
Joe Thornber
cc8394d86f dm thin: provide userspace access to pool metadata
This patch implements two new messages that can be sent to the thin
pool target allowing it to take a snapshot of the _metadata_.  This,
read-only snapshot can be accessed by userland, concurrently with the
live target.

Only one metadata snapshot can be held at a time.  The pool's status
line will give the block location for the current msnap.

Since version 0.1.5 of the userland thin provisioning tools, the
thin_dump program displays the msnap as follows:

    thin_dump -m <msnap root> <metadata dev>

Available here: https://github.com/jthornber/thin-provisioning-tools

Now that userland can access the metadata we can do various things
that have traditionally been kernel side tasks:

     i) Incremental backups.

     By using metadata snapshots we can work out what blocks have
     changed over time.  Combined with data snapshots we can ensure
     the data doesn't change while we back it up.

     A short proof of concept script can be found here:

     https://github.com/jthornber/thinp-test-suite/blob/master/incremental_backup_example.rb

     ii) Migration of thin devices from one pool to another.

     iii) Merging snapshots back into an external origin.

     iv) Asyncronous replication.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-06-03 00:30:01 +01:00
Mike Snitzer
a24c25696b dm thin: use slab mempools
Use dedicated caches prefixed with a "dm_" name rather than relying on
kmalloc mempools backed by generic slab caches so the memory usage of
thin provisioning (and any leaks) can be accounted for independently.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-06-03 00:30:00 +01:00
Mikulas Patocka
35991652ba dm mpath: allow ioctls to trigger pg init
After the failure of a group of paths, any alternative paths that
need initialising do not become available until further I/O is sent to
the device.  Until this has happened, ioctls return -EAGAIN.

With this patch, new paths are made available in response to an ioctl
too.  The processing of the ioctl gets delayed until this has happened.

Instead of returning an error, we submit a work item to kmultipathd
(that will potentially activate the new path) and retry in ten
milliseconds.

Note that the patch doesn't retry an ioctl if the ioctl itself fails due
to a path failure.  Such retries should be handled intelligently by the
code that generated the ioctl in the first place, noting that some SCSI
commands should not be retried because they are not idempotent (XOR write
commands).  For commands that could be retried, there is a danger that
if the device rejected the SCSI command, the path could be errorneously
marked as failed, and the request would be retried on another path which
might fail too.  It can be determined if the failure happens on the
device or on the SCSI controller, but there is no guarantee that all
SCSI drivers set these flags correctly.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-06-03 00:29:58 +01:00
Mike Christie
f220fd4efb dm mpath: delay retry of bypassed pg
If I/O needs retrying and only bypassed priority groups are available,
set the pg_init_delay_retry flag to wait before retrying.

If, for example, the reason for the bypass is that the controller is
getting reset or there is a firmware upgrade happening, retrying right
away would cause a flood of log messages and retries for what could be a
few seconds or even several minutes.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-06-03 00:29:45 +01:00
Mike Snitzer
1fbdd2b3a3 dm mpath: reduce size of struct multipath
Move multipath structure's 'lock' and 'queue_size' members to eliminate
two 4-byte holes.  Also use a bit within a single unsigned int for each
existing flag (saves 8-bytes).  This allows future flags to be added
without each consuming an unsigned int.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-06-03 00:29:43 +01:00
Linus Torvalds
4fc3acf291 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking updates from David Miller:

 1) Make syn floods consume significantly less resources by

    a) Not pre-COW'ing routing metrics for SYN/ACKs
    b) Mirroring the device queue mapping of the SYN for the SYN/ACK
       reply.

    Both from Eric Dumazet.

 2) Fix calculation errors in Byte Queue Limiting, from Hiroaki SHIMODA.

 3) Validate the length requested when building a paged SKB for a
    socket, so we don't overrun the page vector accidently.  From Jason
    Wang.

 4) When netlabel is disabled, we abort all IP option processing when we
    see a CIPSO option.  This isn't the right thing to do, we should
    simply skip over it and continue processing the remaining options
    (if any).  Fix from Paul Moore.

 5) SRIOV fixes for the mellanox driver from Jack orgenstein and Marcel
    Apfelbaum.

 6) 8139cp enables the receiver before the ring address is properly
    programmed, which potentially lets the device crap over random
    memory.  Fix from Jason Wang.

 7) e1000/e1000e fixes for i217 RST handling, and an improper buffer
    address reference in jumbo RX frame processing from Bruce Allan and
    Sebastian Andrzej Siewior, respectively.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  fec_mpc52xx: fix timestamp filtering
  mcs7830: Implement link state detection
  e1000e: fix Rapid Start Technology support for i217
  e1000: look into the page instead of skb->data for e1000_tbi_adjust_stats()
  r8169: call netif_napi_del at errpaths and at driver unload
  tcp: reflect SYN queue_mapping into SYNACK packets
  tcp: do not create inetpeer on SYNACK message
  8139cp/8139too: terminate the eeprom access with the right opmode
  8139cp: set ring address before enabling receiver
  cipso: handle CIPSO options correctly when NetLabel is disabled
  net: sock: validate data_len before allocating skb in sock_alloc_send_pskb()
  bql: Avoid possible inconsistent calculation.
  bql: Avoid unneeded limit decrement.
  bql: Fix POSDIFF() to integer overflow aware.
  net/mlx4_core: Fix obscure mlx4_cmd_box parameter in QUERY_DEV_CAP
  net/mlx4_core: Check port out-of-range before using in mlx4_slave_cap
  net/mlx4_core: Fixes for VF / Guest startup flow
  net/mlx4_en: Fix improper use of "port" parameter in mlx4_en_event
  net/mlx4_core: Fix number of EQs used in ICM initialisation
  net/mlx4_core: Fix the slave_id out-of-range test in mlx4_eq_int
2012-06-02 16:22:51 -07:00
Linus Torvalds
63004afa71 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull straggler x86 fixes from Peter Anvin:
 "Three groups of patches:

  - EFI boot stub documentation and the ability to print error messages;
  - Removal for PTRACE_ARCH_PRCTL for x32 (obsolete interface which
    should never have been ported, and the port is broken and
    potentially dangerous.)
  - ftrace stack corruption fixes.  I'm not super-happy about the
    technical implementation, but it is probably the least invasive in
    the short term.  In the future I would like a single method for
    nesting the debug stack, however."

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, x32, ptrace: Remove PTRACE_ARCH_PRCTL for x32
  x86, efi: Add EFI boot stub documentation
  x86, efi; Add EFI boot stub console support
  x86, efi: Only close open files in error path
  ftrace/x86: Do not change stacks in DEBUG when calling lockdep
  x86: Allow nesting of the debug stack IDT setting
  x86: Reset the debug_stack update counter
  ftrace: Use breakpoint method to update ftrace caller
  ftrace: Synchronize variable setting with breakpoints
2012-06-02 16:17:03 -07:00
Linus Torvalds
f309532bf3 tty: Revert the tty locking series, it needs more work
This reverts the tty layer change to use per-tty locking, because it's
not correct yet, and fixing it will require some more deep surgery.

The main revert is d29f3ef39b ("tty_lock: Localise the lock"), but
there are several smaller commits that built upon it, they also get
reverted here. The list of reverted commits is:

  fde86d3108 - tty: add lockdep annotations
  8f6576ad47 - tty: fix ldisc lock inversion trace
  d3ca8b64b9 - pty: Fix lock inversion
  b1d679afd7 - tty: drop the pty lock during hangup
  abcefe5fc3 - tty/amiserial: Add missing argument for tty_unlock()
  fd11b42e35 - cris: fix missing tty arg in wait_event_interruptible_tty call
  d29f3ef39b - tty_lock: Localise the lock

The revert had a trivial conflict in the 68360serial.c staging driver
that got removed in the meantime.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-02 15:21:43 -07:00
Stephan Gatzka
9ca3cc6f30 fec_mpc52xx: fix timestamp filtering
skb_defer_rx_timestamp was called with a freshly allocated skb but must
be called with rskb instead.

Signed-off-by: Stephan Gatzka <stephan@gatzka.org>
Cc: stable <stable@vger.kernel.org>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-02 17:09:08 -04:00
Ondrej Zary
b1ff4f96fd mcs7830: Implement link state detection
Add .status callback that detects link state changes.
Tested with MCS7832CV-AA chip (9710:7830, identified as rev.C by the driver).
Fixes https://bugzilla.kernel.org/show_bug.cgi?id=28532

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-02 17:09:08 -04:00
Linus Torvalds
233e562eac Merge 'for-linus' branches from git://git.kernel.org/pub/scm/linux/kernel/git/viro/{vfs,signal}
Pull vfs fix and a fix from the signal changes for frv from Al Viro.

The __kernel_nlink_t for powerpc got scrogged because 64-bit powerpc
actually depended on the default "unsigned long", while 32-bit powerpc
had an explicit override to "unsigned short".  Al didn't notice, and
made both of them be the unsigned short.

The frv signal fix is fallout from simplifying the do_notify_resume()
code, and leaving an extra parenthesis.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  powerpc: Fix size of st_nlink on 64bit

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  frv: Remove bogus closing parenthesis
2012-06-02 09:03:54 -07:00
Anton Blanchard
0fd7bee1e9 powerpc: Fix size of st_nlink on 64bit
commit e57f93cc53 (powerpc: get rid of nlink_t uses, switch to
explicitly-sized type) changed the size of st_nlink on ppc64 from
a long to a short, resulting in boot failures.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-02 10:44:11 -04:00
Geert Uytterhoeven
a393624969 frv: Remove bogus closing parenthesis
Introduced by commit 6fd84c0831
("TIF_RESTORE_SIGMASK can be set only when TIF_SIGPENDING is set")

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-02 10:38:19 -04:00
Bruce Allan
6d7407bfba e1000e: fix Rapid Start Technology support for i217
The definition of I217_PROXY_CTRL must use the BM_PHY_REG() macro instead
of the PHY_REG() macro for PHY page 800 register 70 since it is for a PHY
register greater than the maximum allowed by the latter macro, and fix a
typo setting the I217_MEMPWR register in e1000_suspend_workarounds_ich8lan.

Also for clarity, rename a few defines as bit definitions instead of masks.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-06-02 00:12:33 -07:00
Sebastian Andrzej Siewior
281a8f2462 e1000: look into the page instead of skb->data for e1000_tbi_adjust_stats()
This is another fixup where the data is not transfered into buffer
addressed by skb->data but into a page.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-06-02 00:04:19 -07:00
Linus Torvalds
829f51dbd8 Merge branch 'akpm' (Fixups for Andrew's patchbomb)
Merge fixups for the mac NLS tables from Andrew.

* emailed from Andrew Morton, and one cleanup by me:
  nls: fix (and rename) mac NLS table files and config options
  fs/nls/Makefile: remove bogus CONFIG_ assignments
2012-06-01 19:56:23 -07:00
Linus Torvalds
8b8c0daa2c nls: fix (and rename) mac NLS table files and config options
The config options in the Kconfig file (with _CODEPAGE_ in the name)
didn't match the config option name in the Makefile (no _CODEPAGE_).

And both of them were of the hard-to-read MACXYZZY variety, which made
them hard to parse for normal humans: MACROMAN easily reads as "macro
man", not as "Mac Roman".

So rename the options to be consistent, and be NLS_MAC_xyzzy.  Rename
the files to be mac-xyzzy.c too, and drop the "nls" part entirely (it's
already in the directory name).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-01 19:51:22 -07:00
Andrew Morton
92a8956364 fs/nls/Makefile: remove bogus CONFIG_ assignments
These were debug things which snuck through.

Reported-by: Yinghai Lu <yinghai@kernel.org>
Cc: Vladimir Serbinenko <phcoder@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-01 19:47:26 -07:00
Linus Torvalds
804ce9866d fbdev updates for 3.5
It includes:
 - driver for AUO-K1900 and AUO-K1901 epaper controller
 - large updates for OMAP (e.g. decouple HDMI audio and video)
 - some updates for Exynos and SH Mobile
 - various other small fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJPyAhmAAoJECSVL5KnPj1PBcoQAIWftuoXo3sk94f5jKcV4Ucx
 MthEc5iEpMVs8xaEruHHNHXWv8ic0x/PfdC2xrpKOEbNXQcNPlb/QE2xWmBRxmT1
 ucDyu10HJ36jKcwcK4ra5IQwOW+GtbTBEoBZT+WNAjxHZtJmxzjQGM4C12zVQpdJ
 +qV2RP93JmsJoVBL9aKVAg1Ko135LLfD8TcKd+z8TmgFnLfSwKhfl7Jtd2xXwyvz
 /hmW3kJUEnD8E5wuj+/g8sKJhQkGalEiITTqG2j2vJyFgxHSqyLSw8BBixrFW1uT
 B9VnZsHF35ccCo+96UZRH4QsGJTx08+rea/qsv8IMSGczyRp5ey1ufjL+CzKiiIN
 FWfex6fY0HHqZGAopQhjag54e914SIbSxdBwWS/iRrtVt3e9d03BzkhYs4rXl4Ey
 CTC5obzWNTbQ6hLEjgWfVKkKcrF56BnRn3zGPgCTKGp2NK3vODdBkt/EmzUFvCWR
 CcyQhh+PvZzEWp3XsdOGossYs/0aP4bO+7XPGJxZaa3+WVcRaZwAG/uZvJXXBfnp
 DGRFy4wPsTTwKYIx4+t/KrsLtNVKioSMS5GEtuM1YEb8pA7mkUIkqwJv1I261h58
 heTr6vWUsviUqHlKALJ+1CdwWGr3CtktCZssGsSUri61nm8CvlSRn2Nr2aJ/L3RN
 AkemC/33RE5X/+lfkdMx
 =tmIU
 -----END PGP SIGNATURE-----

Merge tag 'fbdev-updates-for-3.5' of git://github.com/schandinat/linux-2.6

Pull fbdev updates from Florian Tobias Schandinat:
 - driver for AUO-K1900 and AUO-K1901 epaper controller
 - large updates for OMAP (e.g. decouple HDMI audio and video)
 - some updates for Exynos and SH Mobile
 - various other small fixes and cleanups

* tag 'fbdev-updates-for-3.5' of git://github.com/schandinat/linux-2.6: (130 commits)
  video: bfin_adv7393fb: Fix cleanup code
  video: exynos_dp: reduce delay time when configuring video setting
  video: exynos_dp: move sw reset prioir to enabling sw defined function
  video: exynos_dp: use devm_ functions
  fb: handle NULL pointers in framebuffer release
  OMAPDSS: HDMI: OMAP4: Update IRQ flags for the HPD IRQ request
  OMAPDSS: Apply VENC timings even if panel is disabled
  OMAPDSS: VENC/DISPC: Delay dividing Y resolution for managers connected to VENC
  OMAPDSS: DISPC: Support rotation through TILER
  OMAPDSS: VRFB: remove compiler warnings when CONFIG_BUG=n
  OMAPFB: remove compiler warnings when CONFIG_BUG=n
  OMAPDSS: remove compiler warnings when CONFIG_BUG=n
  OMAPDSS: DISPC: fix usage of dispc_ovl_set_accu_uv
  OMAPDSS: use DSI_FIFO_BUG workaround only for manual update displays
  OMAPDSS: DSI: Support command mode interleaving during video mode blanking periods
  OMAPDSS: DISPC: Update Accumulator configuration for chroma plane
  drivers/video: fsl-diu-fb: don't initialize the THRESHOLDS registers
  video: exynos mipi dsi: support reverse panel type
  video: exynos mipi dsi: Properly interpret the interrupt source flags
  video: exynos mipi dsi: Avoid races in probe()
  ...
2012-06-01 16:57:51 -07:00
Linus Torvalds
f5e7e844a5 - More robust parsing especially of xattr data in JFFS2
- Updates to mxc_nand and gpmi drivers to support new boards and device tree
  - Improve consistency of information about ECC strength in NAND devices
  - Clean up partition handling of plat_nand
  - Support NAND drivers without dedicated access to OOB area
  - BCH hardware ECC support for OMAP
  - Other fixes and cleanups, and a few new device IDs
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iEYEABECAAYFAk/JG1wACgkQdwG7hYl686M80wCglN4kutx20j+KJWuZofkr9Hog
 weEAoI4jrqEWEdW9EcT2CIWQw7eG+1v+
 =7tdo
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-3.5-20120601' of git://git.infradead.org/linux-mtd

Pull mtd update from David Woodhouse:
 - More robust parsing especially of xattr data in JFFS2
 - Updates to mxc_nand and gpmi drivers to support new boards and device tree
 - Improve consistency of information about ECC strength in NAND devices
 - Clean up partition handling of plat_nand
 - Support NAND drivers without dedicated access to OOB area
 - BCH hardware ECC support for OMAP
 - Other fixes and cleanups, and a few new device IDs

Fixed trivial conflict in drivers/mtd/nand/gpmi-nand/gpmi-nand.c due to
added include files next to each other.

* tag 'for-linus-3.5-20120601' of git://git.infradead.org/linux-mtd: (75 commits)
  mtd: mxc_nand: move ecc strengh setup before nand_scan_tail
  mtd: block2mtd: fix recursive call of mtd_writev
  mtd: gpmi-nand: define ecc.strength
  mtd: of_parts: fix breakage in Kconfig
  mtd: nand: fix scan_read_raw_oob
  mtd: docg3 fix in-middle of blocks reads
  mtd: cfi_cmdset_0002: Slight cleanup of fixup messages
  mtd: add fixup for S29NS512P NOR flash.
  jffs2: allow to complete xattr integrity check on first GC scan
  jffs2: allow to discriminate between recoverable and non-recoverable errors
  mtd: nand: omap: add support for hardware BCH ecc
  ARM: OMAP3: gpmc: add BCH ecc api and modes
  mtd: nand: check the return code of 'read_oob/read_oob_raw'
  mtd: nand: remove 'sndcmd' parameter of 'read_oob/read_oob_raw'
  mtd: m25p80: Add support for Winbond W25Q80BW
  jffs2: get rid of jffs2_sync_super
  jffs2: remove unnecessary GC pass on sync
  jffs2: remove unnecessary GC pass on umount
  jffs2: remove lock_super
  mtd: gpmi: add gpmi support for mx6q
  ...
2012-06-01 16:55:42 -07:00
Linus Torvalds
48445159e9 Merge branch 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86
Pull x86 platform driver updates from Matthew Garrett:
 "Some significant improvements for the Sony driver on newer machines,
  but other than that mostly just minor fixes and a patch to remove the
  broken rfkill code from the Dell driver."

* 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86: (35 commits)
  apple-gmux: Fix up the suspend/resume patch
  dell-laptop: Remove rfkill code
  toshiba_acpi: Fix mis-merge
  dell-laptop: Add touchpad led support for Dell V3450
  acer-wmi: add 3 laptops to video backlight vendor mode quirk table
  sony-laptop: add touchpad enable/disable function
  sony-laptop: add missing Fn key combos for 0x100 handlers
  sony-laptop: add support for more WWAN modems
  sony-laptop: new keyboard backlight handle
  sony-laptop: add high speed battery charging function
  sony-laptop: support automatic resume on lid open
  sony-laptop: adjust error handling in finding SNC handles
  sony-laptop: add thermal profiles support
  sony-laptop: support battery care functions
  sony-laptop: additional debug statements
  sony-laptop: improve SNC initialization and acpi notify callback code
  sony-laptop: use kstrtoul to parse sysfs values
  sony-laptop: generalise ACPI calls into SNC functions
  sony-laptop: fix return path when no ACPI buffer is allocated
  sony-laptop: use soft rfkill status stored in hw
  ...
2012-06-01 16:51:37 -07:00
Linus Torvalds
af4f8ba31a Merge branch 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux
Pull slab updates from Pekka Enberg:
 "Mainly a bunch of SLUB fixes from Joonsoo Kim"

* 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
  slub: use __SetPageSlab function to set PG_slab flag
  slub: fix a memory leak in get_partial_node()
  slub: remove unused argument of init_kmem_cache_node()
  slub: fix a possible memory leak
  Documentations: Fix slabinfo.c directory in vm/slub.txt
  slub: fix incorrect return type of get_any_partial()
2012-06-01 16:50:23 -07:00
H. Peter Anvin
40b46a7d29 Merge remote-tracking branch 'rostedt/tip/perf/urgent-2' into x86-urgent-for-linus 2012-06-01 15:55:31 -07:00
Linus Torvalds
efff0471b0 Merge branch 'ux500/hickup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull arm fixes for ux500 mismerge mishap from Arnd Bergmann:
 "The device tree conversion for arm/ux500 in 3.5 turns out to be
  incomplete because of a mismerge done by Linus Walleij that I failed
  to notice early enough and that Lee Jones as the original author of
  those patches did not manage to fix during the -next cycle.  While we
  originally to get a much larger set of ux500 device tree enablement
  patches merged, this did not happen in time.

  After some discussion at Linaro Connect conference this week, Lee has
  been able to do damage control and provide a series to put the broken
  platform back into usable shape for both DT and non-DT based booting.

  This series has not been part of linux-next and is based on top of the
  current state of the upstream kernel rather than an -rc, but this is
  the best we could manage given the earlier breakage."

* 'ux500/hickup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: ux500: Enable probing of pinctrl through Device Tree
  ARM: ux500: Add support for ab8500 regulators into the Device Tree
  ARM: ux500: Provide regulator support for SMSC911x via Device Tree
  ARM: ux500: Allow PRCMU regulator to be probed during a DT enabled boot
  ARM: ux500: Apply db8500-prcmu regulator information to db8500 Device Tree
  ARM: ux500: Only initialise STE's UIBs on boards which support them
  ARM: ux500: Disable platform setup of the ab8500 when DT is enabled
  ARM: ux500: Use correct format for dynamic IRQ assignment
  ARM: ux500: Re-enable SMSC911x platform code registration during non-DT boots
  ARM: ux500: PRCMU related configuration and layout corrections for Device Tree
  ARM: ux500: Remove DB8500 PRCMU platform registration when DT is enabled
  ARM: ux500: Disable SMSC911x platform code registration when DT is enabled
  ARM: ux500: New DT:ed u8500_init_devices for one-by-one device enablement
  ARM: ux500: New DT:ed snowball_platform_devs for one-by-one device enablement
  pinctrl-nomadik: Allow Device Tree driver probing
2012-06-01 15:46:46 -07:00
Devendra Naga
ad1be8d345 r8169: call netif_napi_del at errpaths and at driver unload
when register_netdev fails, the init'ed NAPIs by netif_napi_add must be
deleted with netif_napi_del, and also when driver unloads, it should
delete the NAPI before unregistering netdevice using unregister_netdev.

Signed-off-by: Devendra Naga <devendra.aaru@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-01 18:44:32 -04:00
Linus Torvalds
3ded7acfdd Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "A bunch of fixes:
   - vmware memory corruption
   - ttm spinlock balance
   - cirrus/mgag200 work in the presence of efifb
  and finally Alex and Jerome managed to track down a magic set of bits
  that on certain rv740 and evergreen cards allow the correct use of the
  complete set of render backends, this makes the cards operate
  correctly in a number of scenarios we had issues in before, it also
  manages to boost speed on benchmarks my large amounts on these
  specific gpus."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/edid: Make the header fixup threshold tunable
  drm/radeon: fix regression in UMS CS ioctl
  drm/vmwgfx: Fix nasty write past alloced memory area
  drm/ttm: Fix spinlock imbalance
  drm/radeon: fixup tiling group size and backendmap on r6xx-r9xx (v4)
  drm/radeon: fix HD6790, HD6570 backend programming
  drm/radeon: properly program gart on rv740, juniper, cypress, barts, hemlock
  drm/radeon: fix bank information in tiling config
  drm/mgag200: kick off conflicting framebuffers earlier.
  drm/cirrus: kick out conflicting framebuffers earlier
  cirrus: avoid crash if driver fails to load
2012-06-01 15:40:29 -07:00
Linus Torvalds
37b22400f8 Sound fixes for 3.5-rc1
Just a few trivial driver-specific fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQIcBAABAgAGBQJPyGJVAAoJEGwxgFQ9KSmktdIQAJ+P5L4imwwNgU2jPyT+Qq4B
 bolQsPCmKFCskV8QkZxhVCzQoKuOyfQZDqhtFgEO4ALN9EHpHFYCFugnuwpZ805C
 D7ToKhsNgAkS7S8s/fyDof9JwBbD0e/c9k+GqeR5CmhMUqhmPwQD/hxAawE022UC
 jOqX5ZSEOMkMZDKLMo1Ss8C1pwuU0J+F2U0C4471dopM8gTnwrvwHXGYbfxkQtmH
 z+eCk13YOO5ZjP4wOVftk07VrF1aFtjbbet53/rN9SBURCxLavA3rvLU9aLY2m0K
 Jrx3NWq2qgdcsExqcUCU+T8Znyn5xo6UclXiCm36VxEG6TYgjxKuhewNQY/lCWhu
 4I+hNzu7oI6kPniok3IYarRFG7lfHyeY942ZfdOXx0CBOV/iy5o7nITXk7k22t9P
 MPXa8Nv+mlp7vyLtWEQKZd5jBobCYnC6ERJB86lQ+/TI1I1+/L6REkENfKEfHmGU
 sO4qcDhBb9UEJRg3s1OQWuTJ7MKXdL6rF6UY3xuuoVgwA+bYNXQhfN/cIn0c+QFS
 IvnjTtEikG7h5h/e1WtcdeA+jEDVQOXRAqYKFnCqXwkVfjbDy/gBp1oqW5d/Sb90
 Dt3okYpNiPtIOv/1cKCXbURS0r+pdA7bwW2bebte72OwCGDgerJisdxxLhd6gbsF
 virl4c4jGnjUy8kS9Ftb
 =Oc6V
 -----END PGP SIGNATURE-----

Merge tag 'sound-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Just a few trivial driver-specific fixes."

* tag 'sound-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hdspm - Work around broken DDS value on PCI RME MADI
  ALSA: usb-audio: fix rate_list memory leak
  ASoC: fsi: bugfix: ensure dma is terminated
  ASoC: fsi: bugfix: correct dma area
  ASoC: fsi: bugfix: enable master clock control on DMA stream
  ASoC: imx-ssi: Use clk_prepare_enable/clk_disable_unprepare
2012-06-01 15:39:26 -07:00
H.J. Lu
bad1a753d4 x86, x32, ptrace: Remove PTRACE_ARCH_PRCTL for x32
When I added x32 ptrace to 3.4 kernel, I also include PTRACE_ARCH_PRCTL
support for x32 GDB  For ARCH_GET_FS/GS, it takes a pointer to int64.  But
at user level, ARCH_GET_FS/GS takes a pointer to int32.  So I have to add
x32 ptrace to glibc to handle it with a temporary int64 passed to kernel and
copy it back to GDB as int32.  Roland suggested that PTRACE_ARCH_PRCTL
is obsolete and x32 GDB should use fs_base and gs_base fields of
user_regs_struct instead.

Accordingly, remove PTRACE_ARCH_PRCTL completely from the x32 code to
avoid possible memory overrun when pointer to int32 is passed to
kernel.

Link: http://lkml.kernel.org/r/CAMe9rOpDzHfS7NH7m1vmD9QRw8SSj4Sc%2BaNOgcWm_WJME2eRsQ@mail.gmail.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@vger.kernel.org> v3.4
2012-06-01 13:54:21 -07:00
Sascha Hauer
4a43faf54e mtd: mxc_nand: move ecc strengh setup before nand_scan_tail
Since commit 6a918bade9, the mxc_nand driver
fails with:

Driver must set ecc.strength when using hardware ECC

This is because nand_scan_tail checks for correct ecc strength
settings, so we must set them up before nand_scan_tail.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Cc: stable@vger.kernel.org [3.4+]
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2012-06-01 20:23:29 +01:00
Gabor Juhos
2e24e32e27 mtd: block2mtd: fix recursive call of mtd_writev
The 'mtd_writev' interface calls the function assigned
to the '_write' field of a given mtd device if that is
not NULL. The block2mtd driver sets the '_writev' field
to the 'mtd_writev' function itself and thus causes a
endless loop.

This is caused by 1dbebd3256
(mtd: harmonize mtd_writev usage).

Remove the assignment from the block2mtd driver to fix the
issue.

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Cc: stable@kernel.org [3.3+]
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2012-06-01 20:23:19 +01:00
Marek Vasut
5636ce0f07 mtd: gpmi-nand: define ecc.strength
Fix an issue which was introduced by the recent addition of ecc.strength.

The ecc.strength wasn't set in gpmi-nand, resulting in the following crash:
[    2.550000] kernel BUG at drivers/mtd/nand/nand_base.c:3347!
...
[    2.550000] [<c020841c>] (nand_scan_tail+0x328/0x650) from [<c02f68e0>] (gpmi_nand_probe+0x43c/0x5a4)
[    2.550000] [<c02f68e0>] (gpmi_nand_probe+0x43c/0x5a4) from [<c01f6618>] (platform_drv_probe+0x14/0x18)
[    2.550000] [<c01f6618>] (platform_drv_probe+0x14/0x18) from [<c01f55b0>] (driver_probe_device+0x74/0x1fc)
[    2.550000] [<c01f55b0>] (driver_probe_device+0x74/0x1fc) from [<c01f57cc>] (__driver_attach+0x94/0x98)
[    2.550000] [<c01f57cc>] (__driver_attach+0x94/0x98) from [<c01f3d40>] (bus_for_each_dev+0x50/0x80)
[    2.550000] [<c01f3d40>] (bus_for_each_dev+0x50/0x80) from [<c01f4e18>] (bus_add_driver+0x188/0x25c)
[    2.550000] [<c01f4e18>] (bus_add_driver+0x188/0x25c) from [<c01f5a70>] (driver_register+0x78/0x138)
[    2.550000] [<c01f5a70>] (driver_register+0x78/0x138) from [<c043dc7c>] (gpmi_nand_init+0xc/0x30)
[    2.550000] [<c043dc7c>] (gpmi_nand_init+0xc/0x30) from [<c0008824>] (do_one_initcall+0x108/0x17c)
[    2.550000] [<c0008824>] (do_one_initcall+0x108/0x17c) from [<c042a8b8>] (kernel_init+0xfc/0x1bc)
[    2.550000] [<c042a8b8>] (kernel_init+0xfc/0x1bc) from [<c000fab4>] (kernel_thread_exit+0x0/0x8)

Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2012-06-01 20:22:21 +01:00
Matthew Garrett
a2f01a8993 apple-gmux: Fix up the suspend/resume patch
I incorporated the wrong version of the suspend/resume patch for gmux,
and so lost David Woodhouse's fix to leave the backlight level unchanged
over suspend/resume. This fixes it up to v2.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
2012-06-01 15:18:52 -04:00
Frank Svendsboe
2e929d001e mtd: of_parts: fix breakage in Kconfig
MTD_OF_PARTS and the default setting is not working due to using 'Y'
instead of 'y', introduced in commit
d6137badef. This made our board, and
possibly other boards using DTS defined partitions and not having
CONFIG_MTD_OF_PARTS=y defined in the defconfig, fail to mount root.

Signed-off-by: Frank Svendsboe <frank.svendsboe@gmail.com>
Cc: stable@kernel.org [3.2+]
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2012-06-01 20:06:59 +01:00
Linus Torvalds
86c47b70f6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull third pile of signal handling patches from Al Viro:
 "This time it's mostly helpers and conversions to them; there's a lot
  of stuff remaining in the tree, but that'll either go in -rc2
  (isolated bug fixes, ideally via arch maintainers' trees) or will sit
  there until the next cycle."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  x86: get rid of calling do_notify_resume() when returning to kernel mode
  blackfin: check __get_user() return value
  whack-a-mole with TIF_FREEZE
  FRV: Optimise the system call exit path in entry.S [ver #2]
  FRV: Shrink TIF_WORK_MASK [ver #2]
  FRV: Prevent syscall exit tracing and notify_resume at end of kernel exceptions
  new helper: signal_delivered()
  powerpc: get rid of restore_sigmask()
  most of set_current_blocked() callers want SIGKILL/SIGSTOP removed from set
  set_restore_sigmask() is never called without SIGPENDING (and never should be)
  TIF_RESTORE_SIGMASK can be set only when TIF_SIGPENDING is set
  don't call try_to_freeze() from do_signal()
  pull clearing RESTORE_SIGMASK into block_sigmask()
  sh64: failure to build sigframe != signal without handler
  openrisc: tracehook_signal_handler() is supposed to be called on success
  new helper: sigmask_to_save()
  new helper: restore_saved_sigmask()
  new helpers: {clear,test,test_and_clear}_restore_sigmask()
  HAVE_RESTORE_SIGMASK is defined on all architectures now
2012-06-01 11:53:44 -07:00
Eric Dumazet
fff3269907 tcp: reflect SYN queue_mapping into SYNACK packets
While testing how linux behaves on SYNFLOOD attack on multiqueue device
(ixgbe), I found that SYNACK messages were dropped at Qdisc level
because we send them all on a single queue.

Obvious choice is to reflect incoming SYN packet @queue_mapping to
SYNACK packet.

Under stress, my machine could only send 25.000 SYNACK per second (for
200.000 incoming SYN per second). NIC : ixgbe with 16 rx/tx queues.

After patch, not a single SYNACK is dropped.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hans Schillstrom <hans.schillstrom@ericsson.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-01 14:22:11 -04:00
Eric Dumazet
7433819a1e tcp: do not create inetpeer on SYNACK message
Another problem on SYNFLOOD/DDOS attack is the inetpeer cache getting
larger and larger, using lots of memory and cpu time.

tcp_v4_send_synack()
->inet_csk_route_req()
 ->ip_route_output_flow()
  ->rt_set_nexthop()
   ->rt_init_metrics()
    ->inet_getpeer( create = true)

This is a side effect of commit a4daad6b09 (net: Pre-COW metrics for
TCP) added in 2.6.39

Possible solution :

Instruct inet_csk_route_req() to remove FLOWI_FLAG_PRECOW_METRICS

Before patch :

# grep peer /proc/slabinfo
inet_peer_cache   4175430 4175430    192   42    2 : tunables    0    0    0 : slabdata  99415  99415      0

Samples: 41K of event 'cycles', Event count (approx.): 30716565122
+  20,24%      ksoftirqd/0  [kernel.kallsyms]           [k] inet_getpeer
+   8,19%      ksoftirqd/0  [kernel.kallsyms]           [k] peer_avl_rebalance.isra.1
+   4,81%      ksoftirqd/0  [kernel.kallsyms]           [k] sha_transform
+   3,64%      ksoftirqd/0  [kernel.kallsyms]           [k] fib_table_lookup
+   2,36%      ksoftirqd/0  [ixgbe]                     [k] ixgbe_poll
+   2,16%      ksoftirqd/0  [kernel.kallsyms]           [k] __ip_route_output_key
+   2,11%      ksoftirqd/0  [kernel.kallsyms]           [k] kernel_map_pages
+   2,11%      ksoftirqd/0  [kernel.kallsyms]           [k] ip_route_input_common
+   2,01%      ksoftirqd/0  [kernel.kallsyms]           [k] __inet_lookup_established
+   1,83%      ksoftirqd/0  [kernel.kallsyms]           [k] md5_transform
+   1,75%      ksoftirqd/0  [kernel.kallsyms]           [k] check_leaf.isra.9
+   1,49%      ksoftirqd/0  [kernel.kallsyms]           [k] ipt_do_table
+   1,46%      ksoftirqd/0  [kernel.kallsyms]           [k] hrtimer_interrupt
+   1,45%      ksoftirqd/0  [kernel.kallsyms]           [k] kmem_cache_alloc
+   1,29%      ksoftirqd/0  [kernel.kallsyms]           [k] inet_csk_search_req
+   1,29%      ksoftirqd/0  [kernel.kallsyms]           [k] __netif_receive_skb
+   1,16%      ksoftirqd/0  [kernel.kallsyms]           [k] copy_user_generic_string
+   1,15%      ksoftirqd/0  [kernel.kallsyms]           [k] kmem_cache_free
+   1,02%      ksoftirqd/0  [kernel.kallsyms]           [k] tcp_make_synack
+   0,93%      ksoftirqd/0  [kernel.kallsyms]           [k] _raw_spin_lock_bh
+   0,87%      ksoftirqd/0  [kernel.kallsyms]           [k] __call_rcu
+   0,84%      ksoftirqd/0  [kernel.kallsyms]           [k] rt_garbage_collect
+   0,84%      ksoftirqd/0  [kernel.kallsyms]           [k] fib_rules_lookup

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hans Schillstrom <hans.schillstrom@ericsson.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-01 14:22:11 -04:00
Jason Wang
0bc777bca4 8139cp/8139too: terminate the eeprom access with the right opmode
Currently, we terminate the eeprom access through clearing the CS by:

RTL_W8 (Cfg9346, ~EE_CS); or writeb (~EE_CS, ee_addr);

This would left the eeprom into "Config. Register Write Enable:"
state which is not expcted as the highest two bits were set to
0x11 ( expected is the "Normal" mode (0x00)). Solving this by write
0x0 instead of ~EE_CS when terminating the eeprom access.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-01 14:22:11 -04:00
Jason Wang
b01af4579e 8139cp: set ring address before enabling receiver
Currently, we enable the receiver before setting the ring address which could
lead the card DMA into unexpected areas. Solving this by set the ring address
before enabling the receiver.

btw. I find and test this in qemu as I didn't have a 8139cp card in hand. please
review it carefully.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-01 14:22:11 -04:00