Commit Graph

245210 Commits

Author SHA1 Message Date
Sage Weil
9d6fcb081a ceph: check return value for start_request in writepages
Since we pass the nofail arg, we should never get an error; BUG if we do.
(And fix the function to not return an error if __map_request fails.)

Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-19 11:25:05 -07:00
Sage Weil
6b4a3b517a ceph: remove useless check
rc is only ever 0 or negative in this method.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-19 11:25:05 -07:00
Sage Weil
a2a79609c0 libceph: add missing breaks in addr_set_port
Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-19 11:25:05 -07:00
Sage Weil
0417788226 libceph: fix TAG_WAIT case
If we get a WAIT as a client something went wrong; error out.  And don't
fall through to an unrelated case.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-19 11:25:04 -07:00
Sage Weil
da39822c65 ceph: fix broken comparison in readdir loop
Both off and fi->offset are unsigned, so the difference is always >= 0.
Compare them directly instead of the sign of the difference.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-19 11:25:04 -07:00
Sage Weil
31456665a0 libceph: fix osdmap timestamp assignment
Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-19 11:25:03 -07:00
Sage Weil
3540303f87 ceph: fix rare potential cap leak
If we grab new_cap, retake the lock, and find we already have a cap now
for the given mds, release new_cap.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-19 11:25:03 -07:00
Sage Weil
12a2f643b0 libceph: use snprintf for unknown addrs
Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-19 11:25:03 -07:00
Sage Weil
2dab036b8c libceph: use snprintf for formatting object name
Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-19 11:25:02 -07:00
Sage Weil
ae59808301 ceph: use snprintf for dirstat content
We allocate a buffer for rstats if the dirstat option is enabled.  Use
snprintf.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-19 11:25:02 -07:00
Sage Weil
e8f54ce169 libceph: fix uninitialized value when no get_authorizer method is set
If there is no get_authorizer method we set the out_kvec to a bogus
pointer.  The length is also zero in that case, so it doesn't much matter,
but it's better not to add the empty item in the first place.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-19 11:25:02 -07:00
Sage Weil
1b36698577 libceph: remove unused variable
Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-19 11:24:17 -07:00
Sage Weil
0da5d70369 libceph: handle connection reopen race with callbacks
If a connection is closed and/or reopened (ceph_con_close, ceph_con_open)
it can race with a callback.  con_work does various state checks for
closed or reopened sockets at the beginning, but drops con->mutex before
making callbacks.  We need to check for state bit changes after retaking
the lock to ensure we restart con_work and execute those CLOSED/OPENING
tests or else we may end up operating under stale assumptions.

In Jim's case, this was causing 'bad tag' errors.

There are four cases where we re-take the con->mutex inside con_work: catch
them all and return EAGAIN from try_{read,write} so that we can restart
con_work.

Reported-by: Jim Schutt <jaschut@sandia.gov>
Tested-by: Jim Schutt <jaschut@sandia.gov>
Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-19 11:21:05 -07:00
Sage Weil
3b66378034 ceph: take reference on mds request r_unsafe_dir
We put ourselves on an inode list for the parent directory of metadata
operations so that an fsync on the directory will wait for metadata updates
to commit to disk.  We weren't holding a reference to that directory,
however, and under certain workloads (fsstress in this case) the directory
can go away.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-19 11:20:07 -07:00
Linus Torvalds
61c4f2c81c Linux 2.6.39 2011-05-18 21:06:34 -07:00
Linus Torvalds
3f80fbff5f Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
  configfs: Fix race between configfs_readdir() and configfs_d_iput()
  configfs: Don't try to d_delete() negative dentries.
  ocfs2/dlm: Target node death during resource migration leads to thread spin
  ocfs2: Skip mount recovery for hard-ro mounts
  ocfs2/cluster: Heartbeat mismatch message improved
  ocfs2/cluster: Increase the live threshold for global heartbeat
  ocfs2/dlm: Use negotiated o2dlm protocol version
  ocfs2: skip existing hole when removing the last extent_rec in punching-hole codes.
  ocfs2: Initialize data_ac (might be used uninitialized)
2011-05-18 16:50:28 -07:00
Linus Torvalds
fce519588a Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6
* 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
  drivercore: revert addition of of_match to struct device
  of: fix race when matching drivers
2011-05-18 13:25:57 -07:00
Linus Torvalds
7103dbed8e Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus:
  MIPS: Kludge IP27 build for 2.6.39.
  MIPS: AR7: Fix GPIO register size for Titan variant.
  MIPS: Fix duplicate invocation of notify_die.
  MIPS: RB532: Fix iomap resource size miscalculation.
2011-05-18 13:21:43 -07:00
Grant Likely
b1608d69cb drivercore: revert addition of of_match to struct device
Commit b826291c, "drivercore/dt: add a match table pointer to struct
device" added an of_match pointer to struct device to cache the
of_match_table entry discovered at driver match time.  This was unsafe
because matching is not an atomic operation with probing a driver.  If
two or more drivers are attempted to be matched to a driver at the
same time, then the cached matching entry pointer could get
overwritten.

This patch reverts the of_match cache pointer and reworks all users to
call of_match_device() directly instead.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-05-18 12:32:23 -06:00
Milton Miller
01294d8262 of: fix race when matching drivers
If two drivers are probing devices at the same time, both will write
their match table result to the dev->of_match cache at the same time.

Only write the result if the device matches.

In a thread titled "SBus devices sometimes detected, sometimes not",
Meelis reported his SBus hme was not detected about 50% of the time.
From the debug suggested by Grant it was obvious another driver matched
some devices between the call to match the hme and the hme discovery
failling.

Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Milton Miller <miltonm@bga.com>
[grant.likely: modified to only call of_match_device() once]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-05-18 10:19:36 -06:00
Linus Torvalds
a2b9c1f620 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: don't delay blk_run_queue_async
  scsi: remove performance regression due to async queue run
  blk-throttle: Use task_subsys_state() to determine a task's blkio_cgroup
  block: rescan partitions on invalidated devices on -ENOMEDIA too
  cdrom: always check_disk_change() on open
  block: unexport DISK_EVENT_MEDIA_CHANGE for legacy/fringe drivers
2011-05-18 06:49:02 -07:00
Ralf Baechle
a5602a3273 MIPS: Kludge IP27 build for 2.6.39.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2011-05-18 14:18:27 +01:00
Florian Fainelli
3e9957b486 MIPS: AR7: Fix GPIO register size for Titan variant.
The 'size' variable contains the correct register size for both AR7
and Titan, but we never used it to ioremap the correct register size.
This problem only shows up on Titan.

[ralf@linux-mips.org: Fixed the fix.  The original patch as in patchwork
recognizes the problem correctly then fails to fix it ...]

Reported-by: Alexander Clouter <alex@digriz.org.uk>
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Patchwork: https://patchwork.linux-mips.org/patch/2380/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2011-05-18 14:18:27 +01:00
Ralf Baechle
10423c91ff MIPS: Fix duplicate invocation of notify_die.
Initial patch by Yury Polyanskiy <ypolyans@princeton.edu>.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/2373/
2011-05-18 14:18:26 +01:00
Ralf Baechle
3436830af5 MIPS: RB532: Fix iomap resource size miscalculation.
This is the MIPS portion of Joe Perches <joe@perches.com>'s
https://patchwork.linux-mips.org/patch/2172/ which seems to have been
lost in time and space.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2011-05-18 14:18:26 +01:00
Joel Becker
24307aa1e7 configfs: Fix race between configfs_readdir() and configfs_d_iput()
configfs_readdir() will use the existing inode numbers of inodes in the
dcache, but it makes them up for attribute files that aren't currently
instantiated.  There is a race where a closing attribute file can be
tearing down at the same time as configfs_readdir() is trying to get its
inode number.

We want to get the inode number of open attribute files, because they
should match while instantiated.  We can't lock down the transition
where dentry->d_inode is set to NULL, so we just check for NULL there.
We can, however, ensure that an inode we find isn't iput() in
configfs_d_iput() until after we've accessed it.

Signed-off-by: Joel Becker <jlbec@evilplan.org>
2011-05-18 04:08:16 -07:00
Joel Becker
df7f99670a configfs: Don't try to d_delete() negative dentries.
When configfs is faking mkdir() on its subsystem or default group
objects, it starts by adding a negative dentry.  It then tries to
instantiate the group.  If that should fail, it must clean up after
itself.

I was using d_delete() here, but configfs_attach_group() promises to
return an empty dentry on error.  d_delete() explodes with the entry
dentry.  Let's try d_drop() instead.  The unhashing is what we want for
our dentry.

Signed-off-by: Joel Becker <jlbec@evilplan.org>
2011-05-18 03:30:58 -07:00
Shaohua Li
3ec717b7ca block: don't delay blk_run_queue_async
Let's check a scenario:
1. blk_delay_queue(q, SCSI_QUEUE_DELAY);
2. blk_run_queue_async();
the second one will became a noop, because q->delay_work already has
WORK_STRUCT_PENDING_BIT set, so the delayed work will still run after
SCSI_QUEUE_DELAY. But blk_run_queue_async actually hopes the delayed
work runs immediately.

Fix this by doing a cancel on potentially pending delayed work
before queuing an immediate run of the workqueue.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-18 12:24:03 +02:00
Linus Torvalds
2e9521fd65 Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
  [media] V4L: soc-camera: regression fix: calculate .sizeimage in soc_camera.c
  [media] v4l2-subdev: fix broken subdev control enumeration
  [media] Fix cx88 remote control input
  [media] v4l: Release module if subdev registration fails
2011-05-18 03:16:38 -07:00
Linus Torvalds
39dcfa552c Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, AMD: Fix ARAT feature setting again
  Revert "x86, AMD: Fix APIC timer erratum 400 affecting K8 Rev.A-E processors"
  x86, apic: Fix spurious error interrupts triggering on all non-boot APs
  x86, mce, AMD: Fix leaving freed data in a list
  x86: Fix UV BAU for non-consecutive nasids
  x86, UV: Fix NMI handler for UV platforms
2011-05-18 03:14:34 -07:00
Linus Torvalds
7f12b72bd8 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf evlist: Fix per thread mmap setup
  perf tools: Honour the cpu list parameter when also monitoring a thread list
  kprobes, x86: Disable irqs during optimized callback
2011-05-18 03:13:46 -07:00
Linus Torvalds
8864f5ee12 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: fix cifsConvertToUCS() for the mapchars case
  cifs: add fallback in is_path_accessible for old servers
2011-05-18 03:13:11 -07:00
Randy Dunlap
f12a20fc9b procfs: add stub for proc_mkdir_mode()
Provide a stub for proc_mkdir_mode() when CONFIG_PROC_FS is not
enabled, just like the stub for proc_mkdir().

Fixes this linux-next build error:

  drivers/net/wireless/airo.c:4504: error: implicit declaration of function 'proc_mkdir_mode'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-18 02:55:24 -07:00
Richard Weinberger
b2db21997f um: fix abort
os_dump_core() uses abort() to terminate UML in case of an fatal error.

glibc's abort() calls raise(SIGABRT) which makes use of tgkill().
tgkill() has no effect within UML's kernel threads because they are not
pthreads.  As fallback abort() executes an invalid instruction to
terminate the process.  Therefore UML gets killed by SIGSEGV and leaves a
ugly log entry in the host's kernel ring buffer.

To get rid of this we use our own abort routine.

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-18 02:55:23 -07:00
KAMEZAWA Hiroyuki
d6c438b6cd memcg: fix zone congestion
ZONE_CONGESTED should be a state of global memory reclaim.  If not, a busy
memcg sets this and give unnecessary throttoling in wait_iff_congested()
against memory recalim in other contexts.  This makes system performance
bad.

I'll think about "memcg is congested!" flag is required or not, later.
But this fix is required first.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: Ying Han <yinghan@google.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-18 02:55:23 -07:00
Axel Lin
d5f33d45e4 drivers/leds/leds-lm3530.c: add MODULE_DEVICE_TABLE
Adding the necessary MODULE_DEVICE_TABLE() information allows the driver
to be automatically loaded by udev.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Cc: Shreshtha Kumar SAHU <shreshthakumar.sahu@stericsson.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-18 02:55:22 -07:00
Alexandre Bounine
0bf2461fdd rapidio: fix default routing initialization
Fix switch initialization to ensure that all switches have default routing
disabled.  This guarantees that no unexpected RapidIO packets arrive to
the default port set by reset and there is no default routing destination
until it is properly configured by software.

This update also unifies handling of unmapped destinations by tsi57x, IDT
Gen1 and IDT Gen2 switches.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: <stable@kernel.org>		[2.6.37+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-18 02:55:22 -07:00
Jeff Layton
11379b5e33 cifs: fix cifsConvertToUCS() for the mapchars case
As Metze pointed out, commit 84cdf74e broke mapchars option:

    Commit "cifs: fix unaligned accesses in cifsConvertToUCS"
    (84cdf74e80) does multiple steps
    in just one commit (moving the function and changing it without
    testing).

    put_unaligned_le16(temp, &target[j]); is never called for any
    codepoint the goes via the 'default' switch statement. As a result
    we put just zero (or maybe uninitialized) bytes into the target
    buffer.

His proposed patch looks correct, but doesn't apply to the current head
of the tree. This patch should also fix it.

Cc: <stable@kernel.org> # .38.x: 581ade4: cifs: clean up various nits in unicode routines (try #2)
Reported-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-05-17 20:54:04 +00:00
Jeff Layton
221d1d7972 cifs: add fallback in is_path_accessible for old servers
The is_path_accessible check uses a QPathInfo call, which isn't
supported by ancient win9x era servers. Fall back to an older
SMBQueryInfo call if it fails with the magic error codes.

Cc: stable@kernel.org
Reported-and-Tested-by: Sandro Bonazzola <sandro.bonazzola@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-05-17 18:51:14 +00:00
Linus Torvalds
a085963a27 Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  tick: Clear broadcast active bit when switching to oneshot
  rtc: mc13xxx: Don't call rtc_device_register while holding lock
  rtc: rp5c01: Initialize drvdata before registering device
  rtc: pcap: Initialize drvdata before registering device
  rtc: msm6242: Initialize drvdata before registering device
  rtc: max8998: Initialize drvdata before registering device
  rtc: max8925: Initialize drvdata before registering device
  rtc: m41t80: Initialize clientdata before registering device
  rtc: ds1286: Initialize drvdata before registering device
  rtc: ep93xx: Initialize drvdata before registering device
  rtc: davinci: Initialize drvdata before registering device
  rtc: mxc: Initialize drvdata before registering device
  clocksource: Install completely before selecting
2011-05-17 08:02:04 -07:00
Borislav Petkov
14fb57dccb x86, AMD: Fix ARAT feature setting again
Trying to enable the local APIC timer on early K8 revisions
uncovers a number of other issues with it, in conjunction with
the C1E enter path on AMD. Fixing those causes much more churn
and troubles than the benefit of using that timer brings so
don't enable it on K8 at all, falling back to the original
functionality the kernel had wrt to that.

Reported-and-bisected-by: Nick Bowler <nbowler@elliptictech.com>
Cc: Boris Ostrovsky <Boris.Ostrovsky@amd.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
Cc: Nick Bowler <nbowler@elliptictech.com>
Cc: Joerg-Volker-Peetz <jvpeetz@web.de>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Link: http://lkml.kernel.org/r/1305636919-31165-3-git-send-email-bp@amd64.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-17 15:28:34 +02:00
Borislav Petkov
328935e634 Revert "x86, AMD: Fix APIC timer erratum 400 affecting K8 Rev.A-E processors"
This reverts commit e20a2d205c, as it crashes
certain boxes with specific AMD CPU models.

Moving the lower endpoint of the Erratum 400 check to accomodate
earlier K8 revisions (A-E) opens a can of worms which is simply
not worth to fix properly by tweaking the errata checking
framework:

* missing IntPenging MSR on revisions < CG cause #GP:

http://marc.info/?l=linux-kernel&m=130541471818831

* makes earlier revisions use the LAPIC timer instead of the C1E
idle routine which switches to HPET, thus not waking up in
deeper C-states:

http://lkml.org/lkml/2011/4/24/20

Therefore, leave the original boundary starting with K8-revF.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-17 15:28:33 +02:00
Jens Axboe
9937a5e2f3 scsi: remove performance regression due to async queue run
Commit c21e6beb removed our queue request_fn re-enter
protection, and defaulted to always running the queues from
kblockd to be safe. This was a known potential slow down,
but should be safe.

Unfortunately this is causing big performance regressions for
some, so we need to improve this logic. Looking into the details
of the re-enter, the real issue is on requeue of requests.

Requeue of requests upon seeing a BUSY condition from the device
ends up re-running the queue, causing traces like this:

scsi_request_fn()
        scsi_dispatch_cmd()
                scsi_queue_insert()
                        __scsi_queue_insert()
                                scsi_run_queue()
					scsi_request_fn()
						...

potentially causing the issue we want to avoid. So special
case the requeue re-run of the queue, but improve it to offload
the entire run of local queue and starved queue from a single
workqueue callback. This is a lot better than potentially
kicking off a workqueue run for each device seen.

This also fixes the issue of the local device going into recursion,
since the above mentioned commit never moved that queue run out
of line.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-17 11:04:44 +02:00
Linus Torvalds
c1d10d18c5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  net: Change netdev_fix_features messages loglevel
  vmxnet3: Fix inconsistent LRO state after initialization
  sfc: Fix oops in register dump after mapping change
  IPVS: fix netns if reading ip_vs_* procfs entries
  bridge: fix forwarding of IPv6
2011-05-16 18:38:08 -07:00
Linus Torvalds
477de0de44 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  Revert "mmc: fix a race between card-detect rescan and clock-gate work instances"
2011-05-16 18:36:47 -07:00
Randy Dunlap
b5e6ab589d mm: fix kernel-doc warning in page_alloc.c
Fix new kernel-doc warning in mm/page_alloc.c:

  Warning(mm/page_alloc.c:2370): No description found for parameter 'nid'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-16 18:34:30 -07:00
Yinghai Lu
93d2175d3d PCI: Clear bridge resource flags if requested size is 0
During pci remove/rescan testing found:

  pci 0000:c0:03.0: PCI bridge to [bus c4-c9]
  pci 0000:c0:03.0:   bridge window [io  0x1000-0x0fff]
  pci 0000:c0:03.0:   bridge window [mem 0xf0000000-0xf00fffff]
  pci 0000:c0:03.0:   bridge window [mem 0xfc180000000-0xfc197ffffff 64bit pref]
  pci 0000:c0:03.0: device not available (can't reserve [io  0x1000-0x0fff])
  pci 0000:c0:03.0: Error enabling bridge (-22), continuing
  pci 0000:c0:03.0: enabling bus mastering
  pci 0000:c0:03.0: setting latency timer to 64
  pcieport 0000:c0:03.0: device not available (can't reserve [io  0x1000-0x0fff])
  pcieport: probe of 0000:c0:03.0 failed with error -22

This bug was caused by commit c8adf9a3e8 ("PCI: pre-allocate
additional resources to devices only after successful allocation of
essential resources.")

After that commit, pci_hotplug_io_size is changed to additional_io_size
from minium size.  So it will not go through resource_size(res) != 0
path, and will not be reset.

The root cause is: pci_bridge_check_ranges will set RESOURCE_IO flag for
pci bridge, and later if children do not need IO resource.  those bridge
resources will not need to be allocated.  but flags is still there.
that will confuse the the pci_enable_bridges later.

related code:

   static void assign_requested_resources_sorted(struct resource_list *head,
                                    struct resource_list_x *fail_head)
   {
           struct resource *res;
           struct resource_list *list;
           int idx;

           for (list = head->next; list; list = list->next) {
                   res = list->res;
                   idx = res - &list->dev->resource[0];
                   if (resource_size(res) && pci_assign_resource(list->dev, idx)) {
   ...
                           reset_resource(res);
                   }
           }
   }

At last, We have to clear the flags in pbus_size_mem/io when requested
size == 0 and !add_head.  becasue this case it will not go through
adjust_resources_sorted().

Just make size1 = size0 when !add_head. it will make flags get cleared.

At the same time when requested size == 0, add_size != 0, will still
have in head and add_list.  because we do not clear the flags for it.

After this, we will get right result:

  pci 0000:c0:03.0: PCI bridge to [bus c4-c9]
  pci 0000:c0:03.0:   bridge window [io  disabled]
  pci 0000:c0:03.0:   bridge window [mem 0xf0000000-0xf00fffff]
  pci 0000:c0:03.0:   bridge window [mem 0xfc180000000-0xfc197ffffff 64bit pref]
  pci 0000:c0:03.0: enabling bus mastering
  pci 0000:c0:03.0: setting latency timer to 64
  pcieport 0000:c0:03.0: setting latency timer to 64
  pcieport 0000:c0:03.0: irq 160 for MSI/MSI-X
  pcieport 0000:c0:03.0: Signaling PME through PCIe PME interrupt
  pci 0000:c4:00.0: Signaling PME through PCIe PME interrupt
  pcie_pme 0000:c0:03.0:pcie01: service driver pcie_pme loaded
  aer 0000:c0:03.0:pcie02: service driver aer loaded
  pciehp 0000:c0:03.0:pcie04: Hotplug Controller:

v3: more simple fix. also fix one typo in pbus_size_mem

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-16 18:33:35 -07:00
Thomas Gleixner
07f4beb0b5 tick: Clear broadcast active bit when switching to oneshot
The first cpu which switches from periodic to oneshot mode switches
also the broadcast device into oneshot mode. The broadcast device
serves as a backup for per cpu timers which stop in deeper
C-states. To avoid starvation of the cpus which might be in idle and
depend on broadcast mode it marks the other cpus as broadcast active
and sets the brodcast expiry value of those cpus to the next tick.

The oneshot mode broadcast bit for the other cpus is sticky and gets
only cleared when those cpus exit idle. If a cpu was not idle while
the bit got set in consequence the bit prevents that the broadcast
device is armed on behalf of that cpu when it enters idle for the
first time after it switched to oneshot mode.

In most cases that goes unnoticed as one of the other cpus has usually
a timer pending which keeps the broadcast device armed with a short
timeout. Now if the only cpu which has a short timer active has the
bit set then the broadcast device will not be armed on behalf of that
cpu and will fire way after the expected timer expiry. In the case of
Christians bug report it took ~145 seconds which is about half of the
wrap around time of HPET (the limit for that device) due to the fact
that all other cpus had no timers armed which expired before the 145
seconds timeframe.

The solution is simply to clear the broadcast active bit
unconditionally when a cpu switches to oneshot mode after the first
cpu switched the broadcast device over. It's not idle at that point
otherwise it would not be executing that code.

[ I fundamentally hate that broadcast crap. Why the heck thought some
  folks that when going into deep idle it's a brilliant concept to
  switch off the last device which brings the cpu back from that
  state? ]

Thanks to Christian for providing all the valuable debug information!

Reported-and-tested-by: Christian Hoffmann <email@christianhoffmann.info>
Cc: John Stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1105161105170.3078%40ionos%3E
Cc: stable@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-05-16 23:35:41 +02:00
Michał Mirosław
6f404e441d net: Change netdev_fix_features messages loglevel
Those reduced to DEBUG can possibly be triggered by unprivileged processes
and are nothing exceptional. Illegal checksum combinations can only be
caused by driver bug, so promote those messages to WARN.

Since GSO without SG will now only cause DEBUG message from
netdev_fix_features(), remove the workaround from register_netdevice().

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-16 15:14:21 -04:00
Thomas Jarosch
ebde6f8acb vmxnet3: Fix inconsistent LRO state after initialization
During initialization of vmxnet3, the state of LRO
gets out of sync with netdev->features.

This leads to very poor TCP performance in a IP forwarding
setup and is hitting many VMware users.

Simplified call sequence:
1. vmxnet3_declare_features() initializes "adapter->lro" to true.

2. The kernel automatically disables LRO if IP forwarding is enabled,
so vmxnet3_set_flags() gets called. This also updates netdev->features.

3. Now vmxnet3_setup_driver_shared() is called. "adapter->lro" is still
set to true and LRO gets enabled again, even though
netdev->features shows it's disabled.

Fix it by updating "adapter->lro", too.

The private vmxnet3 adapter flags are scheduled for removal
in net-next, see commit a0d2730c95
"net: vmxnet3: convert to hw_features".

Patch applies to 2.6.37 / 2.6.38 and 2.6.39-rc6.

Please CC: comments.

Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-16 15:05:23 -04:00