This adds the design document for the ring buffer and also
explains how it is designed to have lockless writes.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
This patch converts the ring buffers into a completely lockless
buffer recording system. The read side still takes locks since
we still serialize readers. But the writers are the ones that
must be lockless (those can happen in NMIs).
The main change is to the "head_page" pointer. We write to the
tail, and read from the head. The "head_page" pointer in the cpu
buffer is now just a reference to where to look. The real head
page is now kept in the head_page->list->prev->next pointer.
That is, in the list head of the previous page we set flags.
The list pages are allocated to be aligned such that the lowest
significant bits are always zero pointing to the list. This gives
us play to put in flags to their pointers.
bit 0: set when the page is a head page
bit 1: set when the writer is moving the page (for overwrite mode)
cmpxchg is used to update the pointer.
When the writer wraps the buffer and the tail meets the head,
in overwrite mode, the writer must move the head page forward.
It first uses cmpxchg to change the pointer flag from 1 to 2.
Once this is done, the reader on another CPU will not take the
page from the buffer.
The writers need to protect against interrupts (we don't bother with
disabling interrupts because NMIs are allowed to write too).
After the writer sets the pointer flag to 2, it takes care to
manage interrupts coming in. This is discribed in detail within the
comments of the code.
Changes in version 2:
- Let reader reset entries value of header page.
- Fix tail page passing commit page on reader page test.
- Always increment entries and write counter in rb_tail_page_update
- Add safety check in rb_set_commit_to_write to break out of infinite loop
- add mask in rb_is_reader_page
[ Impact: lock free writing to the ring buffer ]
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
This patch changes the ring buffer data pages from using a link list
head pointer, to making each buffer page point to another buffer page
and never back to a "head".
This makes the handling of the ring buffer less complex, since the
traversing of the ring buffer pages no longer needs to account for the
head pointer.
This change also is needed to make the ring buffer lockless.
[
Changes in version 2:
- Added change that Lai Jiangshan mentioned.
From: Lai Jiangshan <laijs@cn.fujitsu.com>
Date: Thu, 11 Jun 2009 11:25:48 +0800
LKML-Reference: <4A30793C.6090208@cn.fujitsu.com>
I'm not sure whether these 4 lines:
bpage = list_entry(pages.next, struct buffer_page, list);
list_del_init(&bpage->list);
cpu_buffer->pages = &bpage->list;
list_splice(&pages, cpu_buffer->pages);
equal to these 2 lines:
cpu_buffer->pages = pages.next;
list_del(&pages);
If there are equivalent, I think the second one
are simpler. It may be not a really necessarily cleanup.
What I asked is: if there are equivalent, could you use these two line:
cpu_buffer->pages = pages.next;
list_del(&pages);
]
[ Impact: simplify the ring buffer to help make it lockless ]
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Make this consistent with the unlock statement. Also fix a
minor typo in debugfs formatting
Signed-off-by: Ben Gamari <bgamari.foss@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
This is quite useful for verifying that objects are actually mapped when
they need to be.
Signed-off-by: Ben Gamari <bgamari.foss@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
This wasn't even used as far as I could tell and will only confuse
people (like me).
Signed-off-by: Ben Gamari <bgamari.foss@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
We did before, in the end -- but it was at the bottom of a long stack of
functions. Add an inline wrapper get_valid_domain_for_dev() which will
use the cached one _first_ and only make the out-of-line call if it's
not already set.
This takes the average time taken for a 1-page intel_map_sg() from 5961
cycles to 4812 cycles on my Lenovo x200s test box -- a modest 20%.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Usually crt mainly get modes via GPIOA ports.
However on G4X platform we need to probe possible
ports for DVI-I, which could be wired to GPIOD,
then fetch our desired EDID, i.e on DG45ID platform
we successfully fetch EDID by GPIOD port.
It fixed freedesktop.org bug #21084
Signed-off-by: Ma Ling <ling.ma@intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
If rix is not found in mi->r[], i will become -1 after the loop. This value
is eventually used to access arrays, so we were accessing arrays with a
negative index, which is obviously not what we want to do. This patch fixes
this potential problem.
Signed-off-by: Luciano Coelho <luciano.coelho@nokia.com>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The code in cfg80211's cfg80211_bss_update erroneously
grabs a reference to the BSS, which means that it will
never be freed.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org [2.6.29, 2.6.30]
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Fix the third (I think) polarity error I accidentally
introduced in the rfkill rewrite to make wireless work
again on (certain?) HP laptops.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Tested-by: Maciej Rutecki <maciej.rutecki@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We're missing a Kconfig help for the iwmc3200wifi driver.
Signed-off-by: Samuel Ortiz <samuel.ortiz@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When we reclaim the tx desc, we always assume that the
last desc is a holding desc, which is not true, and skip it.
If the tx queue is drained during channel change, internal
reset and etc, the last descriptor may not be the holding
descriptor and we fail to reclaim them. This results in the
following two issues.
1. Tx stuck - We drop all the frames coming from upper layer
due to shortage in tx desc.
2. Crash - If we fail to reclaim a tx descriptor, we miss to
update the tx BA window with the seq number of the frame
associated to that desc, which, at some point, result in
the following crash due to an assert failure in ath_tx_addto_baw().
This patch fixes these two issues.
kernel BUG at ../drivers/net/wireless/ath/ath9k/xmit.c:180!
[155064.304164] invalid opcode: 0000 [#1] SMP
Call Trace:
[<fbc6d83b>] ? ath9k_tx+0xeb/0x160 [ath9k]
[<fbbc9591>] ipv6? __ieee80211_tx+0x41/0x120 [mac80211]
[<fbbcb5ae>] ? aes_i586ieee80211_master_start_xmit+0x28e/0x560 [mac80211]
[<c037e501>] aes_generic? _spin_lock_irqsave+0x31/0x40
[<c02f347b>] ? dev_hard_start_xmit+0x16b/0x1c0
[<c03058b5>] ? __qdisc_run+0x1b5/0x200
[<fbbcda5a>] ? af_packetieee80211_select_queue+0xa/0x100 [mac80211]
[<c02f53b7>] ? i915dev_queue_xmit+0x2e7/0x3f0
[<fbbc9b49>] ? ieee80211_subif_start_xmit+0x369/0x7a0 [mac80211]
[<c031bc35>] ? ip_output+0x55/0xb0
[<c02e0188>] ? show_memcpy_count+0x18/0x60
[<c02eb186>] ? __kfree_skb+0x36/0x90
[<c02f2202>] ? binfmt_miscdev_queue_xmit_nit+0xd2/0x110
[<c02f347b>] ? dev_hard_start_xmit+0x16b/0x1c0
[<c03058b5>] ? __qdisc_run+0x1b5/0x200
[<c033bca7>] ? scoarp_create+0x57/0x2a0
[<c02f53b7>] ? bridgedev_queue_xmit+0x2e7/0x3f0
[<c03034a0>] ? eth_header+0x0/0xc0
[<c033b95f>] stp? arp_xmit+0x5f/0x70
[<c033bf4f>] ? arp_send+0x5f/0x70
[<c033c8f5>] bnep? arp_solicit+0x105/0x210
[<c02fa5aa>] ? neigh_timer_handler+0x19a/0x390
[<c013bf88>] ? run_timer_softirq+0x138/0x210
[<c02fa410>] ? ppdevneigh_timer_handler+0x0/0x390
[<c02fa410>] ? neigh_timer_handler+0x0/0x390
Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Fix condition in which radio LED did not initialize correctly, and remove
4 compilation warnings.
After the recent changes in rfkill, the radio LED used by b43/b43legacy
did not always initialize correctly.
Both b43 and b43legacy used the deprecated variable radio_enabled in
struct ieee80211_conf.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Don't forget to unlock cfg80211_mutex in one fail path of
nl80211_set_wiphy.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Adding MPU-401 support to cmi8330 driver could cause a regression (non-working
sound) on a system where there is no free IRQ for the MPU-401 device (which
is not very uncommon as this card requires two separate IRQs plus a third one
for MPU-401).
When MPU-401 PnP configuration fails (mostly because of unavailable IRQ), just
ignore MPU-401 and continue without it.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The fixed widget NIDs in patch_via.c seem wrong for some codecs,
and it resulted in the invalid capture source selection.
This patch adds the code to parse the topology instead of using
fixed numbers in order to get the right MUX widget id corresponding
to the ADCs.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The following test script triggers a deadlock on ext2 filesystem:
while true; do quotaon /dev/hda >&/dev/null; usleep $RANDOM; done &
while true; do quotaoff /dev/hda >&/dev/null; usleep $RANDOM; done &
I found there is a potential deadlock between quotaon and quotaoff (or
quotasync). Basically, all of quotactl operations need to be protected by
dqonoff_mutex. vfs_quota_off and vfs_quota_sync also call sb->s_op->quota_write
that needs to grab the i_mutex of the quota file. But in vfs_quota_on_inode
(called from quotaon operation), the current code tries to grab the i_mutex of
the quota file first before getting quonoff_mutex.
Reverse the order in which we take locks in vfs_quota_on_inode().
Jan Kara: Changed changelog to be more readable, made lockdep happy with
I_MUTEX_QUOTA.
Signed-off-by: Jiaying Zhang <jiayingz@google.com>
Signed-off-by: Jan Kara <jack@suse.cz>
The practical values for these limits depend on the design of the
filesystem server so let userspace set them at initialization time.
Signed-off-by: Csaba Henk <csaba@gluster.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
32-bit s390 has efficient support for 64/32-bit conversions, define
KTIME_SCALAR to enable the use of the plain scalar nanosecond based
representation of ktime.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Performance counters need 64 bit atomic operations.
To keep the patch small we use the simple generic atomic64_t implementation.
The native implementation follows with the next kernel.
Fixes this build bug:
In file included from kernel/sched.c:42:
include/linux/perf_counter.h:427: error: expected specifier-qualifier-list before 'atomic64_t'
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
From: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Provide __ucmpdi2() helper function on 31 bit so we don't run
again and again in compile errors like this one:
kernel/built-in.o: In function `T.689':
perf_counter.c:(.text+0x56c86): undefined reference to `__ucmpdi2'
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add PERF_COUNTER_INDEX_OFFSET define to fix this build bug:
kernel/perf_counter.c: In function 'perf_counter_index':
kernel/perf_counter.c:1889: error: 'PERF_COUNTER_INDEX_OFFSET' undeclared
Same fix as for FRV since s390 doesn't support hw counters.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We always returned -EINVAL when setting of a shutdown action failed. This was
misleading, if for example the hardware did not support the shutdown action.
Now we save each shutdown action's init return code and return it when the
action is being set.
Signed-off-by: Frank Munzert <munzert@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
remove loop, add some debug data and use get_sense function
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix this:
drivers/s390/char/monreader.c: In function 'mon_open':
drivers/s390/char/monreader.c:323: warning: passing argument 1 of 'dev_set_drvdata' from incompatible pointer type
include/linux/device.h:457: note: expected 'struct device *' but argument is of type 'struct device **'
drivers/s390/char/monreader.c: In function 'monreader_freeze':
drivers/s390/char/monreader.c:466: warning: passing argument 1 of 'dev_get_drvdata' from incompatible pointer type
include/linux/device.h:452: note: expected 'const struct device *' but argument is of type 'struct device **'
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Define an empty static inline version of sclp_console_pm_event()
to fix the build error below for !SCLP_CONSOLE.
drivers/s390/built-in.o: In function `sclp_rw_pm_event':
sclp_rw.c:(.text+0x12f68): undefined reference to `sclp_console_pm_event'
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If the output pin is used and EAPD capability is present, turn on
the EAPD bit. This fixes the silent output problem on ASUS laptops
with VT1708S codec.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The ktime_get() functions for GENERIC_TIME=n are still located in
hrtimer.c. Move them to time/timekeeping.c where they belong.
LKML-Reference: <new-submission>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch adds a new argument to crypto_alloc_instance which
sets aside some space before the instance for use by algorithms
such as shash that place type-specific data before crypto_alg.
For compatibility the function has been renamed so that existing
users aren't affected.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The generic ktime_get function defined in kernel/hrtimer.c is suboptimial
for GENERIC_TIME=y:
0) | ktime_get() {
0) | ktime_get_ts() {
0) | getnstimeofday() {
0) | read_tod_clock() {
0) 0.601 us | }
0) 1.938 us | }
0) | set_normalized_timespec() {
0) 0.602 us | }
0) 4.375 us | }
0) 5.523 us | }
Overall there are two read_seqbegin/read_seqretry loops and a lot of
unnecessary struct timespec calculations. ktime_get returns a nano second
value which is the sum of xtime, wall_to_monotonic and the nano second
delta from the clock source.
ktime_get can be optimized for GENERIC_TIME=y. The new version only calls
clocksource_read:
0) | ktime_get() {
0) | read_tod_clock() {
0) 0.610 us | }
0) 1.977 us | }
It uses a single read_seqbegin/readseqretry loop and just adds everthing
to a nano second value.
ktime_get_ts is optimized in a similar fashion.
[ tglx: added WARN_ON(timekeeping_suspended) as in getnstimeofday() ]
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: john stultz <johnstul@us.ibm.com>
LKML-Reference: <20090707112728.3005244d@skybase>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Following recent fix to no longer reschedule in the scan_block()
function, the system may become unresponsive with !PREEMPT. This patch
re-adds the cond_resched() call to scan_block() but conditioned by the
allow_resched parameter.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
sys_rt_tgsigqueueinfo needs to be declared in linux/syscalls.h so that
architectures defining the system call table in C can reference it.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <200907071023.44008.arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This commit fixes NOR flash recovery issues observed with Spansion
S29GL512N NOR.
When NOR erases, it first fills PEBs with zeroes, then sets all bytes
to 0xFF. Filling with zeroes starts from the end of the PEB. And when
power is cut, this results in PEBs containing correct EC and VID headers
but corrupted with zeros at the end. This confuses UBI and it mistakinly
accepts these PEBs and associate them with LEBs.
Fis this issue by zeroing EC and VID magics before erasing PEBs, to
make UBI later refuse zem.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
My CMI8329 had OPL3 port specified in SB16 resources. But now I found out that
it was my modification of the card's PnP EEPROM a couple of years ago (can be
done using C9SETROM.EXE utility). I did it because the OPL3 port was
completely missing from PnP data. It seems to be hardwired to 0x388 on
CMI8329.
Find OPL3 port automatically by searching in WSS and SB16 resources. If not
found, assume that it's hardwired to 0x388.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This patch introduces the template->create function intended
to replace the existing alloc function. The intention is for
create to handle the registration directly, whereas currently
the caller of alloc has to handle the registration.
This allows type-specific code to be run prior to registration.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>