Currently, sending an interprocessor interrupt (IPI) requires building up
a message dynamically which means memory allocation. But often times, we
will want to send an IPI in low level contexts where allocation is not
possible which may lead to a panic(). So create a per-cpu static array
for the message queue and use that instead.
Further, while we have two supplemental interrupts, we are currently only
using one of them. So use the second one for the most common IPI message
of all -- smp_send_reschedule(). This avoids ugly contention for locks
which in turn would require an IPI message ...
In general, this improves SMP performance, and in some cases allows the
SMP port to work in places it wouldn't before. Such as the PREEMPT_RT
state where the slab is protected by a per-cpu spin lock. If the slab
kmalloc/kfree were to put the task to sleep, and that task was actually
the IPI handler, then the system falls down yet again.
After running some various stress tests on the system, the static limit
of 5 messages seems to work. On the off chance even this overflows, we
simply panic(), and we can review that scenario to see if the limit needs
to be increased a bit more.
Signed-off-by: Yi Li <yi.li@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Since we're breaking apart some inter-header dependencies to avoid more
circular loops, move the blackfin_core_id() definition to the func that
it is based upon.
Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Common code expects these to be defined for SMP ports, so add them.
Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
The BF561 mem_map.h header has the __ASSEMBLY__/CONFIG_SMP checks out
of order which leads to build errors for assembly code that happens to
include this file.
Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
This function takes an irq_handler_t function, but the prototype in
the header doesn't match the function definition. This is due to the
smp headers needing to avoid circular dependencies. So change the
function to take a simple pointer.
Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
The common asm-generic non-atomic bitops.h defines test_bit() for us, but
we need to use our own version. So redirect the definition of this func
to avoid having to inline the rest of the asm-generic file.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
The cpu maps are defines provided by common linux/cpumask.h, not local
variables. So stop exporting them locally and include the right header
for their definition.
Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
The external functions are named __raw_xxx, not arch_xxx, so rename the
prototypes to match reality. This fixes some simple build errors in the
bfin_ksyms.c code which exports these helpers to modules.
Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Rather than maintain Kconfig entries where people have to enter raw
numbers and hardcode lists of addresses/pins in the driver itself,
push it all to platform resources. This lets us simplify the driver,
the Kconfig, and gives board porters greater flexibility.
In the process, we need to also start supporting the early platform
interface. Not a big deal, but it causes the patch to be bigger than
a simple resource relocation.
All the Blackfin boards already have their resources updated and in
place for this change.
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
These were only included because of the irq handling of the PLL funcs,
and those PLL funcs have been moved out into their own header now.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
The defBF512.h header exists only to include defBF51x_base.h, and it is
the only place where defBF51x_base.h is included. So move the contents
of the defBF51x_base.h header into the defBF512.h header.
Same situation for the other def/cdef pairs.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
The main asm/blackfin.h header will pull in mach/blackfin.h to get
all the fun Blackfin defines. So having any of the sub-mach headers
trying to include asm/blackfin.h makes no sense -- punt it.
The mach/blackfin.h header takes care of including the part-specific
def headers which in turn will include any other needed def file.
Similarly, it takes care of pulling in the part-specific cdef header.
So move this logic out of the blackfin.h when necessary.
Further, make sure the cdef headers do not waste time including the
def headers again.
Since all parts need the common def/cdef headers, move this logic
out of the part-specific headers and into the mach/blackfin.h file.
Finally, we need to split the BF539 def header since the BF538 does
not have MXVR and we don't want to expose those MMRs.
So now all parts should have the same behavior:
mach/blackfin.h
asm/def_LPBlackfin.h
part-specific def.h
if ! asm
asm/cdef_LPBlackfin.h
part-specific cdef.h
And the sub def/cdef headers only tail into what they need.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
We don't want the BF533 to be different in terms of its MMR headers, so
merge the FIO_FLAG helpers back into the normal place. To avoid circular
dependencies with headers, turn the inline C funcs into CPP defines. Not
like there will be any code size differences.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Since the SMP code paths tend to compile fail a lot, start a SMP defconfig
so our nightly build tools will automatically test it.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
We don't want people banging on MMRs directly. As for the ip0x board,
it shouldn't need to muck with the CS pin directly as the Blackfin SPI
bus master driver takes care of driving this.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Use the same naming convention for DMA traffic MMRs (most were legacy
anyways) so we can avoid useless ifdef trees.
Same goes for MDMA names -- this actually allows us to undo a bunch of
ifdef redirects that existed for this purpose alone.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
A bunch of arches define reads[bwl]/writes[bwl] helpers for accessing
memory mapped registers. Since the Blackfin ones aren't specific to
Blackfin code, move them to the common asm-generic/io.h for people.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Each Blackfin port has been duplicating UART structures and defines when
there really is no need for it. So start a new bfin_serial.h header to
unify all these pieces and give ourselves a fresh start.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
In order to not touch the driver file for different xtal usage,
push the clkin value to board file and calculate the register
value instead of hardcoding it.
Signed-off-by: Bob Liu <lliubbo@gmail.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
* 'next-spi' of git://git.secretlab.ca/git/linux-2.6: (77 commits)
spi/omap: Fix DMA API usage in OMAP MCSPI driver
spi/imx: correct the test on platform_get_irq() return value
spi/topcliff: Typo fix threhold to threshold
spi/dw_spi Typo change diable to disable.
spi/fsl_espi: change the read behaviour of the SPIRF
spi/mpc52xx-psc-spi: move probe/remove to proper sections
spi/dw_spi: add DMA support
spi/dw_spi: change to EXPORT_SYMBOL_GPL for exported APIs
spi/dw_spi: Fix too short timeout in spi polling loop
spi/pl022: convert running variable
spi/pl022: convert busy flag to a bool
spi/pl022: pass the returned sglen to the DMA engine
spi/pl022: map the buffers on the DMA engine
spi/topcliff_pch: Fix data transfer issue
spi/imx: remove autodetection
spi/pxa2xx: pass of_node to spi device and set a parent device
spi/pxa2xx: Modify RX-Tresh instead of busy-loop for the remaining RX bytes.
spi/pxa2xx: Add chipselect support for Sodaville
spi/pxa2xx: Consider CE4100's FIFO depth
spi/pxa2xx: Add CE4100 support
...
* 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (30 commits)
gameport: use this_cpu_read instead of lookup
x86: udelay: Use this_cpu_read to avoid address calculation
x86: Use this_cpu_inc_return for nmi counter
x86: Replace uses of current_cpu_data with this_cpu ops
x86: Use this_cpu_ops to optimize code
vmstat: User per cpu atomics to avoid interrupt disable / enable
irq_work: Use per cpu atomics instead of regular atomics
cpuops: Use cmpxchg for xchg to avoid lock semantics
x86: this_cpu_cmpxchg and this_cpu_xchg operations
percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support
percpu,x86: relocate this_cpu_add_return() and friends
connector: Use this_cpu operations
xen: Use this_cpu_inc_return
taskstats: Use this_cpu_ops
random: Use this_cpu_inc_return
fs: Use this_cpu_inc_return in buffer.c
highmem: Use this_cpu_xx_return() operations
vmstat: Use this_cpu_inc_return for vm statistics
x86: Support for this_cpu_add, sub, dec, inc_return
percpu: Generic support for this_cpu_add, sub, dec, inc_return
...
Fixed up conflicts: in arch/x86/kernel/{apic/nmi.c, apic/x2apic_uv_x.c, process.c}
as per Tejun.
* 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (33 commits)
usb: don't use flush_scheduled_work()
speedtch: don't abuse struct delayed_work
media/video: don't use flush_scheduled_work()
media/video: explicitly flush request_module work
ioc4: use static work_struct for ioc4_load_modules()
init: don't call flush_scheduled_work() from do_initcalls()
s390: don't use flush_scheduled_work()
rtc: don't use flush_scheduled_work()
mmc: update workqueue usages
mfd: update workqueue usages
dvb: don't use flush_scheduled_work()
leds-wm8350: don't use flush_scheduled_work()
mISDN: don't use flush_scheduled_work()
macintosh/ams: don't use flush_scheduled_work()
vmwgfx: don't use flush_scheduled_work()
tpm: don't use flush_scheduled_work()
sonypi: don't use flush_scheduled_work()
hvsi: don't use flush_scheduled_work()
xen: don't use flush_scheduled_work()
gdrom: don't use flush_scheduled_work()
...
Fixed up trivial conflict in drivers/media/video/bt8xx/bttv-input.c
as per Tejun.
* 'x86-apic-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: apic: Cleanup and simplify setup_local_APIC()
x86: Further simplify mp_irq info handling
x86: Unify 3 similar ways of saving mp_irqs info
x86, ioapic: Avoid writing io_apic id if already correct
x86, x2apic: Don't map lapic addr for preenabled x2apic systems
x86, sfi: Use register_lapic_address()
x86, apic: Use register_lapic_address() in init_apic_mapping()
x86, apic: Remove early_init_lapic_mapping()
x86, apic: Unify identical register_lapic_address() functions
* 'rmobile-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (67 commits)
ARM: mach-shmobile: update for SMP changes.
ARM: mach-shmobile: update for GIC changes.
ARM: mach-shmobile: Fix up clkdev fallout for SH73A0.
dma: shdma: don't register the global die notifier multiple times
ARM: mach-shmobile: Rely on run-time IRQ handlers
ARM: mach-shmobile: Run-time IRQ handler for GIC
ARM: mach-shmobile: Run-time IRQ handler for INTCA
ARM: mach-shmobile: Enable CONFIG_MULTI_IRQ_HANDLER
ARM: mach-shmobile: Use shared GIC entry macros
ARM: mach-shmobile: mackerel: Add zboot support
ARM: mach-shmobile: mackerel: Add HDMI sound support
ARM: mach-shmobile: mackerel: add HDMI video support
ARM: mach-shmobile: ap4evb: fixup clk_put timing of fsib_clk
ARM: mach-shmobile: sh73a0: fix div4 table
ARM: mach-shmobile: ap4/mackerel: modify wrong comment out of USB
ARM: mach-shmobile: Mackerel VGA camera support
mmc: sh_mmcif: make DMA support by the driver unconditional
ARM: mach-shmobile: Add eMMC support through MMCIF on AG5EVM
ARM: mach-shmobile: Use pullups for AG5EVM KEYSC pins
ARM: mach-shmobile: sh73a0 GPIO pullup improvement
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (58 commits)
Input: wacom_w8001 - support pen or touch only devices
Input: wacom_w8001 - use __set_bit to set keybits
Input: bu21013_ts - fix misuse of logical operation in place of bitop
Input: i8042 - add Acer Aspire 5100 to the Dritek list
Input: wacom - add support for digitizer in Lenovo W700
Input: psmouse - disable the synaptics extension on OLPC machines
Input: psmouse - fix up Synaptics comment
Input: synaptics - ignore bogus mt packet
Input: synaptics - add multi-finger and semi-mt support
Input: synaptics - report clickpad property
input: mt: Document interface updates
Input: fix double equality sign in uevent
Input: introduce device properties
hid: egalax: Add support for Wetab (726b)
Input: include MT library as source for kerneldoc
MAINTAINERS: Update input-mt entry
hid: egalax: Add support for Samsung NB30 netbook
hid: egalax: Document the new devices in Kconfig
hid: egalax: Add support for Wetab
hid: egalax: Convert to MT slots
...
Fixed up trivial conflict in drivers/input/keyboard/Kconfig
* 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (36 commits)
serial: apbuart: Fixup apbuart_console_init()
TTY: Add tty ioctl to figure device node of the system console.
tty: add 'active' sysfs attribute to tty0 and console device
drivers: serial: apbuart: Handle OF failures gracefully
Serial: Avoid unbalanced IRQ wake disable during resume
tty: fix typos/errors in tty_driver.h comments
pch_uart : fix warnings for 64bit compile
8250: fix uninitialized FIFOs
ip2: fix compiler warning on ip2main_pci_tbl
specialix: fix compiler warning on specialix_pci_tbl
rocket: fix compiler warning on rocket_pci_ids
8250: add a UPIO_DWAPB32 for 32 bit accesses
8250: use container_of() instead of casting
serial: omap-serial: Add support for kernel debugger
serial: fix pch_uart kconfig & build
drivers: char: hvc: add arm JTAG DCC console support
RS485 documentation: add 16C950 UART description
serial: ifx6x60: fix memory leak
serial: ifx6x60: free IRQ on error
Serial: EG20T: add PCH_UART driver
...
Fixed up conflicts in drivers/serial/apbuart.c with evil merge that
makes the code look fairly sane (unlike either side).
* 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (144 commits)
USB: add support for Dream Cheeky DL100B Webmail Notifier (1d34:0004)
USB: serial: ftdi_sio: add support for TIOCSERGETLSR
USB: ehci-mxc: Setup portsc register prior to accessing OTG viewport
USB: atmel_usba_udc: fix freeing irq in usba_udc_remove()
usb: ehci-omap: fix tll channel enable mask
usb: ohci-omap3: fix trivial typo
USB: gadget: ci13xxx: don't assume that PAGE_SIZE is 4096
USB: gadget: ci13xxx: fix complete() callback for no_interrupt rq's
USB: gadget: update ci13xxx to work with g_ether
USB: gadgets: ci13xxx: fix probing of compiled-in gadget drivers
Revert "USB: musb: pm: don't rely fully on clock support"
Revert "USB: musb: blackfin: pm: make it work"
USB: uas: Use GFP_NOIO instead of GFP_KERNEL in I/O submission path
USB: uas: Ensure we only bind to a UAS interface
USB: uas: Rename sense pipe and sense urb to status pipe and status urb
USB: uas: Use kzalloc instead of kmalloc
USB: uas: Fix up the Sense IU
usb: musb: core: kill unneeded #include's
DA8xx: assign name to MUSB IRQ resource
usb: gadget: g_ncm added
...
Manually fix up trivial conflicts in USB Kconfig changes in:
arch/arm/mach-omap2/Kconfig
arch/sh/Kconfig
drivers/usb/Kconfig
drivers/usb/host/ehci-hcd.c
and annoying chip clock data conflicts in:
arch/arm/mach-omap2/clock3xxx_data.c
arch/arm/mach-omap2/clock44xx_data.c
Conflicts:
arch/x86/include/asm/io_apic.h
Merge reason: Resolve the conflict, update to a more recent -rc base
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The problem that this patch aims to fix is vfsmount refcounting scalability.
We need to take a reference on the vfsmount for every successful path lookup,
which often go to the same mount point.
The fundamental difficulty is that a "simple" reference count can never be made
scalable, because any time a reference is dropped, we must check whether that
was the last reference. To do that requires communication with all other CPUs
that may have taken a reference count.
We can make refcounts more scalable in a couple of ways, involving keeping
distributed counters, and checking for the global-zero condition less
frequently.
- check the global sum once every interval (this will delay zero detection
for some interval, so it's probably a showstopper for vfsmounts).
- keep a local count and only taking the global sum when local reaches 0 (this
is difficult for vfsmounts, because we can't hold preempt off for the life of
a reference, so a counter would need to be per-thread or tied strongly to a
particular CPU which requires more locking).
- keep a local difference of increments and decrements, which allows us to sum
the total difference and hence find the refcount when summing all CPUs. Then,
keep a single integer "long" refcount for slow and long lasting references,
and only take the global sum of local counters when the long refcount is 0.
This last scheme is what I implemented here. Attached mounts and process root
and working directory references are "long" references, and everything else is
a short reference.
This allows scalable vfsmount references during path walking over mounted
subtrees and unattached (lazy umounted) mounts with processes still running
in them.
This results in one fewer atomic op in the fastpath: mntget is now just a
per-CPU inc, rather than an atomic inc; and mntput just requires a spinlock
and non-atomic decrement in the common case. However code is otherwise bigger
and heavier, so single threaded performance is basically a wash.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Reduce some branches and memory accesses in dcache lookup by adding dentry
flags to indicate common d_ops are set, rather than having to check them.
This saves a pointer memory access (dentry->d_op) in common path lookup
situations, and saves another pointer load and branch in cases where we
have d_op but not the particular operation.
Patched with:
git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
RCU free the struct inode. This will allow:
- Subsequent store-free path walking patch. The inode must be consulted for
permissions when walking, so an RCU inode reference is a must.
- sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
to take i_lock no longer need to take sb_inode_list_lock to walk the list in
the first place. This will simplify and optimize locking.
- Could remove some nested trylock loops in dcache code
- Could potentially simplify things a bit in VM land. Do not need to take the
page lock to follow page->mapping.
The downsides of this is the performance cost of using RCU. In a simple
creat/unlink microbenchmark, performance drops by about 10% due to inability to
reuse cache-hot slab objects. As iterations increase and RCU freeing starts
kicking over, this increases to about 20%.
In cases where inode lifetimes are longer (ie. many inodes may be allocated
during the average life span of a single inode), a lot of this cache reuse is
not applicable, so the regression caused by this patch is smaller.
The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
however this adds some complexity to list walking and store-free path walking,
so I prefer to implement this at a later date, if it is shown to be a win in
real situations. I haven't found a regression in any non-micro benchmark so I
doubt it will be a problem.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
dget_locked was a shortcut to avoid the lazy lru manipulation when we already
held dcache_lock (lru manipulation was relatively cheap at that point).
However, how that the lru lock is an innermost one, we never hold it at any
caller, so the lock cost can now be avoided. We already have well working lazy
dcache LRU, so it should be fine to defer LRU manipulations to scan time.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Protect d_unhashed(dentry) condition with d_lock. This means keeping
DCACHE_UNHASHED bit in synch with hash manipulations.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Make d_count non-atomic and protect it with d_lock. This allows us to ensure a
0 refcount dentry remains 0 without dcache_lock. It is also fairly natural when
we start protecting many other dentry members with d_lock.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Change d_delete from a dentry deletion notification to a dentry caching
advise, more like ->drop_inode. Require it to be constant and idempotent,
and not take d_lock. This is how all existing filesystems use the callback
anyway.
This makes fine grained dentry locking of dput and dentry lru scanning
much simpler.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
* 'omap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6: (243 commits)
omap2: Make OMAP2PLUS select OMAP_DM_TIMER
OMAP4: hwmod data: Fix alignment and end of line in structurefields
OMAP4: hwmod data: Move the DMA structures
OMAP4: hwmod data: Move the smartreflex structures
OMAP4: hwmod data: Fix missing SIDLE_SMART_WKUP in smartreflexsysc
arm: omap: tusb6010: add name for MUSB IRQ
arm: omap: craneboard: Add USB EHCI support
omap2+: Initialize serial port for dynamic remuxing for n8x0
omap2+: Add struct omap_board_data and use it for platform level serial init
omap2+: Allow hwmod state changes to mux pads based on the state changes
omap2+: Add support for hwmod specific muxing of devices
omap2+: Add omap_mux_get_by_name
OMAP2: PM: fix compile error when !CONFIG_SUSPEND
MAINTAINERS: OMAP: hwmod: update hwmod code, data maintainership
OMAP4: Smartreflex framework extensions
OMAP4: hwmod: Add inital data for smartreflex modules.
OMAP4: PM: Program correct init voltages for scalable VDDs
OMAP4: Adding voltage driver support
OMAP4: Register voltage PMIC parameters with the voltage layer
OMAP3: PM: Program correct init voltages for VDD1 and VDD2
...
Fix up trivial conflict in arch/arm/plat-omap/Kconfig