SYNOPSIS
size[4] Tgetattr tag[2] fid[4] request_mask[8]
size[4] Rgetattr tag[2] lstat[n]
DESCRIPTION
The getattr transaction inquires about the file identified by fid.
request_mask is a bit mask that specifies which fields of the
stat structure is the client interested in.
The reply will contain a machine-independent directory entry,
laid out as follows:
st_result_mask[8]
Bit mask that indicates which fields in the stat structure
have been populated by the server
qid.type[1]
the type of the file (directory, etc.), represented as a bit
vector corresponding to the high 8 bits of the file's mode
word.
qid.vers[4]
version number for given path
qid.path[8]
the file server's unique identification for the file
st_mode[4]
Permission and flags
st_uid[4]
User id of owner
st_gid[4]
Group ID of owner
st_nlink[8]
Number of hard links
st_rdev[8]
Device ID (if special file)
st_size[8]
Size, in bytes
st_blksize[8]
Block size for file system IO
st_blocks[8]
Number of file system blocks allocated
st_atime_sec[8]
Time of last access, seconds
st_atime_nsec[8]
Time of last access, nanoseconds
st_mtime_sec[8]
Time of last modification, seconds
st_mtime_nsec[8]
Time of last modification, nanoseconds
st_ctime_sec[8]
Time of last status change, seconds
st_ctime_nsec[8]
Time of last status change, nanoseconds
st_btime_sec[8]
Time of creation (birth) of file, seconds
st_btime_nsec[8]
Time of creation (birth) of file, nanoseconds
st_gen[8]
Inode generation
st_data_version[8]
Data version number
request_mask and result_mask bit masks contain the following bits
#define P9_STATS_MODE 0x00000001ULL
#define P9_STATS_NLINK 0x00000002ULL
#define P9_STATS_UID 0x00000004ULL
#define P9_STATS_GID 0x00000008ULL
#define P9_STATS_RDEV 0x00000010ULL
#define P9_STATS_ATIME 0x00000020ULL
#define P9_STATS_MTIME 0x00000040ULL
#define P9_STATS_CTIME 0x00000080ULL
#define P9_STATS_INO 0x00000100ULL
#define P9_STATS_SIZE 0x00000200ULL
#define P9_STATS_BLOCKS 0x00000400ULL
#define P9_STATS_BTIME 0x00000800ULL
#define P9_STATS_GEN 0x00001000ULL
#define P9_STATS_DATA_VERSION 0x00002000ULL
#define P9_STATS_BASIC 0x000007ffULL
#define P9_STATS_ALL 0x00003fffULL
This patch implements the client side of getattr implementation for
9P2000.L. It introduces a new structure p9_stat_dotl for getting
Linux stat information along with QID. The data layout is similar to
stat structure in Linux user space with the following major
differences:
inode (st_ino) is not part of data. Instead qid is.
device (st_dev) is not part of data because this doesn't make sense
on the client.
All time variables are 64 bit wide on the wire. The kernel seems to use
32 bit variables for these variables. However, some of the architectures
have used 64 bit variables and glibc exposes 64 bit variables to user
space on some architectures. Hence to be on the safer side we have made
these 64 bit in the protocol. Refer to the comments in
include/asm-generic/stat.h
There are some additional fields: st_btime_sec, st_btime_nsec, st_gen,
st_data_version apart from the bitmask, st_result_mask. The bit mask
is filled by the server to indicate which stat fields have been
populated by the server. Currently there is no clean way for the
server to obtain these additional fields, so it sends back just the
basic fields.
Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbegren <ericvh@gmail.com>
This patch implements the kernel part of readdir() implementation for 9p2000.L
Change from V3: Instead of inode, server now sends qids for each dirent
SYNOPSIS
size[4] Treaddir tag[2] fid[4] offset[8] count[4]
size[4] Rreaddir tag[2] count[4] data[count]
DESCRIPTION
The readdir request asks the server to read the directory specified by 'fid'
at an offset specified by 'offset' and return as many dirent structures as
possible that fit into count bytes. Each dirent structure is laid out as
follows.
qid.type[1]
the type of the file (directory, etc.), represented as a bit
vector corresponding to the high 8 bits of the file's mode
word.
qid.vers[4]
version number for given path
qid.path[8]
the file server's unique identification for the file
offset[8]
offset into the next dirent.
type[1]
type of this directory entry.
name[256]
name of this directory entry.
This patch adds v9fs_dir_readdir_dotl() as the readdir() call for 9p2000.L.
This function sends P9_TREADDIR command to the server. In response the server
sends a buffer filled with dirent structures. This is different from the
existing v9fs_dir_readdir() call which receives stat structures from the server.
This results in significant speedup of readdir() on large directories.
For example, doing 'ls >/dev/null' on a directory with 10000 files on my
laptop takes 1.088 seconds with the existing code, but only takes 0.339 seconds
with the new readdir.
Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Use the macros instead of hardcoding numerical constants for the
controls information bitfield.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The videobuf_dmabuf and videobuf_vmalloc_memory fields have a vmalloc
field to store the kernel virtual address of vmalloc'ed buffers. Rename
the field to vaddr.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
The fields are assigned but never used, remove them.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Those functions are only called inside videobuf-dma-sg.c, make them
static.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Instead of creating dirty wrappers around videobuf_dma_map/unmap that
create a dummy videobuf_queue structure, modify videobuf_dma_map/unmap
to take a device pointer argument and use it directly. The
videobuf_sg_dma_map/unmap then become unused and can be removed.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
These functions allocate videobuf_buffer structures only. Renaming in order
to prevent confusion with functions allocating actual video buffer memory.
Rename the functions in videobuf-core.h videobuf-dma-sg.c as well.
Signed-off-by: Pawel Osciak <p.osciak@samsung.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
incoming IR buffer now an int pointer, and not fed from userspace
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
v2: copy of buffer data from userspace done inside this plugin/driver,
keeping the actual drivers minimal, and more flexible in what we can
deliver to them later on (they may be fed from within kernelspace later
on, by an in-kernel IR encoder).
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
v2: currently unused ioctls are included, but #if 0'd out
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Some (North American) providers use a non-standard mode called
"8psk turbo fec". Since there is no flag in the driver that
would allow an application to determine whether a particular
device can handle "turbo fec", the attached patch introduces
FE_CAN_TURBO_FEC.
Since there is no flag in the SI data that would indicate
that a transponder uses "turbo fec", VDR will assume that
all 8psk transponders on DVB-S use "turbo fec".
Tested-by: Derek Kelly <user.vdr@gmail.com>
Signed-off-by: Klaus Schmidinger <Klaus.Schmidinger@tvdr.de>
Signed-off-by: Douglas Schilling Landgraf <dougsland@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Partially convert drivers/media/video/ir-kbd-i2c.c to
not use ir-functions.c
Signed-off-by: David Härdeman <david@hardeman.nu>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This is the RC6 keymap for the Windows Media Center Edition remotes
that come bundled with MCE/eHome Infrared Remote transceivers. Tested
with 3 different variants of the remote, but its possible there are
still some additional keys missing, but its simple enough to add them
in later...
This patch also adds an IR_TYPE_ALL convenience macro to make life
easier for receivers that support all IR protocols.
v2: fix an erroneous comment that referred to imon devices
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Found with makes headers_check:
include/linux/virtio_9p.h:15: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Fang Wenqi <antonf@turbolinux.com.cn>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
The only user of unique_tuple() get_unique_tuple() doesn't care about the
return value of unique_tuple(), so make unique_tuple() return void (nothing).
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This removes duplicate code by providing a default implementation
which is used by 3 of the 4 modules that provide these call.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Patrick McHardy <kaber@trash.net>
some users of nf_ct_ext_exist() know ct->ext isn't NULL. For these users, the
check for ct->ext isn't necessary, the function __nf_ct_ext_exist() can be
used instead.
the type of the return value of nf_ct_ext_exist() is changed to bool.
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
By using an atomic_t for t_updates and t_outstanding credits, this
should allow us to not need to take transaction t_handle_lock in
jbd2_journal_stop().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Conflicts:
drivers/firewire/core-card.c
drivers/firewire/core-cdev.c
and forgotten #include <linux/time.h> in drivers/firewire/ohci.c
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Update lsm_audit for AppArmor specific data, and add the core routines for
AppArmor uses for auditing.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>
Currently there are a number of applications (nautilus being the main one) which
calls access() on files in order to determine how they should be displayed. It
is normal and expected that nautilus will want to see if files are executable
or if they are really read/write-able. access() should return the real
permission. SELinux policy checks are done in access() and can result in lots
of AVC denials as policy denies RWX on files which DAC allows. Currently
SELinux must dontaudit actual attempts to read/write/execute a file in
order to silence these messages (and not flood the logs.) But dontaudit rules
like that can hide real attacks. This patch addes a new common file
permission audit_access. This permission is special in that it is meaningless
and should never show up in an allow rule. Instead the only place this
permission has meaning is in a dontaudit rule like so:
dontaudit nautilus_t sbin_t:file audit_access
With such a rule if nautilus just checks access() we will still get denied and
thus userspace will still get the correct answer but we will not log the denial.
If nautilus attempted to actually perform one of the forbidden actions
(rather than just querying access(2) about it) we would still log a denial.
This type of dontaudit rule should be used sparingly, as it could be a
method for an attacker to probe the system permissions without detection.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Currently MAY_ACCESS means that filesystems must check the permissions
right then and not rely on cached results or the results of future
operations on the object. This can be because of a call to sys_access() or
because of a call to chdir() which needs to check search without relying on
any future operations inside that dir. I plan to use MAY_ACCESS for other
purposes in the security system, so I split the MAY_ACCESS and the
MAY_CHDIR cases.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
When commit be6d3e56a6 "introduce new LSM hooks
where vfsmount is available." was proposed, regarding security_path_truncate(),
only "struct file *" argument (which AppArmor wanted to use) was removed.
But length and time_attrs arguments are not used by TOMOYO nor AppArmor.
Thus, let's remove these arguments.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: James Morris <jmorris@namei.org>
Devices register mask notifier using gsi, but irqchip knows about
irqchip/pin, so conversion from irqchip/pin to gsi should be done before
looking for mask notifier to call.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Currently if guest access address that belongs to memory slot but is not
backed up by page or page is read only KVM treats it like MMIO access.
Remove that capability. It was never part of the interface and should
not be relied upon.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
For 32bit machines where the physical address width is
larger than the virtual address width the frame number types
in KVM may overflow. Fix this by changing them to u64.
[sfr: fix build on 32-bit ppc]
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
This interface allows userspace to request hyperz support, it probably
needs more locking, and really reporting that you can have hyperz is racy
since someone else might get it before you do.
v2: modify so we pass 0 valued packets to let DDX/r300c keep working.
also fixed incorrect 0x4f1c reference.
v3: fixup zb_bw_cntl so older drivers keep working
v4: add locking, fixup SC_HYPERZ_EN - patch stream to disable hiz
Signed-off-by: Dave Airlie <airlied@redhat.com>
* 'drm-radeon-next' of ../drm-radeon-next: (333 commits)
drm/radeon/kms: trivial code style fixes for audio
drm/radeon: remove viewport transform from r6xx/r7xx blit emit
drm/radeon: group r6xx/r7xx newly sequential blit state
drm/radeon: reorder r6xx/r7xx blit state emit to make more regs sequential
drm/radeon: r6xx/r7xx move vport clipping to a single packet
drm/radeon: group r6xx/r7xx sequential blit state
drm/radeon: remove duplicate state emit in r6xx/r7xx blit
drm/radeon: add comments to r6xx/r7xx blit state
drm/radeon/kms/r7xx: add workaround for hw issue with HDP flush
drm/radeon/kms: remove rs4xx gart limit
drm: radeon: fix sign bug
drm/radeon/kms: check/restore sanity before doing anything else with GPU.
drm/radeon: fall back to GTT if bo creation/validation in VRAM fails.
drm/radeon/kms: add ioport register access
drm/radeon/kms: enable HDMI audio on RS600/RS690/RS740
drm/radeon/kms: track audio engine state, do not use not setup timer
drm/radeon/kms/r6xx+: add query for tile config (v2)
drm/radeon/kms: fix CS alignment checking for tiling (v2)
drm/radeon/kms: add tiling support to the cs checker for r6xx/r7xx
drm/radeon/kms: Add crtc tiling setup support for evergreen
...
sil164 transmitters are used for DVI outputs on Intel/nvidia and ATI setups.
So far only nouveau can use this driver.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Tested-by: Patrice Mandin <patmandin@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Userspace needs this information to access tiled
buffers via the CPU.
v2: rebased on evergreen accel changes
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Add support for JMB364 and 369.
Patch-originally-from: Aries Lee <arieslee@jmicron.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Some AHCI implementations may use Vendor Specific HBA[A0h, FFh]
and/or Port[70h, 7Fh] registers to 'prepare' for initialization.
For that, the platform needs memory mapped address of AHCI registers.
This patch adds the 'mmio' argument and reorders the call to
platform init function.
Signed-off-by: Jassi Brar <jassi.brar@samsung.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Drop the cpparg() macro that wraps CPP parameters. We already have
the PARAM() macro for that, no need to have several versions.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Some DIU structures will be used in platform code in
subsequent MPC5121 DIU patch, so we move this header
to be able to include it elsewhere.
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Acked-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
nfs_commit_inode() needs to be defined irrespectively of whether or not
we are supporting NFSv3 and NFSv4.
Allow the compiler to optimise away code in the NFSv2-only case by
converting it into an inlined stub function.
Reported-and-tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mark init_workqueues() as early_initcall() and thus it will be initialized
before smp bringup. init_workqueues() registers for the hotcpu notifier
and thus it should cope with the processors that are brought online after
the workqueues are initialized.
x86 smp bringup code uses workqueues and uses a workaround for the
cold boot process (as the workqueues are initialized post smp_init()).
Marking init_workqueues() as early_initcall() will pave the way for
cleaning up this code.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Usually the vcpu->requests bitmap is sparse, so a test_and_clear_bit() for
each request generates a large number of unneeded atomics if a bit is set.
Replace with a separate test/clear sequence. This is safe since there is
no clear_bit() outside the vcpu thread.
Signed-off-by: Avi Kivity <avi@redhat.com>
As advertised in feature-removal-schedule.txt. Equivalent support is provided
by overlapping memory regions.
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch enable guest to use XSAVE/XRSTOR instructions.
We assume that host_xcr0 would use all possible bits that OS supported.
And we loaded xcr0 in the same way we handled fpu - do it as late as we can.
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch moves the declaration of of_get_address(), of_get_pci_address(),
and of_pci_address_to_resource() out of arch code and into the common
linux/of_address header file.
This patch also fixes some of the asm/prom.h ordering issues. It still
includes some header files that it ideally shouldn't be, but at least the
ordering is consistent now so that of_* overrides work.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
KVM_REQ_KICK poisons vcpu->requests by having a bit set during normal
operation. This causes the fast path check for a clear vcpu->requests
to fail all the time, triggering tons of atomic operations.
Fix by replacing KVM_REQ_KICK with a vcpu->guest_mode atomic.
Signed-off-by: Avi Kivity <avi@redhat.com>
In common cases, guest SRAO MCE will cause corresponding poisoned page
be un-mapped and SIGBUS be sent to QEMU-KVM, then QEMU-KVM will relay
the MCE to guest OS.
But it is reported that if the poisoned page is accessed in guest
after unmapping and before MCE is relayed to guest OS, userspace will
be killed.
The reason is as follows. Because poisoned page has been un-mapped,
guest access will cause guest exit and kvm_mmu_page_fault will be
called. kvm_mmu_page_fault can not get the poisoned page for fault
address, so kernel and user space MMIO processing is tried in turn. In
user MMIO processing, poisoned page is accessed again, then userspace
is killed by force_sig_info.
To fix the bug, kvm_mmu_page_fault send HWPOISON signal to QEMU-KVM
and do not try kernel and user space MMIO processing for poisoned
page.
[xiao: fix warning introduced by avi]
Reported-by: Max Asbock <masbock@linux.vnet.ibm.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Some platforms gate the pclk (APB - the bus - clock) to the peripherals
for power saving, along with the functional clock. When devices are
accessed without pclk enabled, the kernel will oops.
This gives them two options:
1. Leave all clocks on all the time.
2. Attempt to gate pclk along with the functional clock.
(With some hardware, pclk and the functional clock are gated by a single
bit in a register.)
(1) has the disadvantage that it causes increased power usage, which is
bad news for battery operated devices. (2) can lead to kernel oops if
registers are accessed without the functional clock being enabled.
So, introduce the apb_pclk signal in such a way existing drivers don't
need to be updated. Essentially, this means we guarantee that:
1. pclk will be enabled whenever the driver is bound to a device -
from probe() to remove() time.
2. pclk will also be enabled when reading the primecell IDs from the device.
In order to allow drivers to be incrementally updated to achieve greater
power savings, we provide two additional calls to allow drivers to
manage the pclk - amba_pclk_enable()/amba_pclk_disable().
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
A function that copies the padata cpumasks to a user buffer
is a bit error prone. The cpumask can change any time so we
can't be sure to have the right cpumask when using this function.
A user who is interested in the padata cpumasks should register
to the padata cpumask notifier chain instead. Users of
padata_get_cpumask are already updated, so we can remove it.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We pass a pointer to the new padata cpumasks to the cpumask_change_notifier
chain. So users can access the cpumasks without the need of an extra
padata_get_cpumask function.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
padata_set_cpumask needs to be protected by a lock. We make
__padata_set_cpumasks unlocked and static. So this function
can be used by the exported and locked padata_set_cpumask and
padata_set_cpumasks functions.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We rename padata_alloc to padata_alloc_possible because this
function allocates a padata_instance and uses the cpu_possible
mask for parallel and serial workers. Also we rename __padata_alloc
to padata_alloc to avoid to export underlined functions. Underlined
functions are considered to be private to padata. Users are updated
accordingly.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add support for the cy8ctmg110 capacitive touchscreen used on some
embedded devices.
(Some clean up by Alan Cox)
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
See https://bugzilla.kernel.org/show_bug.cgi?id=16056
If other processes are blocked waiting for kswapd to free up some memory so
that they can make progress, then we cannot allow kswapd to block on those
processes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
As we only provide one way to set up resources now, we can remove
the resource-setup-related bitfield (except resource_setup_done).
In addition, pcmcia_state only consisted of one entry, so remove
this bitfield as well.
Suggested-by: Komuro <komurojun-mbn@nifty.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Remove some definitions which became obsolete when the central
event handler got removed.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
This is needed by NFSv4.0 servers in order to keep the number of locking
stateids at a manageable level.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
commit 2ca1af9aa3285c6a5f103ed31ad09f7399fc65d7 "PCI: MSI: Remove
unsafe and unnecessary hardware access" changed read_msi_msg_desc() to
return the last MSI message written instead of reading it from the
device, since it may be called while the device is in a reduced
power state.
However, the pSeries platform code really does need to read messages
from the device, since they are initially written by firmware.
Therefore:
- Restore the previous behaviour of read_msi_msg_desc()
- Add new functions get_cached_msi_msg{,_desc}() which return the
last MSI message written
- Use the new functions where appropriate
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch exports SMBIOS provided firmware instance and label of
onboard PCI devices to sysfs. New files are:
/sys/bus/pci/devices/.../label which contains the firmware name for
the device in question, and
/sys/bus/pci/devices/.../index which contains the firmware device type
instance for the given device.
Signed-off-by: Jordan Hargrave <jordan_hargrave@dell.com>
Signed-off-by: Narendra K <narendra_k@dell.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
It is a known issue that mmio decoding shall be disabled while doing PCI
bar sizing. Host bridge and other devices (PCI PIC) shall be excluded for
certain platforms. This patch mainly comes from Mathew Willcox's
patch in http://kerneltrap.org/mailarchive/linux-kernel/2007/9/13/258969.
A new flag bit "mmio_alway_on" is added to pci_dev with the intention that
devices with their mmio decoding cannot be disabled during BAR sizing shall
have this bit set, preferrablly in their quirks.
Without this patch, Intel Moorestown platform graphics unit will be
corrupted during bar sizing activities.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Move of_register_spi_devices() call from drivers to
spi_register_master(). Also change the function to use
the struct device_node pointer from master spi device
instead of passing it as function argument.
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
of_node_to_nid() is only relevant in a few architectures. Don't force
everyone to implement it anyway.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
The AMBA bus should also use of_device_make_bus_id() when populating device
out of device tree data. This patch makes the function non-static, and
adds a suitable prototype in of_device.h
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Fix __task_cred()'s lockdep check by removing the following validation
condition:
lockdep_tasklist_lock_is_held()
as commit_creds() does not take the tasklist_lock, and nor do most of the
functions that call it, so this check is pointless and it can prevent
detection of the RCU lock not being held if the tasklist_lock is held.
Instead, add the following validation condition:
task->exit_state >= 0
to permit the access if the target task is dead and therefore unable to change
its own credentials.
Fix __task_cred()'s comment to:
(1) discard the bit that says that the caller must prevent the target task
from being deleted. That shouldn't need saying.
(2) Add a comment indicating the result of __task_cred() should not be passed
directly to get_cred(), but rather than get_task_cred() should be used
instead.
Also put a note into the documentation to enforce this point there too.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It's possible for get_task_cred() as it currently stands to 'corrupt' a set of
credentials by incrementing their usage count after their replacement by the
task being accessed.
What happens is that get_task_cred() can race with commit_creds():
TASK_1 TASK_2 RCU_CLEANER
-->get_task_cred(TASK_2)
rcu_read_lock()
__cred = __task_cred(TASK_2)
-->commit_creds()
old_cred = TASK_2->real_cred
TASK_2->real_cred = ...
put_cred(old_cred)
call_rcu(old_cred)
[__cred->usage == 0]
get_cred(__cred)
[__cred->usage == 1]
rcu_read_unlock()
-->put_cred_rcu()
[__cred->usage == 1]
panic()
However, since a tasks credentials are generally not changed very often, we can
reasonably make use of a loop involving reading the creds pointer and using
atomic_inc_not_zero() to attempt to increment it if it hasn't already hit zero.
If successful, we can safely return the credentials in the knowledge that, even
if the task we're accessing has released them, they haven't gone to the RCU
cleanup code.
We then change task_state() in procfs to use get_task_cred() rather than
calling get_cred() on the result of __task_cred(), as that suffers from the
same problem.
Without this change, a BUG_ON in __put_cred() or in put_cred_rcu() can be
tripped when it is noticed that the usage count is not zero as it ought to be,
for example:
kernel BUG at kernel/cred.c:168!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/kernel/mm/ksm/run
CPU 0
Pid: 2436, comm: master Not tainted 2.6.33.3-85.fc13.x86_64 #1 0HR330/OptiPlex
745
RIP: 0010:[<ffffffff81069881>] [<ffffffff81069881>] __put_cred+0xc/0x45
RSP: 0018:ffff88019e7e9eb8 EFLAGS: 00010202
RAX: 0000000000000001 RBX: ffff880161514480 RCX: 00000000ffffffff
RDX: 00000000ffffffff RSI: ffff880140c690c0 RDI: ffff880140c690c0
RBP: ffff88019e7e9eb8 R08: 00000000000000d0 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000040 R12: ffff880140c690c0
R13: ffff88019e77aea0 R14: 00007fff336b0a5c R15: 0000000000000001
FS: 00007f12f50d97c0(0000) GS:ffff880007400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8f461bc000 CR3: 00000001b26ce000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process master (pid: 2436, threadinfo ffff88019e7e8000, task ffff88019e77aea0)
Stack:
ffff88019e7e9ec8 ffffffff810698cd ffff88019e7e9ef8 ffffffff81069b45
<0> ffff880161514180 ffff880161514480 ffff880161514180 0000000000000000
<0> ffff88019e7e9f28 ffffffff8106aace 0000000000000001 0000000000000246
Call Trace:
[<ffffffff810698cd>] put_cred+0x13/0x15
[<ffffffff81069b45>] commit_creds+0x16b/0x175
[<ffffffff8106aace>] set_current_groups+0x47/0x4e
[<ffffffff8106ac89>] sys_setgroups+0xf6/0x105
[<ffffffff81009b02>] system_call_fastpath+0x16/0x1b
Code: 48 8d 71 ff e8 7e 4e 15 00 85 c0 78 0b 8b 75 ec 48 89 df e8 ef 4a 15 00
48 83 c4 18 5b c9 c3 55 8b 07 8b 07 48 89 e5 85 c0 74 04 <0f> 0b eb fe 65 48 8b
04 25 00 cc 00 00 48 3b b8 58 04 00 00 75
RIP [<ffffffff81069881>] __put_cred+0xc/0x45
RSP <ffff88019e7e9eb8>
---[ end trace df391256a100ebdd ]---
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds the DMA context programming and userspace ABI for multichannel
reception, i.e. for listening on multiple channel numbers by means of a
single DMA context.
The use case is reception of more streams than there are IR DMA units
offered by the link layer. This is already implemented by the older
ohci1394 + ieee1394 + raw1394 stack. And as discussed recently on
linux1394-devel, this feature is occasionally used in practice.
The big drawbacks of this mode are that buffer layout and interrupt
generation necessarily differ from single-channel reception: Headers
and trailers are not stripped from packets, packets are not aligned with
buffer chunks, interrupts are per buffer chunk, not per packet.
These drawbacks also cause a rather hefty code footprint to support this
rarely used OHCI-1394 feature. (367 lines added, among them 94 lines of
added userspace ABI documentation.)
This implementation enforces that a multichannel reception context may
only listen to channels to which no single-channel context on the same
link layer is presently listening to. OHCI-1394 would allow to overlay
single-channel contexts by the multi-channel context, but this would be
a departure from the present first-come-first-served policy of IR
context creation.
The implementation is heavily based on an earlier one by Jay Fenlason.
Thanks Jay.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Some features require knowing the DTIM period
before associating. This implements the ability
to wait for a beacon in mac80211 before assoc
to provide this value. It is optional since
most likely not all drivers will need this.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Platforms may have some external power control which need to be
controlled from board specific code. Rename the translate_vdd()
callback to vdd_handler() and pass it the power mode.
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
A small number of users of IRQF_TIMER are using it for the implied no
suspend behaviour on interrupts which are not timer interrupts.
Therefore add a new IRQF_NO_SUSPEND flag, rename IRQF_TIMER to
__IRQF_TIMER and redefine IRQF_TIMER in terms of these new flags.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: xen-devel@lists.xensource.com
Cc: linux-input@vger.kernel.org
Cc: linuxppc-dev@ozlabs.org
Cc: devicetree-discuss@lists.ozlabs.org
LKML-Reference: <1280398595-29708-1-git-send-email-ian.campbell@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Platform parameter to enable automatic FIFO configuration when
the codec is in Mode1 or Mode7 FIFO mode.
When this mode is selected, the controls for changing
nSample (in Mode1), and UTHR (in Mode7) are not added.
The driver configures the FIFO configuration based on
the stream's period size in a way, that every burst will
read period size of data from the host.
In Mode7 we need to use a formula, which gives close enough
aproximation for the burst length from the host point
of view.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@nokia.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Replace the hardwired latency definition with platform data
parameter, and simplify the nSample parameter calculation.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@nokia.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6:
davinci: da850/omap-l138 evm: account for DEFDCDC{2,3} being tied high
regulator: tps6507x: allow driver to use DEFDCDC{2,3}_HIGH register
wm8350-regulator: fix wm8350_register_regulator error handling
ab3100: fix off-by-one value range checking for voltage selector
For some drivers it can be useful to know whether the channel they're
supposed to switch to is going to be used for short off-channel work or
scanning, or whether the hardware is expected to stay on it for a while
longer. This is important for various kinds of calibration work, which
takes longer to complete and should keep some persistent state, even if
the channel temporarily changes.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
In TPS6507x, depending on the status of DEFDCDC{2,3} pin either
DEFDCDC{2,3}_LOW or DEFDCDC{2,3}_HIGH register needs to be read or
programmed to change the output voltage.
The current driver assumes DEFDCDC{2,3} pins are always tied low
and thus operates only on DEFDCDC{2,3}_LOW register. This need
not always be the case (as is found on OMAP-L138 EVM).
Unfortunately, software cannot read the status of DEFDCDC{2,3} pins.
So, this information is passed through platform data depending on
how the board is wired.
Signed-off-by: Anuj Aggarwal <anuj.aggarwal@ti.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
This patch (as1398b) adds runtime PM support to the SCSI layer. Only
the machanism is provided; use of it is up to the various high-level
drivers, and the patch doesn't change any of them. Except for sg --
the patch expicitly prevents a device from being runtime-suspended
while its sg device file is open.
The implementation is simplistic. In general, hosts and targets are
automatically suspended when all their children are asleep, but for
them the runtime-suspend code doesn't actually do anything. (A host's
runtime PM status is propagated up the device tree, though, so a
runtime-PM-aware lower-level driver could power down the host adapter
hardware at the appropriate times.) There are comments indicating
where a transport class might be notified or some other hooks added.
LUNs are runtime-suspended by calling the drivers' existing suspend
handlers (and likewise for runtime-resume). Somewhat arbitrarily, the
implementation delays for 100 ms before suspending an eligible LUN.
This is because there typically are occasions during bootup when the
same device file is opened and closed several times in quick
succession.
The way this all works is that the SCSI core increments a device's
PM-usage count when it is registered. If a high-level driver does
nothing then the device will not be eligible for runtime-suspend
because of the elevated usage count. If a high-level driver wants to
use runtime PM then it can call scsi_autopm_put_device() in its probe
routine to decrement the usage count and scsi_autopm_get_device() in
its remove routine to restore the original count.
Hosts, targets, and LUNs are not suspended while they are being probed
or removed, or while the error handler is running. In fact, a fairly
large part of the patch consists of code to make sure that things
aren't suspended at such times.
[jejb: fix up compile issues in PM config variations]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
We have two separate definitions for identical constants with nearly the
same name. One comes from the generic headers in scsi.h; the other is
an enum in libsas.h ... it's causing confusion about which one is
correct (fortunately they both are).
Fix this by eliminating the libsas.h duplicate
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
wait for session to come online in eh_device_reset_handler
and eh_target_reset_handler
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Incoming requests shouldn't require a local exchange if we're
just going to reply with one or two frames and don't expect
anything further. Don't allocate exchanges for such requests
until requested by the upper-layer protocol.
The sequence is always NULL for new requests, so remove
that as an argument to request handlers.
Also change the first argument to lport->tt.seq_els_rsp_send
from the sequence pointer to the received frame pointer, to
supply the exchange IDs and destination ID info.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
For incoming ELS and FCP requests, we often don't require an
exchange and sequence, however, sometimes we do. For those cases,
(primarily FCP requests for targets) add a function to set up
the exchange and sequence.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Add functions to fill in an FC header given a request header.
These reduces code lines in fc_lport and fc_rport and works
without an exchange/sequence assigned.
fc_fill_reply_hdr() fills a header for a final reply frame.
fc_fill_hdr() which is similar but allows specifying the
f_ctl parameter.
Add defines for F_CTL values FC_FCTL_REQ and FC_FCTL_RESP.
These can be used for most request and response sequences.
v2 of patch adds a line to copy the frame encapsulation
info from the received frame.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
To pave the way for eliminating exchanges from incoming requests,
add simple inline fc_frame_sid() and fc_frame_did() functions
which get the FC_IDs from the frame header. This can be almost
as efficient as getting them from the sequence/exchange.
Move ntohll, htonll, ntoh24 and hton24 to <scsi/fc_frame.h>
since we need them there and that's included by <scsi/libfc.h>
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The LOGO state hasn't been used in a while, except in a brief
transition to DELETE state while holding the rport mutex.
All port LOGO responses have been ignored as well as any timeout
if we don't get a response.
So this patch just removes LOGO state and simplifies the response handler.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The FC-BB-6 committee is proposing a new FIP usage model called
VN_port to VN_port mode. It allows VN_ports to discover each other
over a loss-free L2 Ethernet without any FCF or Fibre-channel fabric
services. This is point-to-multipoint. There is also a variant
of this called point-to-point which provides for making sure there
is just one pair of ports operating over the Ethernet fabric.
We add these new states: VNMP_START, _PROBE1, _PROBE2, _CLAIM, and _UP.
These usually go quickly in that sequence. After waiting a random
amount of time up to 100 ms in START, we select a pseudo-random
proposed locally-unique port ID and send out probes in states PROBE1
and PROBE2, 100 ms apart. If no probe responses are heard, we
proceed to CLAIM state 400 ms later and send a claim notification.
We wait another 400 ms to receive claim responses, which give us
a list of the other nodes on the network, including their FC-4
capabilities. After another 400 ms we go to VNMP_UP state and
should start interoperating with any of the nodes for whic we
receivec claim responses. More details are in the spec.j
Add the new mode as FIP_MODE_VN2VN. The driver must specify
explicitly that it wants to operate in this mode. There is
no automatic detection between point-to-multipoint and fabric
mode, and the local port initialization is affected, so it isn't
anticipated that there will ever be any such automatic switchover.
It may eventually be possible to have both fabric and VN2VN
modes on the same L2 network, which may be done by two separate
local VN_ports (lports).
When in VN2VN mode, FIP replaces libfc's fabric-oriented discovery
module with its own simple code that adds remote ports as they
are discovered from incoming claim notifications and responses.
These hooks are placed by fcoe_disc_init().
A linear list of discovered vn_ports is maintained under the
fcoe_ctlr struct. It is expected to be short for now, and
accessed infrequently. It is kept under RCU for lock-ordering
reasons. The lport and/or rport mutexes may be held when we
need to lookup a fcoe_vnport during an ELS send.
Change fcoe_ctlr_encaps() to lookup the destination vn_port in
the list of peers for the destination MAC address of the
FIP-encapsulated frame.
Add a new function fcoe_disc_init() to initialize just the
discovery portion of libfcoe for VN2VN mode.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The FC-BB-6 committee is proposing a new FIP usage model called
VN_port to VN_port mode. It allows VN_ports to discover each other
over a loss-free L2 Ethernet without any FCF or Fibre-channel fabric
services. This is point-to-multipoint. There is also a variant
of this called point-to-point which provides for making sure there
is just one pair of ports operating over the Ethernet fabric.
This patch defines the new message type and subtypes as well as
one new descriptor type used by VN2VN mode.
These are all still at the proposed stage and subject to change.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When an exchange is received with a FIP encapsulation, we need
to know that the response must be sent via FIP and what the original
ELS opcode was. This becomes important for VN2VN mode, where we may
receive FLOGI or LOGO from several peer VN_ports, and the LS_ACC or
LS_RJT must be sent FIP-encapsulated with the correct sub-type.
Add a field to the struct fc_frame, fr_encaps, to indicate the
encapsulation values. That term is chosen to be neutral and
LLD-agnostic in case non-FCoE/FIP LLDs might find it useful.
The frame fr_encaps is transferred from the ingress frame to the
exchange by fc_exch_recv_req(), and back to the outgoing frame
by fc_seq_send().
This is taking the last byte in the skb->cb array. If needed,
we could combine the info in sof, eof, flags, and encaps
together into one field, but it'd be better to do that if
and when its needed.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The FIP proposal for VN_port to VN_port point-to-multipoint
operation requires a FLOGI be sent to each remote port.
The FLOGI is sent with the assigned S_ID and D_IDs of the
local and remote ports. This and the response get
FIP-encapsulated for Ethernet.
Add FLOGI state to the remote port state machine.
This will be skipped if not in point-to-multipoint mode.
To reduce a little duplication between PLOGI and FLOGI
response handling, added fc_rport_login_complete(), which
handles the parameters for the rdata struct.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
For VN_port to VN_port mode, the transport sets the port_id and
there's no lport FLOGI. This is similar to FC loop mode.
Add a point_to_multipoint flag that indicates the local port is in
point-to-multipoint mode. This skips FLOGI and discovery.
It also skips resetting the port_id on resets other than link down.
Add function fc_lport_set_local_id() that sets the local port_id.
This is called by libfcoe on behalf of the low-level driver
to set the port_id when the link comes up.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
There are three modes that libfcoe currently supports, and a new one
is coming. Change the fcoe_ctlr_init() interface to add the mode
desired. This should not change any functionality.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
For VN_port to VN_port mode, FIP will do discovery and needs a
way to find its state from the local port or discovery structure.
It seems that any other LLD that implements its own discovery
would also need something like this.
Replace disc->lport with disc->priv, and use container_of to
find the lport. We could use disc->priv for that, but
container_of is smaller and faster.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
It turns out most of the FIP work is now done from worker threads
or process context now, so there's no need to use a spin lock.
Change to use mutex instead of spin lock and delayed_work instead
of a timer.
This will make it nicer for the VN_port to VN_port feature that
will interact more with the libfc layers requiring that
spinlocks not be held.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Add pre-zeroed space after the allocation for fc_rport_priv
for use by the lower-level driver.
This is primarily for VN2VN FIP mode, but could be used in
other ways someday.
The space required is specified in lport->rport_priv_size.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
To allow LLD to do lookups on rports without grabbing a mutex,
make them RCU-safe. The caller of lport->tt.rport_lookup will
have the choice of holding disc_mutex or the rcu_read_lock().
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This is per FC-BB-5 Annex-D recommendation and per that
if address checking fails then drop the frame.
FIP code paths are already doing this so only needed for fcoe
frames.
The src address checking is limited to only fip mode since
this might break non-fip mode used in p2p due to used OUI
based addressing in some p2p code paths, going forward FIP
will be the only mode, therefore limited this to only FIP
mode so that it won't break non-fip p2p mode for now.
-v2
Removes FCOE packet type checking since fcoe_rcv is
registered to receive only FCoE type packets from netdev
and it is already checked by netdev.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Analyzing fcoe with sparse currently fails. This is because struct
fcoe_rcv_info contains two enum members that have been declared with
__attribute__((packed)). Apparently gcc honors this attribute while sparse
ignores it. The result is that sizeof(struct fcoe_rcv_info)
== sizeof(struct sk_buff::cb) == 48 on a 64-bit system according to gcc, but
not according to sparse. The patch below modifies the definition of
struct fcoe_rcv_info such that gcc and sparse interpret this structure
definition in the same way. The current sparse output is as follows:
$ cd linux-2.6.34
$ make C=2 M=drivers/scsi/fcoe modules
CHECK drivers/scsi/fcoe/fcoe.c
include/scsi/fc_frame.h:81:9: error: invalid bitfield width, -1.
CC [M] drivers/scsi/fcoe/fcoe.o
CHECK drivers/scsi/fcoe/libfcoe.c
include/scsi/fc_frame.h:81:9: error: invalid bitfield width, -1.
drivers/scsi/fcoe/libfcoe.c:56:37: error: invalid initializer
Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Cc: jeykholt@cisco.com
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Add a new kernel API to attach a task to current task's cgroup
in all the active hierarchies.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paul Menage <menage@google.com>
Acked-by: Li Zefan <lizf@cn.fujitsu.com>
gcc 4.4.4 will complain if you use a .discard section for both text and
data ("causes a section type conflict"). Add support for ".discard.*"
sections, and use .discard.text for a dummy function in the x86
RESERVE_BRK() macro.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
This patch adds support for RX and TX DMA via the DMA API,
this is only supported when the KS8842 is accessed via timberdale.
There is no support for DMA on the generic bus interface it self,
a state machine inside the FPGA is handling RX and TX transfers to/from
buffers in the FPGA. The host CPU can do DMA to and from these buffers.
The FPGA has to handle the RX interrupts, so these must be enabled in
the ks8842 but not in the FPGA. The driver must not disable the RX interrupt
that would mean that the data transfers into the FPGA buffers would stop.
The host shall not enable TX interrupts since TX is handled by the FPGA,
the host is notified by DMA callbacks when transfers are finished.
Which DMA channels to use are added as parameters in the platform data struct.
Signed-off-by: Richard Röjfors <richard.rojfors@pelagicore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Save a few bytes of text
(allyesconfig)
$ size drivers/net/wireless/built-in.o*
text data bss dec hex filename
3924568 100548 871056 4896172 4ab5ac drivers/net/wireless/built-in.o.new
3926520 100548 871464 4898532 4abee4 drivers/net/wireless/built-in.o.old
$ size net/wireless/core.o*
text data bss dec hex filename
12843 216 3768 16827 41bb net/wireless/core.o.new
12328 216 3656 16200 3f48 net/wireless/core.o
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Remote ports were restarting indefinitely after getting
rejects in PRLI.
Fix by adding a counter of restarts and limiting that with
the port login retry limit as well.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch somewhat combines two fixes to remote port handing in libfc.
The first problem was that rport work could be queued on a deleted
and freed rport. This is handled by not resetting rdata->event
ton NONE if the rdata is about to be deleted.
However, that fix led to the second problem, described by
Bhanu Gollapudi, as follows:
> Here is the sequence of events. T1 is first LOGO receive thread, T2 is
> fc_rport_work() scheduled by T1 and T3 is second LOGO receive thread and
> T4 is fc_rport_work scheduled by T3.
>
> 1. (T1)Received 1st LOGO in state Ready
> 2. (T1)Delete port & enter to RESTART state.
> 3. (T1)schdule event_work, since event is RPORT_EV_NONE.
> 4. (T1)set event = RPORT_EV_LOGO
> 5. (T1)Enter RESTART state as disc_id is set.
> 6. (T2)remember to PLOGI, and set event = RPORT_EV_NONE
> 6. (T3)Received 2nd LOGO
> 7. (T3)Delete Port & enter to RESTART state.
> 8. (T3)schedule event_work, since event is RPORT_EV_NONE.
> 9. (T3)Enter RESTART state as disc_id is set.
> 9. (T3)set event = RPORT_EV_LOGO
> 10.(T2)work restart, enter PLOGI state and issues PLOGI
> 11.(T4)Since state is not RESTART anymore, restart is not set, and the
> event is not reset to RPORT_EV_NONE. (current event is RPORT_EV_LOGO).
> 12. Now, PLOGI succeeds and fc_rport_enter_ready() will not schedule
> event_work, and hence the rport will never be created, eventually losing
> the target after dev_loss_tmo.
So, the problem here is that we were tracking the desire for
the rport be restarted by state RESTART, which was otherwise
equivalent to DELETE. A contributing factor is that we dropped
the lock between steps 6 and 10 in thread T2, which allows the
state to change, and we didn't completely re-evaluate then.
This is hopefully corrected by the following minor redesign:
Simplify the rport restart logic by making the decision to
restart after deleting the transport rport. That decision
is based on a new STARTED flag that indicates fc_rport_login()
has been called and fc_rport_logoff() has not been called
since then. This replaces the need for the RESTART state.
Only restart if the rdata is still in DELETED state
and only if it still has the STARTED flag set.
Also now, since we clear the event code much later in the
work thread, allow for the possibility that the rport may
have become READY again via incoming PLOGI, and if so,
queue another event to handle that.
In the problem scenario, the second LOGO received will
cause the LOGO event to occur again.
Reported-by: Bhanu Gollapudi <bprakash@broadcom.com>
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Resubmitting after incorporating Joe's review comment.
Unsolicited PRLO request is now handled by sending LS_ACC,
and then relogin to the remote port if an N-port login
session exists for that remote port.
Note that this patch should be applied on top of Joe Eykholt's
"Fix remote port restart problem" patch.
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Some old comments in fc_fcoe.h say TBD long after the
standard has been passed by T11. Clean them up.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
s2io: fixing DBG_PRINT() macro
ath9k: fix dma direction for map/unmap in ath_rx_tasklet
net: dev_forward_skb should call nf_reset
net sched: fix race in mirred device removal
tun: avoid BUG, dump packet on GSO errors
bonding: set device in RLB ARP packet handler
wimax/i2400m: Add PID & VID for Intel WiMAX 6250
ipv6: Don't add routes to ipv6 disabled interfaces.
net: Fix skb_copy_expand() handling of ->csum_start
net: Fix corruption of skb csum field in pskb_expand_head() of net/core/skbuff.c
macvtap: Limit packet queue length
ixgbe/igb: catch invalid VF settings
bnx2x: Advance a module version
bnx2x: Protect statistics ramrod and sequence number
bnx2x: Protect a SM state change
wireless: use netif_rx_ni in ieee80211_send_layer2_update
Filesystems with unwritten extent support must not complete an AIO request
until the transaction to convert the extent has been commited. That means
the aio_complete calls needs to be moved into the ->end_io callback so
that the filesystem can control when to call it exactly.
This makes a bit of a mess out of dio_complete and the ->end_io callback
prototype even more complicated.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
__GFP_NOFAIL is going away, so add our own retry loop. Also add
jbd2__journal_start() and jbd2__journal_restart() which take a gfp
mask, so that file systems can optionally (re)start transaction
handles using GFP_KERNEL. If they do this, then they need to be
prepared to handle receiving an PTR_ERR(-ENOMEM) error, and be ready
to reflect that error up to userspace.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
To properly handle clocksources that change frequencies
at the clocksource->enable() point, this patch adds
a method that will update the clocksource's mult/shift and
max_idle_ns values.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
LKML-Reference: <1279068988-21864-12-git-send-email-johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch makes xtime and wall_to_monotonic static, as planned in
Documentation/feature-removal-schedule.txt. This will allow for
further cleanups to the timekeeping core.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
LKML-Reference: <1279068988-21864-10-git-send-email-johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Provides an accessor function to replace hrtimer.c's
direct access of wall_to_monotonic.
This will allow wall_to_monotonic to be made static as
planned in Documentation/feature-removal-schedule.txt
Signed-off-by: John Stultz <johnstul@us.ibm.com>
LKML-Reference: <1279068988-21864-9-git-send-email-johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
update_vsyscall() did not provide the wall_to_monotoinc offset,
so arch specific implementations tend to reference wall_to_monotonic
directly. This limits future cleanups in the timekeeping core, so
this patch fixes the update_vsyscall interface to provide
wall_to_monotonic, allowing wall_to_monotonic to be made static
as planned in Documentation/feature-removal-schedule.txt
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tony Luck <tony.luck@intel.com>
LKML-Reference: <1279068988-21864-7-git-send-email-johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
After accidentally misusing timespec_add_safe, I wanted to make sure
we don't accidently trip over that issue again, so I created a simple
timespec_add() function which we can use to replace the instances
of timespec_add_safe() that don't want the overflow detection.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
LKML-Reference: <1279068988-21864-3-git-send-email-johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Implementation of the ST-Ericsson baudrate extension in the PL011
block. In this modified variant it is possible to change the
sampling factor from 16 to 8, and thanks to this we can get higher
baudrates while still using the same peripheral clock.
Also replace the simple division to determine the baud divisor
with DIV_ROUND_CLOSEST() rather than a simple integer division.
Cc: Alessandro Rubini <rubini@unipv.it>
Cc: Jerzy Kasenberg <jerzy.kasenberg@tieto.com>
Signed-off-by: Marcin Mielczarczyk <marcin.mielczarczyk@tieto.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
In the ST-Ericsson version of the PL011 the TX and RX have different
control registers.
Cc: Alessandro Rubini <rubini@unipv.it>
Signed-off-by: Marcin Mielczarczyk <marcin.mielczarczyk@tieto.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When a pagetable is about to be destroyed, we notify Xen so that the
hypervisor can clear the related shadow pagetable.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Add a xen_emul_unplug command line option to the kernel to unplug
xen emulated disks and nics.
Set the default value of xen_emul_unplug depending on whether or
not the Xen PV frontends and the Xen platform PCI driver have
been compiled for this kernel (modules or built-in are both OK).
The user can specify xen_emul_unplug=ignore to enable PV drivers on HVM
even if the host platform doesn't support unplug.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Use xen_vcpuop_clockevent instead of hpet and APIC timers as main
clockevent device on all vcpus, use the xen wallclock time as wallclock
instead of rtc and use xen_clocksource as clocksource.
The pv clock algorithm needs to work correctly for the xen_clocksource
and xen wallclock to be usable, only modern Xen versions offer a
reliable pv clock in HVM guests (XENFEAT_hvm_safe_pvclock).
Using the hpet as clocksource means a VMEXIT every time we read/write to
the hpet mmio addresses, pvclock give us a better rating without
VMEXITs. Same goes for the xen wallclock and xen_vcpuop_clockevent
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Don Dutile <ddutile@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Filesystems with unwritten extent support must not complete an AIO request
until the transaction to convert the extent has been commited. That means
the aio_complete calls needs to be moved into the ->end_io callback so
that the filesystem can control when to call it exactly.
This makes a bit of a mess out of dio_complete and the ->end_io callback
prototype even more complicated.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Alex Elder <aelder@sgi.com>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPI / Sleep: Allow the NVS saving to be skipped during suspend to RAM
ACPI: create "processor.bm_check_disable" boot param
ACPI: skip checking BM_STS if the BIOS doesn't ask for it
ACPI: fix unused function warning
ACPI: processor: fix processor_physically_present on UP
ACPI video: fix string mismatch for Sony SR290 laptop
ACPI battery: don't invoke power_supply_changed twice when battery is hot-added
ACPI: handle systems which asynchoronously enable ACPI mode
This patch allows exporting GPIO pins not used by the keypad itself
to be accessible from elsewhere.
Signed-off-by: Xiaolong Chen <xiao-long.chen@motorola.com>
Signed-off-by: Yuanbo Ye <yuan-bo.ye@motorola.com>
Signed-off-by: Tao Hu <taohu@motorola.com>
Acked-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
This inserts sanity check that refuses to mount a filesystem with
unsupported block size.
Previously, kernel code of nilfs was looking only limitation of
devices though mkfs.nilfs2 limits the range of block sizes; there was
no check that prevents rec_len overflow with larger block sizes.
With this change, block sizes larger than 64KB or smaller than 1KB
will get rejected explicitly by kernel.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
With 64KB blocksize, a directory entry can have size 64KB which does
not fit into 16 bits we have for entry length. So this patch stores
0xffff instead and converts value when read from / written to disk.
Nilfs derives its directory implementation from ext2 filesystem, and
this draws upon the corresponding change on ext2.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Move frags[] at the end of struct skb_shared_info, and make
pskb_expand_head() copy only the used part of it instead of whole array.
This should avoid kmemcheck warnings and speedup pskb_expand_head() as
well, avoiding a lot of cache misses.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes hang when target device of mirred packet classifier
action is removed.
If a mirror or redirection action is configured to cause packets
to go to another device, the classifier holds a ref count, but was assuming
the adminstrator cleaned up all redirections before removing. The fix
is to add a notifier and cleanup during unregister.
The new list is implicitly protected by RTNL mutex because
it is held during filter add/delete as well as notifier.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add addr_assign_type to struct net_device and expose it via sysfs.
This new attribute has the purpose of giving user-space the ability to
distinguish between different assignment types of MAC addresses.
For example user-space can treat NICs with randomly generated MAC
addresses differently than NICs that have permanent (locally assigned)
MAC addresses.
For the former udev could write a persistent net rule by matching the
device path instead of the MAC address.
There's also the case of devices that 'steal' MAC addresses from slave
devices. In which it is also be beneficial for user-space to be aware
of the fact.
This patch also introduces a helper function to assist adoption of
drivers that generate MAC addresses randomly.
Signed-off-by: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 2a6b69765a
(ACPI: Store NVS state even when entering suspend to RAM) caused the
ACPI suspend code save the NVS area during suspend and restore it
during resume unconditionally, although it is known that some systems
need to use acpi_sleep=s4_nonvs for hibernation to work. To allow
the affected systems to avoid saving and restoring the NVS area
during suspend to RAM and resume, introduce kernel command line
option acpi_sleep=nonvs and make acpi_sleep=s4_nonvs work as its
alias temporarily (add acpi_sleep=s4_nonvs to the feature removal
file).
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16396 .
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-and-tested-by: tomas m <tmezzadra@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This list used was by only two platforms with all other platforms defining an
own list of valid bus id's to pass to of_platform_bus_probe. This patch:
i) copies the default list to the two platforms that depended on it (powerpc)
ii) remove the usage of of_default_bus_ids in of_platform_bus_probe
iii) removes the definition of the list from all architectures that defined it
Passing a NULL 'matches' parameter to of_platform_bus_probe is still valid; the
function returns no error in that case as the NULL value is equivalent to an
empty list.
Signed-off-by: Jonas Bonn <jonas@southpole.se>
[grant.likely@secretlab.ca: added __initdata annotations, warn on and return error on missing match table, and fix whitespace errors]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
of_device is currently just an #define alias to platform_device until it
gets removed entirely. This patch removes references to it from the
include directories and the core drivers/of code.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
It is mostly unused now. Sparc has a few defines left in it, but they
can be moved to other headers. Removing this header means that new
architectures adding CONFIG_OF support don't need to also add this
header file.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
Only thing left in it is of_instantiate_rtc() which can be moved to
asm/prom.h on PowerPC and is unused in microblaze.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
Both of_bus_type and of_platform_bus_type are just #define aliases
for the platform bus. This patch removes all references to them and
switches to the of_register_platform_driver()/of_unregister_platform_driver()
API for registering.
Subsequent patches will convert each user of of_register_platform_driver()
into plain platform_drivers without the of_platform_driver shim. At which
point the of_register_platform_driver()/of_unregister_platform_driver()
functions can be removed.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
of_platform_bus was being used in the same manner as the platform_bus.
The only difference being that of_platform_bus devices are generated
from data in the device tree, and platform_bus devices are usually
statically allocated in platform code. Having them separate causes
the problem of device drivers having to be registered twice if it
was possible for the same device to appear on either bus.
This patch removes of_platform_bus_type and registers all of_platform
bus devices and drivers on the platform bus instead. A previous patch
made the of_device structure an alias for the platform_device structure,
and a shim is used to adapt of_platform_drivers to the platform bus.
After all of of_platform_bus drivers are converted to be normal platform
drivers, the shim code can be removed.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools: Fix fallback to cplus_demangle() when bfd_demangle() is not available
perf annotate: Fix handling of goto labels that are valid hex numbers
tracing: Properly align linker defined symbols
perf symbols: Fix directory descriptor leaking
perf: Fix various display bugs with parent filtering
usleep[_range] are finer precision implementations of msleep
and are designed to be drop-in replacements for udelay where
a precise sleep / busy-wait is unnecessary. They also allow
an easy interface to specify slack when a precise (ish)
wakeup is unnecessary to help minimize wakeups
Signed-off-by: Patrick Pannuto <ppannuto@codeaurora.org>
Cc: akinobu.mita@gmail.com
Cc: sboyd@codeaurora.org
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
LKML-Reference: <4C44CDD2.1070708@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We should copy the initial value to userspace for iptables-save and
to allow removal of specific quota rules.
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
In both the ieee1394 stack and the firewire stack, the core treats
kernelspace drivers better than userspace drivers when it comes to
CSR address range allocation: The former may request a register to be
placed automatically at a free spot anywhere inside a specified address
range. The latter may only request a register at a fixed offset.
Hence, userspace drivers which do not require a fixed offset potentially
need to implement a retry loop with incremented offset in each retry
until the kernel does not fail allocation with EBUSY. This awkward
procedure is not fundamentally necessary as the core already provides a
superior allocation API to kernelspace drivers.
Therefore change the ioctl() ABI by addition of a region_end member in
the existing struct fw_cdev_allocate. Userspace and kernelspace APIs
work the same way now.
There is a small cost to pay by clients though: If client source code
is required to compile with older kernel headers too, then any use of
the new member fw_cdev_allocate.region_end needs to be enclosed by
#ifdef/#endif directives. However, any client program that seriously
wants to use address range allocations will require a kernel of cdev ABI
version >= 4 at runtime and a linux/firewire-cdev.h header of >= 4
anyway. This is because v4 brings FW_CDEV_EVENT_REQUEST2. The only
client program in which build-time compatibility with struct
fw_cdev_allocate as found in older kernel headers makes sense is
libraw1394.
(libraw1394 uses the older broken FW_CDEV_EVENT_REQUEST to implement a
makeshift, incorrect transaction responder that does at least work
somewhat in many simple scenarios, relying on guesswork by libraw1394
and by libraw1394 based applications. Plus, address range allocation
and transaction responder is only one of many features that libraw1394
needs to provide, and these other features need to work with kernel and
kernel-headers as old as possible. Any new linux/firewire-cdev.h based
client that implements a transaction responder should never attempt to
do it like libraw1394; instead it should make a header and kernel of v4
or later a hard requirement.)
While we are at it, update the struct fw_cdev_allocate documentation to
better reflect the recent fw_cdev_event_request2 ABI addition.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This extends the FW_CDEV_IOC_SEND_PHY_PACKET ioctl() for /dev/fw* to be
useful for ping time measurements. One application for it would be gap
count optimization in userspace that is based on ping times rather than
hop count. (The latter is implemented in firewire-core itself but is
not applicable to beta PHYs that act as repeater.)
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Add an FW_CDEV_IOC_RECEIVE_PHY_PACKETS ioctl() and
FW_CDEV_EVENT_PHY_PACKET_RECEIVED poll()/read() event for /dev/fw*.
This can be used to get information from remote PHYs by remote access
PHY packets.
This is also the 2nd half of the functionality (the receive part) to
support a userspace implementation of a VersaPHY transaction layer.
Safety considerations:
- PHY packets are generally broadcasts, hence some kind of elevated
privileges should be required of a process to be able to listen in
on PHY packets. This implementation assumes that a process that is
allowed to open the /dev/fw* of a local node does have this
privilege.
There was an inconclusive discussion about introducing POSIX
capabilities as a means to check for user privileges for these
kinds of operations.
Other limitations:
- PHY packet reception may be switched on by ioctl() but cannot be
switched off again. It would be trivial to provide an off switch,
but this is not worth the code. The client should simply close()
the fd then, or just ignore further events.
- For sake of simplicity of API and kernel-side implementation, no
filter per packet content is provided.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Add an FW_CDEV_IOC_SEND_PHY_PACKET ioctl() for /dev/fw* which can be
used to implement bus management related functionality in userspace.
This is also half of the functionality (the transmit part) that is
needed to support a userspace implementation of a VersaPHY transaction
layer.
Safety considerations:
- PHY packets are generally broadcasts and may have interesting
effects on PHYs and the bus, e.g. make asynchronous arbitration
impossible due to too low gap count. Hence some kind of elevated
privileges should be required of a process to be able to send
PHY packets. This implementation assumes that a process that is
allowed to open the /dev/fw* of a local node does have this
privilege.
There was an inconclusive discussion about introducing POSIX
capabilities as a means to check for user privileges for these
kinds of operations.
- The kernel does not check integrity of the supplied packet data.
That would be far too much code, considering the many kinds of
PHY packets. A process which got the privilege to send these
packets is trusted to do it correctly.
Just like with the other "send packet" ioctls, a non-blocking API is
chosen; i.e. the ioctl may return even before AT DMA started. After
transmission, an event for poll()/read() is enqueued. Most users are
going to need a blocking API, but a blocking userspace wrapper is easy
to implement, and the second of the two existing libraw1394 calls
raw1394_phy_packet_write() and raw1394_start_phy_packet_write() can be
better supported that way.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Response events:
- are generated on more occasions than their documentation claimed.
CSR allocation:
- An already occupied CSR can be determined from errno==EBUSY.
Bus resets:
- Note that FW_CDEV_IOC_INITIATE_BUS_RESET is nonblocking and that the
client is not required to observe a grace period since kernels
2.6.36+ will enforce it now (commit 02d37bed).
- The possible values of fw_cdev_initiate_bus_reset.type are listed in
the kerneldoc comment already.
- Clarify that an application that uses FW_CDEV_IOC_ADD_DESCRIPTOR and
FW_CDEV_IOC_REMOVE_DESCRIPTOR does not have to issue a bus reset.
Isochronous I/O contexts:
- At most one can be created per open file descriptor.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
core-transaction.c transmit_complete_callback() and close_transaction()
expect packet callback status to be an ACK or RCODE, and ACKs get
translated to RCODEs for transaction callbacks.
An old comment on the packet callback API (been there from the initial
submission of the stack) and the dummy_driver implementation of
send_request/send_response deviated from this as they also included
-ERRNO in the range of status values.
Let's narrow status values down to ACK and RCODE to prevent surprises.
RCODE_CANCELLED is chosen as the dummy_driver's RCODE as its meaning of
"transaction timed out" comes closest to what happens when a transaction
coincides with card removal.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
In some situations a CPU match permits a better spreading of
connections, or select targets only for a given cpu.
With Remote Packet Steering or multiqueue NIC and appropriate IRQ
affinities, we can distribute trafic on available cpus, per session.
(all RX packets for a given flow is handled by a given cpu)
Some legacy applications being not SMP friendly, one way to scale a
server is to run multiple copies of them.
Instead of randomly choosing an instance, we can use the cpu number as a
key so that softirq handler for a whole instance is running on a single
cpu, maximizing cache effects in TCP/UDP stacks.
Using NAT for example, a four ways machine might run four copies of
server application, using a separate listening port for each instance,
but still presenting an unique external port :
iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 0 \
-j REDIRECT --to-port 8080
iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 1 \
-j REDIRECT --to-port 8081
iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 2 \
-j REDIRECT --to-port 8082
iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 3 \
-j REDIRECT --to-port 8083
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Quota code never touches file data. It just modifies i_blocks + i_bytes
of inodes and inode flags of quota files. So use mark_inode_dirty_sync
instead of mark_inode_dirty.
Signed-off-by: Jan Kara <jack@suse.cz>
Use nf_conntrack/nf_nat code to do the packet mangling and the TCP
sequence adjusting. The function 'ip_vs_skb_replace' is now dead
code, so it is removed.
To SNAT FTP, use something like:
% iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 \
--vport 21 -j SNAT --to-source 192.168.10.10
and for the data connections in passive mode:
% iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 \
--vportctl 21 -j SNAT --to-source 192.168.10.10
using '-m state --state RELATED' would also works.
Make sure the kernel modules ip_vs_ftp, nf_conntrack_ftp, and
nf_nat_ftp are loaded.
[ up-port and minor fixes by Simon Horman <horms@verge.net.au> ]
Signed-off-by: Hannes Eder <heder@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This implements the kernel-space side of the netfilter matcher xt_ipvs.
[ minor fixes by Simon Horman <horms@verge.net.au> ]
Signed-off-by: Hannes Eder <heder@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
[ Patrick: added xt_ipvs.h to Kbuild ]
Signed-off-by: Patrick McHardy <kaber@trash.net>
The .data..init_task output section was missing
a load offset causing a popwerpc target to fail to boot.
Sean MacLennan tracked it down to the definition of
INIT_TASK_DATA_SECTION().
There are only two users of INIT_TASK_DATA_SECTION()
in the kernel today: cris and popwerpc.
cris do not support relocatable kernels and is thus not
impacted by this change.
Fix INIT_TASK_DATA_SECTION() to specify load offset like
all other output sections.
Reported-by: Sean MacLennan <smaclennan@pikatech.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This adds three new fields to nilfs_super_block structure, compatible
feature set, readonly-compatible feature set, and incompatible feature
set in order to prepare for future disk format modifications.
The role of these fields conforms to those of ext3 or other
filesystems. Most important flags are the incompatible feature set;
it is used to refuse to mount the filesystem which sets an
incompatible feature the kernel doesn't know about.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
This inserts comments indicating hexadecimal offset in declaration of
nilfs_super_block structure so that people can know offset of its
fields without counting from the head.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Suspend/resume requires few different things on HVM: the suspend
hypercall is different; we don't need to save/restore memory related
settings; except the shared info page and the callback mechanism.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Add the xen pci platform device driver that is responsible
for initializing the grant table and xenbus in PV on HVM mode.
Few changes to xenbus and grant table are necessary to allow the delayed
initialization in HVM mode.
Grant table needs few additional modifications to work in HVM mode.
The Xen PCI platform device raises an irq every time an event has been
delivered to us. However these interrupts are only delivered to vcpu 0.
The Xen PCI platform interrupt handler calls xen_hvm_evtchn_do_upcall
that is a little wrapper around __xen_evtchn_do_upcall, the traditional
Xen upcall handler, the very same used with traditional PV guests.
When running on HVM the event channel upcall is never called while in
progress because it is a normal Linux irq handler (and we cannot switch
the irq chip wholesale to the Xen PV ones as we are running QEMU and
might have passed in PCI devices), therefore we cannot be sure that
evtchn_upcall_pending is 0 when returning.
For this reason if evtchn_upcall_pending is set by Xen we need to loop
again on the event channels set pending otherwise we might loose some
event channel deliveries.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Set the callback to receive evtchns from Xen, using the
callback vector delivery mechanism.
The traditional way for receiving event channel notifications from Xen
is via the interrupts from the platform PCI device.
The callback vector is a newer alternative that allow us to receive
notifications on any vcpu and doesn't need any PCI support: we allocate
a vector exclusively to receive events, in the vector handler we don't
need to interact with the vlapic, therefore we avoid a VMEXIT.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Workqueue can now handle high concurrency. Convert drm_crtc_helper to
use system_nrt_wq instead of slow-work. The conversion is mostly
straight forward. One difference is that drm_helper_hpd_irq_event()
no longer blocks and can be called from any context.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Make fscache operation to use only workqueue instead of combination of
workqueue and slow-work. FSCACHE_OP_SLOW is dropped and
FSCACHE_OP_FAST is renamed to FSCACHE_OP_ASYNC and uses newly added
fscache_op_wq workqueue to execute op->processor().
fscache_operation_init_slow() is dropped and fscache_operation_init()
now takes @processor argument directly.
* Unbound workqueue is used.
* fscache_retrieval_work() is no longer necessary as OP_ASYNC now does
the equivalent thing.
* sysctl fscache.operation_max_active added to control concurrency.
The default value is nr_cpus clamped between 2 and
WQ_UNBOUND_MAX_ACTIVE.
* debugfs support is dropped for now. Tracing API based debug
facility is planned to be added.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David Howells <dhowells@redhat.com>
Make fscache object state transition callbacks use workqueue instead
of slow-work. New dedicated unbound CPU workqueue fscache_object_wq
is created. get/put callbacks are renamed and modified to take
@object and called directly from the enqueue wrapper and the work
function. While at it, make all open coded instances of get/put to
use fscache_get/put_object().
* Unbound workqueue is used.
* work_busy() output is printed instead of slow-work flags in object
debugging outputs. They mean basically the same thing bit-for-bit.
* sysctl fscache.object_max_active added to control concurrency. The
default value is nr_cpus clamped between 4 and
WQ_UNBOUND_MAX_ACTIVE.
* slow_work_sleep_till_thread_needed() is replaced with fscache
private implementation fscache_object_sleep_till_congested() which
waits on fscache_object_wq congestion.
* debugfs support is dropped for now. Tracing API based debug
facility is planned to be added.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David Howells <dhowells@redhat.com>
It turns out that there is a bit in the _CST for Intel FFH C3
that tells the OS if we should be checking BM_STS or not.
Linux has been unconditionally checking BM_STS.
If the chip-set is configured to enable BM_STS,
it can retard or completely prevent entry into
deep C-states -- as illustrated by turbostat:
http://userweb.kernel.org/~lenb/acpi/utils/pmtools/turbostat/
ref: Intel Processor Vendor-Specific ACPI Interface Specification
table 4 "_CST FFH GAS Field Encoding"
Bit 1: Set to 1 if OSPM should use Bus Master avoidance for this C-state
https://bugzilla.kernel.org/show_bug.cgi?id=15886
Signed-off-by: Len Brown <len.brown@intel.com>
Add a new rt attribute, RTA_MARK, and use it in
rt_fill_info()/inet_rtm_getroute() to support following commands :
ip route get 192.168.20.110 mark NUMBER
ip route get 192.168.20.108 from 192.168.20.110 iif eth1 mark NUMBER
ip route list cache [192.168.20.110] mark NUMBER
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Once a work starts execution, its data contains the cpu number it was
on instead of pointing to cwq. This is added by commit 7a22ad75
(workqueue: carry cpu number in work data once execution starts) to
reliably determine the work was last on even if the workqueue itself
was destroyed inbetween.
Whether data points to a cwq or contains a cpu number was
distinguished by comparing the value against PAGE_OFFSET. The
assumption was that a cpu number should be below PAGE_OFFSET while a
pointer to cwq should be above it. However, on architectures which
use separate address spaces for user and kernel spaces, this doesn't
hold as PAGE_OFFSET is zero.
Fix it by using an explicit flag, WORK_STRUCT_CWQ, to mark what the
data field contains. If the flag is set, it's pointing to a cwq;
otherwise, it contains a cpu number.
Reported on s390 and microblaze during linux-next testing.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Sachin Sant <sachinp@in.ibm.com>
Reported-by: Michal Simek <michal.simek@petalogix.com>
Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Tested-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Tested-by: Michal Simek <monstr@monstr.eu>
Mark Wagner reported OOM symptoms when sending UDP traffic over
a macvtap link to a kvm receiver.
This appears to be caused by the fact that macvtap packet queues
are unlimited in length. This means that if the receiver can't
keep up with the rate of flow, then we will hit OOM. Of course
it gets worse if the OOM killer then decides to kill the receiver.
This patch imposes a cap on the packet queue length, in the same
way as the tuntap driver, using the device TX queue length.
Please note that macvtap currently has no way of giving congestion
notification, that means the software device TX queue cannot be
used and packets will always be dropped once the macvtap driver
queue fills up.
This shouldn't be a great problem for the scenario where macvtap
is used to feed a kvm receiver, as the traffic is most likely
external in origin so congestion notification can't be applied
anyway.
Of course, if anybody decides to complain about guest-to-guest
UDP packet loss down the track, then we may have to revisit this.
Incidentally, this patch also fixes a real memory leak when
macvtap_get_queue fails.
Chris Wright noticed that for this patch to work, we need a
non-zero TX queue length. This patch includes his work to change
the default macvtap TX queue length to 500.
Reported-by: Mark Wagner <mwagner@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
sysrq,kdb: Use __handle_sysrq() for kdb's sysrq function
debug_core,kdb: fix kgdb_connected bit set in the wrong place
Fix merge regression from external kdb to upstream kdb
repair gdbstub to match the gdbserial protocol specification
kdb: break out of kdb_ll() when command is terminated
This core is found on some Freescale SoCs and also some Coldfire
SoCs. Support for Coldfire is missing though at the moment as
they have an older revision of the core which does not have RX FIFO
support.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
and fix the broken case if a core's frequency depends on others.
trace_power_frequency was only implemented in a rather ungeneric
way in acpi-cpufreq driver's target() function only.
-> Move the call to trace_power_frequency to
cpufreq.c:cpufreq_notify_transition() where CPUFREQ_POSTCHANGE
notifier is triggered.
This will support power frequency tracing by all cpufreq
drivers.
trace_power_frequency did not trace frequency changes correctly
when the userspace governor was used or when CPU cores'
frequency depend on each other.
-> Moving this into the CPUFREQ_POSTCHANGE notifier and pass the cpu
which gets switched automatically fixes this.
Robert Schoene provided some important fixes on top of my
initial quick shot version which are integrated in this patch:
- Forgot some changes in power_end trace (TP_printk/variable names)
- Variable dummy in power_end must now be cpu_id
- Use static 64 bit variable instead of unsigned int for cpu_id
[akpm@linux-foundation.org: build fix]
Signed-off-by: Thomas Renninger <trenn@suse.de>
Cc: davej@codemonkey.org.uk
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Dave Jones <davej@codemonkey.org.uk>
Acked-by: Arjan van de Ven <arjan@infradead.org>
Cc: Robert Schoene <robert.schoene@tu-dresden.de>
Tested-by: Robert Schoene <robert.schoene@tu-dresden.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The kdb code should not toggle the sysrq state in case an end user
wants to try and resume the normal kernel execution.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Network code uses the __packed macro instead of __attribute__((packed)).
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove IRDA_PACKED macro, which maps to __attribute__((packed)). IRDA is
one of the last users of __attribute__((packet)). Networking code uses
__packed now.
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: David S. Miller <davem@davemloft.net>