Enable QP creation with a given source QP number.
The created QP will use the source QPN as its wire QP number.
This comes as a pre-patch for downstream patches in this series to
allow user space applications to accelerate traffic which is typically
handled by IPoIB ULP.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When creating address handle from multicast GID, set MAC according to
the appropriate formula instead of searching for it in the GID table:
- For IPv4 multicast GID use ip_eth_mc_map().
- For IPv6 multicast GID use ipv6_eth_mc_map().
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
RoCEv2 Annex states that for RoCEv2 over IPv4, the corresponding
IPv4 address is encoded into the GID according to the following rule:
GID= :ffff:<IPv4 address>
Remove the 0xff0e prefix for RoCEv2 packets with IPv4 and leave it
zeroed and change rdma_is_multicast_addr() to consider the new logic.
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
RoCE Annex (A16.9.10/11) declares that during attach (detach) QP to a
multicast group, if the QP is associated with a RoCE port, the
multicast group MLID is unused and is ignored.
During attach or detach multicast, when the QP is associated with a
port, it is enough to check the port's link layer and validate the
LID only if it is Infiniband. Otherwise, avoid validating the
multicast LID.
Fixes: 8561eae60f ("IB/core: For multicast functions, verify that LIDs are multicast LIDs")
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add debugfs interface for monitor the number of delay drop timeout
events and the number of existing dropless RQs in the system.
In addition add debugfs interface for configuring the global timeout value
which is used in the SET_DELAY_DROP command.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
RQs that were configured for "delay drop" will prevent packet drops
when their WQEs are depleted.
Marking an RQ to be drop-less is done by setting delay_drop_en in RQ
context using CREATE_RQ command.
Since this feature is globally activated/deactivated by using the
SET_DELAY_DROP command on all the marked RQs, we activated/deactivated
it according to the number of RQs with 'delay_drop' enabled.
When timeout is expired, then the feature is deactivated. Therefore
the driver handles the delay drop timeout event and reactivate it.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When delay drop timeout is expired, the firmware raises
general notification event of DELAY_DROP_TIMEOUT subtype.
In addition the feature is disable so the driver have to
reactivate the timeout.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add support to SET_DELAY_DROP command.
This command will be used in downstream patches for delay packet drop.
The timeout value should be indicated by delay_drop_timeout field.
Packet processing will be delayed till timeout value passed or until
more WQEs are posted.
Setting this value to 0 disables the feature.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Work queue which is created with IB_WQ_FLAGS_DELAY_DROP won't
cause packet drops when there aren't receive WQEs, but will wait until
posting of receive WQEs or for some period of time that the device
was configured with.
It includes:
* Add a new creation flag to enable delay drop functionality in a WQ.
* A new capability was introduced - IB_RAW_PACKET_CAP_DELAY_DROP, which
is the device's ability to delay packet drops when there aren't receive
WQEs.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When a user sets port_guid, node_guid or policy of an IB virtual
function, save this information in "struct mlx5_vf_context".
This information will be restored later when pci_resume is called.
To make sure this works, one can use aer-inject to generate PCI
errors on mlx5 devices and verify if relevant fields are restored
after PCI resume.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch adds debug control parameters for congestion control which
can be read or written through debugfs. They are for reaction point and
notification point nodes.
These control parameters are as below:
+------------------------------+-----------------------------------------+
| Name | Description |
|------------------------------+-----------------------------------------|
|rp_clamp_tgt_rate | When set target rate is updated to |
| | current rate |
|------------------------------+-----------------------------------------|
|rp_clamp_tgt_rate_ati | When set update target rate based on |
| | timer as well |
|------------------------------+-----------------------------------------|
|rp_time_reset | time between rate increase if no |
| | CNP is received unit in usec |
|------------------------------+-----------------------------------------|
|rp_byte_reset | Number of bytes between rate inease if |
| | no CNP is received |
|------------------------------+-----------------------------------------|
|rp_threshold | Threshold for reaction point rate |
| | control |
|------------------------------+-----------------------------------------|
|rp_ai_rate | Rate for target rate, unit in Mbps |
|------------------------------+-----------------------------------------|
|rp_hai_rate | Rate for hyper increase state |
| | unit in Mbps |
|------------------------------+-----------------------------------------|
|rp_min_dec_fac | Minimum factor by which the current |
| | transmit rate can be changed when |
| | processing a CNP, unit is percerntage |
|------------------------------+-----------------------------------------|
|rp_min_rate | Minimum value for rate limit, |
| | unit in Mbps |
|------------------------------+-----------------------------------------|
|rp_rate_to_set_on_first_cnp | Rate that is set when first CNP is |
| | received, unit is Mbps |
|------------------------------+-----------------------------------------|
|rp_dce_tcp_g | Used to calculate alpha |
|------------------------------+-----------------------------------------|
|rp_dce_tcp_rtt | Time between updates of alpha value, |
| | unit is usec |
|------------------------------+-----------------------------------------|
|rp_rate_reduce_monitor_period | Minimum time between consecutive rate |
| | reductions |
|------------------------------+-----------------------------------------|
|rp_initial_alpha_value | Initial value of alpha |
|------------------------------+-----------------------------------------|
|rp_gd | When CNP is received, flow rate is |
| | reduced based on gd, rp_gd is given as |
| | log2(rp_gd) |
|------------------------------+-----------------------------------------|
|np_cnp_dscp | dscp code point for generated cnp |
|------------------------------+-----------------------------------------|
|np_cnp_prio_mode | 802.1p priority for generated cnp |
|------------------------------+-----------------------------------------|
|np_cnp_prio | cnp priority mode |
+------------------------------+-----------------------------------------+
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The old logic ignored link state. This led to missing IB events like
when link goes down on the switch while admin state is up or to redundant
events like when admin state goes up while link is down.
To fix that, probe the port state on NETDEV events and compare to last
known state to decide if IB events needs to be dispatched.
FIxes: 5ec8c83e3a ("IB/mlx5: Port events in RoCE now rely on netdev events")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Before running the ethtool's loopback selftest, we need
to make sure that the local loopback is enabled.
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Currently, unicast/multicast loopback raw ethernet
(non-RDMA) packets are sent back to the vport.
A unicast loopback packet is the packet with destination
MAC address the same as the source MAC address.
For multicast, the destination MAC address is in the
vport's multicast filter list.
Moreover, the local loopback is not needed if
there is one or none user space context.
After this patch, the raw ethernet unicast and multicast
local loopback are disabled by default. When there is more
than one user space context, the local loopback is enabled.
Note that when local loopback is disabled, raw ethernet
packets are not looped back to the vport and are forwarded
to the next routing level (eswitch, or multihost switch,
or out to the wire depending on the configuration).
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Further testing showed that the fix introduced in 7dfb8be11b ("btrfs:
Round down values which are written for total_bytes_size") was
insufficient and it could still lead to discrepancies between the
total_bytes in the super block and the device total bytes. So this patch
also ensures that the difference between old/new sizes when
shrinking/growing is also rounded down. This ensure that we won't be
subtracting/adding a non-sectorsize multiples to the superblock/device
total sizees.
Fixes: 7dfb8be11b ("btrfs: Round down values which are written for total_bytes_size")
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If a lot of metadata is reserved for outstanding delayed allocations, we
rely on shrink_delalloc() to reclaim metadata space in order to fulfill
reservation tickets. However, shrink_delalloc() has a shortcut where if
it determines that space can be overcommitted, it will stop early. This
made sense before the ticketed enospc system, but now it means that
shrink_delalloc() will often not reclaim enough space to fulfill any
tickets, leading to an early ENOSPC. (Reservation tickets don't care
about being able to overcommit, they need every byte accounted for.)
Fix it by getting rid of the shortcut so that shrink_delalloc() reclaims
all of the metadata it is supposed to. This fixes early ENOSPCs we were
seeing when doing a btrfs receive to populate a new filesystem, as well
as early ENOSPCs Christoph saw when doing a big cp -r onto Btrfs.
Fixes: 957780eb27 ("Btrfs: introduce ticketed enospc infrastructure")
Tested-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Cc: stable@vger.kernel.org
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If we have a block group that is all of the following:
1) uncached in memory
2) is read-only
3) has a disk cache state that indicates we need to recreate the cache
AND the file system has enough free space fragmentation such that the
request for an extent of a given size can't be honored;
AND have a single CPU core;
AND it's the block group with the highest starting offset such that
there are no opportunities (like reading from disk) for the loop to
yield the CPU;
We can end up with a lockup.
The root cause is simple. Once we're in the position that we've read in
all of the other block groups directly and none of those block groups
can honor the request, there are no more opportunities to sleep. We end
up trying to start a caching thread which never gets run if we only have
one core. This *should* present as a hung task waiting on the caching
thread to make some progress, but it doesn't. Instead, it degrades into
a busy loop because of the placement of the read-only check.
During the first pass through the loop, block_group->cached will be set
to BTRFS_CACHE_STARTED and have_caching_bg will be set. Then we hit the
read-only check and short circuit the loop. We're not yet in
LOOP_CACHING_WAIT, so we skip that loop back before going through the
loop again for other raid groups.
Then we move to LOOP_CACHING_WAIT state.
During the this pass through the loop, ->cached will still be
BTRFS_CACHE_STARTED, which means it's not cached, so we'll enter
cache_block_group, do a lot of nothing, and return, and also set
have_caching_bg again. Then we hit the read-only check and short circuit
the loop. The same thing happens as before except now we DO trigger
the LOOP_CACHING_WAIT && have_caching_bg check and loop back up to the
top. We do this forever.
There are two fixes in this patch since they address the same underlying
bug.
The first is to add a cond_resched to the end of the loop to ensure
that the caching thread always has an opportunity to run. This will
fix the soft lockup issue, but find_free_extent will still loop doing
nothing until the thread has completed.
The second is to move the read-only check to the top of the loop. We're
never going to return an allocation within a read-only block group so
we may as well skip it early. The check for ->cached == BTRFS_CACHE_ERROR
would cause the same problem except that BTRFS_CACHE_ERROR is considered
a "done" state and we won't re-set have_caching_bg again.
Many thanks to Stephan Kulow <coolo@suse.de> for his excellent help in
the testing process.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In kernels with CONFIG_IWMMXT=y running on non-iWMMXt hardware, the
signal frame can be left partially uninitialised in such a way
that userspace cannot parse uc_regspace[] safely. In particular,
this means that the VFP registers cannot be located reliably in the
signal frame when a multi_v7_defconfig kernel is run on the
majority of platforms.
The cause is that the uc_regspace[] is laid out statically based on
the kernel config, but the decision of whether to save/restore the
iWMMXt registers must be a runtime decision.
To minimise breakage of software that may assume a fixed layout,
this patch emits a dummy block of the same size as iwmmxt_sigframe,
for non-iWMMXt threads. However, the magic and size of this block
are now filled in to help parsers skip over it. A new DUMMY_MAGIC
is defined for this purpose.
It is probably legitimate (if non-portable) for userspace to
manufacture its own sigframe for sigreturn, and there is no obvious
reason why userspace should be required to insert a DUMMY_MAGIC
block when running on non-iWMMXt hardware, when omitting it has
worked just fine forever in other configurations. So in this case,
sigreturn does not require this block to be present.
Reported-by: Edmund Grimley-Evans <Edmund.Grimley-Evans@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
preserve_iwmmxt_context() and restore_iwmmxt_context() lack __user
accessors on their arguments pointing to the user signal frame.
There does not be appear to be a bug here, but this omission is
inconsistent with the crunch and vfp sigframe access functions.
This patch adds the annotations, for consistency.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
This patch allows driver to post send and receive
requests on QPs which are in error state.
Instead of flushing the QP in the context of polling
error CQEs, the QPs will be added to a flush list
maintained per CQ. QP state is moved to error.
QP is added to flush list if the user moves it
to error state using modify_qp also. After polling the HW
CQ in poll_cq routine, this flush list is traversed
and driver completes work requests on each QP in the flush
list, till the budget expires. The QP is moved out of
flush list during QP destroy or during modify_QP to RESET.
When ULPs post Work Requests while QP is in error state,
driver will store the ULP data and then increment the
QP producer s/w index, without ringing doorbell. It then
schedules a worker to invoke the CQ handler since the
interrupts wont be generated from the HW for this request.
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Current implementation does not program vlan header insertion
in RoCE packet if no vlan is configured. Firmware does not add
prority when there is no vlan tag in the packet. Modify the code
to insert vlan header when PFC is enabled on the interface.
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Logic of retrieving netdev speed from net_device and translating it to
IB speed is implemented in rxe, in usnic and in bnxt drivers.
Define new function which merges all.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Christian Benvenuti <benve@cisco.com>
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
RoCEv2 is the preferred RDMA protocol for Ethernet link layer because
of its advantages over RoCEv1. For better user experience make it the
default choice for RDMA_CM connections if device/port support it.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Its better to use __func__ to print functions name instead of writing
the name in the print statement.
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Use DEVICE_ATTR RO() macro and rename the show function accordingly.
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
We can use pfn_to_page() in realmode for other configs. Hence remove the
CONFIG_FLATMEM ifdef.
Fixes: 8e0861fa3c ("powerpc: Prepare to support kernel handling of IOMMU map/unmap")
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Also fix up the #endif comment]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The MAX17211 monitor a single cell pack. The MAX17215 monitor and
balance a 2S or 3S pack or monitor a multiple-series cell pack.
Both device use 1-Wire interfce.
Signed-off-by: Alex A. Mihaylov <minimumlaw@rambler.ru>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by <linux/sysfs.h> work
with const attribute_group. So mark the non-const structs as const.
File size before:
text data bss dec hex filename
2673 400 0 3073 c01 power/supply/pcf50633-charger.o
File size After adding 'const':
text data bss dec hex filename
2737 336 0 3073 bed power/supply/pcf50633-charger.o
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
of_irq_get() may return any negative error number as well as 0 on failure,
while the driver only checks for -EPROBE_DEFER, blithely continuing with
the call to devm_request_irq() -- that function expects *unsigned int*,
so would probably fail anyway when a large IRQ number resulting from a
conversion of a negative error number is passed to it... This, however,
is incorrect behavior -- error number is not IRQ number.
Check for 'irq <= 0' instead and return -ENXIO from probe if of_irq_get()
returned 0.
Fixes: a09209acd6 ("power: supply: act8945a_charger: Add status change update support")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
When CONFIG_OMAP_USB2 is set to 'm' and the charger driver is built-in,
we get this link failure:
drivers/power/supply/cpcap-charger.o: In function `cpcap_charger_probe':
cpcap-charger.c:(.text+0x48c): undefined reference to `omap_usb2_set_comparator'
drivers/power/supply/cpcap-charger.o: In function `cpcap_charger_remove':
cpcap-charger.c:(.text+0x774): undefined reference to `omap_usb2_set_comparator'
This adds a dependency to prevent that problem, while still allowing
compile-testing with the OMAP_USB2 driver completely disabled.
Fixes: 0c9888e3c1 ("power: supply: cpcap-charger: Add minimal CPCAP PMIC battery charger")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
The capacity mode bit is bit 15. Currently it is written as
default initialized enum and never shifted. This leads to
a behaviour where the BATTERY_MODE is not correctly
recognized and set again.
This commit initializes the enum accordingly.
Signed-off-by: Michael Heinemann <committed@heine.so>
Tested-by: Phil Reid <preid@electromag.com.au>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
commit b047e1cce8 ("ASoC: ac97: Support multi-platform AC'97")
modified hac_soc_platform_probe(), but "int ret" was missed.
This patch adds missing "int ret", otherwise, we will get
linux/sound/soc/sh/hac.c: In function 'hac_soc_platform_probe':
linux/sound/soc/sh/hac.c:318: error: 'ret' undeclared (first use in this function)
linux/sound/soc/sh/hac.c:318: error: (Each undeclared identifier is reported only once
linux/sound/soc/sh/hac.c:318: error: for each function it appears in.)
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
MONOVOL Playback Volume is a playback volume control on "MONOVOL MIX".
Signed-off-by: Bard Liao <bardliao@realtek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Use memdup_user() helper instead of open-coding to simplify the code.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
All cases initialise rv, and if they didn't that would be a bug. By
dropping the initialisation we give the compiler the chance to catch
those bugs for us.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This is helpful for 'nft monitor' to track which process caused a given
change to the ruleset.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch removes duplicate rcu_read_lock().
1. IPVS part:
According to Julian Anastasov's mention, contexts of ipvs are described
at: http://marc.info/?l=netfilter-devel&m=149562884514072&w=2, in summary:
- packet RX/TX: does not need locks because packets come from hooks.
- sync msg RX: backup server uses RCU locks while registering new
connections.
- ip_vs_ctl.c: configuration get/set, RCU locks needed.
- xt_ipvs.c: It is a netfilter match, running from hook context.
As result, rcu_read_lock and rcu_read_unlock can be removed from:
- ip_vs_core.c: all
- ip_vs_ctl.c:
- only from ip_vs_has_real_service
- ip_vs_ftp.c: all
- ip_vs_proto_sctp.c: all
- ip_vs_proto_tcp.c: all
- ip_vs_proto_udp.c: all
- ip_vs_xmit.c: all (contains only packet processing)
2. Netfilter part:
There are three types of functions that are guaranteed the rcu_read_lock().
First, as result, functions are only called by nf_hook():
- nf_conntrack_broadcast_help(), pptp_expectfn(), set_expected_rtp_rtcp().
- tcpmss_reverse_mtu(), tproxy_laddr4(), tproxy_laddr6().
- match_lookup_rt6(), check_hlist(), hashlimit_mt_common().
- xt_osf_match_packet().
Second, functions that caller already held the rcu_read_lock().
- destroy_conntrack(), ctnetlink_conntrack_event().
- ctnl_timeout_find_get(), nfqnl_nf_hook_drop().
Third, functions that are mixed with type1 and type2.
These functions are called by nf_hook() also these are called by
ordinary functions that already held the rcu_read_lock():
- __ctnetlink_glue_build(), ctnetlink_expect_event().
- ctnetlink_proto_size().
Applied files are below:
- nf_conntrack_broadcast.c, nf_conntrack_core.c, nf_conntrack_netlink.c.
- nf_conntrack_pptp.c, nf_conntrack_sip.c, nfnetlink_cttimeout.c.
- nfnetlink_queue.c, xt_TCPMSS.c, xt_TPROXY.c, xt_addrtype.c.
- xt_connlimit.c, xt_hashlimit.c, xt_osf.c
Detailed calltrace can be found at:
http://marc.info/?l=netfilter-devel&m=149667610710350&w=2
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Use memdup_user_nul() helper instead of open-coding to simplify the code.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
External IRQ0 (index 48) has the same capabilities as the other IRQ1-7
and is handled by the same register IPIC_SEPNR. When this register is
not specified for "ack" in "ipic_info", you cannot configure this IRQ
as IRQ_TYPE_EDGE_FALLING. This oversight was probably due to the
non-contiguous hwirq numbering of IRQ0 in the IPIC.
Signed-off-by: Jurgen Schindele <schindele@nentec.de>
[scottwood: Cleaned up commit message and posted as a proper patch]
Signed-off-by: Scott Wood <oss@buserror.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>