Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2015-08-18
This series contains updates to igb, e100, e1000e and ixgbe.
Shota Suzuki provides a fix for a possible overflow in
igb_set_interrupt_capability() which leads to an oops. When changing the
number of queues by "ethtool -L", set IGB_FLAG_QUEUE_PAIRS in the same
manner as when initializing the igb driver.
Vasily Averin provides a fix for a missing rtnl_unlock() for when we
error out due to not being able to allocate memory for our queues.
Stefan Assman provides a couple of fixes for igb/igbvf. First changes
the igb driver in probe to simply call igb_enable_sriov() instead of
igb_sriov_reinit() since we are starting from scratch. Then in igbvf,
fix the driver where it does not clear the buffer_info->dma in all
cases after calling dma_unmap_single(), which was found by changing the
MTU twice.
Richard Cochran implements the periodic output function using the
programmable clock outputs available in i210 when possible, falling
back to the target time for longer periods.
Todd adds support for the Marvell PHY 1512 which is required for i354
devices. Then updates igb to make sure SR-IOV init uses the correct
number of queues, since recent changes could result in the PF holding
onto all of the queues.
Alex Williamson provides a fix in the case where a guest OS does not
support hot-unplug, so disable SR-IOV prior to unregister_netdev() to
avoid the problem.
Jia-Ju Bai provides several patches, first knocks some collecting dust
off an old e100 driver to add a check to avoid a null pointer
dereference. Then cleans up a possible resource leak by releasing the
skb buffer allocated when the e100_xmit_prepare() runs into an issue
in the DMA mapping. In igb, add a missing rtnl_unlock() for when we
error out due to igb_sriov_reinit() in the igb_init_interrupt_scheme().
Provides a e1000e fix, based on suggestions from Alex Duyck to move
head/tail register writing to e1000_configure_tx/rx() to avoid a
possible null pointer dereference (similar to igb driver). Lastly,
fix a possible memory leak in igb_probe(), where the memory shadow_vfta
allocated by kcalloc in igb_sw_init() is not freed.
Mark simplifies port-specific macros for ixgbe by eliminating explicit
comparisons with 0 and enclose formal parameters in parens to eliminate
the risk of an operator precedence issue.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
You can't use kstrtoul() with an int or it causes memory corruption.
Also j should be unsigned or we have underflow bugs.
I considered changing "j" to unsigned long but everything fits in a u32.
Fixes: 8e3d04fd7d ('cxgb4: Add MPS tracing support')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/built-in.o: In function `.vnic_wq_devcmd2_alloc':
(.text+0x49fe40): multiple definition of `.vnic_wq_devcmd2_alloc'
drivers/scsi/built-in.o:(.text+0xb4318): first defined here
drivers/net/built-in.o:(.opd+0x2af00): multiple definition of `vnic_wq_devcmd2_alloc'
drivers/scsi/built-in.o:(.opd+0xad70): first defined here
drivers/net/built-in.o: In function `.vnic_wq_init_start':
(.text+0x49f9c0): multiple definition of `.vnic_wq_init_start'
drivers/scsi/built-in.o:(.text+0xb3b58): first defined here
drivers/net/built-in.o:(.opd+0x2ae88): multiple definition of `vnic_wq_init_start'
drivers/scsi/built-in.o:(.opd+0xace0): first defined here
Rename these to 'enic_*' to avoid the conflict with the functiosn of
the same name in the snic scsi driver.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Latest FW submission added some vxlan offload capabilities to our device.
This patch adds the ability to connect to the vxlan NDOs and configure
the UDP port associated with it in the HW.
The device would now be capable of performing RSS according to the
inner headers of the vxlan packets.
Signed-off-by: Rajesh Borundia <Rajesh.Borundia@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Simplify port-specific macros by eliminating explicit comparison
with 0. More importantly, enclose formal parameter in parens to
eliminate the risk of an operator precedence surprise.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Recent changes to igb_probe_vfs() could lead to the PF holding onto all
of the queues. Reorder igb_probe_vfs() to be before
gb_init_queue_configuration() and add some more error checking.
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In error handling code of igb_probe, the memory adapter->shadow_vfta
allocated by kcalloc in igb_sw_init is not freed. So when register_netdev
or igb_init_i2c is failed, a memory leak will occur.
This patch adds kfree to fix it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When e1000e_setup_rx_resources is failed in e1000_open,
e1000e_free_tx_resources in "err_setup_rx" segment is executed.
"writel(0, tx_ring->head)" statement in e1000_clean_tx_ring
in e1000e_free_tx_resources will cause a null poonter dereference(crash),
because "tx_ring->head" is only assigned in e1000_configure_tx
in e1000_configure, but it is after e1000e_setup_rx_resources.
This patch moves head/tail register writing to e1000_configure_tx/rx,
which can fix this problem. It is inspired by igb_configure_tx_ring
in the igb driver.
Specially, thank Alexander Duyck for his valuable suggestion.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When igb_init_interrupt_scheme in igb_sriov_reinit is failed, the lock
acquired by rtnl_lock() is not released, which causes a deadlock.
This patch adds rtnl_unlock() in error handling to fix it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When pci_dma_mapping_error in e100_xmit_prepare is failed, the skb buffer
allocated by netdev_alloc_skb_ip_align in e100_rx_alloc_skb is not
released, which causes a possible resource leak.
This patch adds error handling code to fix it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The driver lacks the check of nic->cbs_pool after pci_pool_create
in e100_probe. When this function is failed, a null pointer dereference
occurs when pci_pool_alloc uses nic->cbs_pool in e100_alloc_cbs.
This patch adds a check and related error handling code to fix it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When the .remove() callback for a PF is called, SR-IOV support for the
device is disabled, which requires unbinding and removing the VFs.
The VFs may be in-use either by the host kernel or userspace, such as
assigned to a VM through vfio-pci. In this latter case, the VFs may
be removed either by shutting down the VM or hot-unplugging the
devices from the VM. Unfortunately in the case of a Windows 2012 R2
guest, hot-unplug is broken due to the ordering of the PF driver
teardown. Disabling SR-IOV prior to unregister_netdev() avoids this
issue.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for Marvell PHY 1512 (required for I354).
Submitted by: Maciej Szwed <maciej.szwed@intel.com>
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In addition to interrupt driven target time output events, the i210
also has two programmable clock outputs. These clocks support periods
between 16 nanoseconds and 140 milliseconds. This patch implements
the periodic output function using the clock outputs when possible,
falling back to the target time for longer periods.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
During driver probing the following code path is triggered.
igb_probe
->igb_sw_init
->igb_probe_vfs
->igb_pci_enable_sriov
->igb_sriov_reinit
Doing the SR-IOV re-init is not necessary during probing since we're
starting from scratch. Here we can call igb_enable_sriov() right away.
Running igb_sriov_reinit() during igb_probe() also seems to cause
occasional packet loss on some onboard 82576 NICs. Reproduced on
Dell and HP servers with onboard 82576 NICs.
Example:
Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01)
Subsystem: Dell Device [1028:0481]
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When initializing igb driver (e.g. 82576, I350), IGB_FLAG_QUEUE_PAIRS is
set if adapter->rss_queues exceeds half of max_rss_queues in
igb_init_queue_configuration().
On the other hand, IGB_FLAG_QUEUE_PAIRS is not set even if the number of
queues exceeds half of max_combined in igb_set_channels() when changing
the number of queues by "ethtool -L".
In this case, if numvecs is larger than MAX_MSIX_ENTRIES (10), the size
of adapter->msix_entries[], an overflow can occur in
igb_set_interrupt_capability(), which in turn leads to an oops.
Fix this problem as follows:
- When changing the number of queues by "ethtool -L", set
IGB_FLAG_QUEUE_PAIRS in the same way as initializing igb driver.
- When increasing the size of q_vector, reallocate it appropriately.
(With IGB_FLAG_QUEUE_PAIRS set, the size of q_vector gets larger.)
Another possible way to fix this problem is to cap the queues at its
initial number, which is the number of the initial online cpus. But this
is not the optimal way because we cannot increase queues when another
cpu becomes online.
Note that before commit cd14ef54d2 ("igb: Change to use statically
allocated array for MSIx entries"), this problem did not cause oops
but just made the number of queues become 1 because of entering msi_only
mode in igb_set_interrupt_capability().
Fixes: 907b783579 ("igb: Add ethtool support to configure number of channels")
CC: stable <stable@vger.kernel.org>
Signed-off-by: Shota Suzuki <suzuki_shota_t3@lab.ntt.co.jp>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
>> drivers/net/ethernet/cisco/enic/vnic_dev.c:1095:13: sparse: incorrect type in assignment (different address spaces)
drivers/net/ethernet/cisco/enic/vnic_dev.c:1095:13: expected void *res
drivers/net/ethernet/cisco/enic/vnic_dev.c:1095:13: got void [noderef] <asn:2>*
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
>> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44: sparse: incorrect type in argument 1 (different base types)
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44: expected restricted __sum16 [usertype] n
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44: got restricted __be16 [usertype] check_sum
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only for packets with first ethertype set to IPv4/6 for now.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only rx/tx pause settings.
Autoneg setting is currently not supported.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Port speed settings are applied by the device only upon
port admin status transition from DOWN to UP.
So we enforce this transition regardless of the port's
current operation state (which may be occasionally DOWN if
for example the network cable is disconnected).
- Fix the PORT_UP/DOWN device interface enum
- Set the local_port bit in the device PAOS register
- EXPORT the PAOS (Port Administrative and Operational Status)
register set/query access functions.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Change the maximum LRO session size from 16KB to 64KB
- Reduce the LRO session timeout from 512us to 32us in
order to reduce the TCP latency of non-LRO'ed flows.
- Fix skb_shinfo(skb)->gso_size and set skb_shinfo(skb)->gso_type.
- Fix a bug accessing un-initialized mdev pointer.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We un-intentionally limited the minimum rings size too much.
TX minimum ring size reduced from 128 to 64.
RX minimum ring size reduced from 128 to 2.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The indirection table size was defined by a variable that
was actually assigned a constant value.
Since we do not have any forseen intension to make it configurable
we simply made it a constant.
We also limit the number of channels such that the RSS indirection
table could always populate all RX rings.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to generate a unique key per TIR.
Generating a single key per netdev and copying it to all
its TIRs.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
devcmd is an interface for driver to communicate with fw/adaptor. It
involves writing data to hardware registers and waiting for the result.
This mechanism does not scale well. The queuing of "no wait" devcmds is
done in firmware memory rather than on the host. Firmware memory is a
rather more scarce and valuable resource than host memory. A devcmd storm
from one vf can disrupt the service on other pf/vf. The lack of flow
control allows for possible denial of server from one VM to another.
Devcmd2 uses work queue to post the devcmds, just like tx work queue. This
allows better flow control.
Initialize devcmd2, if fails we fall back to devcmd1.
Also change the driver version.
Signed-off-by: N V V Satyanarayana Reddy <nalreddy@cisco.com>
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add devcmd resources to vnic_res_type. Add data types used by devcmd.
Signed-off-by: N V V Satyanarayana Reddy <nalreddy@cisco.com>
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
pr_info does not give any details about the interface involved. This patch
uses netdev_info for printing the message. Use dev_info where netdev is not
ready.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some of the structure definitions are in .c file to make them private to
that file. This patch moves the struct definition to .h file, So that their
definitions are accessible from other files.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 0b50dc4fc9 ("Convert smsc911x to use ACPI as well as DT") makes
the call to smsc911x_probe_config() unconditional, and no longer fails if
there is no device node. device_get_phy_mode() is called unconditionally,
and if there is no phy node configured returns an error code. This error
code is assigned to phy_interface, and interpreted elsewhere in the code
as valid phy mode. This in turn causes qemu to crash when running a
variant of realview_pb_defconfig.
qemu: hardware error: lan9118_read: Bad reg 0x86
Fixes: 0b50dc4fc9 ("Convert smsc911x to use ACPI as well as DT")
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc Graeme Gregory <graeme.gregory@linaro.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Management firmware tells driver in case bandwidth configuration for
a specific function exists, but [regretably] the same field has different
meanings depending on the multi-function mode - it can either be
a percentile value or an actual speed.
For newer multi-function modes current logic is incorrect -
driver understands values as actual speeds instead of percentages,
causing the resulting chip configuration to be incorrect.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Handle TRACE_PKT, stack can sniff them on the first port
Add debubfs enrty to configure tracing for offload traffic like iWARP
& iSCSI for debugging purpose.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rocker driver tracks arp_tbl neighs to resolve IPv4 route nexthops. The
driver uses NETEVENT_NEIGH_UPDATE for neigh adds and updates, but there is
no event when the neigh is removed from the device (such as when the device
goes admin down). This patches hooks ndo_neigh_destroy so the driver can
know when a neigh is removed from the device. In response, the driver will
purge the neigh entry from its internal tbl.
I didn't find an in-tree users of ndo_neigh_destroy, so I'm not sure if
this ndo is vestigial or if there are out-of-tree users. In any case, it
does what I need here. An alternative design would be to generate
NETEVENT_NEIGH_UPDATE event when neigh is being destroyed, setting state to
NUD_NONE so driver knows neigh entry is dead.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On sucessful probe, driver prints the switch ID. This patch changes the
format of the printed ID to match what's used in sysfs phys_switch_id node.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ACPI bindings for the smsc911x driver. Convert the DT specific calls
to nonspecific device* calls, This allows the driver to work
with both ACPI and DT configurations. Ethernet should now work when using
ACPI on ARM Juno.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Graeme Gregory <graeme.gregory@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
As per Am335x Errata [1] Advisory 1.0.9, The CPSW C0_TX_PEND and
C0_RX_PEND interrupt outputs provide a single transmit interrupt
that combines transmit channel interrupts TXPEND[7:0] and a
single receive interrupt that combines receive channel interrupts
RXPEND[7:0]. The TXPEND[0] and RXPEND[0] interrupt outputs are
connected to the ARM Cortex-A8 interrupt controller (INTC) rather
than the C0_TX_PEND and C0_RX_PEND interrupt outputs. So even
though CPSW interrupt is cleared by writing appropriate values to
EOI register the interrupt is not cleared in IRQ controller. So
interrupt is still pending and CPU is struck in ISR, the
workaround is to disable the interrupts in ARM irq controller.
[1] http://www.ti.com/lit/er/sprz360f/sprz360f.pdf
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/cavium/Kconfig
The cavium conflict was overlapping dependency
changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need to use the IS_ERR_VALUE() macro for checking
the return value from pm_runtime_* functions.
Just do a simple negative test instead.
The semantic patch that makes this change is available
in scripts/coccinelle/api/pm_runtime.cocci.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add debugfs support to dump tid info like stid, sftid, tids, atid and
hwtids
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For T4 adapter, offloaded servers tid for IPv4 connections are
allocated from filter region. So add a new field for server filter tid if
server tid is allocated from filter region.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the tid info, differentiate from which region the TID is allocated
from. It can be from TCAM region or HASH region.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding more details to sge qinfo for debugging purpose.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This BQL implementation is mostly derived from its related driver, alx.
Tested on AR8131 (rev c0) [1969:1063]. Saturated a 100mbps link with 5
concurrent runs of netperf. Ping latency dropped from 14ms to 3ms.
Signed-off-by: Ron Angeles <ronangeles@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current filer rule optimization is broken in several ways:
(1) Can perform reads/writes beyond end of allocated tables.
(gianfar_ethtool.c:1326).
(2) It breaks badly for rules with more than 2 specifiers
(e.g. matching ip, port, tos).
Example:
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.1 dst-port 1 tos 1 action 1
Added rule with ID 254
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.2 dst-port 2 tos 2 action 9
Added rule with ID 253
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.3 dst-port 3 tos 3 action 17
Added rule with ID 252
# ./filer_decode /sys/kernel/debug/gfar1/filer_raw
00: MASK == 00000210 AND Q:00 ctrl:00000080 prop:00000210
01: FPR == 00000210 AND CLE Q:00 ctrl:00000281 prop:00000210
02: MASK == ffffffff AND Q:00 ctrl:00000080 prop:ffffffff
03: DPT == 00000003 AND Q:00 ctrl:0000008e prop:00000003
04: TOS == 00000003 AND Q:00 ctrl:0000008a prop:00000003
05: DIA == 0a000003 AND Q:11 ctrl:0000448c prop:0a000003
06: DPT == 00000002 AND Q:00 ctrl:0000008e prop:00000002
07: TOS == 00000002 AND Q:00 ctrl:0000008a prop:00000002
08: DIA == 0a000002 AND Q:09 ctrl:0000248c prop:0a000002
09: DIA == 0a000001 AND Q:00 ctrl:0000008c prop:0a000001
0a: DPT == 00000001 AND Q:00 ctrl:0000008e prop:00000001
0b: TOS == 00000001 CLE Q:01 ctrl:0000060a prop:00000001
ff: MASK >= 00000000 Q:00 ctrl:00000020 prop:00000000
(Entire cluster gets AND-ed together).
(3) We observed that the masking rules it generates do not
play well with clustering on P2020. Only first rule
of the cluster would ever fire. Given that optimizer
relies heavily on masking this is very hard to fix.
Example:
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.1 dst-port 1 action 1
Added rule with ID 254
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.2 dst-port 2 action 9
Added rule with ID 253
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.3 dst-port 3 action 17
Added rule with ID 252
# ./filer_decode /sys/kernel/debug/gfar1/filer_raw
00: MASK == 00000210 AND Q:00 ctrl:00000080 prop:00000210
01: FPR == 00000210 AND CLE Q:00 ctrl:00000281 prop:00000210
02: MASK == ffffffff AND Q:00 ctrl:00000080 prop:ffffffff
03: DPT == 00000003 AND Q:00 ctrl:0000008e prop:00000003
04: DIA == 0a000003 Q:11 ctrl:0000440c prop:0a000003
05: DPT == 00000002 AND Q:00 ctrl:0000008e prop:00000002
06: DIA == 0a000002 Q:09 ctrl:0000240c prop:0a000002
07: DIA == 0a000001 AND Q:00 ctrl:0000008c prop:0a000001
08: DPT == 00000001 CLE Q:01 ctrl:0000060e prop:00000001
ff: MASK >= 00000000 Q:00 ctrl:00000020 prop:00000000
Which looks correct according to the spec but only the first
(eth id 252)/last added rule for 10.0.0.3 will ever trigger.
As if filer did not treat the AND CLE as cluster start but
also kept AND-ing the rules. We found no errata covering this.
The fact that nobody noticed (2) or (3) makes me think
that this feature is not very widely used and we should just
remove it.
Reported-by: Aleksander Dutkowski <adutkowski@gmail.com>
Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Acked-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
At a cost of one line let's make sure .count is correct
when calling gfar_process_filer_changes().
Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
MAX_FILER_IDX is the last usable index. Using less-than
will already guarantee that one entry for catch-all rule
will be left, no need to subtract 1 here.
Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>