In metadata mode, the vxlan interface is not supposed to use the fdb control
plane but an external one (openvswitch or static routes). With the current
code, packets may leak into the fdb handling code which usually causes them
to be dropped anyway but may have strange side effects.
Just drop the packets directly when in metadata mode if the destination data
are not correctly provided on egress.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Failure of kzalloc should cause the enclosing function
to return -ENOMEM, not -ENODEV.
Additionally, removed the following checkpatch warnings:
ERROR: spaces required around that '==' (ctx:VxV)
ERROR: space required before the open parenthesis '('
CHECK: Comparison to NULL could be written "!lp"
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ether_setup sets IFF_TX_SKB_SHARING but this is not supported by
geneve as it modifies the skb on xmit.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ether_setup sets IFF_TX_SKB_SHARING but this is not supported by vxlan
as it modifies the skb on xmit.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Calculate the maximum MTU taking into account the size of headers
involved in GENEVE encapsulation, as for other tunnel types.
Changes in v3:
- Correct comment style
Changes in v2:
- Conform more closely to ip_tunnel_change_mtu
- Exclude GENEVE options from max MTU calculation
Signed-off-by: David Wragg <david@weave.works>
Acked-by: Jesse Gross <jesse@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When PVID is toggled off on a port member in a VLAN filtering bridge or
the PVID VLAN is deleted, make the port drop untagged packets. Reverse
the operation when PVID is toggled back on.
Set the PVID back to the default (1), when leaving the bridge so that
untagged traffic will be directed to the CPU.
Fixes: 56ade8fe3f ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When VLAN filtering is enabled on a bridge and PVID is deleted from a
bridge port, then untagged frames are not allowed to ingress into the
bridge from this port.
Add the Switch Port Acceptable Frame Types (SPAFT) register, which
configures the frame admittance of the port.
Fixes: 56ade8fe3f ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For error handling, dma_alloc_coherent's return value
needs to be checked, not argument.
Signed-off-by: Insu Yun <wuninsu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Counting rx packets for every CQE_RX in CQ irq handler is incorrect.
Synchronization is missing when multiple queues are receiving packets
simultaneously. Like transmit packet stats use HW stats here.
Also removed unused 'cqe_type' parameter in nicvf_rcv_pkt_handler().
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For secondary Qsets 'hw_tso' is not getting set as probe() returns
much earlier. Fixed it by moving silicon revision check.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a interface is assigned morethan 8 queues and the logical interface
is toggled i.e down & up, additional queues or qsets are not initialized
as secondary qset count is being set to zero while tearing down.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the Marvell 88E1510, marvell_of_reg_init was called too late, in the
config_aneg function.
Since commit 113c74d83e ("net: phy: turn carrier off on phy attach"),
this lead to the link not coming up at boot anymore, due to the phy
state machine being stuck at waiting for interrupts (off by default on
the 88E1510).
For seven other Marvell PHYs, marvell_of_reg_init was not called at all.
Add a generic marvell_config_init function, which in turn calls
marvell_of_reg_init.
PHYs, which already have a specific config_init function with a call to
marvell_of_reg_init, are left untouched. The generic marvell_config_init
function is called for all the others, to get consistent behavior across
all Marvell PHYs.
Fixes: 113c74d83e ("net: phy: turn carrier off on phy attach")
Signed-off-by: Clemens Gruber <clemens.gruber@pqgruber.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drop reference on the relay_po socket when __pppoe_xmit() succeeds.
This is already handled correctly in the error path.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a VLAN device leaves a bridge its STP state is set to DISABLED,
which causes the hardware to discard any packets coming through the port
with this VLAN.
Fix that by setting STP state to FORWARDING when the device leaves its
bridge and allow traffic to be directed to CPU.
Fixes: 26f0e7fb15 ("mlxsw: spectrum: Add support for VLAN devices bridging")
Reported-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
MLXSW_PORT_MAX_PORTS represents the maximum number of local ports, which
is 65 for both ASICs (SwitchX-2 and Spectrum) supported by this driver.
Fixes: 93c1edb27f ("mlxsw: Introduce Mellanox switch driver core")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The cpsw-phy-sel driver supports only MII, RMII, and RGMII PHY modes,
and silently handled any other values as if MII was specified. In a
case where the PHY mode was incorrectly specified, or a bug elsewhere,
there would be no indication of a problem. If MII was the correct mode,
then this will go unnoticed, otherwise the symptom will be a failure
to transmit/receive data over the RMII/RGMII link.
Add a dev_warn() to make this condition obvious and provide a
breadcrumb to follow.
Cc: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David Rivshin <drivshin@allworx.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
genphy_config_init() masked out pause flags set in phy driver structure.
Pause flags needs to be preserved in phydev->supported &
phydev->advertising.
Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's forbidden to manually change dev->features in run-time. Currently, this is
done in the driver to make sure that GSO_UDP_TUNNEL is advertized only when
VXLAN tunnel is set. However, since the stack actually does features intersection
with hw_enc_features, we can safely revert to advertizing features early when
registering the netdevice.
Fixes: f4a1edd561 ('net/mlx4_en: Advertize encapsulation offloads [...]')
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
problem description:
The current code sets UAR page size equal to system page size.
The ConnectX-3 and ConnectX-3 Pro HWs require minimum 128 UAR pages.
The mlx4 kernel drivers are not loaded if there is less than 128 UAR pages.
solution:
Always set UAR page to 4KB. This allows more UAR pages if the OS
has PAGE_SIZE larger than 4KB. For example, PowerPC kernel use 64KB
system page size, with 4MB uar region, there are 4MB/2/64KB = 32
uars (half for uar, half for blueflame). This does not meet minimum 128
UAR pages requirement. With 4KB UAR page, there are 4MB/2/4KB = 512 uars
which meet the minimum requirement.
Note that only codes in mlx4_core that deal with firmware know that uar
page size is 4KB. Codes that deal with usr page in cq and qp context
(mlx4_ib, mlx4_en and part of mlx4_core) still have the same assumption
that uar page size equals to system page size.
Note that with this implementation, on 64KB system page size kernel, there
are 16 uars per system page but only one uars is used. The other 15
uars are ignored because of the above assumption.
Regarding SR-IOV, mlx4_core in hypervisor will set the uar page size
to 4KB and mlx4_core code in virtual OS will obtain the uar page size from
firmware.
Regarding backward compatibility in SR-IOV, if hypervisor has this new code,
the virtual OS must be updated. If hypervisor has old code, and the virtual
OS has this new code, the new code will be backward compatible with the
old code. If the uar size is big enough, this new code in VF continues to
work with 64 KB uar page size (on PowerPc kernel). If the uar size does not
meet 128 uars requirement, this new code not loaded in VF and print the same
error message as the old code in Hypervisor.
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PCI channel could go offline during reset due to EEH. Don't bug on in
this case, the error is recoverable.
Fixes: f6bc11e426 ('net/mlx4_core: Enhance the catas flow to support device reset')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The error flow in procedure handle_existing_counter() is wrong.
The procedure should exit after encountering the error, not continue
as if everything is OK.
Fixes: 68230242cd ('net/mlx4_core: Add port attribute when tracking counters')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, the shift value used for time-stamping was constant and didn't
depend on the HW chip frequency. Change that to take the frequency into account
and calculate the maximal value in cycles per wraparound of ten seconds. This
time slot was chosen since it gives a good accuracy in time synchronization.
Algorithm for shift value calculation:
* Round up the maximal value in cycles to nearest power of two
* Calculate maximal multiplier by division of all 64 bits set
to above result
* Then, invert the function clocksource_khz2mult() to get the shift from
maximal mult value
Fixes: ec693d4701 ('net/mlx4_en: Add HW timestamping (TS) support')
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RdropOvflw counts overrun of HW buffer, therefore should
be used for rx_fifo_errors only.
Currently RdropOvflw counter is mistakenly also set into
rx_missed_errors and rx_over_errors too, which makes the
device total dropped packets accounting to show wrong results.
Fix that. Use it for rx_fifo_errors only.
Fixes: c27a02cd94 ('mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC')
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Marvell 88E6240 has been tested successfully without further
changes. Add entry to the table of supported devices.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current default tx ring size of 512 causes an extra page to be
allocated for the tx ring with only 1 entry in it. Reduce it to
511. The default rx ring size is also reduced to 511 to use less
memory by default.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tx push is supported for small packets to reduce DMA latency. The
following bugs are fixed in this patch:
1. Fix the definition of the push BD which is different from the DMA BD.
2. The push buffer has to be zero padded to the next 64-bit word boundary
or tx checksum won't be correct.
3. Increase the tx push packet threshold to 164 bytes (192 bytes with the BD)
so that small tunneled packets are within the threshold.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
20G is not supported by production hardware and only the 40GbaseCR4 standard
is supported.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cleanup bnxt_probe_phy() to cleanly separate 2 code blocks for autoneg
on and off. Autoneg flow control is possible only if autoneg is enabled.
In bnxt_get_settings(), Pause and Asym_Pause are always supported.
Only the advertisement bits change depending on the ethtool -A setting
in auto mode.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. Determine autoneg on|off setting from link_info->autoneg. Using the
firmware returned setting can be misleading if autoneg is changed and
there hasn't been a phy update from the firmware.
2. If autoneg is disabled, link_info->autoneg should be set to 0 to
indicate both speed and flow control autoneg are disabled.
3. To enable autoneg flow control, speed autoneg must be enabled.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
EMAC could be disabled, while there is some sb_buff
in use. That buffers got lost for linux.
In order to reproduce run on device during active ethernet work:
ifconfig eth0 down
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
EMAC reset internal tx ring pointer to zero at statup.
txbd_curr and txbd_dirty can be different from zero.
That cause ethernet transfer hang (no packets transmitted).
In order to reproduce, run on device:
ifconfig eth0 down
ifconfig eth0 up
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently our netdevice ops is a one static global variable which
is referenced by all mlx5e netdevice instances. This can be
problematic when different driver instances do not share same
HW capabilities (e.g SRIOV PF and VFs probed to the host).
Now we have two constant global netdevice ops variables, one
for basic netdevice ops and the other with extended SRIOV ops,
on netdevice construction we choose the one suitable for
current device capabilities.
Fixes: 66e49dedad ("net/mlx5e: Add support for SR-IOV ndos")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently mlx5e_select_queue is redundant since num_tc is always 1.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is presently a race condition between the bonding periodic
link monitor and the updating of a slave's speed and duplex. The former
occurs on a periodic basis, and the latter in response to a driver's
calling of netif_carrier_on.
It is possible for the periodic monitor to run between the
driver call of netif_carrier_on and the receipt of the NETDEV_CHANGE
event that causes bonding to update the slave's speed and duplex. This
manifests most notably as a report that a slave is up and "0 Mbps full
duplex" after enslavement, but in principle could report an incorrect
speed and duplex after any link up event if the device comes up with a
different speed or duplex. This affects the 802.3ad aggregator
selection, as the speed and duplex are selection criteria.
This is fixed by updating the speed and duplex in the periodic
monitor, prior to using that information.
This was done historically in bonding, but the call to
bond_update_speed_duplex was removed in commit 876254ae27 ("bonding:
don't call update_speed_duplex() under spinlocks"), as it might sleep
under lock. Later, the locking was changed to only hold RTNL, and so
after commit 876254ae27 ("bonding: don't call update_speed_duplex()
under spinlocks") this call is again safe.
Tested-by: "Tantilov, Emil S" <emil.s.tantilov@intel.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: dingtianhong <dingtianhong@huawei.com>
Fixes: 876254ae27 ("bonding: don't call update_speed_duplex() under spinlocks")
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Acked-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The am79c961a.c driver fails to build with clang because of an
unusual inline assembly construct:
drivers/net/ethernet/amd/am79c961a.c:53:7: error: invalid % escape in inline assembly string
"str%?h %1, [%2] @ NET_RAP\n\t"
The same change has been done a decade ago in arch/arm as of
6a39dd6222 ("[ARM] 3759/2: Remove uses of %?"), but apparently
some drivers were missed.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The smc91x driver doesn't honor the probe deferral mechanism when the
interrupt source is not yet available, such as one provided by a gpio
controller not probed.
Fix this by propagating the platform_get_irq() error code as the probe
return value.
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the two wildcard entries, they serve no purpose and will match way too
many devices, some of them being covered by the driver in
drivers/net/phy/broadcom.c. Remove the now unused bcm7xxx_dummy_config_init()
function which would produce a warning.
Fixes: b560a58c45 ("net: phy: add Broadcom BCM7xxx internal PHY driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we were wrongly advertising gigabit features for these 10/100 only
Ethernet PHYs, bcm7xxx_config_init() which is supposed to apply workaround
would have not run since the check would be true, now that we have fixed the
PHY features, remove that check since it has no reasoning to be there anymore.
Fixes: e18556ee3b ("net: phy: bcm7xxx: do not use PHY_BRCM_100MBPS_WAR")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PHY entries for BCM7425/29/35 declare the 40nm Ethernet PHY as being
10/100/1000 capable, while this is just a 10/100 capable PHY device, fix that.
Fixes: d068b02cfd ("net: phy: add BCM7425 and BCM7429 PHYs")
Fixes: 9458ceab49 ("net: phy: bcm7xxx: Add entry for BCM7435")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The clear and set masks in the call to phy_set_clr_bits() called from
bcm7xxx_config_init() are inverted. We need to fix this by swapping the two
arguments, that is, set 0 bits, but clear the shade mode 2 enable bit.
Fixes: b560a58c45 ("net: phy: add Broadcom BCM7xxx internal PHY driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When adding support for the R-Car gen3 gPTP active in configuration mode,
some call sites of ravb_ptp_{init|stop}() were missed due to an oversight.
Add checks for the R-Car gen2 SoCs around these...
Fixes: f5d7837f96 ("ravb: ptp: Add CONFIG mode support")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When adding support for the R-Car gen3 gPTP active in configuration mode,
the code setting the CCC.CSEL field was duplicated due to an oversight.
For R-Car gen 2 it's just redundant and for R-Car gen3 the write at this
time is probably ignored due to CCC.GAC bit being already set...
Fixes: f5d7837f96 ("ravb: ptp: Add CONFIG mode support")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
My analysis in the below mail applies, although the second part is
unnecessary because i isn't used in arithmetic operations here:
https://marc.info/?l=openbsd-tech&m=145377854103866&w=2
Thanks for your time.
Signed-off-by: Michael McConville <mmcco@mykolab.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
BRIDGE_VLAN_FILTERING automatically adds a newly bridged port to the
VLAN with the bridge's default_pvid.
The mv88e6xxx driver currently reserves VLANs 4000+ for unbridged ports
isolation. When a port joins a bridge, it leaves its reserved VLAN. When
a port leaves a bridge, it joins again its reserved VLAN.
But if the VLAN filtering is disabled, or if this hardware VLAN is
already in use, the bridged port ends up with no default VLAN, and the
communication with the CPU is thus broken.
To fix this, make a port join its reserved VLAN once on setup, never
leave it, and restore its PVID after another one was eventually used.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
smatch detected a suspicious looking bitop condition:
drivers/net/ethernet/cavium/liquidio/lio_main.c:2529
handle_timestamp() warn: suspicious bitop condition
(skb_shinfo(skb)->tx_flags | SKBTX_IN_PROGRESS is always non-zero,
so the logic is definitely not correct. Use & to mask the correct
bit.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>