Mostly driver fixes.

Current release - regressions:
 
  - Revert "net: Add a second bind table hashed by port and address",
    needs more work
 
  - amd-xgbe: use platform_irq_count(), static setup of IRQ resources
    had been removed from DT core
 
  - dts: at91: ksz9477_evb: add phy-mode to fix port/phy validation
 
 Current release - new code bugs:
 
  - hns3: modify the ring param print info
 
 Previous releases - always broken:
 
  - axienet: make the 64b addressable DMA depends on 64b architectures
 
  - iavf: fix issue with MAC address of VF shown as zero
 
  - ice: fix PTP TX timestamp offset calculation
 
  - usb: ax88179_178a needs FLAG_SEND_ZLP
 
 Misc:
 
  - document some net.sctp.* sysctls
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmKrcx8ACgkQMUZtbf5S
 IrsKjA/9Ho+cxnGAvx7ngQepqAU8RQDFy7sQoFHiGqs+jMeph/E81PM2QDBR9g9h
 k/s7YRpLGuxWFT7KUJScNl0ZyPgSk5EHcqy202ToYyDQv+srLnh5bgbRykMF2Unc
 D4mf63a2pNo9S0L1PmMz87p+XaWIwblqQ0wbl5F97e7eAWel+y7rPCBqR0lZ9Il7
 w8rZp6iOVOhD495s1ikqOYUVCntepC9MQIo8iIE/WrREiOWmZNNbV8RzvuHRNQs6
 j9eLsukKwTfekQbzR3SXbYxyjwRowAQ3bD5sEL3MuqflsxRpVm5lEqN0AuVlAo3C
 IJZFSFqnusC4cSUYVdfWhYlx8om+uw4XKzfqQD/T7yobjoVA/Mmt/Uf7Mw6krR+g
 bI+/bpgX7WpLYQNtBFAils5pY36pthN+zg9FuU0v7tNLgC3AmQqA8sRI/fRCVJFV
 b1Wmk6Ldj1lCynX0KpzU6XSFGzP2Ht9CYReImiwvZbaABIoM14woHhRPrh8UGWIY
 sdpLoR+XRyL/0N1W7l0FgbGm/zOaEbh8fo0ZGYHLukXPUby6osiV36frzjxOj/NO
 DqNkPq4ajfWFvcWdqbfRKXwpLyM/Ki2WpQjvaNzDLOL74sDspr8wjnIOOLbuHv/8
 NW6tcWwfIu9nkDJOpRedh+O2gj6FKdruobdKVgQd376J0kxWLv0=
 =JSV9
 -----END PGP SIGNATURE-----

Merge tag 'net-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Mostly driver fixes.

  Current release - regressions:

   - Revert "net: Add a second bind table hashed by port and address",
     needs more work

   - amd-xgbe: use platform_irq_count(), static setup of IRQ resources
     had been removed from DT core

   - dts: at91: ksz9477_evb: add phy-mode to fix port/phy validation

  Current release - new code bugs:

   - hns3: modify the ring param print info

  Previous releases - always broken:

   - axienet: make the 64b addressable DMA depends on 64b architectures

   - iavf: fix issue with MAC address of VF shown as zero

   - ice: fix PTP TX timestamp offset calculation

   - usb: ax88179_178a needs FLAG_SEND_ZLP

  Misc:

   - document some net.sctp.* sysctls"

* tag 'net-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (31 commits)
  net: axienet: add missing error return code in axienet_probe()
  Revert "net: Add a second bind table hashed by port and address"
  net: ax25: Fix deadlock caused by skb_recv_datagram in ax25_recvmsg
  net: usb: ax88179_178a needs FLAG_SEND_ZLP
  MAINTAINERS: add include/dt-bindings/net to NETWORKING DRIVERS
  ARM: dts: at91: ksz9477_evb: fix port/phy validation
  net: bgmac: Fix an erroneous kfree() in bgmac_remove()
  ice: Fix memory corruption in VF driver
  ice: Fix queue config fail handling
  ice: Sync VLAN filtering features for DVM
  ice: Fix PTP TX timestamp offset calculation
  mlxsw: spectrum_cnt: Reorder counter pools
  docs: networking: phy: Fix a typo
  amd-xgbe: Use platform_irq_count()
  octeontx2-vf: Add support for adaptive interrupt coalescing
  xilinx:  Fix build on x86.
  net: axienet: Use iowrite64 to write all 64b descriptor pointers
  net: axienet: make the 64b addresable DMA depends on 64b archectures
  net: hns3: fix tm port shapping of fibre port is incorrect after driver initialization
  net: hns3: fix PF rss size initialization bug
  ...
This commit is contained in:
Linus Torvalds 2022-06-16 11:51:32 -07:00
commit 48a23ec6ff
36 changed files with 431 additions and 750 deletions

View File

@ -2925,6 +2925,43 @@ plpmtud_probe_interval - INTEGER
Default: 0
reconf_enable - BOOLEAN
Enable or disable extension of Stream Reconfiguration functionality
specified in RFC6525. This extension provides the ability to "reset"
a stream, and it includes the Parameters of "Outgoing/Incoming SSN
Reset", "SSN/TSN Reset" and "Add Outgoing/Incoming Streams".
- 1: Enable extension.
- 0: Disable extension.
Default: 0
intl_enable - BOOLEAN
Enable or disable extension of User Message Interleaving functionality
specified in RFC8260. This extension allows the interleaving of user
messages sent on different streams. With this feature enabled, I-DATA
chunk will replace DATA chunk to carry user messages if also supported
by the peer. Note that to use this feature, one needs to set this option
to 1 and also needs to set socket options SCTP_FRAGMENT_INTERLEAVE to 2
and SCTP_INTERLEAVING_SUPPORTED to 1.
- 1: Enable extension.
- 0: Disable extension.
Default: 0
ecn_enable - BOOLEAN
Control use of Explicit Congestion Notification (ECN) by SCTP.
Like in TCP, ECN is used only when both ends of the SCTP connection
indicate support for it. This feature is useful in avoiding losses
due to congestion by allowing supporting routers to signal congestion
before having to drop packets.
1: Enable ecn.
0: Disable ecn.
Default: 1
``/proc/sys/net/core/*``
========================

View File

@ -104,7 +104,7 @@ Whenever possible, use the PHY side RGMII delay for these reasons:
* PHY device drivers in PHYLIB being reusable by nature, being able to
configure correctly a specified delay enables more designs with similar delay
requirements to be operate correctly
requirements to be operated correctly
For cases where the PHY is not capable of providing this delay, but the
Ethernet MAC driver is capable of doing so, the correct phy_interface_t value

View File

@ -13800,6 +13800,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
F: Documentation/devicetree/bindings/net/
F: drivers/connector/
F: drivers/net/
F: include/dt-bindings/net/
F: include/linux/etherdevice.h
F: include/linux/fcdevice.h
F: include/linux/fddidevice.h

View File

@ -120,26 +120,31 @@
port@0 {
reg = <0>;
label = "lan1";
phy-mode = "internal";
};
port@1 {
reg = <1>;
label = "lan2";
phy-mode = "internal";
};
port@2 {
reg = <2>;
label = "lan3";
phy-mode = "internal";
};
port@3 {
reg = <3>;
label = "lan4";
phy-mode = "internal";
};
port@4 {
reg = <4>;
label = "lan5";
phy-mode = "internal";
};
port@5 {

View File

@ -338,7 +338,7 @@ static int xgbe_platform_probe(struct platform_device *pdev)
* the PHY resources listed last
*/
phy_memnum = xgbe_resource_count(pdev, IORESOURCE_MEM) - 3;
phy_irqnum = xgbe_resource_count(pdev, IORESOURCE_IRQ) - 1;
phy_irqnum = platform_irq_count(pdev) - 1;
dma_irqnum = 1;
dma_irqend = phy_irqnum;
} else {
@ -348,7 +348,7 @@ static int xgbe_platform_probe(struct platform_device *pdev)
phy_memnum = 0;
phy_irqnum = 0;
dma_irqnum = 1;
dma_irqend = xgbe_resource_count(pdev, IORESOURCE_IRQ);
dma_irqend = platform_irq_count(pdev);
}
/* Obtain the mmio areas for the device */

View File

@ -332,7 +332,6 @@ static void bgmac_remove(struct bcma_device *core)
bcma_mdio_mii_unregister(bgmac->mii_bus);
bgmac_enet_remove(bgmac);
bcma_set_drvdata(core, NULL);
kfree(bgmac);
}
static struct bcma_driver bgmac_bcma_driver = {

View File

@ -769,6 +769,7 @@ struct hnae3_tc_info {
u8 prio_tc[HNAE3_MAX_USER_PRIO]; /* TC indexed by prio */
u16 tqp_count[HNAE3_MAX_TC];
u16 tqp_offset[HNAE3_MAX_TC];
u8 max_tc; /* Total number of TCs */
u8 num_tc; /* Total number of enabled TCs */
bool mqprio_active;
};

View File

@ -1129,7 +1129,7 @@ hns3_is_ringparam_changed(struct net_device *ndev,
if (old_ringparam->tx_desc_num == new_ringparam->tx_desc_num &&
old_ringparam->rx_desc_num == new_ringparam->rx_desc_num &&
old_ringparam->rx_buf_len == new_ringparam->rx_buf_len) {
netdev_info(ndev, "ringparam not changed\n");
netdev_info(ndev, "descriptor number and rx buffer length not changed\n");
return false;
}

View File

@ -3268,7 +3268,7 @@ static int hclge_tp_port_init(struct hclge_dev *hdev)
static int hclge_update_port_info(struct hclge_dev *hdev)
{
struct hclge_mac *mac = &hdev->hw.mac;
int speed = HCLGE_MAC_SPEED_UNKNOWN;
int speed;
int ret;
/* get the port info from SFP cmd if not copper port */
@ -3279,10 +3279,13 @@ static int hclge_update_port_info(struct hclge_dev *hdev)
if (!hdev->support_sfp_query)
return 0;
if (hdev->ae_dev->dev_version >= HNAE3_DEVICE_VERSION_V2)
if (hdev->ae_dev->dev_version >= HNAE3_DEVICE_VERSION_V2) {
speed = mac->speed;
ret = hclge_get_sfp_info(hdev, mac);
else
} else {
speed = HCLGE_MAC_SPEED_UNKNOWN;
ret = hclge_get_sfp_speed(hdev, &speed);
}
if (ret == -EOPNOTSUPP) {
hdev->support_sfp_query = false;
@ -3294,6 +3297,8 @@ static int hclge_update_port_info(struct hclge_dev *hdev)
if (hdev->ae_dev->dev_version >= HNAE3_DEVICE_VERSION_V2) {
if (mac->speed_type == QUERY_ACTIVE_SPEED) {
hclge_update_port_capability(hdev, mac);
if (mac->speed != speed)
(void)hclge_tm_port_shaper_cfg(hdev);
return 0;
}
return hclge_cfg_mac_speed_dup(hdev, mac->speed,
@ -3376,6 +3381,12 @@ static int hclge_set_vf_link_state(struct hnae3_handle *handle, int vf,
link_state_old = vport->vf_info.link_state;
vport->vf_info.link_state = link_state;
/* return success directly if the VF is unalive, VF will
* query link state itself when it starts work.
*/
if (!test_bit(HCLGE_VPORT_STATE_ALIVE, &vport->state))
return 0;
ret = hclge_push_vf_link_status(vport);
if (ret) {
vport->vf_info.link_state = link_state_old;
@ -10117,6 +10128,7 @@ static int hclge_modify_port_base_vlan_tag(struct hclge_vport *vport,
if (ret)
return ret;
vport->port_base_vlan_cfg.tbl_sta = false;
/* remove old VLAN tag */
if (old_info->vlan_tag == 0)
ret = hclge_set_vf_vlan_common(hdev, vport->vport_id,

View File

@ -282,8 +282,8 @@ static int hclge_tm_pg_to_pri_map_cfg(struct hclge_dev *hdev,
return hclge_cmd_send(&hdev->hw, &desc, 1);
}
static int hclge_tm_qs_to_pri_map_cfg(struct hclge_dev *hdev,
u16 qs_id, u8 pri)
static int hclge_tm_qs_to_pri_map_cfg(struct hclge_dev *hdev, u16 qs_id, u8 pri,
bool link_vld)
{
struct hclge_qs_to_pri_link_cmd *map;
struct hclge_desc desc;
@ -294,7 +294,7 @@ static int hclge_tm_qs_to_pri_map_cfg(struct hclge_dev *hdev,
map->qs_id = cpu_to_le16(qs_id);
map->priority = pri;
map->link_vld = HCLGE_TM_QS_PRI_LINK_VLD_MSK;
map->link_vld = link_vld ? HCLGE_TM_QS_PRI_LINK_VLD_MSK : 0;
return hclge_cmd_send(&hdev->hw, &desc, 1);
}
@ -420,7 +420,7 @@ static int hclge_tm_pg_shapping_cfg(struct hclge_dev *hdev,
return hclge_cmd_send(&hdev->hw, &desc, 1);
}
static int hclge_tm_port_shaper_cfg(struct hclge_dev *hdev)
int hclge_tm_port_shaper_cfg(struct hclge_dev *hdev)
{
struct hclge_port_shapping_cmd *shap_cfg_cmd;
struct hclge_shaper_ir_para ir_para;
@ -642,11 +642,13 @@ static void hclge_tm_update_kinfo_rss_size(struct hclge_vport *vport)
* one tc for VF for simplicity. VF's vport_id is non zero.
*/
if (vport->vport_id) {
kinfo->tc_info.max_tc = 1;
kinfo->tc_info.num_tc = 1;
vport->qs_offset = HNAE3_MAX_TC +
vport->vport_id - HCLGE_VF_VPORT_START_NUM;
vport_max_rss_size = hdev->vf_rss_size_max;
} else {
kinfo->tc_info.max_tc = hdev->tc_max;
kinfo->tc_info.num_tc =
min_t(u16, vport->alloc_tqps, hdev->tm_info.num_tc);
vport->qs_offset = 0;
@ -679,7 +681,9 @@ static void hclge_tm_vport_tc_info_update(struct hclge_vport *vport)
kinfo->num_tqps = hclge_vport_get_tqp_num(vport);
vport->dwrr = 100; /* 100 percent as init */
vport->bw_limit = hdev->tm_info.pg_info[0].bw_limit;
hdev->rss_cfg.rss_size = kinfo->rss_size;
if (vport->vport_id == PF_VPORT_ID)
hdev->rss_cfg.rss_size = kinfo->rss_size;
/* when enable mqprio, the tc_info has been updated. */
if (kinfo->tc_info.mqprio_active)
@ -714,14 +718,22 @@ static void hclge_tm_vport_info_update(struct hclge_dev *hdev)
static void hclge_tm_tc_info_init(struct hclge_dev *hdev)
{
u8 i;
u8 i, tc_sch_mode;
u32 bw_limit;
for (i = 0; i < hdev->tc_max; i++) {
if (i < hdev->tm_info.num_tc) {
tc_sch_mode = HCLGE_SCH_MODE_DWRR;
bw_limit = hdev->tm_info.pg_info[0].bw_limit;
} else {
tc_sch_mode = HCLGE_SCH_MODE_SP;
bw_limit = 0;
}
for (i = 0; i < hdev->tm_info.num_tc; i++) {
hdev->tm_info.tc_info[i].tc_id = i;
hdev->tm_info.tc_info[i].tc_sch_mode = HCLGE_SCH_MODE_DWRR;
hdev->tm_info.tc_info[i].tc_sch_mode = tc_sch_mode;
hdev->tm_info.tc_info[i].pgid = 0;
hdev->tm_info.tc_info[i].bw_limit =
hdev->tm_info.pg_info[0].bw_limit;
hdev->tm_info.tc_info[i].bw_limit = bw_limit;
}
for (i = 0; i < HNAE3_MAX_USER_PRIO; i++)
@ -926,10 +938,13 @@ static int hclge_tm_pri_q_qs_cfg_tc_base(struct hclge_dev *hdev)
for (k = 0; k < hdev->num_alloc_vport; k++) {
struct hnae3_knic_private_info *kinfo = &vport[k].nic.kinfo;
for (i = 0; i < kinfo->tc_info.num_tc; i++) {
for (i = 0; i < kinfo->tc_info.max_tc; i++) {
u8 pri = i < kinfo->tc_info.num_tc ? i : 0;
bool link_vld = i < kinfo->tc_info.num_tc;
ret = hclge_tm_qs_to_pri_map_cfg(hdev,
vport[k].qs_offset + i,
i);
pri, link_vld);
if (ret)
return ret;
}
@ -949,7 +964,7 @@ static int hclge_tm_pri_q_qs_cfg_vnet_base(struct hclge_dev *hdev)
for (i = 0; i < HNAE3_MAX_TC; i++) {
ret = hclge_tm_qs_to_pri_map_cfg(hdev,
vport[k].qs_offset + i,
k);
k, true);
if (ret)
return ret;
}
@ -989,33 +1004,39 @@ static int hclge_tm_pri_tc_base_shaper_cfg(struct hclge_dev *hdev)
{
u32 max_tm_rate = hdev->ae_dev->dev_specs.max_tm_rate;
struct hclge_shaper_ir_para ir_para;
u32 shaper_para;
u32 shaper_para_c, shaper_para_p;
int ret;
u32 i;
for (i = 0; i < hdev->tm_info.num_tc; i++) {
for (i = 0; i < hdev->tc_max; i++) {
u32 rate = hdev->tm_info.tc_info[i].bw_limit;
ret = hclge_shaper_para_calc(rate, HCLGE_SHAPER_LVL_PRI,
&ir_para, max_tm_rate);
if (ret)
return ret;
if (rate) {
ret = hclge_shaper_para_calc(rate, HCLGE_SHAPER_LVL_PRI,
&ir_para, max_tm_rate);
if (ret)
return ret;
shaper_para_c = hclge_tm_get_shapping_para(0, 0, 0,
HCLGE_SHAPER_BS_U_DEF,
HCLGE_SHAPER_BS_S_DEF);
shaper_para_p = hclge_tm_get_shapping_para(ir_para.ir_b,
ir_para.ir_u,
ir_para.ir_s,
HCLGE_SHAPER_BS_U_DEF,
HCLGE_SHAPER_BS_S_DEF);
} else {
shaper_para_c = 0;
shaper_para_p = 0;
}
shaper_para = hclge_tm_get_shapping_para(0, 0, 0,
HCLGE_SHAPER_BS_U_DEF,
HCLGE_SHAPER_BS_S_DEF);
ret = hclge_tm_pri_shapping_cfg(hdev, HCLGE_TM_SHAP_C_BUCKET, i,
shaper_para, rate);
shaper_para_c, rate);
if (ret)
return ret;
shaper_para = hclge_tm_get_shapping_para(ir_para.ir_b,
ir_para.ir_u,
ir_para.ir_s,
HCLGE_SHAPER_BS_U_DEF,
HCLGE_SHAPER_BS_S_DEF);
ret = hclge_tm_pri_shapping_cfg(hdev, HCLGE_TM_SHAP_P_BUCKET, i,
shaper_para, rate);
shaper_para_p, rate);
if (ret)
return ret;
}
@ -1125,7 +1146,7 @@ static int hclge_tm_pri_tc_base_dwrr_cfg(struct hclge_dev *hdev)
int ret;
u32 i, k;
for (i = 0; i < hdev->tm_info.num_tc; i++) {
for (i = 0; i < hdev->tc_max; i++) {
pg_info =
&hdev->tm_info.pg_info[hdev->tm_info.tc_info[i].pgid];
dwrr = pg_info->tc_dwrr[i];
@ -1135,9 +1156,15 @@ static int hclge_tm_pri_tc_base_dwrr_cfg(struct hclge_dev *hdev)
return ret;
for (k = 0; k < hdev->num_alloc_vport; k++) {
struct hnae3_knic_private_info *kinfo = &vport[k].nic.kinfo;
if (i >= kinfo->tc_info.max_tc)
continue;
dwrr = i < kinfo->tc_info.num_tc ? vport[k].dwrr : 0;
ret = hclge_tm_qs_weight_cfg(
hdev, vport[k].qs_offset + i,
vport[k].dwrr);
dwrr);
if (ret)
return ret;
}
@ -1303,6 +1330,7 @@ static int hclge_tm_schd_mode_tc_base_cfg(struct hclge_dev *hdev, u8 pri_id)
{
struct hclge_vport *vport = hdev->vport;
int ret;
u8 mode;
u16 i;
ret = hclge_tm_pri_schd_mode_cfg(hdev, pri_id);
@ -1310,9 +1338,16 @@ static int hclge_tm_schd_mode_tc_base_cfg(struct hclge_dev *hdev, u8 pri_id)
return ret;
for (i = 0; i < hdev->num_alloc_vport; i++) {
struct hnae3_knic_private_info *kinfo = &vport[i].nic.kinfo;
if (pri_id >= kinfo->tc_info.max_tc)
continue;
mode = pri_id < kinfo->tc_info.num_tc ? HCLGE_SCH_MODE_DWRR :
HCLGE_SCH_MODE_SP;
ret = hclge_tm_qs_schd_mode_cfg(hdev,
vport[i].qs_offset + pri_id,
HCLGE_SCH_MODE_DWRR);
mode);
if (ret)
return ret;
}
@ -1353,7 +1388,7 @@ static int hclge_tm_lvl34_schd_mode_cfg(struct hclge_dev *hdev)
u8 i;
if (hdev->tx_sch_mode == HCLGE_FLAG_TC_BASE_SCH_MODE) {
for (i = 0; i < hdev->tm_info.num_tc; i++) {
for (i = 0; i < hdev->tc_max; i++) {
ret = hclge_tm_schd_mode_tc_base_cfg(hdev, i);
if (ret)
return ret;

View File

@ -237,6 +237,7 @@ int hclge_pause_addr_cfg(struct hclge_dev *hdev, const u8 *mac_addr);
void hclge_pfc_rx_stats_get(struct hclge_dev *hdev, u64 *stats);
void hclge_pfc_tx_stats_get(struct hclge_dev *hdev, u64 *stats);
int hclge_tm_qs_shaper_cfg(struct hclge_vport *vport, int max_tx_rate);
int hclge_tm_port_shaper_cfg(struct hclge_dev *hdev);
int hclge_tm_get_qset_num(struct hclge_dev *hdev, u16 *qset_num);
int hclge_tm_get_pri_num(struct hclge_dev *hdev, u8 *pri_num);
int hclge_tm_get_qset_map_pri(struct hclge_dev *hdev, u16 qset_id, u8 *priority,

View File

@ -2586,15 +2586,16 @@ static void i40e_diag_test(struct net_device *netdev,
set_bit(__I40E_TESTING, pf->state);
if (test_bit(__I40E_RESET_RECOVERY_PENDING, pf->state) ||
test_bit(__I40E_RESET_INTR_RECEIVED, pf->state)) {
dev_warn(&pf->pdev->dev,
"Cannot start offline testing when PF is in reset state.\n");
goto skip_ol_tests;
}
if (i40e_active_vfs(pf) || i40e_active_vmdqs(pf)) {
dev_warn(&pf->pdev->dev,
"Please take active VFs and Netqueues offline and restart the adapter before running NIC diagnostics\n");
data[I40E_ETH_TEST_REG] = 1;
data[I40E_ETH_TEST_EEPROM] = 1;
data[I40E_ETH_TEST_INTR] = 1;
data[I40E_ETH_TEST_LINK] = 1;
eth_test->flags |= ETH_TEST_FL_FAILED;
clear_bit(__I40E_TESTING, pf->state);
goto skip_ol_tests;
}
@ -2641,9 +2642,17 @@ static void i40e_diag_test(struct net_device *netdev,
data[I40E_ETH_TEST_INTR] = 0;
}
skip_ol_tests:
netif_info(pf, drv, netdev, "testing finished\n");
return;
skip_ol_tests:
data[I40E_ETH_TEST_REG] = 1;
data[I40E_ETH_TEST_EEPROM] = 1;
data[I40E_ETH_TEST_INTR] = 1;
data[I40E_ETH_TEST_LINK] = 1;
eth_test->flags |= ETH_TEST_FL_FAILED;
clear_bit(__I40E_TESTING, pf->state);
netif_info(pf, drv, netdev, "testing failed\n");
}
static void i40e_get_wol(struct net_device *netdev,

View File

@ -8542,6 +8542,11 @@ static int i40e_configure_clsflower(struct i40e_vsi *vsi,
return -EOPNOTSUPP;
}
if (!tc) {
dev_err(&pf->pdev->dev, "Unable to add filter because of invalid destination");
return -EINVAL;
}
if (test_bit(__I40E_RESET_RECOVERY_PENDING, pf->state) ||
test_bit(__I40E_RESET_INTR_RECEIVED, pf->state))
return -EBUSY;

View File

@ -2282,7 +2282,7 @@ static int i40e_vc_config_queues_msg(struct i40e_vf *vf, u8 *msg)
}
if (vf->adq_enabled) {
for (i = 0; i < I40E_MAX_VF_VSI; i++)
for (i = 0; i < vf->num_tc; i++)
num_qps_all += vf->ch[i].num_qps;
if (num_qps_all != qci->num_queue_pairs) {
aq_ret = I40E_ERR_PARAM;

View File

@ -984,7 +984,7 @@ struct iavf_mac_filter *iavf_add_filter(struct iavf_adapter *adapter,
list_add_tail(&f->list, &adapter->mac_filter_list);
f->add = true;
f->is_new_mac = true;
f->is_primary = false;
f->is_primary = ether_addr_equal(macaddr, adapter->hw.mac.addr);
adapter->aq_required |= IAVF_FLAG_AQ_ADD_MAC_FILTER;
} else {
f->remove = false;

View File

@ -5763,25 +5763,38 @@ static netdev_features_t
ice_fix_features(struct net_device *netdev, netdev_features_t features)
{
struct ice_netdev_priv *np = netdev_priv(netdev);
netdev_features_t supported_vlan_filtering;
netdev_features_t requested_vlan_filtering;
struct ice_vsi *vsi = np->vsi;
netdev_features_t req_vlan_fltr, cur_vlan_fltr;
bool cur_ctag, cur_stag, req_ctag, req_stag;
requested_vlan_filtering = features & NETIF_VLAN_FILTERING_FEATURES;
cur_vlan_fltr = netdev->features & NETIF_VLAN_FILTERING_FEATURES;
cur_ctag = cur_vlan_fltr & NETIF_F_HW_VLAN_CTAG_FILTER;
cur_stag = cur_vlan_fltr & NETIF_F_HW_VLAN_STAG_FILTER;
/* make sure supported_vlan_filtering works for both SVM and DVM */
supported_vlan_filtering = NETIF_F_HW_VLAN_CTAG_FILTER;
if (ice_is_dvm_ena(&vsi->back->hw))
supported_vlan_filtering |= NETIF_F_HW_VLAN_STAG_FILTER;
req_vlan_fltr = features & NETIF_VLAN_FILTERING_FEATURES;
req_ctag = req_vlan_fltr & NETIF_F_HW_VLAN_CTAG_FILTER;
req_stag = req_vlan_fltr & NETIF_F_HW_VLAN_STAG_FILTER;
if (requested_vlan_filtering &&
requested_vlan_filtering != supported_vlan_filtering) {
if (requested_vlan_filtering & NETIF_F_HW_VLAN_CTAG_FILTER) {
netdev_warn(netdev, "cannot support requested VLAN filtering settings, enabling all supported VLAN filtering settings\n");
features |= supported_vlan_filtering;
if (req_vlan_fltr != cur_vlan_fltr) {
if (ice_is_dvm_ena(&np->vsi->back->hw)) {
if (req_ctag && req_stag) {
features |= NETIF_VLAN_FILTERING_FEATURES;
} else if (!req_ctag && !req_stag) {
features &= ~NETIF_VLAN_FILTERING_FEATURES;
} else if ((!cur_ctag && req_ctag && !cur_stag) ||
(!cur_stag && req_stag && !cur_ctag)) {
features |= NETIF_VLAN_FILTERING_FEATURES;
netdev_warn(netdev, "802.1Q and 802.1ad VLAN filtering must be either both on or both off. VLAN filtering has been enabled for both types.\n");
} else if ((cur_ctag && !req_ctag && cur_stag) ||
(cur_stag && !req_stag && cur_ctag)) {
features &= ~NETIF_VLAN_FILTERING_FEATURES;
netdev_warn(netdev, "802.1Q and 802.1ad VLAN filtering must be either both on or both off. VLAN filtering has been disabled for both types.\n");
}
} else {
netdev_warn(netdev, "cannot support requested VLAN filtering settings, clearing all supported VLAN filtering settings\n");
features &= ~supported_vlan_filtering;
if (req_vlan_fltr & NETIF_F_HW_VLAN_STAG_FILTER)
netdev_warn(netdev, "cannot support requested 802.1ad filtering setting in SVM mode\n");
if (req_vlan_fltr & NETIF_F_HW_VLAN_CTAG_FILTER)
features |= NETIF_F_HW_VLAN_CTAG_FILTER;
}
}

View File

@ -2271,7 +2271,7 @@ static int
ice_ptp_init_tx_e822(struct ice_pf *pf, struct ice_ptp_tx *tx, u8 port)
{
tx->quad = port / ICE_PORTS_PER_QUAD;
tx->quad_offset = tx->quad * INDEX_PER_PORT;
tx->quad_offset = (port % ICE_PORTS_PER_QUAD) * INDEX_PER_PORT;
tx->len = INDEX_PER_PORT;
return ice_ptp_alloc_tx_tracker(tx);

View File

@ -49,6 +49,37 @@ struct ice_perout_channel {
* To allow multiple ports to access the shared register block independently,
* the blocks are split up so that indexes are assigned to each port based on
* hardware logical port number.
*
* The timestamp blocks are handled differently for E810- and E822-based
* devices. In E810 devices, each port has its own block of timestamps, while in
* E822 there is a need to logically break the block of registers into smaller
* chunks based on the port number to avoid collisions.
*
* Example for port 5 in E810:
* +--------+--------+--------+--------+--------+--------+--------+--------+
* |register|register|register|register|register|register|register|register|
* | block | block | block | block | block | block | block | block |
* | for | for | for | for | for | for | for | for |
* | port 0 | port 1 | port 2 | port 3 | port 4 | port 5 | port 6 | port 7 |
* +--------+--------+--------+--------+--------+--------+--------+--------+
* ^^
* ||
* |--- quad offset is always 0
* ---- quad number
*
* Example for port 5 in E822:
* +-----------------------------+-----------------------------+
* | register block for quad 0 | register block for quad 1 |
* |+------+------+------+------+|+------+------+------+------+|
* ||port 0|port 1|port 2|port 3|||port 0|port 1|port 2|port 3||
* |+------+------+------+------+|+------+------+------+------+|
* +-----------------------------+-------^---------------------+
* ^ |
* | --- quad offset*
* ---- quad number
*
* * PHY port 5 is port 1 in quad 1
*
*/
/**

View File

@ -504,6 +504,11 @@ int ice_reset_vf(struct ice_vf *vf, u32 flags)
}
if (ice_is_vf_disabled(vf)) {
vsi = ice_get_vf_vsi(vf);
if (WARN_ON(!vsi))
return -EINVAL;
ice_vsi_stop_lan_tx_rings(vsi, ICE_NO_RESET, vf->vf_id);
ice_vsi_stop_all_rx_rings(vsi);
dev_dbg(dev, "VF is already disabled, there is no need for resetting it, telling VM, all is fine %d\n",
vf->vf_id);
return 0;

View File

@ -1569,35 +1569,27 @@ error_param:
*/
static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg)
{
enum virtchnl_status_code v_ret = VIRTCHNL_STATUS_SUCCESS;
struct virtchnl_vsi_queue_config_info *qci =
(struct virtchnl_vsi_queue_config_info *)msg;
struct virtchnl_queue_pair_info *qpi;
struct ice_pf *pf = vf->pf;
struct ice_vsi *vsi;
int i, q_idx;
int i = -1, q_idx;
if (!test_bit(ICE_VF_STATE_ACTIVE, vf->vf_states)) {
v_ret = VIRTCHNL_STATUS_ERR_PARAM;
if (!test_bit(ICE_VF_STATE_ACTIVE, vf->vf_states))
goto error_param;
}
if (!ice_vc_isvalid_vsi_id(vf, qci->vsi_id)) {
v_ret = VIRTCHNL_STATUS_ERR_PARAM;
if (!ice_vc_isvalid_vsi_id(vf, qci->vsi_id))
goto error_param;
}
vsi = ice_get_vf_vsi(vf);
if (!vsi) {
v_ret = VIRTCHNL_STATUS_ERR_PARAM;
if (!vsi)
goto error_param;
}
if (qci->num_queue_pairs > ICE_MAX_RSS_QS_PER_VF ||
qci->num_queue_pairs > min_t(u16, vsi->alloc_txq, vsi->alloc_rxq)) {
dev_err(ice_pf_to_dev(pf), "VF-%d requesting more than supported number of queues: %d\n",
vf->vf_id, min_t(u16, vsi->alloc_txq, vsi->alloc_rxq));
v_ret = VIRTCHNL_STATUS_ERR_PARAM;
goto error_param;
}
@ -1610,7 +1602,6 @@ static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg)
!ice_vc_isvalid_ring_len(qpi->txq.ring_len) ||
!ice_vc_isvalid_ring_len(qpi->rxq.ring_len) ||
!ice_vc_isvalid_q_id(vf, qci->vsi_id, qpi->txq.queue_id)) {
v_ret = VIRTCHNL_STATUS_ERR_PARAM;
goto error_param;
}
@ -1620,7 +1611,6 @@ static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg)
* for selected "vsi"
*/
if (q_idx >= vsi->alloc_txq || q_idx >= vsi->alloc_rxq) {
v_ret = VIRTCHNL_STATUS_ERR_PARAM;
goto error_param;
}
@ -1630,14 +1620,13 @@ static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg)
vsi->tx_rings[i]->count = qpi->txq.ring_len;
/* Disable any existing queue first */
if (ice_vf_vsi_dis_single_txq(vf, vsi, q_idx)) {
v_ret = VIRTCHNL_STATUS_ERR_PARAM;
if (ice_vf_vsi_dis_single_txq(vf, vsi, q_idx))
goto error_param;
}
/* Configure a queue with the requested settings */
if (ice_vsi_cfg_single_txq(vsi, vsi->tx_rings, q_idx)) {
v_ret = VIRTCHNL_STATUS_ERR_PARAM;
dev_warn(ice_pf_to_dev(pf), "VF-%d failed to configure TX queue %d\n",
vf->vf_id, i);
goto error_param;
}
}
@ -1651,17 +1640,13 @@ static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg)
if (qpi->rxq.databuffer_size != 0 &&
(qpi->rxq.databuffer_size > ((16 * 1024) - 128) ||
qpi->rxq.databuffer_size < 1024)) {
v_ret = VIRTCHNL_STATUS_ERR_PARAM;
qpi->rxq.databuffer_size < 1024))
goto error_param;
}
vsi->rx_buf_len = qpi->rxq.databuffer_size;
vsi->rx_rings[i]->rx_buf_len = vsi->rx_buf_len;
if (qpi->rxq.max_pkt_size > max_frame_size ||
qpi->rxq.max_pkt_size < 64) {
v_ret = VIRTCHNL_STATUS_ERR_PARAM;
qpi->rxq.max_pkt_size < 64)
goto error_param;
}
vsi->max_frame = qpi->rxq.max_pkt_size;
/* add space for the port VLAN since the VF driver is
@ -1672,16 +1657,30 @@ static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg)
vsi->max_frame += VLAN_HLEN;
if (ice_vsi_cfg_single_rxq(vsi, q_idx)) {
v_ret = VIRTCHNL_STATUS_ERR_PARAM;
dev_warn(ice_pf_to_dev(pf), "VF-%d failed to configure RX queue %d\n",
vf->vf_id, i);
goto error_param;
}
}
}
error_param:
/* send the response to the VF */
return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_CONFIG_VSI_QUEUES, v_ret,
NULL, 0);
return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_CONFIG_VSI_QUEUES,
VIRTCHNL_STATUS_SUCCESS, NULL, 0);
error_param:
/* disable whatever we can */
for (; i >= 0; i--) {
if (ice_vsi_ctrl_one_rx_ring(vsi, false, i, true))
dev_err(ice_pf_to_dev(pf), "VF-%d could not disable RX queue %d\n",
vf->vf_id, i);
if (ice_vf_vsi_dis_single_txq(vf, vsi, i))
dev_err(ice_pf_to_dev(pf), "VF-%d could not disable TX queue %d\n",
vf->vf_id, i);
}
/* send the response to the VF */
return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_CONFIG_VSI_QUEUES,
VIRTCHNL_STATUS_ERR_PARAM, NULL, 0);
}
/**

View File

@ -1390,7 +1390,8 @@ static int otx2vf_get_link_ksettings(struct net_device *netdev,
static const struct ethtool_ops otx2vf_ethtool_ops = {
.supported_coalesce_params = ETHTOOL_COALESCE_USECS |
ETHTOOL_COALESCE_MAX_FRAMES,
ETHTOOL_COALESCE_MAX_FRAMES |
ETHTOOL_COALESCE_USE_ADAPTIVE,
.supported_ring_params = ETHTOOL_RING_USE_RX_BUF_LEN |
ETHTOOL_RING_USE_CQE_SIZE,
.get_link = otx2_get_link,

View File

@ -8,8 +8,8 @@
#include "spectrum.h"
enum mlxsw_sp_counter_sub_pool_id {
MLXSW_SP_COUNTER_SUB_POOL_FLOW,
MLXSW_SP_COUNTER_SUB_POOL_RIF,
MLXSW_SP_COUNTER_SUB_POOL_FLOW,
};
int mlxsw_sp_counter_alloc(struct mlxsw_sp *mlxsw_sp,

View File

@ -547,6 +547,57 @@ static inline void axienet_iow(struct axienet_local *lp, off_t offset,
iowrite32(value, lp->regs + offset);
}
/**
* axienet_dma_out32 - Memory mapped Axi DMA register write.
* @lp: Pointer to axienet local structure
* @reg: Address offset from the base address of the Axi DMA core
* @value: Value to be written into the Axi DMA register
*
* This function writes the desired value into the corresponding Axi DMA
* register.
*/
static inline void axienet_dma_out32(struct axienet_local *lp,
off_t reg, u32 value)
{
iowrite32(value, lp->dma_regs + reg);
}
#if defined(CONFIG_64BIT) && defined(iowrite64)
/**
* axienet_dma_out64 - Memory mapped Axi DMA register write.
* @lp: Pointer to axienet local structure
* @reg: Address offset from the base address of the Axi DMA core
* @value: Value to be written into the Axi DMA register
*
* This function writes the desired value into the corresponding Axi DMA
* register.
*/
static inline void axienet_dma_out64(struct axienet_local *lp,
off_t reg, u64 value)
{
iowrite64(value, lp->dma_regs + reg);
}
static inline void axienet_dma_out_addr(struct axienet_local *lp, off_t reg,
dma_addr_t addr)
{
if (lp->features & XAE_FEATURE_DMA_64BIT)
axienet_dma_out64(lp, reg, addr);
else
axienet_dma_out32(lp, reg, lower_32_bits(addr));
}
#else /* CONFIG_64BIT */
static inline void axienet_dma_out_addr(struct axienet_local *lp, off_t reg,
dma_addr_t addr)
{
axienet_dma_out32(lp, reg, lower_32_bits(addr));
}
#endif /* CONFIG_64BIT */
/* Function prototypes visible in xilinx_axienet_mdio.c for other files */
int axienet_mdio_enable(struct axienet_local *lp);
void axienet_mdio_disable(struct axienet_local *lp);

View File

@ -133,30 +133,6 @@ static inline u32 axienet_dma_in32(struct axienet_local *lp, off_t reg)
return ioread32(lp->dma_regs + reg);
}
/**
* axienet_dma_out32 - Memory mapped Axi DMA register write.
* @lp: Pointer to axienet local structure
* @reg: Address offset from the base address of the Axi DMA core
* @value: Value to be written into the Axi DMA register
*
* This function writes the desired value into the corresponding Axi DMA
* register.
*/
static inline void axienet_dma_out32(struct axienet_local *lp,
off_t reg, u32 value)
{
iowrite32(value, lp->dma_regs + reg);
}
static void axienet_dma_out_addr(struct axienet_local *lp, off_t reg,
dma_addr_t addr)
{
axienet_dma_out32(lp, reg, lower_32_bits(addr));
if (lp->features & XAE_FEATURE_DMA_64BIT)
axienet_dma_out32(lp, reg + 4, upper_32_bits(addr));
}
static void desc_set_phys_addr(struct axienet_local *lp, dma_addr_t addr,
struct axidma_bd *desc)
{
@ -2061,6 +2037,11 @@ static int axienet_probe(struct platform_device *pdev)
iowrite32(0x0, desc);
}
}
if (!IS_ENABLED(CONFIG_64BIT) && lp->features & XAE_FEATURE_DMA_64BIT) {
dev_err(&pdev->dev, "64-bit addressable DMA is not compatible with 32-bit archecture\n");
ret = -EINVAL;
goto cleanup_clk;
}
ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(addr_width));
if (ret) {

View File

@ -1750,7 +1750,7 @@ static const struct driver_info ax88179_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@ -1763,7 +1763,7 @@ static const struct driver_info ax88178a_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@ -1776,7 +1776,7 @@ static const struct driver_info cypress_GX3_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@ -1789,7 +1789,7 @@ static const struct driver_info dlink_dub1312_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@ -1802,7 +1802,7 @@ static const struct driver_info sitecom_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@ -1815,7 +1815,7 @@ static const struct driver_info samsung_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@ -1828,7 +1828,7 @@ static const struct driver_info lenovo_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@ -1841,7 +1841,7 @@ static const struct driver_info belkin_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@ -1854,7 +1854,7 @@ static const struct driver_info toshiba_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@ -1867,7 +1867,7 @@ static const struct driver_info mct_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@ -1880,7 +1880,7 @@ static const struct driver_info at_umc2000_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@ -1893,7 +1893,7 @@ static const struct driver_info at_umc200_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@ -1906,7 +1906,7 @@ static const struct driver_info at_umc2000sp_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};

View File

@ -25,7 +25,6 @@
#undef INET_CSK_CLEAR_TIMERS
struct inet_bind_bucket;
struct inet_bind2_bucket;
struct tcp_congestion_ops;
/*
@ -58,7 +57,6 @@ struct inet_connection_sock_af_ops {
*
* @icsk_accept_queue: FIFO of established children
* @icsk_bind_hash: Bind node
* @icsk_bind2_hash: Bind node in the bhash2 table
* @icsk_timeout: Timeout
* @icsk_retransmit_timer: Resend (no ack)
* @icsk_rto: Retransmit timeout
@ -85,7 +83,6 @@ struct inet_connection_sock {
struct inet_sock icsk_inet;
struct request_sock_queue icsk_accept_queue;
struct inet_bind_bucket *icsk_bind_hash;
struct inet_bind2_bucket *icsk_bind2_hash;
unsigned long icsk_timeout;
struct timer_list icsk_retransmit_timer;
struct timer_list icsk_delack_timer;

View File

@ -90,32 +90,11 @@ struct inet_bind_bucket {
struct hlist_head owners;
};
struct inet_bind2_bucket {
possible_net_t ib_net;
int l3mdev;
unsigned short port;
union {
#if IS_ENABLED(CONFIG_IPV6)
struct in6_addr v6_rcv_saddr;
#endif
__be32 rcv_saddr;
};
/* Node in the inet2_bind_hashbucket chain */
struct hlist_node node;
/* List of sockets hashed to this bucket */
struct hlist_head owners;
};
static inline struct net *ib_net(struct inet_bind_bucket *ib)
{
return read_pnet(&ib->ib_net);
}
static inline struct net *ib2_net(struct inet_bind2_bucket *ib)
{
return read_pnet(&ib->ib_net);
}
#define inet_bind_bucket_for_each(tb, head) \
hlist_for_each_entry(tb, head, node)
@ -124,15 +103,6 @@ struct inet_bind_hashbucket {
struct hlist_head chain;
};
/* This is synchronized using the inet_bind_hashbucket's spinlock.
* Instead of having separate spinlocks, the inet_bind2_hashbucket can share
* the inet_bind_hashbucket's given that in every case where the bhash2 table
* is useful, a lookup in the bhash table also occurs.
*/
struct inet_bind2_hashbucket {
struct hlist_head chain;
};
/* Sockets can be hashed in established or listening table.
* We must use different 'nulls' end-of-chain value for all hash buckets :
* A socket might transition from ESTABLISH to LISTEN state without
@ -164,12 +134,6 @@ struct inet_hashinfo {
*/
struct kmem_cache *bind_bucket_cachep;
struct inet_bind_hashbucket *bhash;
/* The 2nd binding table hashed by port and address.
* This is used primarily for expediting the resolution of bind
* conflicts.
*/
struct kmem_cache *bind2_bucket_cachep;
struct inet_bind2_hashbucket *bhash2;
unsigned int bhash_size;
/* The 2nd listener table hashed by local port and address */
@ -229,36 +193,6 @@ inet_bind_bucket_create(struct kmem_cache *cachep, struct net *net,
void inet_bind_bucket_destroy(struct kmem_cache *cachep,
struct inet_bind_bucket *tb);
static inline bool check_bind_bucket_match(struct inet_bind_bucket *tb,
struct net *net,
const unsigned short port,
int l3mdev)
{
return net_eq(ib_net(tb), net) && tb->port == port &&
tb->l3mdev == l3mdev;
}
struct inet_bind2_bucket *
inet_bind2_bucket_create(struct kmem_cache *cachep, struct net *net,
struct inet_bind2_hashbucket *head,
const unsigned short port, int l3mdev,
const struct sock *sk);
void inet_bind2_bucket_destroy(struct kmem_cache *cachep,
struct inet_bind2_bucket *tb);
struct inet_bind2_bucket *
inet_bind2_bucket_find(struct inet_hashinfo *hinfo, struct net *net,
const unsigned short port, int l3mdev,
struct sock *sk,
struct inet_bind2_hashbucket **head);
bool check_bind2_bucket_match_nulladdr(struct inet_bind2_bucket *tb,
struct net *net,
const unsigned short port,
int l3mdev,
const struct sock *sk);
static inline u32 inet_bhashfn(const struct net *net, const __u16 lport,
const u32 bhash_size)
{
@ -266,7 +200,7 @@ static inline u32 inet_bhashfn(const struct net *net, const __u16 lport,
}
void inet_bind_hash(struct sock *sk, struct inet_bind_bucket *tb,
struct inet_bind2_bucket *tb2, const unsigned short snum);
const unsigned short snum);
/* Caller must disable local BH processing. */
int __inet_inherit_port(const struct sock *sk, struct sock *child);

View File

@ -348,7 +348,6 @@ struct sk_filter;
* @sk_txtime_report_errors: set report errors mode for SO_TXTIME
* @sk_txtime_unused: unused txtime flags
* @ns_tracker: tracker for netns reference
* @sk_bind2_node: bind node in the bhash2 table
*/
struct sock {
/*
@ -538,7 +537,6 @@ struct sock {
#endif
struct rcu_head sk_rcu;
netns_tracker ns_tracker;
struct hlist_node sk_bind2_node;
};
enum sk_pacing {
@ -819,16 +817,6 @@ static inline void sk_add_bind_node(struct sock *sk,
hlist_add_head(&sk->sk_bind_node, list);
}
static inline void __sk_del_bind2_node(struct sock *sk)
{
__hlist_del(&sk->sk_bind2_node);
}
static inline void sk_add_bind2_node(struct sock *sk, struct hlist_head *list)
{
hlist_add_head(&sk->sk_bind2_node, list);
}
#define sk_for_each(__sk, list) \
hlist_for_each_entry(__sk, list, sk_node)
#define sk_for_each_rcu(__sk, list) \
@ -846,8 +834,6 @@ static inline void sk_add_bind2_node(struct sock *sk, struct hlist_head *list)
hlist_for_each_entry_safe(__sk, tmp, list, sk_node)
#define sk_for_each_bound(__sk, list) \
hlist_for_each_entry(__sk, list, sk_bind_node)
#define sk_for_each_bound_bhash2(__sk, list) \
hlist_for_each_entry(__sk, list, sk_bind2_node)
/**
* sk_for_each_entry_offset_rcu - iterate over a list at a given struct offset

View File

@ -1661,9 +1661,12 @@ static int ax25_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
int flags)
{
struct sock *sk = sock->sk;
struct sk_buff *skb;
struct sk_buff *skb, *last;
struct sk_buff_head *sk_queue;
int copied;
int err = 0;
int off = 0;
long timeo;
lock_sock(sk);
/*
@ -1675,10 +1678,29 @@ static int ax25_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
goto out;
}
/* Now we can treat all alike */
skb = skb_recv_datagram(sk, flags, &err);
if (skb == NULL)
goto out;
/* We need support for non-blocking reads. */
sk_queue = &sk->sk_receive_queue;
skb = __skb_try_recv_datagram(sk, sk_queue, flags, &off, &err, &last);
/* If no packet is available, release_sock(sk) and try again. */
if (!skb) {
if (err != -EAGAIN)
goto out;
release_sock(sk);
timeo = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
while (timeo && !__skb_wait_for_more_packets(sk, sk_queue, &err,
&timeo, last)) {
skb = __skb_try_recv_datagram(sk, sk_queue, flags, &off,
&err, &last);
if (skb)
break;
if (err != -EAGAIN)
goto done;
}
if (!skb)
goto done;
lock_sock(sk);
}
if (!sk_to_ax25(sk)->pidincl)
skb_pull(skb, 1); /* Remove PID */
@ -1725,6 +1747,7 @@ static int ax25_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
out:
release_sock(sk);
done:
return err;
}

View File

@ -1120,12 +1120,6 @@ static int __init dccp_init(void)
SLAB_HWCACHE_ALIGN | SLAB_ACCOUNT, NULL);
if (!dccp_hashinfo.bind_bucket_cachep)
goto out_free_hashinfo2;
dccp_hashinfo.bind2_bucket_cachep =
kmem_cache_create("dccp_bind2_bucket",
sizeof(struct inet_bind2_bucket), 0,
SLAB_HWCACHE_ALIGN | SLAB_ACCOUNT, NULL);
if (!dccp_hashinfo.bind2_bucket_cachep)
goto out_free_bind_bucket_cachep;
/*
* Size and allocate the main established and bind bucket
@ -1156,7 +1150,7 @@ static int __init dccp_init(void)
if (!dccp_hashinfo.ehash) {
DCCP_CRIT("Failed to allocate DCCP established hash table");
goto out_free_bind2_bucket_cachep;
goto out_free_bind_bucket_cachep;
}
for (i = 0; i <= dccp_hashinfo.ehash_mask; i++)
@ -1182,23 +1176,14 @@ static int __init dccp_init(void)
goto out_free_dccp_locks;
}
dccp_hashinfo.bhash2 = (struct inet_bind2_hashbucket *)
__get_free_pages(GFP_ATOMIC | __GFP_NOWARN, bhash_order);
if (!dccp_hashinfo.bhash2) {
DCCP_CRIT("Failed to allocate DCCP bind2 hash table");
goto out_free_dccp_bhash;
}
for (i = 0; i < dccp_hashinfo.bhash_size; i++) {
spin_lock_init(&dccp_hashinfo.bhash[i].lock);
INIT_HLIST_HEAD(&dccp_hashinfo.bhash[i].chain);
INIT_HLIST_HEAD(&dccp_hashinfo.bhash2[i].chain);
}
rc = dccp_mib_init();
if (rc)
goto out_free_dccp_bhash2;
goto out_free_dccp_bhash;
rc = dccp_ackvec_init();
if (rc)
@ -1222,38 +1207,30 @@ out_ackvec_exit:
dccp_ackvec_exit();
out_free_dccp_mib:
dccp_mib_exit();
out_free_dccp_bhash2:
free_pages((unsigned long)dccp_hashinfo.bhash2, bhash_order);
out_free_dccp_bhash:
free_pages((unsigned long)dccp_hashinfo.bhash, bhash_order);
out_free_dccp_locks:
inet_ehash_locks_free(&dccp_hashinfo);
out_free_dccp_ehash:
free_pages((unsigned long)dccp_hashinfo.ehash, ehash_order);
out_free_bind2_bucket_cachep:
kmem_cache_destroy(dccp_hashinfo.bind2_bucket_cachep);
out_free_bind_bucket_cachep:
kmem_cache_destroy(dccp_hashinfo.bind_bucket_cachep);
out_free_hashinfo2:
inet_hashinfo2_free_mod(&dccp_hashinfo);
out_fail:
dccp_hashinfo.bhash = NULL;
dccp_hashinfo.bhash2 = NULL;
dccp_hashinfo.ehash = NULL;
dccp_hashinfo.bind_bucket_cachep = NULL;
dccp_hashinfo.bind2_bucket_cachep = NULL;
return rc;
}
static void __exit dccp_fini(void)
{
int bhash_order = get_order(dccp_hashinfo.bhash_size *
sizeof(struct inet_bind_hashbucket));
ccid_cleanup_builtins();
dccp_mib_exit();
free_pages((unsigned long)dccp_hashinfo.bhash, bhash_order);
free_pages((unsigned long)dccp_hashinfo.bhash2, bhash_order);
free_pages((unsigned long)dccp_hashinfo.bhash,
get_order(dccp_hashinfo.bhash_size *
sizeof(struct inet_bind_hashbucket)));
free_pages((unsigned long)dccp_hashinfo.ehash,
get_order((dccp_hashinfo.ehash_mask + 1) *
sizeof(struct inet_ehash_bucket)));

View File

@ -117,32 +117,6 @@ bool inet_rcv_saddr_any(const struct sock *sk)
return !sk->sk_rcv_saddr;
}
static bool use_bhash2_on_bind(const struct sock *sk)
{
#if IS_ENABLED(CONFIG_IPV6)
int addr_type;
if (sk->sk_family == AF_INET6) {
addr_type = ipv6_addr_type(&sk->sk_v6_rcv_saddr);
return addr_type != IPV6_ADDR_ANY &&
addr_type != IPV6_ADDR_MAPPED;
}
#endif
return sk->sk_rcv_saddr != htonl(INADDR_ANY);
}
static u32 get_bhash2_nulladdr_hash(const struct sock *sk, struct net *net,
int port)
{
#if IS_ENABLED(CONFIG_IPV6)
struct in6_addr nulladdr = {};
if (sk->sk_family == AF_INET6)
return ipv6_portaddr_hash(net, &nulladdr, port);
#endif
return ipv4_portaddr_hash(net, 0, port);
}
void inet_get_local_port_range(struct net *net, int *low, int *high)
{
unsigned int seq;
@ -156,71 +130,16 @@ void inet_get_local_port_range(struct net *net, int *low, int *high)
}
EXPORT_SYMBOL(inet_get_local_port_range);
static bool bind_conflict_exist(const struct sock *sk, struct sock *sk2,
kuid_t sk_uid, bool relax,
bool reuseport_cb_ok, bool reuseport_ok)
{
int bound_dev_if2;
if (sk == sk2)
return false;
bound_dev_if2 = READ_ONCE(sk2->sk_bound_dev_if);
if (!sk->sk_bound_dev_if || !bound_dev_if2 ||
sk->sk_bound_dev_if == bound_dev_if2) {
if (sk->sk_reuse && sk2->sk_reuse &&
sk2->sk_state != TCP_LISTEN) {
if (!relax || (!reuseport_ok && sk->sk_reuseport &&
sk2->sk_reuseport && reuseport_cb_ok &&
(sk2->sk_state == TCP_TIME_WAIT ||
uid_eq(sk_uid, sock_i_uid(sk2)))))
return true;
} else if (!reuseport_ok || !sk->sk_reuseport ||
!sk2->sk_reuseport || !reuseport_cb_ok ||
(sk2->sk_state != TCP_TIME_WAIT &&
!uid_eq(sk_uid, sock_i_uid(sk2)))) {
return true;
}
}
return false;
}
static bool check_bhash2_conflict(const struct sock *sk,
struct inet_bind2_bucket *tb2, kuid_t sk_uid,
bool relax, bool reuseport_cb_ok,
bool reuseport_ok)
{
struct sock *sk2;
sk_for_each_bound_bhash2(sk2, &tb2->owners) {
if (sk->sk_family == AF_INET && ipv6_only_sock(sk2))
continue;
if (bind_conflict_exist(sk, sk2, sk_uid, relax,
reuseport_cb_ok, reuseport_ok))
return true;
}
return false;
}
/* This should be called only when the corresponding inet_bind_bucket spinlock
* is held
*/
static int inet_csk_bind_conflict(const struct sock *sk, int port,
struct inet_bind_bucket *tb,
struct inet_bind2_bucket *tb2, /* may be null */
static int inet_csk_bind_conflict(const struct sock *sk,
const struct inet_bind_bucket *tb,
bool relax, bool reuseport_ok)
{
struct inet_hashinfo *hinfo = sk->sk_prot->h.hashinfo;
kuid_t uid = sock_i_uid((struct sock *)sk);
struct sock_reuseport *reuseport_cb;
struct inet_bind2_hashbucket *head2;
bool reuseport_cb_ok;
struct sock *sk2;
struct net *net;
int l3mdev;
u32 hash;
bool reuseport_cb_ok;
bool reuse = sk->sk_reuse;
bool reuseport = !!sk->sk_reuseport;
struct sock_reuseport *reuseport_cb;
kuid_t uid = sock_i_uid((struct sock *)sk);
rcu_read_lock();
reuseport_cb = rcu_dereference(sk->sk_reuseport_cb);
@ -231,42 +150,40 @@ static int inet_csk_bind_conflict(const struct sock *sk, int port,
/*
* Unlike other sk lookup places we do not check
* for sk_net here, since _all_ the socks listed
* in tb->owners and tb2->owners list belong
* to the same net
* in tb->owners list belong to the same net - the
* one this bucket belongs to.
*/
if (!use_bhash2_on_bind(sk)) {
sk_for_each_bound(sk2, &tb->owners)
if (bind_conflict_exist(sk, sk2, uid, relax,
reuseport_cb_ok, reuseport_ok) &&
inet_rcv_saddr_equal(sk, sk2, true))
return true;
sk_for_each_bound(sk2, &tb->owners) {
int bound_dev_if2;
return false;
if (sk == sk2)
continue;
bound_dev_if2 = READ_ONCE(sk2->sk_bound_dev_if);
if ((!sk->sk_bound_dev_if ||
!bound_dev_if2 ||
sk->sk_bound_dev_if == bound_dev_if2)) {
if (reuse && sk2->sk_reuse &&
sk2->sk_state != TCP_LISTEN) {
if ((!relax ||
(!reuseport_ok &&
reuseport && sk2->sk_reuseport &&
reuseport_cb_ok &&
(sk2->sk_state == TCP_TIME_WAIT ||
uid_eq(uid, sock_i_uid(sk2))))) &&
inet_rcv_saddr_equal(sk, sk2, true))
break;
} else if (!reuseport_ok ||
!reuseport || !sk2->sk_reuseport ||
!reuseport_cb_ok ||
(sk2->sk_state != TCP_TIME_WAIT &&
!uid_eq(uid, sock_i_uid(sk2)))) {
if (inet_rcv_saddr_equal(sk, sk2, true))
break;
}
}
}
if (tb2 && check_bhash2_conflict(sk, tb2, uid, relax, reuseport_cb_ok,
reuseport_ok))
return true;
net = sock_net(sk);
/* check there's no conflict with an existing IPV6_ADDR_ANY (if ipv6) or
* INADDR_ANY (if ipv4) socket.
*/
hash = get_bhash2_nulladdr_hash(sk, net, port);
head2 = &hinfo->bhash2[hash & (hinfo->bhash_size - 1)];
l3mdev = inet_sk_bound_l3mdev(sk);
inet_bind_bucket_for_each(tb2, &head2->chain)
if (check_bind2_bucket_match_nulladdr(tb2, net, port, l3mdev, sk))
break;
if (tb2 && check_bhash2_conflict(sk, tb2, uid, relax, reuseport_cb_ok,
reuseport_ok))
return true;
return false;
return sk2 != NULL;
}
/*
@ -274,20 +191,16 @@ static int inet_csk_bind_conflict(const struct sock *sk, int port,
* inet_bind_hashbucket lock held.
*/
static struct inet_bind_hashbucket *
inet_csk_find_open_port(struct sock *sk, struct inet_bind_bucket **tb_ret,
struct inet_bind2_bucket **tb2_ret,
struct inet_bind2_hashbucket **head2_ret, int *port_ret)
inet_csk_find_open_port(struct sock *sk, struct inet_bind_bucket **tb_ret, int *port_ret)
{
struct inet_hashinfo *hinfo = sk->sk_prot->h.hashinfo;
struct inet_bind2_hashbucket *head2;
int port = 0;
struct inet_bind_hashbucket *head;
struct net *net = sock_net(sk);
bool relax = false;
int i, low, high, attempt_half;
struct inet_bind2_bucket *tb2;
struct inet_bind_bucket *tb;
u32 remaining, offset;
bool relax = false;
int port = 0;
int l3mdev;
l3mdev = inet_sk_bound_l3mdev(sk);
@ -326,12 +239,10 @@ other_parity_scan:
head = &hinfo->bhash[inet_bhashfn(net, port,
hinfo->bhash_size)];
spin_lock_bh(&head->lock);
tb2 = inet_bind2_bucket_find(hinfo, net, port, l3mdev, sk,
&head2);
inet_bind_bucket_for_each(tb, &head->chain)
if (check_bind_bucket_match(tb, net, port, l3mdev)) {
if (!inet_csk_bind_conflict(sk, port, tb, tb2,
relax, false))
if (net_eq(ib_net(tb), net) && tb->l3mdev == l3mdev &&
tb->port == port) {
if (!inet_csk_bind_conflict(sk, tb, relax, false))
goto success;
goto next_port;
}
@ -361,8 +272,6 @@ next_port:
success:
*port_ret = port;
*tb_ret = tb;
*tb2_ret = tb2;
*head2_ret = head2;
return head;
}
@ -458,81 +367,54 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
{
bool reuse = sk->sk_reuse && sk->sk_state != TCP_LISTEN;
struct inet_hashinfo *hinfo = sk->sk_prot->h.hashinfo;
bool bhash_created = false, bhash2_created = false;
struct inet_bind2_bucket *tb2 = NULL;
struct inet_bind2_hashbucket *head2;
struct inet_bind_bucket *tb = NULL;
int ret = 1, port = snum;
struct inet_bind_hashbucket *head;
struct net *net = sock_net(sk);
int ret = 1, port = snum;
bool found_port = false;
struct inet_bind_bucket *tb = NULL;
int l3mdev;
l3mdev = inet_sk_bound_l3mdev(sk);
if (!port) {
head = inet_csk_find_open_port(sk, &tb, &tb2, &head2, &port);
head = inet_csk_find_open_port(sk, &tb, &port);
if (!head)
return ret;
if (tb && tb2)
goto success;
found_port = true;
} else {
head = &hinfo->bhash[inet_bhashfn(net, port,
hinfo->bhash_size)];
spin_lock_bh(&head->lock);
inet_bind_bucket_for_each(tb, &head->chain)
if (check_bind_bucket_match(tb, net, port, l3mdev))
break;
tb2 = inet_bind2_bucket_find(hinfo, net, port, l3mdev, sk,
&head2);
}
if (!tb) {
tb = inet_bind_bucket_create(hinfo->bind_bucket_cachep, net,
head, port, l3mdev);
if (!tb)
goto fail_unlock;
bhash_created = true;
goto tb_not_found;
goto success;
}
if (!tb2) {
tb2 = inet_bind2_bucket_create(hinfo->bind2_bucket_cachep,
net, head2, port, l3mdev, sk);
if (!tb2)
goto fail_unlock;
bhash2_created = true;
}
/* If we had to find an open port, we already checked for conflicts */
if (!found_port && !hlist_empty(&tb->owners)) {
head = &hinfo->bhash[inet_bhashfn(net, port,
hinfo->bhash_size)];
spin_lock_bh(&head->lock);
inet_bind_bucket_for_each(tb, &head->chain)
if (net_eq(ib_net(tb), net) && tb->l3mdev == l3mdev &&
tb->port == port)
goto tb_found;
tb_not_found:
tb = inet_bind_bucket_create(hinfo->bind_bucket_cachep,
net, head, port, l3mdev);
if (!tb)
goto fail_unlock;
tb_found:
if (!hlist_empty(&tb->owners)) {
if (sk->sk_reuse == SK_FORCE_REUSE)
goto success;
if ((tb->fastreuse > 0 && reuse) ||
sk_reuseport_match(tb, sk))
goto success;
if (inet_csk_bind_conflict(sk, port, tb, tb2, true, true))
if (inet_csk_bind_conflict(sk, tb, true, true))
goto fail_unlock;
}
success:
inet_csk_update_fastreuse(tb, sk);
if (!inet_csk(sk)->icsk_bind_hash)
inet_bind_hash(sk, tb, tb2, port);
inet_bind_hash(sk, tb, port);
WARN_ON(inet_csk(sk)->icsk_bind_hash != tb);
WARN_ON(inet_csk(sk)->icsk_bind2_hash != tb2);
ret = 0;
fail_unlock:
if (ret) {
if (bhash_created)
inet_bind_bucket_destroy(hinfo->bind_bucket_cachep, tb);
if (bhash2_created)
inet_bind2_bucket_destroy(hinfo->bind2_bucket_cachep,
tb2);
}
spin_unlock_bh(&head->lock);
return ret;
}
@ -1079,7 +961,6 @@ struct sock *inet_csk_clone_lock(const struct sock *sk,
inet_sk_set_state(newsk, TCP_SYN_RECV);
newicsk->icsk_bind_hash = NULL;
newicsk->icsk_bind2_hash = NULL;
inet_sk(newsk)->inet_dport = inet_rsk(req)->ir_rmt_port;
inet_sk(newsk)->inet_num = inet_rsk(req)->ir_num;

View File

@ -81,41 +81,6 @@ struct inet_bind_bucket *inet_bind_bucket_create(struct kmem_cache *cachep,
return tb;
}
struct inet_bind2_bucket *inet_bind2_bucket_create(struct kmem_cache *cachep,
struct net *net,
struct inet_bind2_hashbucket *head,
const unsigned short port,
int l3mdev,
const struct sock *sk)
{
struct inet_bind2_bucket *tb = kmem_cache_alloc(cachep, GFP_ATOMIC);
if (tb) {
write_pnet(&tb->ib_net, net);
tb->l3mdev = l3mdev;
tb->port = port;
#if IS_ENABLED(CONFIG_IPV6)
if (sk->sk_family == AF_INET6)
tb->v6_rcv_saddr = sk->sk_v6_rcv_saddr;
else
#endif
tb->rcv_saddr = sk->sk_rcv_saddr;
INIT_HLIST_HEAD(&tb->owners);
hlist_add_head(&tb->node, &head->chain);
}
return tb;
}
static bool bind2_bucket_addr_match(struct inet_bind2_bucket *tb2, struct sock *sk)
{
#if IS_ENABLED(CONFIG_IPV6)
if (sk->sk_family == AF_INET6)
return ipv6_addr_equal(&tb2->v6_rcv_saddr,
&sk->sk_v6_rcv_saddr);
#endif
return tb2->rcv_saddr == sk->sk_rcv_saddr;
}
/*
* Caller must hold hashbucket lock for this tb with local BH disabled
*/
@ -127,25 +92,12 @@ void inet_bind_bucket_destroy(struct kmem_cache *cachep, struct inet_bind_bucket
}
}
/* Caller must hold the lock for the corresponding hashbucket in the bhash table
* with local BH disabled
*/
void inet_bind2_bucket_destroy(struct kmem_cache *cachep, struct inet_bind2_bucket *tb)
{
if (hlist_empty(&tb->owners)) {
__hlist_del(&tb->node);
kmem_cache_free(cachep, tb);
}
}
void inet_bind_hash(struct sock *sk, struct inet_bind_bucket *tb,
struct inet_bind2_bucket *tb2, const unsigned short snum)
const unsigned short snum)
{
inet_sk(sk)->inet_num = snum;
sk_add_bind_node(sk, &tb->owners);
inet_csk(sk)->icsk_bind_hash = tb;
sk_add_bind2_node(sk, &tb2->owners);
inet_csk(sk)->icsk_bind2_hash = tb2;
}
/*
@ -157,7 +109,6 @@ static void __inet_put_port(struct sock *sk)
const int bhash = inet_bhashfn(sock_net(sk), inet_sk(sk)->inet_num,
hashinfo->bhash_size);
struct inet_bind_hashbucket *head = &hashinfo->bhash[bhash];
struct inet_bind2_bucket *tb2;
struct inet_bind_bucket *tb;
spin_lock(&head->lock);
@ -166,13 +117,6 @@ static void __inet_put_port(struct sock *sk)
inet_csk(sk)->icsk_bind_hash = NULL;
inet_sk(sk)->inet_num = 0;
inet_bind_bucket_destroy(hashinfo->bind_bucket_cachep, tb);
if (inet_csk(sk)->icsk_bind2_hash) {
tb2 = inet_csk(sk)->icsk_bind2_hash;
__sk_del_bind2_node(sk);
inet_csk(sk)->icsk_bind2_hash = NULL;
inet_bind2_bucket_destroy(hashinfo->bind2_bucket_cachep, tb2);
}
spin_unlock(&head->lock);
}
@ -189,19 +133,14 @@ int __inet_inherit_port(const struct sock *sk, struct sock *child)
struct inet_hashinfo *table = sk->sk_prot->h.hashinfo;
unsigned short port = inet_sk(child)->inet_num;
const int bhash = inet_bhashfn(sock_net(sk), port,
table->bhash_size);
table->bhash_size);
struct inet_bind_hashbucket *head = &table->bhash[bhash];
struct inet_bind2_hashbucket *head_bhash2;
bool created_inet_bind_bucket = false;
struct net *net = sock_net(sk);
struct inet_bind2_bucket *tb2;
struct inet_bind_bucket *tb;
int l3mdev;
spin_lock(&head->lock);
tb = inet_csk(sk)->icsk_bind_hash;
tb2 = inet_csk(sk)->icsk_bind2_hash;
if (unlikely(!tb || !tb2)) {
if (unlikely(!tb)) {
spin_unlock(&head->lock);
return -ENOENT;
}
@ -214,45 +153,25 @@ int __inet_inherit_port(const struct sock *sk, struct sock *child)
* as that of the child socket. We have to look up or
* create a new bind bucket for the child here. */
inet_bind_bucket_for_each(tb, &head->chain) {
if (check_bind_bucket_match(tb, net, port, l3mdev))
if (net_eq(ib_net(tb), sock_net(sk)) &&
tb->l3mdev == l3mdev && tb->port == port)
break;
}
if (!tb) {
tb = inet_bind_bucket_create(table->bind_bucket_cachep,
net, head, port, l3mdev);
sock_net(sk), head, port,
l3mdev);
if (!tb) {
spin_unlock(&head->lock);
return -ENOMEM;
}
created_inet_bind_bucket = true;
}
inet_csk_update_fastreuse(tb, child);
goto bhash2_find;
} else if (!bind2_bucket_addr_match(tb2, child)) {
l3mdev = inet_sk_bound_l3mdev(sk);
bhash2_find:
tb2 = inet_bind2_bucket_find(table, net, port, l3mdev, child,
&head_bhash2);
if (!tb2) {
tb2 = inet_bind2_bucket_create(table->bind2_bucket_cachep,
net, head_bhash2, port,
l3mdev, child);
if (!tb2)
goto error;
}
}
inet_bind_hash(child, tb, tb2, port);
inet_bind_hash(child, tb, port);
spin_unlock(&head->lock);
return 0;
error:
if (created_inet_bind_bucket)
inet_bind_bucket_destroy(table->bind_bucket_cachep, tb);
spin_unlock(&head->lock);
return -ENOMEM;
}
EXPORT_SYMBOL_GPL(__inet_inherit_port);
@ -756,76 +675,6 @@ void inet_unhash(struct sock *sk)
}
EXPORT_SYMBOL_GPL(inet_unhash);
static bool check_bind2_bucket_match(struct inet_bind2_bucket *tb,
struct net *net, unsigned short port,
int l3mdev, struct sock *sk)
{
#if IS_ENABLED(CONFIG_IPV6)
if (sk->sk_family == AF_INET6)
return net_eq(ib2_net(tb), net) && tb->port == port &&
tb->l3mdev == l3mdev &&
ipv6_addr_equal(&tb->v6_rcv_saddr, &sk->sk_v6_rcv_saddr);
else
#endif
return net_eq(ib2_net(tb), net) && tb->port == port &&
tb->l3mdev == l3mdev && tb->rcv_saddr == sk->sk_rcv_saddr;
}
bool check_bind2_bucket_match_nulladdr(struct inet_bind2_bucket *tb,
struct net *net, const unsigned short port,
int l3mdev, const struct sock *sk)
{
#if IS_ENABLED(CONFIG_IPV6)
struct in6_addr nulladdr = {};
if (sk->sk_family == AF_INET6)
return net_eq(ib2_net(tb), net) && tb->port == port &&
tb->l3mdev == l3mdev &&
ipv6_addr_equal(&tb->v6_rcv_saddr, &nulladdr);
else
#endif
return net_eq(ib2_net(tb), net) && tb->port == port &&
tb->l3mdev == l3mdev && tb->rcv_saddr == 0;
}
static struct inet_bind2_hashbucket *
inet_bhashfn_portaddr(struct inet_hashinfo *hinfo, const struct sock *sk,
const struct net *net, unsigned short port)
{
u32 hash;
#if IS_ENABLED(CONFIG_IPV6)
if (sk->sk_family == AF_INET6)
hash = ipv6_portaddr_hash(net, &sk->sk_v6_rcv_saddr, port);
else
#endif
hash = ipv4_portaddr_hash(net, sk->sk_rcv_saddr, port);
return &hinfo->bhash2[hash & (hinfo->bhash_size - 1)];
}
/* This should only be called when the spinlock for the socket's corresponding
* bind_hashbucket is held
*/
struct inet_bind2_bucket *
inet_bind2_bucket_find(struct inet_hashinfo *hinfo, struct net *net,
const unsigned short port, int l3mdev, struct sock *sk,
struct inet_bind2_hashbucket **head)
{
struct inet_bind2_bucket *bhash2 = NULL;
struct inet_bind2_hashbucket *h;
h = inet_bhashfn_portaddr(hinfo, sk, net, port);
inet_bind_bucket_for_each(bhash2, &h->chain) {
if (check_bind2_bucket_match(bhash2, net, port, l3mdev, sk))
break;
}
if (head)
*head = h;
return bhash2;
}
/* RFC 6056 3.3.4. Algorithm 4: Double-Hash Port Selection Algorithm
* Note that we use 32bit integers (vs RFC 'short integers')
* because 2^16 is not a multiple of num_ephemeral and this
@ -846,13 +695,10 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
{
struct inet_hashinfo *hinfo = death_row->hashinfo;
struct inet_timewait_sock *tw = NULL;
struct inet_bind2_hashbucket *head2;
struct inet_bind_hashbucket *head;
int port = inet_sk(sk)->inet_num;
struct net *net = sock_net(sk);
struct inet_bind2_bucket *tb2;
struct inet_bind_bucket *tb;
bool tb_created = false;
u32 remaining, offset;
int ret, i, low, high;
int l3mdev;
@ -909,7 +755,8 @@ other_parity_scan:
* the established check is already unique enough.
*/
inet_bind_bucket_for_each(tb, &head->chain) {
if (check_bind_bucket_match(tb, net, port, l3mdev)) {
if (net_eq(ib_net(tb), net) && tb->l3mdev == l3mdev &&
tb->port == port) {
if (tb->fastreuse >= 0 ||
tb->fastreuseport >= 0)
goto next_port;
@ -927,7 +774,6 @@ other_parity_scan:
spin_unlock_bh(&head->lock);
return -ENOMEM;
}
tb_created = true;
tb->fastreuse = -1;
tb->fastreuseport = -1;
goto ok;
@ -943,17 +789,6 @@ next_port:
return -EADDRNOTAVAIL;
ok:
/* Find the corresponding tb2 bucket since we need to
* add the socket to the bhash2 table as well
*/
tb2 = inet_bind2_bucket_find(hinfo, net, port, l3mdev, sk, &head2);
if (!tb2) {
tb2 = inet_bind2_bucket_create(hinfo->bind2_bucket_cachep, net,
head2, port, l3mdev, sk);
if (!tb2)
goto error;
}
/* Here we want to add a little bit of randomness to the next source
* port that will be chosen. We use a max() with a random here so that
* on low contention the randomness is maximal and on high contention
@ -963,7 +798,7 @@ ok:
WRITE_ONCE(table_perturb[index], READ_ONCE(table_perturb[index]) + i + 2);
/* Head lock still held and bh's disabled */
inet_bind_hash(sk, tb, tb2, port);
inet_bind_hash(sk, tb, port);
if (sk_unhashed(sk)) {
inet_sk(sk)->inet_sport = htons(port);
inet_ehash_nolisten(sk, (struct sock *)tw, NULL);
@ -975,12 +810,6 @@ ok:
inet_twsk_deschedule_put(tw);
local_bh_enable();
return 0;
error:
if (tb_created)
inet_bind_bucket_destroy(hinfo->bind_bucket_cachep, tb);
spin_unlock_bh(&head->lock);
return -ENOMEM;
}
/*

View File

@ -4604,12 +4604,6 @@ void __init tcp_init(void)
SLAB_HWCACHE_ALIGN | SLAB_PANIC |
SLAB_ACCOUNT,
NULL);
tcp_hashinfo.bind2_bucket_cachep =
kmem_cache_create("tcp_bind2_bucket",
sizeof(struct inet_bind2_bucket), 0,
SLAB_HWCACHE_ALIGN | SLAB_PANIC |
SLAB_ACCOUNT,
NULL);
/* Size and allocate the main established and bind bucket
* hash tables.
@ -4632,9 +4626,8 @@ void __init tcp_init(void)
if (inet_ehash_locks_alloc(&tcp_hashinfo))
panic("TCP: failed to alloc ehash_locks");
tcp_hashinfo.bhash =
alloc_large_system_hash("TCP bind bhash tables",
sizeof(struct inet_bind_hashbucket) +
sizeof(struct inet_bind2_hashbucket),
alloc_large_system_hash("TCP bind",
sizeof(struct inet_bind_hashbucket),
tcp_hashinfo.ehash_mask + 1,
17, /* one slot per 128 KB of memory */
0,
@ -4643,12 +4636,9 @@ void __init tcp_init(void)
0,
64 * 1024);
tcp_hashinfo.bhash_size = 1U << tcp_hashinfo.bhash_size;
tcp_hashinfo.bhash2 =
(struct inet_bind2_hashbucket *)(tcp_hashinfo.bhash + tcp_hashinfo.bhash_size);
for (i = 0; i < tcp_hashinfo.bhash_size; i++) {
spin_lock_init(&tcp_hashinfo.bhash[i].lock);
INIT_HLIST_HEAD(&tcp_hashinfo.bhash[i].chain);
INIT_HLIST_HEAD(&tcp_hashinfo.bhash2[i].chain);
}

View File

@ -37,4 +37,3 @@ gro
ioam6_parser
toeplitz
cmsg_sender
bind_bhash_test

View File

@ -59,7 +59,6 @@ TEST_GEN_FILES += toeplitz
TEST_GEN_FILES += cmsg_sender
TEST_GEN_FILES += stress_reuseport_listen
TEST_PROGS += test_vxlan_vnifiltering.sh
TEST_GEN_FILES += bind_bhash_test
TEST_FILES := settings
@ -70,5 +69,4 @@ include bpf/Makefile
$(OUTPUT)/reuseport_bpf_numa: LDLIBS += -lnuma
$(OUTPUT)/tcp_mmap: LDLIBS += -lpthread
$(OUTPUT)/bind_bhash_test: LDLIBS += -lpthread
$(OUTPUT)/tcp_inq: LDLIBS += -lpthread

View File

@ -1,119 +0,0 @@
// SPDX-License-Identifier: GPL-2.0
/*
* This times how long it takes to bind to a port when the port already
* has multiple sockets in its bhash table.
*
* In the setup(), we populate the port's bhash table with
* MAX_THREADS * MAX_CONNECTIONS number of entries.
*/
#include <unistd.h>
#include <stdio.h>
#include <netdb.h>
#include <pthread.h>
#define MAX_THREADS 600
#define MAX_CONNECTIONS 40
static const char *bind_addr = "::1";
static const char *port;
static int fd_array[MAX_THREADS][MAX_CONNECTIONS];
static int bind_socket(int opt, const char *addr)
{
struct addrinfo *res, hint = {};
int sock_fd, reuse = 1, err;
sock_fd = socket(AF_INET6, SOCK_STREAM, 0);
if (sock_fd < 0) {
perror("socket fd err");
return -1;
}
hint.ai_family = AF_INET6;
hint.ai_socktype = SOCK_STREAM;
err = getaddrinfo(addr, port, &hint, &res);
if (err) {
perror("getaddrinfo failed");
return -1;
}
if (opt) {
err = setsockopt(sock_fd, SOL_SOCKET, opt, &reuse, sizeof(reuse));
if (err) {
perror("setsockopt failed");
return -1;
}
}
err = bind(sock_fd, res->ai_addr, res->ai_addrlen);
if (err) {
perror("failed to bind to port");
return -1;
}
return sock_fd;
}
static void *setup(void *arg)
{
int sock_fd, i;
int *array = (int *)arg;
for (i = 0; i < MAX_CONNECTIONS; i++) {
sock_fd = bind_socket(SO_REUSEADDR | SO_REUSEPORT, bind_addr);
if (sock_fd < 0)
return NULL;
array[i] = sock_fd;
}
return NULL;
}
int main(int argc, const char *argv[])
{
int listener_fd, sock_fd, i, j;
pthread_t tid[MAX_THREADS];
clock_t begin, end;
if (argc != 2) {
printf("Usage: listener <port>\n");
return -1;
}
port = argv[1];
listener_fd = bind_socket(SO_REUSEADDR | SO_REUSEPORT, bind_addr);
if (listen(listener_fd, 100) < 0) {
perror("listen failed");
return -1;
}
/* Set up threads to populate the bhash table entry for the port */
for (i = 0; i < MAX_THREADS; i++)
pthread_create(&tid[i], NULL, setup, fd_array[i]);
for (i = 0; i < MAX_THREADS; i++)
pthread_join(tid[i], NULL);
begin = clock();
/* Bind to the same port on a different address */
sock_fd = bind_socket(0, "2001:0db8:0:f101::1");
end = clock();
printf("time spent = %f\n", (double)(end - begin) / CLOCKS_PER_SEC);
/* clean up */
close(sock_fd);
close(listener_fd);
for (i = 0; i < MAX_THREADS; i++) {
for (j = 0; i < MAX_THREADS; i++)
close(fd_array[i][j]);
}
return 0;
}