From aac2f1bf14d07c8f13048915f39df4a527350c9a Mon Sep 17 00:00:00 2001 From: Jacob Keller Date: Thu, 21 Aug 2014 06:17:59 +0000 Subject: [PATCH] ixgbe: limit combined total of macvlan and SR-IOV VFs Hardware has a limited number of pools available (64). Previously, no checks were in place to limit the number of accelerated macvlan devices based on the number of pools. Normally this would be ok, because there was already a limit for these well below the number of available pools. However, SR-IOV uses the very same pools. Therefor, we need to ensure that the total number of pools (number of VFs plus the number of non-VF pools in use for accelerated macvlans) does not exceed the number of pools available in hardware. This patch resolves a kernel NULL pointer dereference caused by the following commands: $modprobe ixgbe max_vfs=63 $ethtool -K eth2 l2-fwd-offload on $ip link add link eth2 macvlan0 type macvlan $ip link set dev macvlan0 up [ 992.950080] BUG: unable to handle kernel NULL pointer dereference at 0000000000000056 [ 992.951109] IP: [] ixgbe_disable_fwd_ring+0x1e/0xf0 [ixgbe] [ 992.951684] PGD 22a80e067 PUD 232e9b067 PMD 0 [ 992.952389] Oops: 0000 [#1] SMP [ 992.953014] Modules linked in: nfsd lockd nfs_acl exportfs auth_rpcgss oid_registry sunrpc bridge stp llc vhost_net macvtap macvlan vhost tun kvm_intel kvm ioatdma ixgbe mdio igb dca [ 992.956042] CPU: 2 PID: 11928 Comm: ifconfig Not tainted 3.16.0-rc6-net-next-07-29-2014-FCoE+ #1 [ 992.956915] Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.03.0003.041920141333 04/19/2014 [ 992.957791] task: ffff8804341c0000 ti: ffff8801d7dc8000 task.ti: ffff8801d7dc8000 [ 992.958660] RIP: 0010:[] [] ixgbe_disable_fwd_ring+0x1e/0xf0 [ixgbe] [ 992.959613] RSP: 0018:ffff8801d7dcbbb8 EFLAGS: 00010286 [ 992.960093] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000001 [ 992.960575] RDX: ffff880232eb7000 RSI: 0000000000000000 RDI: ffff88022dc05800 [ 992.961059] RBP: ffff8801d7dcbbd8 R08: 0000000000000000 R09: 0000000000000000 [ 992.961541] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88022ec20980 [ 992.962023] R13: ffff880232eb7000 R14: 0000000000000001 R15: 0000000000000001 [ 992.962508] FS: 00007fab264887a0(0000) GS:ffff880237640000(0000) knlGS:0000000000000000 [ 992.963378] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 992.963858] CR2: 0000000000000056 CR3: 000000022a939000 CR4: 00000000001427e0 [ 992.964340] Stack: [ 992.964806] ffff88022ec28840 ffff88022ec20980 ffff88022dc05800 ffff880232eb7000 [ 992.965976] ffff8801d7dcbc28 ffffffffa003bae8 ffff8801d7dcbbe8 0000000000000400 [ 992.967147] 000000000000000d ffff88022ec20980 ffff88022ec20000 ffff88022dc05800 [ 992.968319] Call Trace: [ 992.968795] [] ixgbe_fwd_ring_up+0x88/0x280 [ixgbe] [ 992.969284] [] ixgbe_fwd_add+0x173/0x220 [ixgbe] [ 992.969767] [] macvlan_open+0x1bc/0x230 [macvlan] [ 992.970256] [] __dev_open+0xd7/0x150 [ 992.970735] [] __dev_change_flags+0xa7/0x170 [ 992.971220] [] dev_change_flags+0x2b/0x70 [ 992.971703] [] devinet_ioctl+0x602/0x6d0 [ 992.972184] [] inet_ioctl+0x78/0x90 [ 992.972666] [] sock_do_ioctl+0x2b/0x70 [ 992.973146] [] sock_ioctl+0x6d/0x260 [ 992.973627] [] do_vfs_ioctl+0x84/0x540 [ 992.974109] [] ? final_putname+0x21/0x50 [ 992.974593] [] ? sysret_check+0x22/0x5d [ 992.975073] [] SyS_ioctl+0x91/0xa0 [ 992.975550] [] system_call_fastpath+0x16/0x1b [ 992.976026] Code: ff 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 4c 89 65 f0 48 89 f3 4c 89 6d f8 4c 8b a7 08 02 00 00 <44> 0f b6 6e 56 44 03 af 14 02 00 00 4c 89 e7 e8 5e f2 ff ff be [ 992.982261] RIP [] ixgbe_disable_fwd_ring+0x1e/0xf0 [ixgbe] [ 992.983212] RSP [ 992.983681] CR2: 0000000000000056 [ 992.984248] ---[ end trace 9f54802b5cc3638b ]--- Cc: John Fastabend Signed-off-by: Jacob Keller Tested-by: Phil Schmitt Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 8 ++++++++ drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c | 14 ++++++++------ 2 files changed, 16 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index bc3eff7bbedc..5a3efd9f9de0 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -7840,9 +7840,17 @@ static void *ixgbe_fwd_add(struct net_device *pdev, struct net_device *vdev) { struct ixgbe_fwd_adapter *fwd_adapter = NULL; struct ixgbe_adapter *adapter = netdev_priv(pdev); + int used_pools = adapter->num_vfs + adapter->num_rx_pools; unsigned int limit; int pool, err; + /* Hardware has a limited number of available pools. Each VF, and the + * PF require a pool. Check to ensure we don't attempt to use more + * then the available number of pools. + */ + if (used_pools >= IXGBE_MAX_VF_FUNCTIONS) + return ERR_PTR(-EINVAL); + #ifdef CONFIG_RPS if (vdev->num_rx_queues != vdev->num_tx_queues) { netdev_info(pdev, "%s: Only supports a single queue count for TX and RX\n", diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c index c14d4d89672f..706fc69aa0c5 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c @@ -250,13 +250,15 @@ static int ixgbe_pci_sriov_enable(struct pci_dev *dev, int num_vfs) if (err) return err; - /* While the SR-IOV capability structure reports total VFs to be - * 64 we limit the actual number that can be allocated to 63 so - * that some transmit/receive resources can be reserved to the - * PF. The PCI bus driver already checks for other values out of - * range. + /* While the SR-IOV capability structure reports total VFs to be 64, + * we have to limit the actual number allocated based on two factors. + * First, we reserve some transmit/receive resources for the PF. + * Second, VMDQ also uses the same pools that SR-IOV does. We need to + * account for this, so that we don't accidentally allocate more VFs + * than we have available pools. The PCI bus driver already checks for + * other values out of range. */ - if (num_vfs > IXGBE_MAX_VFS_DRV_LIMIT) + if ((num_vfs + adapter->num_rx_pools) > IXGBE_MAX_VF_FUNCTIONS) return -EPERM; adapter->num_vfs = num_vfs;