linux

Author	SHA1	Message	Date
Paul Durrant	c2d09fde72	xen-netback: use hash value from the frontend My recent patch to include/xen/interface/io/netif.h defines a new extra info type that can be used to pass hash values between backend and guest frontend. This patch adds code to xen-netback to use the value in a hash extra info fragment passed from the guest frontend in a transmit-side (i.e. netback receive side) packet to set the skb hash accordingly. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-16 13:35:56 -04:00
Paul Durrant	f07f989338	xen-netback: pass hash value to the frontend My recent patch to include/xen/interface/io/netif.h defines a new extra info type that can be used to pass hash values between backend and guest frontend. This patch adds code to xen-netback to pass hash values calculated for guest receive-side packets (i.e. netback transmit side) to the frontend. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-16 13:35:56 -04:00
Paul Durrant	40d8abdee8	xen-netback: add control protocol implementation My recent patch to include/xen/interface/io/netif.h defines a new shared ring (in addition to the rx and tx rings) for passing control messages from a VM frontend driver to a backend driver. A previous patch added the necessary boilerplate for mapping the control ring from the frontend, should it be created. This patch adds implementations for each of the defined protocol messages. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-16 13:35:56 -04:00
Paul Durrant	4e15ee2cb4	xen-netback: add control ring boilerplate My recent patch to include/xen/interface/io/netif.h defines a new shared ring (in addition to the rx and tx rings) for passing control messages from a VM frontend driver to a backend driver. This patch adds the necessary code to xen-netback to map this new shared ring, should it be created by a frontend, but does not add implementations for any of the defined protocol messages. These are added in a subsequent patch for clarity. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-16 13:35:56 -04:00
Paul Durrant	72eec92acc	xen-netback: fix extra_info handling in xenvif_tx_err() Patch `562abd39` "xen-netback: support multiple extra info fragments passed from frontend" contained a mistake which can result in an in- correct number of responses being generated when handling errors encountered when processing packets containing extra info fragments. This patch fixes the problem. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reported-by: Jan Beulich <JBeulich@suse.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-13 01:58:57 -04:00
Paul Durrant	562abd39a1	xen-netback: support multiple extra info fragments passed from frontend The code does not currently support a frontend passing multiple extra info fragments to the backend in a tx request. The xenvif_get_extras() function handles multiple extra_info fragments but make_tx_response() assumes there is only ever a single extra info fragment. This patch modifies xenvif_get_extras() to pass back a count of extra info fragments, which is then passed to make_tx_response() (after possibly being stashed in pending_tx_info for deferred responses). Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-13 22:08:01 -04:00
David Vrabel	99a2dea50d	xen-netback: use skb to determine number of required guest Rx requests Using the MTU or GSO size to determine the number of required guest Rx requests for an skb was subtly broken since these value may change at runtime. After `1650d5455b` (xen-netback: always fully coalesce guest Rx packets) we always fully pack a packet into its guest Rx slots. Calculating the number of required slots from the packet length is then easy. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-15 15:13:18 -05:00
David Vrabel	68a33bfd84	xen-netback: use RING_COPY_REQUEST() throughout Instead of open-coding memcpy()s and directly accessing Tx and Rx requests, use the new RING_COPY_REQUEST() that ensures the local copy is correct. This is more than is strictly necessary for guest Rx requests since only the id and gref fields are used and it is harmless if the frontend modifies these. This is part of XSA155. CC: stable@vger.kernel.org Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2015-12-18 10:00:28 -05:00
David Vrabel	0f589967a7	xen-netback: don't use last request to determine minimum Tx credit The last from guest transmitted request gives no indication about the minimum amount of credit that the guest might need to send a packet since the last packet might have been a small one. Instead allow for the worst case 128 KiB packet. This is part of XSA155. CC: stable@vger.kernel.org Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2015-12-18 10:00:23 -05:00
Julien Grall	d0089e8a0e	net/xen-netback: Make it running on 64KB page granularity The PV network protocol is using 4KB page granularity. The goal of this patch is to allow a Linux using 64KB page granularity working as a network backend on a non-modified Xen. It's only necessary to adapt the ring size and break skb data in small chunk of 4KB. The rest of the code is relying on the grant table code. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-10-23 14:20:41 +01:00
Julien Grall	a0f2e80fcd	net/xen-netback: xenvif_gop_frag_copy: move GSO check out of the loop The skb doesn't change within the function. Therefore it's only necessary to check if we need GSO once at the beginning. Signed-off-by: Julien Grall <julien.grall@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-10-23 14:20:32 +01:00
Linus Torvalds	06ab838c20	xen: MFN/GFN/BFN terminology changes for 4.3-rc0 - Use the correct GFN/BFN terms more consistently. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJV8VRMAAoJEFxbo/MsZsTRiGQH/i/jrAJUJfrFC2PINaA2gDwe O0dlrkCiSgAYChGmxxxXZQSPM5Po5+EbT/dLjZ/uvSooeorM9RYY/mFo7ut/qLep 4pyQUuwGtebWGBZTrj9sygUVXVhgJnyoZxskNUbhj9zvP7hb9++IiI78mzne6cpj lCh/7Z2dgpfRcKlNRu+qpzP79Uc7OqIfDK+IZLrQKlXa7IQDJTQYoRjbKpfCtmMV BEG3kN9ESx5tLzYiAfxvaxVXl9WQFEoktqe9V8IgOQlVRLgJ2DQWS6vmraGrokWM 3HDOCHtRCXlPhu1Vnrp0R9OgqWbz8FJnmVAndXT8r3Nsjjmd0aLwhJx7YAReO/4= =JDia -----END PGP SIGNATURE----- Merge tag 'for-linus-4.3-rc0b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen terminology fixes from David Vrabel: "Use the correct GFN/BFN terms more consistently" * tag 'for-linus-4.3-rc0b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/xenbus: Rename the variable xen_store_mfn to xen_store_gfn xen/privcmd: Further s/MFN/GFN/ clean-up hvc/xen: Further s/MFN/GFN clean-up video/xen-fbfront: Further s/MFN/GFN clean-up xen/tmem: Use xen_page_to_gfn rather than pfn_to_gfn xen: Use correctly the Xen memory terminologies arm/xen: implement correctly pfn_to_mfn xen: Make clear that swiotlb and biomerge are dealing with DMA address	2015-09-10 16:21:11 -07:00
Wei Liu	4c82ac3c37	xen-netback: respect user provided max_queues Originally that parameter was always reset to num_online_cpus during module initialisation, which renders it useless. The fix is to only set max_queues to num_online_cpus when user has not provided a value. Reported-by: Johnny Strom <johnny.strom@linuxsolutions.fi> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-09-10 10:11:48 -07:00
David Vrabel	1d5d485239	xen-netback: require fewer guest Rx slots when not using GSO Commit `f48da8b14d` (xen-netback: fix unlimited guest Rx internal queue and carrier flapping) introduced a regression. The PV frontend in IPXE only places 4 requests on the guest Rx ring. Since netback required at least (MAX_SKB_FRAGS + 1) slots, IPXE could not receive any packets. a) If GSO is not enabled on the VIF, fewer guest Rx slots are required for the largest possible packet. Calculate the required slots based on the maximum GSO size or the MTU. This calculation of the number of required slots relies on `1650d5455b` (xen-netback: always fully coalesce guest Rx packets) which present in 4.0-rc1 and later. b) Reduce the Rx stall detection to checking for at least one available Rx request. This is fine since we're predominately concerned with detecting interfaces which are down and thus have zero available Rx requests. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-09-09 12:34:35 -07:00
Julien Grall	0df4f266b3	xen: Use correctly the Xen memory terminologies Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN is meant, I suspect this is because the first support for Xen was for PV. This resulted in some misimplementation of helpers on ARM and confused developers about the expected behavior. For instance, with pfn_to_mfn, we expect to get an MFN based on the name. Although, if we look at the implementation on x86, it's returning a GFN. For clarity and avoid new confusion, replace any reference to mfn with gfn in any helpers used by PV drivers. The x86 code will still keep some reference of pfn_to_mfn which may be used by all kind of guests No changes as been made in the hypercall field, even though they may be invalid, in order to keep the same as the defintion in xen repo. Note that page_to_mfn has been renamed to xen_page_to_gfn to avoid a name to close to the KVM function gfn_to_page. Take also the opportunity to simplify simple construction such as pfn_to_mfn(page_to_pfn(page)) into xen_page_to_gfn. More complex clean up will come in follow-up patches. [1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e758ed14f390342513405dd766e874934573e6cb Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-09-08 18:03:49 +01:00
Paul Durrant	210c34dcd8	xen-netback: add support for multicast control Xen's PV network protocol includes messages to add/remove ethernet multicast addresses to/from a filter list in the backend. This allows the frontend to request the backend only forward multicast packets which are of interest thus preventing unnecessary noise on the shared ring. The canonical netif header in git://xenbits.xen.org/xen.git specifies the message format (two more XEN_NETIF_EXTRA_TYPEs) so the minimal necessary changes have been pulled into include/xen/interface/io/netif.h. To prevent the frontend from extending the multicast filter list arbitrarily a limit (XEN_NETBK_MCAST_MAX) has been set to 64 entries. This limit is not specified by the protocol and so may change in future. If the limit is reached then the next XEN_NETIF_EXTRA_TYPE_MCAST_ADD sent by the frontend will be failed with NETIF_RSP_ERROR. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-09-02 11:45:00 -07:00
Ross Lagerwall	57b229063a	xen/netback: Wake dealloc thread after completing zerocopy work Waking the dealloc thread before decrementing inflight_packets is racy because it means the thread may go to sleep before inflight_packets is decremented. If kthread_stop() has already been called, the dealloc thread may wait forever with nothing to wake it. Instead, wake the thread only after decrementing inflight_packets. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-06 23:42:48 -07:00
Ross Lagerwall	2475b22526	xen-netback: Allocate fraglist early to avoid complex rollback Determine if a fraglist is needed in the tx path, and allocate it if necessary before setting up the copy and map operations. Otherwise, undoing the copy and map operations is tricky. This fixes a use-after-free: if allocating the fraglist failed, the copy and map operations that had been set up were still executed, writing over the data area of a freed skb. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-03 22:23:03 -07:00
Dan Carpenter	50c2e4dd67	net/xen-netback: off by one in BUG_ON() condition The > should be >=. I also added spaces around the '-' operations so the code is a little more consistent and matches the condition better. Fixes: `f53c3fe8da` ('xen-netback: Introduce TX grant mapping') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-14 15:40:52 -07:00
Linus Torvalds	7adf12b87f	xen: features and cleanups for 4.2-rc0 - Add "make xenconfig" to assist in generating configs for Xen guests. - Preparatory cleanups necessary for supporting 64 KiB pages in ARM guests. - Automatically use hvc0 as the default console in ARM guests. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJVkpoqAAoJEFxbo/MsZsTRu3IH/2AMPx2i65hoSqfHtGf3sz/z XNfcidVmOElFVXGaW83m0tBWMemT5LpOGRfiq5sIo8xt/8xD2vozEkl/3kkf3RrX EmZDw3E8vmstBdBTjWdovVhNenRc0m0pB5daS7wUdo9cETq1ag1L3BHTB3fEBApO 74V6qAfnhnq+snqWhRD3XAk3LKI0nWuWaV+5HsmxDtnunGhuRLGVs7mwxZGg56sM mILA0eApGPdwyVVpuDe0SwV52V8E/iuVOWTcomGEN2+cRWffG5+QpHxQA8bOtF6O KfqldiNXOY/idM+5+oSm9hespmdWbyzsFqmTYz0LvQvxE8eEZtHHB3gIcHkE8QU= =danz -----END PGP SIGNATURE----- Merge tag 'for-linus-4.2-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from David Vrabel: "Xen features and cleanups for 4.2-rc0: - add "make xenconfig" to assist in generating configs for Xen guests - preparatory cleanups necessary for supporting 64 KiB pages in ARM guests - automatically use hvc0 as the default console in ARM guests" * tag 'for-linus-4.2-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: block/xen-blkback: s/nr_pages/nr_segs/ block/xen-blkfront: Remove invalid comment block/xen-blkfront: Remove unused macro MAXIMUM_OUTSTANDING_BLOCK_REQS arm/xen: Drop duplicate define mfn_to_virt xen/grant-table: Remove unused macro SPP xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring xen: Include xen/page.h rather than asm/xen/page.h kconfig: add xenconfig defconfig helper kconfig: clarify kvmconfig is for kvm xen/pcifront: Remove usage of struct timeval xen/tmem: use BUILD_BUG_ON() in favor of BUG_ON() hvc_xen: avoid uninitialized variable warning xenbus: avoid uninitialized variable warning xen/arm: allow console=hvc0 to be omitted for guests arm,arm64/xen: move Xen initialization earlier arm/xen: Correctly check if the event channel interrupt is present	2015-07-01 11:53:46 -07:00
Julien Grall	68946159da	net/xen-netback: Don't mix hexa and decimal with 0x in the printf format Append 0x to all %x in order to avoid while reading when there is other decimal value in the log. Also replace some of the hexadecimal print to decimal to uniformize the format with netfront. Signed-off-by: Julien Grall <julien.grall@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: netdev@vger.kernel.org Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-21 09:40:41 -07:00
Julien Grall	44f0764cfe	net/xen-netback: Remove unused code in xenvif_rx_action The variables old_req_cons and ring_slots_used are assigned but never used since commit `1650d5455b` "xen-netback: always fully coalesce guest Rx packets". Signed-off-by: Julien Grall <julien.grall@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-21 09:40:40 -07:00
Julien Grall	a9fd60e268	xen: Include xen/page.h rather than asm/xen/page.h Using xen/page.h will be necessary later for using common xen page helpers. As xen/page.h already include asm/xen/page.h, always use the later. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: netdev@vger.kernel.org Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-06-17 16:14:18 +01:00
David S. Miller	dda922c831	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/phy/amd-xgbe-phy.c drivers/net/wireless/iwlwifi/Kconfig include/net/mac80211.h iwlwifi/Kconfig and mac80211.h were both trivial overlapping changes. The drivers/net/phy/amd-xgbe-phy.c file got removed in 'net-next' and the bug fix that happened on the 'net' side is already integrated into the rest of the amd-xgbe driver. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-01 22:51:30 -07:00
Ian Campbell	dc5e7a811d	xen: netback: fix printf format string warning drivers/net/xen-netback/netback.c: In function ‘xenvif_tx_build_gops’: drivers/net/xen-netback/netback.c:1253:8: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘int’ [-Wformat=] (txreq.offset&~PAGE_MASK) + txreq.size); ^ PAGE_MASK's type can vary by arch, so a cast is needed. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> ---- v2: Cast to unsigned long, since PAGE_MASK can vary by arch. Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-01 12:03:04 -07:00
Shailendra Verma	c489dbb189	net:xen-netback - Change 1 to true for bool type variable. The variable separate_tx_rx_irq is bool type so assigning true instead of 1. Signed-off-by: Shailendra Verma <shailendra.capricorn@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-25 16:34:35 -04:00
Linus Torvalds	497a5df7bf	xen: features and fixes for 4.1-rc0 - Use a single source list of hypercalls, generating other tables etc. at build time. - Add a "Xen PV" APIC driver to support >255 VCPUs in PV guests. - Significant performance improve to guest save/restore/migration. - scsiback/front save/restore support. - Infrastructure for multi-page xenbus rings. - Misc fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJVL6OZAAoJEFxbo/MsZsTRs3YH/2AycBHs129DNnc2OLmOklBz AdD43k+FOfZlv0YU80WmPVmOpGGHGB5Pqkix2KtnvPYmtx3pb/5ikhDwSTWZpqBl Qq6/RgsRjYZ8VMKqrMTkJMrJWHQYbg8lgsP5810nsFBn/Qdbxms+WBqpMkFVo3b2 rvUZj8QijMJPS3qr55DklVaOlXV4+sTAytTdCiubVnaB/agM2jjRflp/lnJrhtTg yc4NTrIlD1RsMV/lNh92upBP/pCm6Bs0zQ2H1v3hkdhBBmaO0IVXpSheYhfDOHfo 9v209n137N7X86CGWImFk6m2b+EfiFnLFir07zKSA+iZwkYKn75znSdPfj0KCc0= =bxTm -----END PGP SIGNATURE----- Merge tag 'stable/for-linus-4.1-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen features and fixes from David Vrabel: - use a single source list of hypercalls, generating other tables etc. at build time. - add a "Xen PV" APIC driver to support >255 VCPUs in PV guests. - significant performance improve to guest save/restore/migration. - scsiback/front save/restore support. - infrastructure for multi-page xenbus rings. - misc fixes. * tag 'stable/for-linus-4.1-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pci: Try harder to get PXM information for Xen xenbus_client: Extend interface to support multi-page ring xen-pciback: also support disabling of bus-mastering and memory-write-invalidate xen: support suspend/resume in pvscsi frontend xen: scsiback: add LUN of restored domain xen-scsiback: define a pr_fmt macro with xen-pvscsi xen/mce: fix up xen_late_init_mcelog() error handling xen/privcmd: improve performance of MMAPBATCH_V2 xen: unify foreign GFN map/unmap for auto-xlated physmap guests x86/xen/apic: WARN with details. x86/xen: Provide a "Xen PV" APIC driver to support >255 VCPUs xen/pciback: Don't print scary messages when unsupported by hypervisor. xen: use generated hypercall symbols in arch/x86/xen/xen-head.S xen: use generated hypervisor symbols in arch/x86/xen/trace.c xen: synchronize include/xen/interface/xen.h with xen xen: build infrastructure for generating hypercall depending symbols xen: balloon: Use static attribute groups for sysfs entries xen: pcpu: Use static attribute groups for sysfs entry	2015-04-16 14:01:03 -05:00
Wei Liu	ccc9d90a9a	xenbus_client: Extend interface to support multi-page ring Originally Xen PV drivers only use single-page ring to pass along information. This might limit the throughput between frontend and backend. The patch extends Xenbus driver to support multi-page ring, which in general should improve throughput if ring is the bottleneck. Changes to various frontend / backend to adapt to the new interface are also included. Affected Xen drivers: * blkfront/back * netfront/back * pcifront/back * scsifront/back * vtpmfront The interface is documented, as before, in xenbus_client.c. Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Cc: Konrad Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-04-15 10:56:47 +01:00
David S. Miller	0fa74a4be4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/emulex/benet/be_main.c net/core/sysctl_net_core.c net/ipv4/inet_diag.c The be_main.c conflict resolution was really tricky. The conflict hunks generated by GIT were very unhelpful, to say the least. It split functions in half and moved them around, when the real actual conflict only existed solely inside of one function, that being be_map_pci_bars(). So instead, to resolve this, I checked out be_main.c from the top of net-next, then I applied the be_main.c changes from 'net' since the last time I merged. And this worked beautifully. The inet_diag.c and sysctl_net_core.c conflicts were simple overlapping changes, and were easily to resolve. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 18:51:09 -04:00
Palik, Imre	edafc132ba	xen-netback: making the bandwidth limiter runtime settable With the current netback, the bandwidth limiter's parameters are only settable during vif setup time. This patch register a watch on them, and thus makes them runtime changeable. When the watch fires, the timer is reset. The timer's mutex is used for fencing the change. Cc: Anthony Liguori <aliguori@amazon.com> Signed-off-by: Imre Palik <imrep@amazon.de> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:55:15 -04:00
David Vrabel	c8a4d29988	xen-netback: notify immediately after pushing Tx response. This fixes a performance regression introduced by `7fbb9d8415` (xen-netback: release pending index before pushing Tx responses) Moving the notify outside of the spin locks means it can be delayed a long time (if the dealloc thread is descheduled or there is an interrupt or softirq). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Zoltan Kiss <zoltan.kiss@linaro.org> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 23:30:44 -04:00
David Vrabel	b0c21badf1	xen-netback: refactor xenvif_handle_frag_list() When handling a from-guest frag list, xenvif_handle_frag_list() replaces the frags before calling the destructor to clean up the original (foreign) frags. Whilst this is safe (the destructor doesn't actually use the frags), it looks odd. Reorder the function to be less confusing. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 14:58:18 -05:00
David Vrabel	49d9991a18	xen-netback: unref frags when handling a from-guest skb with a frag list Every time a VIF is destroyed up to 256 pages may be leaked if packets with more than MAX_SKB_FRAGS frags were transmitted from the guest. Even worse, if another user of ballooned pages allocated one of these ballooned pages it would not handle the unexpectedly >1 page count (e.g., gntdev would deadlock when unmapping a grant because the page count would never reach 1). When handling a from-guest skb with a frag list, unref the frags before releasing them so they are freed correctly when the VIF is destroyed. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-05 14:58:17 -05:00
David Vrabel	7fbb9d8415	xen-netback: release pending index before pushing Tx responses If the pending indexes are released /after/ pushing the Tx response then a stale pending index may be used if a new Tx request is immediately pushed by the frontend. The may cause various WARNINGs or BUGs if the stale pending index is actually still in use. Fix this by releasing the pending index before pushing the Tx response. The full barrier for the pending ring update is not required since the the Tx response push already has a suitable write barrier. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-24 16:24:22 -05:00
Linus Torvalds	c5ce28df0e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: 1) More iov_iter conversion work from Al Viro. [ The "crypto: switch af_alg_make_sg() to iov_iter" commit was wrong, and this pull actually adds an extra commit on top of the branch I'm pulling to fix that up, so that the pre-merge state is ok. - Linus ] 2) Various optimizations to the ipv4 forwarding information base trie lookup implementation. From Alexander Duyck. 3) Remove sock_iocb altogether, from CHristoph Hellwig. 4) Allow congestion control algorithm selection via routing metrics. From Daniel Borkmann. 5) Make ipv4 uncached route list per-cpu, from Eric Dumazet. 6) Handle rfs hash collisions more gracefully, also from Eric Dumazet. 7) Add xmit_more support to r8169, e1000, and e1000e drivers. From Florian Westphal. 8) Transparent Ethernet Bridging support for GRO, from Jesse Gross. 9) Add BPF packet actions to packet scheduler, from Jiri Pirko. 10) Add support for uniqu flow IDs to openvswitch, from Joe Stringer. 11) New NetCP ethernet driver, from Muralidharan Karicheri and Wingman Kwok. 12) More sanely handle out-of-window dupacks, which can result in serious ACK storms. From Neal Cardwell. 13) Various rhashtable bug fixes and enhancements, from Herbert Xu, Patrick McHardy, and Thomas Graf. 14) Support xmit_more in be2net, from Sathya Perla. 15) Group Policy extensions for vxlan, from Thomas Graf. 16) Remove Checksum Offload support for vxlan, from Tom Herbert. 17) Like ipv4, support lockless transmit over ipv6 UDP sockets. From Vlad Yasevich. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1494+1 commits) crypto: fix af_alg_make_sg() conversion to iov_iter ipv4: Namespecify TCP PMTU mechanism i40e: Fix for stats init function call in Rx setup tcp: don't include Fast Open option in SYN-ACK on pure SYN-data openvswitch: Only set TUNNEL_VXLAN_OPT if VXLAN-GBP metadata is set ipv6: Make __ipv6_select_ident static ipv6: Fix fragment id assignment on LE arches. bridge: Fix inability to add non-vlan fdb entry net: Mellanox: Delete unnecessary checks before the function call "vunmap" cxgb4: Add support in cxgb4 to get expansion rom version via ethtool ethtool: rename reserved1 memeber in ethtool_drvinfo for expansion ROM version net: dsa: Remove redundant phy_attach() IB/mlx4: Reset flow support for IB kernel ULPs IB/mlx4: Always use the correct port for mirrored multicast attachments net/bonding: Fix potential bad memory access during bonding events tipc: remove tipc_snprintf tipc: nl compat add noop and remove legacy nl framework tipc: convert legacy nl stats show to nl compat tipc: convert legacy nl net id get to nl compat tipc: convert legacy nl net id set to nl compat ...	2015-02-10 20:01:30 -08:00
Linus Torvalds	bdccc4edeb	xen: features and fixes for 3.20-rc0 - Reworked handling for foreign (grant mapped) pages to simplify the code, enable a number of additional use cases and fix a number of long-standing bugs. - Prefer the TSC over the Xen PV clock when dom0 (and the TSC is stable). - Assorted other cleanup and minor bug fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJU2JC+AAoJEFxbo/MsZsTRIvAH/1lgQ0EQlxaZtEFWY8cJBzxY dXaTMfyGQOddGYDCW0r42hhXJHeX7DWXSERSD3aW9DZOn/eYdneHq9gWRD4uPrGn hEFQ26J4jZWR5riGXaja0LqI2gJKLZ6BhHIQciLEbY+jw4ynkNBLNRPFehuwrCsZ WdBwJkyvXC3RErekncRl/aNhxdi4p1P6qeiaW/mo3UcSO/CFSKybOLwT65iePazg XuY9UiTn2+qcRkm/tjx8K9heHK8SBEGNWuoTcWYF1to8mwwUfKIAc4NO2UBDXJI+ rp7Z2lVFdII15JsQ08ATh3t7xDrMWLzCX/y4jCzmF3DBXLbSWdHCQMgI7TWt5pE= =PyJK -----END PGP SIGNATURE----- Merge tag 'stable/for-linus-3.20-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen features and fixes from David Vrabel: - Reworked handling for foreign (grant mapped) pages to simplify the code, enable a number of additional use cases and fix a number of long-standing bugs. - Prefer the TSC over the Xen PV clock when dom0 (and the TSC is stable). - Assorted other cleanup and minor bug fixes. * tag 'stable/for-linus-3.20-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (25 commits) xen/manage: Fix USB interaction issues when resuming xenbus: Add proper handling of XS_ERROR from Xenbus for transactions. xen/gntdev: provide find_special_page VMA operation xen/gntdev: mark userspace PTEs as special on x86 PV guests xen-blkback: safely unmap grants in case they are still in use xen/gntdev: safely unmap grants in case they are still in use xen/gntdev: convert priv->lock to a mutex xen/grant-table: add a mechanism to safely unmap pages that are in use xen-netback: use foreign page information from the pages themselves xen: mark grant mapped pages as foreign xen/grant-table: add helpers for allocating pages x86/xen: require ballooned pages for grant maps xen: remove scratch frames for ballooned pages and m2p override xen/grant-table: pre-populate kernel unmap ops for xen_gnttab_unmap_refs() mm: add 'foreign' alias for the 'pinned' page flag mm: provide a find_special_page vma operation x86/xen: cleanup arch/x86/xen/mmu.c x86/xen: add some __init annotations in arch/x86/xen/mmu.c x86/xen: add some __init and static annotations in arch/x86/xen/setup.c x86/xen: use correct types for addresses in arch/x86/xen/setup.c ...	2015-02-10 13:56:56 -08:00
David S. Miller	6e03f896b5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/vxlan.c drivers/vhost/net.c include/linux/if_vlan.h net/core/dev.c The net/core/dev.c conflict was the overlap of one commit marking an existing function static whilst another was adding a new function. In the include/linux/if_vlan.h case, the type used for a local variable was changed in 'net', whereas the function got rewritten to fix a stacked vlan bug in 'net-next'. In drivers/vhost/net.c, Al Viro's iov_iter conversions in 'net-next' overlapped with an endainness fix for VHOST 1.0 in 'net'. In drivers/net/vxlan.c, vxlan_find_vni() added a 'flags' parameter in 'net-next' whereas in 'net' there was a bug fix to pass in the correct network namespace pointer in calls to this function. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 14:33:28 -08:00
David Vrabel	42b5212fee	xen-netback: stop the guest rx thread after a fatal error After commit `e9d8b2c296` (xen-netback: disable rogue vif in kthread context), a fatal (protocol) error would leave the guest Rx thread spinning, wasting CPU time. Commit `ecf08d2dbb` (xen-netback: reintroduce guest Rx stall detection) made this even worse by removing a cond_resched() from this path. Since a fatal error is non-recoverable, just allow the guest Rx thread to exit. This requires taking additional refs to the task so the thread exiting early is handled safely. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reported-by: Julien Grall <julien.grall@linaro.org> Tested-by: Julien Grall <julien.grall@linaro.org> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-02 19:39:04 -08:00
Jennifer Herbert	c2677a6fc4	xen-netback: use foreign page information from the pages themselves Use the foreign page flag in netback to get the domid and grant ref needed for the grant copy. This signficiantly simplifies the netback code and makes netback work with foreign pages from other backends (e.g., blkback). This allows blkback to use iSCSI disks provided by domUs running on the same host. Signed-off-by: Jennifer Herbert <jennifer.herbert@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-28 14:03:13 +00:00
Jennifer Herbert	0ae65f49af	x86/xen: require ballooned pages for grant maps Ballooned pages are always used for grant maps which means the original frame does not need to be saved in page->index nor restored after the grant unmap. This allows the workaround in netback for the conflicting use of the (unionized) page->index and page->pfmemalloc to be removed. Signed-off-by: Jennifer Herbert <jennifer.herbert@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-28 14:03:11 +00:00
David Vrabel	1650d5455b	xen-netback: always fully coalesce guest Rx packets Always fully coalesce guest Rx packets into the minimum number of ring slots. Reducing the number of slots per packet has significant performance benefits when receiving off-host traffic. Results from XenServer's performance benchmarks: Baseline Full coalesce Interhost VM receive 7.2 Gb/s 11 Gb/s Interhost aggregate 24 Gb/s 24 Gb/s Intrahost single stream 14 Gb/s 14 Gb/s Intrahost aggregate 34 Gb/s 34 Gb/s However, this can increase the number of grant ops per packet which decreases performance of backend (dom0) to VM traffic (by ~10%) /unless/ grant copy has been optimized for adjacent ops with the same source or destination (see "grant-table: defer releasing pages acquired in a grant copy"[1] expected in Xen 4.6). [1] http://lists.xen.org/archives/html/xen-devel/2015-01/msg01118.html Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-23 18:01:58 -08:00
David Vrabel	26c0e10258	xen-netback: support frontends without feature-rx-notify again Commit `bc96f648df` (xen-netback: make feature-rx-notify mandatory) incorrectly assumed that there were no frontends in use that did not support this feature. But the frontend driver in MiniOS does not and since this is used by (qemu) stubdoms, these stopped working. Netback sort of works as-is in this mode except: - If there are no Rx requests and the internal Rx queue fills, only the drain timeout will wake the thread. The default drain timeout of 10 s would give unacceptable pauses. - If an Rx stall was detected and the internal Rx queue is drained, then the Rx thread would never wake. Handle these two cases (when feature-rx-notify is disabled) by: - Reducing the drain timeout to 30 ms. - Disabling Rx stall detection. Reported-by: John <jw@nuclearfallout.net> Tested-by: John <jw@nuclearfallout.net> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-18 12:49:49 -05:00
Malcolm Crossley	7e5d775395	xen-netback: remove unconditional __pskb_pull_tail() in guest Tx path Unconditionally pulling 128 bytes into the linear area is not required for: - security: Every protocol demux starts with pskb_may_pull() to pull frag data into the linear area, if necessary, before looking at headers. - performance: Netback has already grant copied up-to 128 bytes from the first slot of a packet into the linear area. The first slot normally contain all the IPv4/IPv6 and TCP/UDP headers. The unconditional pull would often copy frag data unnecessarily. This is a performance problem when running on a version of Xen where grant unmap avoids TLB flushes for pages which are not accessed. TLB flushes can now be avoided for > 99% of unmaps (it was 0% before). Grant unmap TLB flush avoidance will be available in a future version of Xen (probably 4.6). Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-06 14:40:18 -05:00
David S. Miller	55b42b5ca2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/phy/marvell.c Simple overlapping changes in drivers/net/phy/marvell.c Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-01 14:53:27 -04:00
Zoltan Kiss	44cc8ed17e	xen-netback: Remove __GFP_COLD This flag is unnecessary, it came from some old code. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Zoltan Kiss <zoltan.kiss@linaro.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-29 15:59:37 -04:00
David Vrabel	ecf08d2dbb	xen-netback: reintroduce guest Rx stall detection If a frontend not receiving packets it is useful to detect this and turn off the carrier so packets are dropped early instead of being queued and drained when they expire. A to-guest queue is stalled if it doesn't have enough free slots for a an extended period of time (default 60 s). If at least one queue is stalled, the carrier is turned off (in the expectation that the other queues will soon stall as well). The carrier is only turned on once all queues are ready. When the frontend connects, all the queues start in the stalled state and only become ready once the frontend queues enough Rx requests. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-25 14:15:20 -04:00
David Vrabel	f48da8b14d	xen-netback: fix unlimited guest Rx internal queue and carrier flapping Netback needs to discard old to-guest skb's (guest Rx queue drain) and it needs detect guest Rx stalls (to disable the carrier so packets are discarded earlier), but the current implementation is very broken. 1. The check in hard_start_xmit of the slot availability did not consider the number of packets that were already in the guest Rx queue. This could allow the queue to grow without bound. The guest stops consuming packets and the ring was allowed to fill leaving S slot free. Netback queues a packet requiring more than S slots (ensuring that the ring stays with S slots free). Netback queue indefinately packets provided that then require S or fewer slots. 2. The Rx stall detection is not triggered in this case since the (host) Tx queue is not stopped. 3. If the Tx queue is stopped and a guest Rx interrupt occurs, netback will consider this an Rx purge event which may result in it taking the carrier down unnecessarily. It also considers a queue with only 1 slot free as unstalled (even though the next packet might not fit in this). The internal guest Rx queue is limited by a byte length (to 512 Kib, enough for half the ring). The (host) Tx queue is stopped and started based on this limit. This sets an upper bound on the amount of memory used by packets on the internal queue. This allows the estimatation of the number of slots for an skb to be removed (it wasn't a very good estimate anyway). Instead, the guest Rx thread just waits for enough free slots for a maximum sized packet. skbs queued on the internal queue have an 'expires' time (set to the current time plus the drain timeout). The guest Rx thread will detect when the skb at the head of the queue has expired and discard expired skbs. This sets a clear upper bound on the length of time an skb can be queued for. For a guest being destroyed the maximum time needed to wait for all the packets it sent to be dropped is still the drain timeout (10 s) since it will not be sending new packets. Rx stall detection is reintroduced in a later commit. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-25 14:15:20 -04:00
Wei Liu	a64bd93452	xen-netback: don't stop dealloc kthread too early Reference count the number of packets in host stack, so that we don't stop the deallocation thread too early. If not, we can end up with xenvif_free permanently waiting for deallocation thread to unmap grefs. Reported-by: Thomas Leonard <talex5@gmail.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-13 20:07:44 -07:00
Zoltan Kiss	743b0a92b9	xen-netback: Fix vif->disable handling In the patch called "xen-netback: Turn off the carrier if the guest is not able to receive" new branches were introduced to this if statement, risking that a queue with non-zero id can reenable the disabled interface. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: xen-devel@lists.xenproject.org Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-07 16:02:56 -07:00
Zoltan Kiss	f34a4cf9c9	xen-netback: Turn off the carrier if the guest is not able to receive Currently when the guest is not able to receive more packets, qdisc layer starts a timer, and when it goes off, qdisc is started again to deliver a packet again. This is a very slow way to drain the queues, consumes unnecessary resources and slows down other guests shutdown. This patch change the behaviour by turning the carrier off when that timer fires, so all the packets are freed up which were stucked waiting for that vif. Instead of the rx_queue_purge bool it uses the VIF_STATUS_RX_PURGE_EVENT bit to signal the thread that either the timeout happened or an RX interrupt arrived, so the thread can check what it should do. It also disables NAPI, so the guest can't transmit, but leaves the interrupts on, so it can resurrect. Only the queues which brought down the interface can enable it again, the bit QUEUE_STATUS_RX_STALLED makes sure of that. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: xen-devel@lists.xenproject.org Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-05 16:04:46 -07:00

1 2 3 4

158 Commits