linux

mirror of https://github.com/torvalds/linux.git synced 2024-10-31 17:21:49 +00:00

Author	SHA1	Message	Date
Herbert Xu	d23e43658a	tun: Fix device unregister race It is currently possible for an asynchronous device unregister to cause the same tun device to be unregistered twice. This is because the unregister in tun_chr_close only checks whether __tun_get(tfile) != NULL. This however has nothing to do with whether the device has already been unregistered. All it tells you is whether __tun_detach has been called. This patch fixes this by using the most obvious thing to test whether the device has been unregistered. It also moves __tun_detach outside of rtnl_unlock since nothing that it does requires that lock. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-05 18:03:18 -07:00
Kay Sievers	d405640539	Driver Core: misc: add nodename support for misc devices. This adds support for misc devices to report their requested nodename to userspace. It also updates a number of misc drivers to provide the needed subdirectory and device name to be used for them. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-06-15 21:30:25 -07:00
Eric W. Biederman	f0a4d0e5b5	tun: Fix unregister race It is possible for tun_chr_close to race with dellink on the a tun device. In which case if __tun_get runs before dellink but dellink runs before tun_chr_close calls unregister_netdevice we will attempt to unregister the netdevice after it is already gone. The two cases are already serialized on the rtnl_lock, so I have gone for the cheap simple fix of moving rtnl_lock to cover __tun_get in tun_chr_close. Eliminating the possibility of the tun device being unregistered between __tun_get and unregister_netdevice in tun_chr_close. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Tested-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-08 00:44:31 -07:00
Sridhar Samudrala	6f536f4039	tun: Fix copy/paste error in tun_get_user Use the right structure while incrementing the offset in tun_get_user. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-08 00:27:28 -07:00
Herbert Xu	4909122fb8	tun: Optimise handling of bogus gso->hdr_len As all current versions of virtio_net generate a value for the header length that's too small, we should optimise this so that we don't copy it twice. This can be done by ensuring that it is at least as large as the place where we'll write the checksum. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-08 00:20:01 -07:00
Herbert Xu	c722c625db	tun: Only wake up writers When I added socket accounting to tun I inadvertently introduced spurious wake-up events that kills qemu performance. The problem occurs when qemu polls on the tun fd for read, and then transmits packets. For each packet transmitted, we will wake up qemu even if it only cares about read events. Now this affects all sockets, but it is only a new problem for tun. So this patch tries to fix it for tun first and we can then look at the problem in general. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-03 21:45:55 -07:00
David Woodhouse	980c9e8cee	tun: add tun_flags, owner, group attributes in sysfs This patch adds three attribute files in /sys/class/net/$dev/ for tun devices; allowing userspace to obtain the information which TUNGETIFF offers, and more, but without having to attach to the device in question (which may not be possible if it's in use). It also fixes a bug which has been present in the TUNGETIFF ioctl since its inception, where it would never set IFF_TUN or IFF_TAP according to the device type. (Look carefully at the code which I remove from tun_get_iff() and how the new tun_flags() helper is subtly different). Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-09 22:54:21 -07:00
David Woodhouse	f85ba78068	tun: add IFF_TUN_EXCL flag to avoid opening a persistent device. When creating a certain types of VPN, NetworkManager will first attempt to find an available tun device by iterating through 'vpn%d' until it finds one that isn't already busy. Then it'll set that to be persistent and owned by the otherwise unprivileged user that the VPN dæmon itself runs as. There's a race condition here -- during the period where the vpn%d device is created and we're waiting for the VPN dæmon to actually connect and use it, if we try to create _another_ device we could end up re-using the same one -- because trying to open it again doesn't get -EBUSY as it would while it's _actually_ busy. So solve this, we add an IFF_TUN_EXCL flag which causes tun_set_iff() to fail if it would be opening an existing persistent tundevice -- so that we can make sure we're getting an entirely _new_ device. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-04-27 03:23:54 -07:00
Michael S. Tsirkin	6f26c9a755	tun: fix tun_chr_aio_write so that aio works aio_write gets const struct iovec * but tun_chr_aio_write casts this to struct iovec * and modifies the iovec. As a result, attempts to use io_submit to send packets to a tun device fail with weird errors such as EINVAL. Since tun is the only user of skb_copy_datagram_from_iovec, we can fix this simply by changing the later so that it does not touch the iovec passed to it. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-04-21 05:42:46 -07:00
Michael S. Tsirkin	43b39dcdbd	tun: fix tun_chr_aio_read so that aio works aio_read gets const struct iovec * but tun_chr_aio_read casts this to struct iovec * and modifies the iovec. As a result, attempts to use io_submit to get packets from a tun device fail with weird errors such as EINVAL. Fix by using the new skb_copy_datagram_const_iovec. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-04-21 05:42:45 -07:00
Herbert Xu	c40af84a67	tun: Fix sk_sleep races when attaching/detaching As the sk_sleep wait queue actually lives in tfile, which may be detached from the tun device, bad things will happen when we use sk_sleep after detaching. Since the tun device is the persistent data structure here (when requested by the user), it makes much more sense to have the wait queue live there. There is no reason to have it in tfile at all since the only time we can wait is if we have a tun attached. In fact we already have a wait queue in tun_struct, so we might as well use it. Reported-by: Eric W. Biederman <ebiederm@xmission.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-04-20 03:01:48 -07:00
Herbert Xu	9c3fea6ab0	tun: Only free a netdev when all tun descriptors are closed The commit `c70f182940` ("tun: Fix races between tun_net_close and free_netdev") fixed a race where an asynchronous deletion of a tun device can hose a poll(2) on a tun fd attached to that device. However, this came at the cost of moving the tun wait queue into the tun file data structure. The problem with this is that it imposes restrictions on when and where the tun device can access the wait queue since the tun file may change at any time due to detaching and reattaching. In particular, now that we need to use the wait queue on the receive path it becomes difficult to properly synchronise this with the detachment of the tun device. This patch solves the original race in a different way. Since the race is only because the underlying memory gets freed, we can prevent it simply by ensuring that we don't do that until all tun descriptors ever attached to the device (even if they have since be detached because they may still be sitting in poll) have been closed. This is done by using reference counting the attached tun file descriptors. The refcount in tun->sk has been reappropriated for this purpose since it was already being used for that, albeit from the opposite angle. Note that we no longer zero tfile->tun since tun_get will return NULL anyway after the refcount on tfile hits zero. Instead it represents whether this device has ever been attached to a device. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-04-20 03:01:47 -07:00
Herbert Xu	0eca93bcf7	tun: Fix crash with non-GSO users When I made the tun driver use non-linear packets as the preferred option, it broke non-GSO users because they would end up allocating a completely non-linear packet, which triggers a crash when we call eth_type_trans. This patch reverts non-GSO users to using linear packets and adds a check to ensure that GSO users can't cause crashes in the same way. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-04-14 02:09:43 -07:00
Herbert Xu	ab46d77966	tun: Fix merge error When forward-porting the tun accounting patch I managed to break the send path compltely by dropping the tun_get call. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-14 20:46:39 -08:00
David S. Miller	0ecc103aec	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/gianfar.c	2009-02-09 23:22:21 -08:00
Alex Williamson	cfbf84fcbc	tun: Fix unicast filter overflow Tap devices can make use of a small MAC filter set via the TUNSETTXFILTER ioctl. The filter has a set of exact matches plus a hash for imperfect filtering of additional multicast addresses. The current code is unbalanced, adding unicast addresses to the multicast hash, but only checking the hash against multicast addresses. This results in the filter dropping unicast addresses that overflow the exact filter. The fix is simply to disable the filter by leaving count set to zero if we find non-multicast addresses after the exact match table is filled. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-08 17:49:17 -08:00
Herbert Xu	33dccbb050	tun: Limit amount of queued packets per device Unlike a normal socket path, the tuntap device send path does not have any accounting. This means that the user-space sender may be able to pin down arbitrary amounts of kernel memory by continuing to send data to an end-point that is congested. Even when this isn't an issue because of limited queueing at most end points, this can also be a problem because its only response to congestion is packet loss. That is, when those local queues at the end-point fills up, the tuntap device will start wasting system time because it will continue to send data there which simply gets dropped straight away. Of course one could argue that everybody should do congestion control end-to-end, unfortunately there are people in this world still hooked on UDP, and they don't appear to be going away anywhere fast. In fact, we've always helped them by performing accounting in our UDP code, the sole purpose of which is to provide congestion feedback other than through packet loss. This patch attempts to apply the same bandaid to the tuntap device. It creates a pseudo-socket object which is used to account our packets just as a normal socket does for UDP. Of course things are a little complex because we're actually reinjecting traffic back into the stack rather than out of the stack. The stack complexities however should have been resolved by preceding patches. So this one can simply start using skb_set_owner_w. For now the accounting is essentially disabled by default for backwards compatibility. In particular, we set the cap to INT_MAX. This is so that existing applications don't get confused by the sudden arrival EAGAIN errors. In future we may wish (or be forced to) do this by default. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-05 21:25:32 -08:00
Michael Tokarev	1bded710a5	tun: Check supplemental groups in TUN/TAP driver. Michael Tokarev wrote: [] > 2, and this is the main one: How about supplementary groups? > > Here I have a valid usage case: a group of testers running various > versions of windows using KVM (kernel virtual machine), 1 at a time, > to test some software. kvm is set up to use bridge with a tap device > (there should be a way to connect to the machine). Anyone on that group > has to be able to start/stop the virtual machines. > > My first attempt - pretty obvious when I saw -g option of tunctl - is > to add group ownership for the tun device and add a supplementary group > to each user (their primary group should be different). But that fails, > since kernel only checks for egid, not any other group ids. > > What's the reasoning to not allow supplementary groups and to only check > for egid? Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-02 23:34:56 -08:00
Harvey Harrison	09640e6365	net: replace uses of __constant_{endian} Base versions handle constant folding now. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-01 00:45:17 -08:00
Eric W. Biederman	f019a7a594	tun: Implement ip link del tunXXX This greatly simplifies testing to verify I have fixed the problems with a tun device disappearing when the tun file descriptor is still held open. Further it allows removal network namespace operations for the tun driver. Reducing the network namespace handling in the driver to the minimum. i.e. When we are creating a tun device. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 16:02:16 -08:00
Eric W. Biederman	aec191aa2a	tun: There is no longer any need to deny changing network namespaces With the awkward case between free_netdev and dev_chr_close fixed there is no longer any need to limit tun and tap devices to the network namespace they were created in. So remove the NETIF_F_NETNS_LOCAL flag on the network device. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 16:00:47 -08:00
Eric W. Biederman	c70f182940	tun: Fix races between tun_net_close and free_netdev. The tun code does not cope gracefully if the network device goes away before the tun file descriptor is closed. It looks like we can trigger this with rmmod, and moving tun devices between network namespaces will allow this to be triggered when network namespaces exit. To fix this I introduce an intermediate data structure tun_file which holds a count of users and a pointer to the struct tun_struct. tun_get increments that reference count if it is greater than 0. tun_put decrements that reference count and detaches from the network device if the count is 0. While we have a file attached to the network device I hold a reference to the network device keeping it from going away completely. When a network device is unregistered I decrement the count of the attached tun_file and if that was the last user I detach the tun_file, and all processes on read_wait are woken up to ensure they do not sleep indefinitely. As some of those sleeps happen with the count on the tun device elevated waking up the read waiters ensures that tun_file will be detached in a timely manner. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 16:00:46 -08:00
Eric W. Biederman	b2430de37e	tun: Move read_wait into tun_file The poll interface requires that the waitqueue exist while the struct file is open. In the rare case when a tun device disappears before the tun file closes we fail to provide this property, so move read_wait. This is safe now that tun_net_xmit is atomic with tun_detach. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 16:00:46 -08:00
Eric W. Biederman	38231b7a8d	tun: Make tun_net_xmit atomic wrt tun_attach && tun_detach Currently this small race allows for a packet to be received when we detach from an tun device and still be enqueued. Not especially important but not what the code is trying to do. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 16:00:45 -08:00
Eric W. Biederman	36b50bab53	tun: Grab the netns in open. Grabbing namespaces in open, and putting them in close always seems to be the cleanest approach with the fewest surprises. So now that we have tun_file so we have somepleace to put the network namespace, let's grab the network namespace on file open and put on file close. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 16:00:45 -08:00
Eric W. Biederman	631ab46b79	tun: Introduce tun_file Currently the tun code suffers from only having a single word of data that exists for the entire life of the tun file descriptor. This results in peculiar holding of references to the network namespace as well as races between free_netdevice and tun_chr_close. Fix this by introducing tun_file which will hold the per file state. For the moment it still holds just a single word so the differences are all logic changes with no changes in semantics. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 16:00:44 -08:00
Eric W. Biederman	eac9e90265	tun: Use POLLERR not EBADF in tun_chr_poll EBADF is meaningless in the context of a poll mask so use POLLERR instead. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 16:00:44 -08:00
Eric W. Biederman	a7385ba211	tun: Fix races in tun_set_iff It is possible for two different tasks with access to the same file descriptor to call tun_set_iff on it at the same time and race to attach to a tap device. Prevent this by placing all of the logic to attach to a file descriptor in one function and testing the file descriptor to be certain it is not already attached to another tun device. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 16:00:43 -08:00
Eric W. Biederman	74a3e5a71c	tun: Remove unnecessary tun_get_by_name Currently the tun driver keeps a private list of tun devices for what appears to be a small gain in performance when reconnecting a file descriptor to an existing tun or tap device. So simplify the code by removing it. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-21 16:00:43 -08:00
Gerrit Renker	745417e206	tun: Eliminate sparse signedness warning register_pernet_gen_device() expects 'int', found via sparse. CHECK drivers/net/tun.c drivers/net/tun.c:1245:36: warning: incorrect type in argument 1 (different signedness) drivers/net/tun.c:1245:36: expected int id drivers/net/tun.c:1245:36: got unsigned int static [toplevel] *<noident> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-01-04 17:14:46 -08:00
Kusanagi Kouichi	7a0a9608e4	tun: Fix SIOCSIFHWADDR error. Set proper operations. Signed-off-by: Kusanagi Kouichi <slash@ma.neweb.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-29 18:23:28 -08:00
Linus Torvalds	0191b625ca	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits) net: Allow dependancies of FDDI & Tokenring to be modular. igb: Fix build warning when DCA is disabled. net: Fix warning fallout from recent NAPI interface changes. gro: Fix potential use after free sfc: If AN is enabled, always read speed/duplex from the AN advertising bits sfc: When disabling the NIC, close the device rather than unregistering it sfc: SFT9001: Add cable diagnostics sfc: Add support for multiple PHY self-tests sfc: Merge top-level functions for self-tests sfc: Clean up PHY mode management in loopback self-test sfc: Fix unreliable link detection in some loopback modes sfc: Generate unique names for per-NIC workqueues 802.3ad: use standard ethhdr instead of ad_header 802.3ad: generalize out mac address initializer 802.3ad: initialize ports LACPDU from const initializer 802.3ad: remove typedef around ad_system 802.3ad: turn ports is_individual into a bool 802.3ad: turn ports is_enabled into a bool 802.3ad: make ntt bool ixgbe: Fix set_ringparam in ixgbe to use the same memory pools. ... Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due to the conversion to %pI (in this networking merge) and the addition of doing IPv6 addresses (from the earlier merge of CIFS).	2008-12-28 12:49:40 -08:00
Stephen Hemminger	008298231a	netdev: add more functions to netdevice ops This patch moves neigh_setup and hard_start_xmit into the network device ops structure. For bisection, fix all the previously converted drivers as well. Bonding driver took the biggest hit on this. Added a prefetch of the hard_start_xmit in the fast path to try and reduce any impact this would have. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-20 20:14:53 -08:00
Stephen Hemminger	758e43b74c	tun: convert to net_device_ops Convert the TUN/TAP tunnel driver to net_device_ops. Split the ops in two, and retain compatability. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-19 22:42:47 -08:00
David Howells	86a264abe5	CRED: Wrap current->cred and a few other accessors Wrap current->cred and a few other accessors to hide their actual implementation. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:18 +11:00
David Howells	ee9785ada3	CRED: Wrap task credential accesses in the network device drivers Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(\|e\|s\|fs)[ug]id to current_(\|e\|s\|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: netdev@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:38:43 +11:00
David S. Miller	9eeda9abd1	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/ath5k/base.c net/8021q/vlan_core.c	2008-11-06 22:43:03 -08:00
David S. Miller	babcda74e9	drivers/net: Kill now superfluous ->last_rx stores. The generic packet receive code takes care of setting netdev->last_rx when necessary, for the sake of the bonding ARP monitor. Drivers need not do it any more. Some cases had to be skipped over because the drivers were making use of the ->last_rx value themselves. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-03 21:11:17 -08:00
Al Viro	233e70f422	saner FASYNC handling on file close As it is, all instances of ->release() for files that have ->fasync() need to remember to evict file from fasync lists; forgetting that creates a hole and we actually have a bunch that does forget. So let's keep our lives simple - let __fput() check FASYNC in file->f_flags and call ->fasync() there if it's been set. And lose that crap in ->release() instances - leaving it there is still valid, but we don't have to bother anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-01 09:49:46 -07:00
Johannes Berg	e174961ca1	net: convert print_mac to %pM This converts pretty much everything to print_mac. There were a few things that had conflicts which I have just dropped for now, no harm done. I've built an allyesconfig with this and looked at the files that weren't built very carefully, but it's a huge patch. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-27 17:06:18 -07:00
Rusty Russell	f42157cb56	tun: fallback if skb_alloc() fails on big packets skb_alloc produces linear packets (using kmalloc()). That can fail, so should we fall back to making paged skbs. My original version of this patch always allocate paged skbs for big packets. But that made performance drop from 8.4 seconds to 8.8 seconds on 1G lguest->Host TCP xmit. So now we only do that as a fallback. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Max Krasnyansky <maxk@qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-08-15 19:52:31 -07:00
Mark McLoughlin	e3b9955697	tun: TUNGETIFF interface to query name and flags Add a TUNGETIFF interface so that userspace can query a tun/tap descriptor for its name and flags. This is needed because it is common for one app to create a tap interface, exec another app and pass it the file descriptor for the interface. Without TUNGETIFF the spawned app has no way of detecting wheter the interface has e.g. IFF_VNET_HDR set. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Acked-by: Max Krasnyansky <maxk@qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-08-15 19:52:19 -07:00
Harvey Harrison	c0e5a8c21b	net: tun.c fix cast Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-07-22 17:54:17 -04:00
David S. Miller	49997d7515	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: Documentation/powerpc/booting-without-of.txt drivers/atm/Makefile drivers/net/fs_enet/fs_enet-main.c drivers/pci/pci-acpi.c net/8021q/vlan.c net/iucv/iucv.c	2008-07-18 02:39:39 -07:00
Max Krasnyansky	f271b2cc78	tun: Fix/rewrite packet filtering logic Please see the following thread to get some context on this http://marc.info/?l=linux-netdev&m=121564433018903&w=2 Basically the issue is that current multi-cast filtering stuff in the TUN/TAP driver is seriously broken. Original patch went in without proper review and ACK. It was broken and confusing to start with and subsequent patches broke it completely. To give you an idea of what's broken here are some of the issues: - Very confusing comments throughout the code that imply that the character device is a network interface in its own right, and that packets are passed between the two nics. Which is completely wrong. - Wrong set of ioctls is used for setting up filters. They look like shortcuts for manipulating state of the tun/tap network interface but in reality manipulate the state of the TX filter. - ioctls that were originally used for setting address of the the TX filter got "fixed" and now set the address of the network interface itself. Which made filter totaly useless. - Filtering is done too late. Instead of filtering early on, to avoid unnecessary wakeups, filtering is done in the read() call. The list goes on and on :) So the patch cleans all that up. It introduces simple and clean interface for setting up TX filters (TUNSETTXFILTER + tun_filter spec) and does filtering before enqueuing the packets. TX filtering is useful in the scenarios where TAP is part of a bridge, in which case it gets all broadcast, multicast and potentially other packets when the bridge is learning. So for example Ethernet tunnelling app may want to setup TX filters to avoid tunnelling multicast traffic. QEMU and other hypervisors can push RX filtering that is currently done in the guest into the host context therefore saving wakeups and unnecessary data transfer. Signed-off-by: Max Krasnyansky <maxk@qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-14 22:18:19 -07:00
David S. Miller	2aec609fb4	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/netfilter/nf_conntrack_proto_tcp.c	2008-07-14 20:23:54 -07:00
Jonathan Corbet	2fceef397f	Merge commit 'v2.6.26' into bkl-removal	2008-07-14 15:29:34 -06:00
Max Krasnyansky	e35259a953	tun: Persistent devices can get stuck in xoff state The scenario goes like this. App stops reading from tun/tap. TX queue gets full and driver does netif_stop_queue(). App closes fd and TX queue gets flushed as part of the cleanup. Next time the app opens tun/tap and starts reading from it but the xoff state is not cleared. We're stuck. Normally xoff state is cleared when netdev is brought up. But in the case of persistent devices this happens only during initial setup. The fix is trivial. If device is already up when an app opens it we clear xoff state and that gets things moving again. Signed-off-by: Max Krasnyansky <maxk@qualcomm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-10 16:59:11 -07:00
Rusty Russell	f43798c276	tun: Allow GSO using virtio_net_hdr Add a IFF_VNET_HDR flag. This uses the same ABI as virtio_net (ie. prepending struct virtio_net_hdr to packets) to indicate GSO and checksum information. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Max Krasnyansky <maxk@qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-03 03:48:02 -07:00
Rusty Russell	5228ddc98f	tun: TUNSETFEATURES to set gso features. ethtool is useful for setting (some) device fields, but it's root-only. Finer feature control is available through a tun-specific ioctl. (Includes Mark McLoughlin <markmc@redhat.com>'s fix to hold rtnl sem). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Max Krasnyansky <maxk@qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-03 03:46:16 -07:00
Rusty Russell	07240fd090	tun: Interface to query tun/tap features. The problem with introducing checksum offload and gso to tun is they need to set dev->features to enable GSO and/or checksumming, which is supposed to be done before register_netdevice(), ie. as part of TUNSETIFF. Unfortunately, TUNSETIFF has always just ignored flags it doesn't understand, so there's no good way of detecting whether the kernel supports new IFF_ flags. This patch implements a TUNGETFEATURES ioctl which returns all the valid IFF flags. It could be extended later to include other features. Here's an example program which uses it: #include <linux/if_tun.h> #include <sys/types.h> #include <sys/ioctl.h> #include <sys/stat.h> #include <fcntl.h> #include <err.h> #include <stdio.h> static struct { unsigned int flag; const char *name; } known_flags[] = { { IFF_TUN, "TUN" }, { IFF_TAP, "TAP" }, { IFF_NO_PI, "NO_PI" }, { IFF_ONE_QUEUE, "ONE_QUEUE" }, }; int main() { unsigned int features, i; int netfd = open("/dev/net/tun", O_RDWR); if (netfd < 0) err(1, "Opening /dev/net/tun"); if (ioctl(netfd, TUNGETFEATURES, &features) != 0) { printf("Kernel does not support TUNGETFEATURES, guessing\n"); features = (IFF_TUN\|IFF_TAP\|IFF_NO_PI\|IFF_ONE_QUEUE); } printf("Available features are: "); for (i = 0; i < sizeof(known_flags)/sizeof(known_flags[0]); i++) { if (features & known_flags[i].flag) { features &= ~known_flags[i].flag; printf("%s ", known_flags[i].name); } } if (features) printf("(UNKNOWN %#x)", features); printf("\n"); return 0; } Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Max Krasnyansky <maxk@qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-03 03:45:32 -07:00
Jonathan Corbet	9d31952257	tun: fasync BKL pushdown Signed-off-by: Jonathan Corbet <corbet@lwn.net>	2008-07-02 15:06:27 -06:00
Arnd Bergmann	fd3e05b6c8	net-tun: BKL pushdown Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2008-07-02 15:06:23 -06:00
Ang Way Chuang	f09f7ee20c	tun: Proper handling of IPv6 header in tun driver when TUN_NO_PI is set By default, tun.c running in TUN_TUN_DEV mode will set the protocol of packet to IPv4 if TUN_NO_PI is set. My program failed to work when I assumed that the driver will check the first nibble of packet, determine IP version and set the appropriate protocol. Signed-off-by: Ang Way Chuang <wcang@nav6.org> Acked-by: Max Krasnyansky <maxk@qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-17 21:10:33 -07:00
David S. Miller	9edb74cc6c	tun: Multicast handling in tun_chr_ioctl() needs proper locking. Since these operations don't go through the normal device calls, we have to ensure we synchronize with those paths. Noticed by Alan Cox. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-24 03:44:43 -07:00
David S. Miller	48abfe05cd	tun: Fix minor race in TUNSETLINK ioctl handling. Noticed by Alan Cox. The IFF_UP test is a bit racey, because other entities outside of this driver's ioctl handler can modify that state, even though this ioctl handler runs under lock_kernel(). Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-23 19:37:58 -07:00
Pavel Emelyanov	fc54c65853	[TUN]: Allow to register tun devices in namespace. This is basically means that a net is set for a new device, but actually also involves two more steps: 1. mark the tun device as "local", i.e. do not allow for it to move across namespaces. This is done so, since tun device is most often associated to some file (and thus to some process) and moving the device alone is not valid while keeping the file and the process outside. The need in ability to move a detached persistent device is to be investigated later. 2. get the tun device's net when tun becomes attached and put one when it becomes detached. This is needed to handle the case when a task owning the tun dies, but a files lives for some more time - in this case we must not allow for net to be freed, since its exit hook will spoil that file's private data by unregistering the tun from under tun_chr_close. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-16 00:41:53 -07:00
Pavel Emelyanov	d647a591da	[TUN]: Make the tun_dev_list per-net. Remove the static tun_dev_list and replace its occurrences in driver with per-net one. It is used in two places - in tun_set_iff and tun_cleanup. In the first case it's legal to use current net_ns. In the cleanup call - move the loop, that unregisters all devices in net exit hook. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-16 00:41:16 -07:00
Pavel Emelyanov	79d1760491	[TUN]: Introduce the tun_net structure and init/exit net ops. This is the first step in making tuntap devices work in net namespaces. The structure mentioned is pointed by generic net pointer with tun_net_id id, and tun driver fills one on its load. It will contain only the tun devices list. So declare this structure and introduce net init and exit hooks. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-16 00:40:46 -07:00
Rusty Russell	e01bf1c833	net: check for underlength tap writes If the user gives a packet under 14 bytes, we'll end up reading off the end of the skb (not oopsing, just reading off the end). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Max Krasnyanskiy <maxk@qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-12 18:49:30 -07:00
Rusty Russell	14daa02139	net: make struct tun_struct private to tun.c There's no reason for this to be in the header, and it just hurts recompile time. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Max Krasnyanskiy <maxk@qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-12 18:48:58 -07:00
Kim B. Heino	401023710d	[TUN]: Fix RTNL-locking in tun/tap driver Current tun/tap driver sets also net device's hw address when asked to change character device's hw address. This is a good idea, but it misses RTLN-locking, resulting following error message in 2.6.25-rc3's inetdev_event() function: RTNL: assertion failed at net/ipv4/devinet.c (1050) Attached patch fixes this problem. Signed-off-by: Kim B. Heino <Kim.Heino@bluegiga.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-29 12:26:21 -08:00
Nathaniel Filardo	a26af1e08a	tun: impossible to deassert IFF_ONE_QUEUE or IFF_NO_PI From: "Nathaniel Filardo" <nwfilardo@gmail.com> Taken from http://bugzilla.kernel.org/show_bug.cgi?id=9806 The TUN/TAP driver only permits one-way transitions of IFF_NO_PI or IFF_ONE_QUEUE during the lifetime of a tap/tun interface. Note that tun_set_iff contains 541 if (ifr->ifr_flags & IFF_NO_PI) 542 tun->flags \|= TUN_NO_PI; 543 544 if (ifr->ifr_flags & IFF_ONE_QUEUE) 545 tun->flags \|= TUN_ONE_QUEUE; This is easily fixed by adding else branches which clear these bits. Steps to reproduce: This is easily reproduced by setting an interface persistant using tunctl then attempting to open it as IFF_TAP or IFF_TUN, without asserting the IFF_NO_PI flag. The ioctl() will succeed and the ifr.flags word is not modified, but the interface remains in IFF_NO_PI mode (as it was set by tunctl). Acked-by: Maxim Krasnyansky <maxk@qualcomm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-05 03:05:07 -08:00
Al Viro	a3edb08311	annotate tun Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-28 15:07:57 -08:00
Akinobu Mita	52427c9d11	[TUN]: Use iov_length() Use iov_length() instead of tun's homemade iov_total(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:34 -08:00
Toyo Abe	c6e991de4b	[TUNTAP]: Fix wrong debug message. This is a trivial fix of debug message. When a persist flag is set, the message should say "enabled". Signed-off-by: Toyo Abe <tabe@miraclelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-26 19:36:34 -08:00
Joe Perches	0795af5729	[NET]: Introduce and use print_mac() and DECLARE_MAC_BUF() This is nicer than the MAC_FMT stuff. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:42 -07:00
Ed Swierk	4885a50476	[TAP]: Configurable interface MTU. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:19 -07:00
Jeff Garzik	09f75cd7bf	[NET] drivers/net: statistics cleanup #1 -- save memory and shrink code We now have struct net_device_stats embedded in struct net_device, and the default ->get_stats() hook does the obvious thing for us. Run through drivers/net/* and remove the driver-local storage of statistics, and driver-local ->get_stats() hook where applicable. This was just the low-hanging fruit in drivers/net; plenty more drivers remain to be updated. [ Resolved conflicts with napi_struct changes and fix sunqe build regression... -DaveM ] Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:16 -07:00
Ralf Baechle	10d024c1b2	[NET]: Nuke SET_MODULE_OWNER macro. It's been a useless no-op for long enough in 2.6 so I figured it's time to remove it. The number of people that could object because they're maintaining unified 2.4 and 2.6 drivers is probably rather small. [ Handled drivers added by netdev tree and some missed IRDA cases... -DaveM ] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:13 -07:00
Eric W. Biederman	881d966b48	[NET]: Make the device list and device lookups per namespace. This patch makes most of the generic device layer network namespace safe. This patch makes dev_base_head a network namespace variable, and then it picks up a few associated variables. The functions: dev_getbyhwaddr dev_getfirsthwbytype dev_get_by_flags dev_get_by_name __dev_get_by_name dev_get_by_index __dev_get_by_index dev_ioctl dev_ethtool dev_load wireless_process_ioctl were modified to take a network namespace argument, and deal with it. vlan_ioctl_set and brioctl_set were modified so their hooks will receive a network namespace argument. So basically anthing in the core of the network stack that was affected to by the change of dev_base was modified to handle multiple network namespaces. The rest of the network stack was simply modified to explicitly use &init_net the initial network namespace. This can be fixed when those components of the network stack are modified to handle multiple network namespaces. For now the ifindex generator is left global. Fundametally ifindex numbers are per namespace, or else we will have corner case problems with migration when we get that far. At the same time there are assumptions in the network stack that the ifindex of a network device won't change. Making the ifindex number global seems a good compromise until the network stack can cope with ifindex changes when you change namespaces, and the like. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:10 -07:00
Guido Guenther	8c644623fe	[NET]: Allow group ownership of TUN/TAP devices. Introduce a new syscall TUNSETGROUP for group ownership setting of tap devices. The user now is allowed to send packages if either his euid or his egid matches the one specified via tunctl (via -u or -g respecitvely). If both, gid and uid, are set via tunctl, both have to match. Signed-off-by: Guido Guenther <agx@sigxcpu.org> Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:16:42 -07:00
Brian Braunstein	36226a8ded	[NET] tun/tap: fixed hw address handling Fixed tun/tap driver's handling of hw addresses. The hw address is stored in both the net_device.dev_addr and tun.dev_addr fields. These fields were not kept synchronized, and in fact weren't even initialized to the same value. Now during both init and when performing SIOCSIFHWADDR on the tun device these values are both updated. However, if SIOCSIFHWADDR is performed on the net device directly (for instance, setting the hw address using ifconfig), the tun device does not get updated. Perhaps the tun.dev_addr field should be removed completely at some point, as it is redundant and net_device.dev_addr can be used anywhere it is used. Signed-off-by: Brian Braunstein <linuxkernel@bristyle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-26 01:00:55 -07:00
Arnaldo Carvalho de Melo	d626f62b11	[SK_BUFF]: Introduce skb_copy_from_linear_data{_offset} To clearly state the intent of copying from linear sk_buffs, _offset being a overly long variant but interesting for the sake of saving some bytes. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-04-25 22:28:23 -07:00
Arnaldo Carvalho de Melo	459a98ed88	[SK_BUFF]: Introduce skb_reset_mac_header(skb) For the common, open coded 'skb->mac.raw = skb->data' operation, so that we can later turn skb->mac.raw into a offset, reducing the size of struct sk_buff in 64bit land while possibly keeping it as a pointer on 32bit. This one touches just the most simple case, next will handle the slightly more "complex" cases. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:24:32 -07:00
Arnaldo Carvalho de Melo	4c13eb6657	[ETH]: Make eth_type_trans set skb->dev like the other *_type_trans One less thing for drivers writers to worry about. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:24:30 -07:00
Arjan van de Ven	d54b1fdb1d	[PATCH] mark struct file_operations const 5 Many struct file_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:45 -08:00
Eric W. Biederman	609d7fa956	[PATCH] file: modify struct fown_struct to use a struct pid File handles can be requested to send sigio and sigurg to processes. By tracking the destination processes using struct pid instead of pid_t we make the interface safe from all potential pid wrap around problems. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-02 07:57:14 -07:00
Badari Pulavarty	ee0b3e671b	[PATCH] Remove readv/writev methods and use aio_read/aio_write instead This patch removes readv() and writev() methods and replaces them with aio_read()/aio_write() methods. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-01 00:39:28 -07:00
Jeff Garzik	7282d491ec	drivers/net: const-ify ethtool_ops declarations Signed-off-by: Jeff Garzik <jeff@garzik.org>	2006-09-13 14:30:00 -04:00
Jeff Garzik	6aa20a2235	drivers/net: Trim trailing whitespace Signed-off-by: Jeff Garzik <jeff@garzik.org>	2006-09-13 13:24:59 -04:00
Jörn Engel	6ab3d5624e	Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-06-30 19:25:36 +02:00
Greg Kroah-Hartman	96192ff1a9	[PATCH] devfs: Remove the miscdevice devfs_name field as it's no longer needed Also fixes all drivers that set this field. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-06-26 12:25:08 -07:00
David Woodhouse	ca6bb5d7ab	[NET]: Require CAP_NET_ADMIN to create tuntap devices. The tuntap driver allows an admin to create persistent devices and assign ownership of them to individual users. Unfortunately, relaxing the permissions on the /dev/net/tun device node so that they can actually use those devices will _also_ allow those users to create arbitrary new devices of their own. This patch corrects that, and adjusts the recommended permissions for the device node accordingly. Signed-off-By: David Woodhouse <dwmw2@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-23 02:07:44 -07:00
Dave Jones	8f22757ee8	[TUN]: Fix leak in tun_get_user() We're leaking an skb in a failure path in this function. Coverity #632 Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-11 18:49:13 -08:00
Mike Kershaw	ff4cc3ac93	[TUNTAP]: Allow setting the linktype of the tap device from userspace Currently tun/tap only supports the EN10MB ARP type. For use with wireless and other networking types it should be possible to set the ARP type via an ioctl. Patch v2: Included check that the tap interface is down before changing the link type out from underneath it Signed-off-by: Mike Kershaw <dragorn@kismetwireless.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-01 17:40:05 -07:00
David S. Miller	b03efcfb21	[NET]: Transform skb_queue_len() binary tests into skb_queue_empty() This is part of the grand scheme to eliminate the qlen member of skb_queue_head, and subsequently remove the 'list' member of sk_buff. Most users of skb_queue_len() want to know if the queue is empty or not, and that's trivially done with skb_queue_empty() which doesn't use the skb_queue_head->qlen member and instead uses the queue list emptyness as the test. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-07-08 14:57:23 -07:00
Linus Torvalds	1da177e4c3	Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!	2005-04-16 15:20:36 -07:00

1 2 3

138 Commits