linux

Author	SHA1	Message	Date
Christoph Hellwig	f8b12e513b	virtio_blk: revert QUEUE_FLAG_VIRT addition It seems like the addition of QUEUE_FLAG_VIRT caueses major performance regressions for Fedora users: https://bugzilla.redhat.com/show_bug.cgi?id=509383 https://bugzilla.redhat.com/show_bug.cgi?id=505695 while I can't reproduce those extreme regressions myself I think the flag is wrong. Rationale: QUEUE_FLAG_VIRT expands to QUEUE_FLAG_NONROT which casus the queue unplugged immediately. This is not a good behaviour for at least qemu and kvm where we do have significant overhead for every I/O operations. Even with all the latested speeups (native AIO, MSI support, zero copy) we can only get native speed for up to 128kb I/O requests we already are down to 66% of native performance for 4kb requests even on my laptop running the Intel X25-M SSD for which the QUEUE_FLAG_NONROT was designed. If we ever get virtio-blk overhead low enough that this flag makes sense it should only be set based on a feature flag set by the host. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-10-22 16:39:26 +10:30
Linus Torvalds	1f0918d03f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: lguest: don't force VIRTIO_F_NOTIFY_ON_EMPTY lguest: cleanup for map_switcher() lguest: use PGDIR_SHIFT for PAE code to allow different PAGE_OFFSET lguest: use set_pte/set_pmd uniformly for real page table entries lguest: move panic notifier registration to its expected place. virtio_blk: add support for cache flush virtio: add virtio IDs file virtio: get rid of redundant VIRTIO_ID_9P definition virtio: make add_buf return capacity remaining virtio_pci: minor MSI-X cleanups	2009-09-23 09:23:45 -07:00
Christoph Hellwig	f1b0ef0626	virtio_blk: add support for cache flush Recent qemu has added a VIRTIO_BLK_F_FLUSH flag to advertise that the virtual disk has a volatile write cache that needs to be flushed. In case we see this feature implement tell the Linux block layer about the fact and use the new VIRTIO_BLK_T_FLUSH to flush the cache when required. This allows for an correct and simple implementation of write barriers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-09-23 22:26:36 +09:30
Fernando Luis Vazquez Cao	3ca4f5ca73	virtio: add virtio IDs file Virtio IDs are spread all over the tree which makes assigning new IDs bothersome. Putting them together should make the process less error-prone. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-09-23 22:26:32 +09:30
Rusty Russell	3c1b27d504	virtio: make add_buf return capacity remaining This API change means that virtio_net can tell how much capacity remains for buffers. It's necessarily fuzzy, since VIRTIO_RING_F_INDIRECT_DESC means we can fit any number of descriptors in one, if we can kmalloc. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Dinesh Subhraveti <dineshs@us.ibm.com>	2009-09-23 22:26:31 +09:30
Alexey Dobriyan	83d5cde47d	const: make block_device_operations const Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-22 07:17:25 -07:00
Linus Torvalds	bb184d11ff	Merge branch 'tj-block-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc * 'tj-block-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc: virtio_blk: mark virtio_blk with __refdata to kill spurious section mismatch block: sysfs fix mismatched queue_var_{store,show} in 64bit kernel ataflop: adjust NULL test block: fix failfast merge testing in elv_rq_merge_ok() z2ram: Small cleanup for z2ram.c	2009-07-22 10:06:33 -07:00
Rakib Mullick	4fbfff7607	virtio_blk: mark virtio_blk with __refdata to kill spurious section mismatch The variable virtio_blk references the function virtblk_probe() (which is in .devinit section) and also references the function virtblk_remove() ( which is in .devexit section). So, virtio_blk simultaneously refers .devinit and .devexit section. To avoid this messup, we mark virtio_blk as __refdata. We were warned by the following warning: LD drivers/block/built-in.o WARNING: drivers/block/built-in.o(.data+0xc8dc): Section mismatch in reference from the variable virtio_blk to the function .devinit.text:virtblk_probe() The variable virtio_blk references the function __devinit virtblk_probe() If the reference is valid then annotate the variable with __init* or __refdata (see linux/init.h) or name the variable: driver, _template, _timer, _sht, _ops, _probe, _probe_one, _console, WARNING: drivers/block/built-in.o(.data+0xc8e0): Section mismatch in reference from the variable virtio_blk to the function .devexit.text:virtblk_remove() The variable virtio_blk references the function __devexit virtblk_remove() If the reference is valid then annotate the variable with __exit* (see linux/init.h) or name the variable: driver, _template, _timer, _sht, _ops, _probe, _probe_one, _console, Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2009-07-19 10:46:48 +09:00
Christoph Hellwig	d9ecdea7ed	virtio_blk: ioctl return value fix Block driver ioctl methods must return ENOTTY and not -ENOIOCTLCMD if they expect the block layer to handle generic ioctls. This triggered a BLKROSET failure in xfsqa #200. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-07-17 21:47:46 +09:30
Christoph Hellwig	4eff3cae9c	virtio_blk: don't bounce highmem requests By default a block driver bounces highmem requests, but virtio-blk is perfectly fine with any request that fit into it's 64 bit addressing scheme, mapped in the kernel virtual space or not. Besides improving performance on highmem systems this also makes the reproducible oops in __bounce_end_io go away (but hiding the real cause). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-07-17 21:47:46 +09:30
Mike Frysinger	98e9444474	virtio_blk: add missing __dev{init,exit} markings The remove member of the virtio_driver structure uses __devexit_p(), so the remove function itself should be marked with __devexit. And where there be __devexit on the remove, so is there __devinit on the probe. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-06-12 22:16:39 +09:30
Michael S. Tsirkin	d2a7ddda9f	virtio: find_vqs/del_vqs virtio operations This replaces find_vq/del_vq with find_vqs/del_vqs virtio operations, and updates all drivers. This is needed for MSI support, because MSI needs to know the total number of vectors upfront. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (+ lguest/9p compile fixes)	2009-06-12 22:16:36 +09:30
Rusty Russell	9499f5e7ed	virtio: add names to virtqueue struct, mapping from devices to queues. Add a linked list of all virtqueues for a virtio device: this helps for debugging and is also needed for upcoming interface change. Also, add a "name" field for clearer debug messages. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-06-12 22:16:36 +09:30
john cooper	1d589bb16b	Add serial number support for virtio_blk, V4a This patch extracts the opaque data from pci i/o region 0 via the added VIRTIO_BLK_F_IDENTIFY field. By convention this data takes the form of that returned by an ATA IDENTIFY DEVICE command, however the driver (except for structure size) makes no interpretation of the data. The structure data is copied wholesale to userspace via a HDIO_GET_IDENTITY ioctl command (eg: hdparm -i <dev>). Signed-off-by: john cooper <john.cooper@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-06-09 14:41:40 +02:00
Martin K. Petersen	e1defc4ff0	block: Do away with the notion of hardsect_size Until now we have had a 1:1 mapping between storage device physical block size and the logical block sized used when addressing the device. With SATA 4KB drives coming out that will no longer be the case. The sector size will be 4KB but the logical block size will remain 512-bytes. Hence we need to distinguish between the physical block size and the logical ditto. This patch renames hardsect_size to logical_block_size. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-22 23:22:54 +02:00
Jens Axboe	f831cc0349	virtio_blk: get rid of unused variable drivers/block/virtio_blk.c: In function 'blk_done': drivers/block/virtio_blk.c:53: warning: unused variable 'nr_bytes' Leftover from commit `1cde26f928` Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-18 14:44:45 +02:00
Hannes Reinecke	1cde26f928	virtio_blk: SG_IO passthru support Add support for SG_IO passthru to virtio_blk. We add the scsi command block after the normal outhdr, and the scsi inhdr with full status information aswell as the sense buffer before the regular inhdr. [hch: forward ported, added the VIRTIO_BLK_F_SCSI flags, some comments and tested the whole beast] [axboe: updated to use ->resid and not dual-path the byte count] Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (+ checkpatch.pl tweak) Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-18 14:41:30 +02:00
Christoph Hellwig	6c3b46f745	virtio_blk: don't blindly derefence req->rq_disk request->rq_disk is only set for FS requests or BLOCK_PC requests originating from the generic block layer scsi ioctls. It's not set for requests origination from other soures or internal cache flush commands implemented by the patch I'll send after this. So instead of using it to get at the private data in do_virtblk_request setup queue->queuedata and use it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-18 14:38:28 +02:00
Tejun Heo	9934c8c045	block: implement and enforce request peek/start/fetch Till now block layer allowed two separate modes of request execution. A request is always acquired from the request queue via elv_next_request(). After that, drivers are free to either dequeue it or process it without dequeueing. Dequeue allows elv_next_request() to return the next request so that multiple requests can be in flight. Executing requests without dequeueing has its merits mostly in allowing drivers for simpler devices which can't do sg to deal with segments only without considering request boundary. However, the benefit this brings is dubious and declining while the cost of the API ambiguity is increasing. Segment based drivers are usually for very old or limited devices and as converting to dequeueing model isn't difficult, it doesn't justify the API overhead it puts on block layer and its more modern users. Previous patches converted all block low level drivers to dequeueing model. This patch completes the API transition by... * renaming elv_next_request() to blk_peek_request() * renaming blkdev_dequeue_request() to blk_start_request() * adding blk_fetch_request() which is combination of peek and start * disallowing completion of queued (not started) requests * applying new API to all LLDs Renamings are for consistency and to break out of tree code so that it's apparent that out of tree drivers need updating. [ Impact: block request issue API cleanup, no functional change ] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Mike Miller <mike.miller@hp.com> Cc: unsik Kim <donari75@gmail.com> Cc: Paul Clements <paul.clements@steeleye.com> Cc: Tim Waugh <tim@cyberelk.net> Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: David S. Miller <davem@davemloft.net> Cc: Laurent Vivier <Laurent@lvivier.info> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Adrian McMenamin <adrian@mcmen.demon.co.uk> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Alex Dubov <oakad@yahoo.com> Cc: Pierre Ossman <drzeus@drzeus.cx> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: Stefan Weinhuber <wein@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:52:18 +02:00
Tejun Heo	83096ebf12	block: convert to pos and nr_sectors accessors With recent cleanups, there is no place where low level driver directly manipulates request fields. This means that the 'hard' request fields always equal the !hard fields. Convert all rq->sectors, nr_sectors and current_nr_sectors references to accessors. While at it, drop superflous blk_rq_pos() < 0 test in swim.c. [ Impact: use pos and nr_sectors accessors ] Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Tested-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Grant Likely <grant.likely@secretlab.ca> Tested-by: Adrian McMenamin <adrian@mcmen.demon.co.uk> Acked-by: Adrian McMenamin <adrian@mcmen.demon.co.uk> Acked-by: Mike Miller <mike.miller@hp.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Eric Moore <Eric.Moore@lsi.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Paul Clements <paul.clements@steeleye.com> Cc: Tim Waugh <tim@cyberelk.net> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Alex Dubov <oakad@yahoo.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Dario Ballabio <ballabio_dario@emc.com> Cc: David S. Miller <davem@davemloft.net> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: unsik Kim <donari75@gmail.com> Cc: Laurent Vivier <Laurent@lvivier.info> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:50:54 +02:00
Tejun Heo	40cbbb781d	block: implement and use [__]blk_end_request_all() There are many [__]blk_end_request() call sites which call it with full request length and expect full completion. Many of them ensure that the request actually completes by doing BUG_ON() the return value, which is awkward and error-prone. This patch adds [__]blk_end_request_all() which takes @rq and @error and fully completes the request. BUG_ON() is added to to ensure that this actually happens. Most conversions are simple but there are a few noteworthy ones. * cdrom/viocd: viocd_end_request() replaced with direct calls to __blk_end_request_all(). * s390/block/dasd: dasd_end_request() replaced with direct calls to __blk_end_request_all(). * s390/char/tape_block: tapeblock_end_request() replaced with direct calls to blk_end_request_all(). [ Impact: cleanup ] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Mike Miller <mike.miller@hp.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Alex Dubov <oakad@yahoo.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-04-28 07:37:35 +02:00
Linus Torvalds	ab70537c32	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: lguest: struct device - replace bus_id with dev_name() lguest: move the initial guest page table creation code to the host kvm-s390: implement config_changed for virtio on s390 virtio_console: support console resizing virtio: add PCI device release() function virtio_blk: fix type warning virtio: block: dynamic maximum segments virtio: set max_segment_size and max_sectors to infinite. virtio: avoid implicit use of Linux page size in balloon interface virtio: hand virtio ring alignment as argument to vring_new_virtqueue virtio: use KVM_S390_VIRTIO_RING_ALIGN instead of relying on pagesize virtio: use LGUEST_VRING_ALIGN instead of relying on pagesize virtio: Don't use PAGE_SIZE for vring alignment in virtio_pci. virtio: rename 'pagesize' arg to vring_init/vring_size virtio: Don't use PAGE_SIZE in virtio_pci.c virtio: struct device - replace bus_id with dev_name(), dev_set_name() virtio-pci queue allocation not page-aligned	2008-12-30 17:37:25 -08:00
Randy Dunlap	b194aee956	virtio_blk: fix type warning Fix parameter type warning: linux-next-20081126/drivers/block/virtio_blk.c:307: warning: large integer implicitly truncated to unsigned type Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:26:06 +10:30
Rusty Russell	0864b79a15	virtio: block: dynamic maximum segments Enhance the driver to handle whatever maximum segment number the host tells us to handle. Do to this, we need to allocate the scatterlist dynamically. We set max_phys_segments and max_hw_segments to the same value (1 if the host doesn't tell us, since that's safest and all known hosts do tell us). Note that kmalloc'ing the structure for large sg_elems might be problematic: the fix for this is sg_table, but that requires more work. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:26:05 +10:30
Rusty Russell	4b7f7e2049	virtio: set max_segment_size and max_sectors to infinite. Setting max_segment_size allows more than 64k per sg element, unless the host specified a limit. Setting max_sectors indicates that our max_hw_segments is the only limit. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:26:05 +10:30
Fernando Luis Vázquez Cao	7d116b626b	virtio_blk: set queue paravirt flag As a paravirt front-end driver, virtio_blk is not a rotational device so we want do avoid idling in AS/CFQ. Tell the block layer about this. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:41 +01:00
Al Viro	4e10985298	[PATCH] switch virtio_blk Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:48:09 -04:00
Al Viro	d4430d62fa	[PATCH] beginning of methods conversion To keep the size of changesets sane we split the switch by drivers; to keep the damn thing bisectable we do the following: 1) rename the affected methods, add ones with correct prototypes, make (few) callers handle both. That's this changeset. 2) for each driver convert to new methods. ALL drivers are converted in this series. 3) kill the old (renamed) methods. Note that it _is_ a flagday; all in-tree drivers are converted and by the end of this series no trace of old methods remain. The only reason why we do that this way is to keep the damn thing bisectable and allow per-driver debugging if anything goes wrong. New methods: open(bdev, mode) release(disk, mode) ioctl(bdev, mode, cmd, arg) /* Called without BKL / compat_ioctl(bdev, mode, cmd, arg) locked_ioctl(bdev, mode, cmd, arg) / Called with BKL, legacy */ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:47:32 -04:00
Al Viro	74f3c8aff3	[PATCH] switch scsi_cmd_ioctl() to passing fmode_t Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:47:14 -04:00
Kiyoshi Ueda	8316982ac0	virtio_blk: change to use __blk_end_request() This patch converts virtio_blk to use __blk_end_request() directly so that end_{queued\|dequeued}_request() can be removed. Related 'uptodate' argument is converted to 'error'. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-09 08:56:20 +02:00
Fernando Luis Vázquez Cao	766ca4428d	virtio_blk: use a wrapper function to access io context information of IO requests struct request has an ioprio member but it is never updated because currently bios do not hold io context information. The implication of this is that virtio_blk ends up passing useless information to the backend driver. That said, some IO schedulers such as CFQ do store io context information in struct request, but use private members for that, which means that that information cannot be directly accessed in a IO scheduler-independent way. This patch adds a function to obtain the ioprio of a request. We should avoid accessing ioprio directly and use this function instead, so that its users do not have to care about future changes in block layer structures or what the currently active IO controller is. This patch does not introduce any functional changes but paves the way for future clean-ups and enhancements. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-09 08:56:02 +02:00
Christian Borntraeger	066f4d82a6	virtio_blk: check for hardsector size from host Currently virtio_blk assumes a 512 byte hard sector size. This can cause trouble / performance issues if the backing has a different block size (like a file on an ext3 file system formatted with 4k block size or a dasd). Lets add a feature flag that tells the guest to use a different hard sector size than 512 byte. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:05 +10:00
Christian Borntraeger	3ef5360954	virtio_blk: allow read-only disks Hello Rusty, sometimes it is useful to share a disk (e.g. usr). To avoid file system corruption, the disk should be mounted read-only in that case. This patch adds a new feature flag, that allows the host to specify, if the disk should be considered read-only. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-05-30 15:09:44 +10:00
Chris Lalancette	ac9d463afb	Fix crash in virtio_blk during modprobe ; rmmod ; modprobe Fix a modprobe virtio_blk ; rmmod virtio_blk ; modprobe virtio_blk crash; this was basically because we weren't doing "del_gendisk()" in the remove path. Signed-off-by: Chris Lalancette <clalance@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (moved del_gendisk up)	2008-05-30 15:09:41 +10:00
Ryan Harper	48e4043d45	virtio: add virtio disk geometry feature Rather than faking up some geometry, allow the backend to push the disk geometry via virtio pci config option. Keep the old geo code around for compatibility. Signed-off-by: Ryan Harper <ryanh@us.ibm.com> Reviewed-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (modified to single struct)	2008-05-02 21:50:51 +10:00
Rusty Russell	c45a6816c1	virtio: explicit advertisement of driver features A recent proposed feature addition to the virtio block driver revealed some flaws in the API: in particular, we assume that feature negotiation is complete once a driver's probe function returns. There is nothing in the API to require this, however, and even I didn't notice when it was violated. So instead, we require the driver to specify what features it supports in a table, we can then move the feature negotiation into the virtio core. The intersection of device and driver features are presented in a new 'features' bitmap in the struct virtio_device. Note that this highlights the difference between Linux unsigned-long bitmaps where each unsigned long is in native endian, and a straight-forward little-endian array of bytes. Drivers can still remove feature bits in their probe routine if they really have to. API changes: - dev->config->feature() no longer gets and acks a feature. - drivers should advertise their features in the 'feature_table' field - use virtio_has_feature() for extra sanity when checking feature bits Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-05-02 21:50:50 +10:00
Rusty Russell	72e61eb40b	virtio: change config to guest endian. A recent proposed feature addition to the virtio block driver revealed some flaws in the API, in particular how easy it is to break big endian machines. The virtio config space was originally chosen to be little-endian, because we thought the config might be part of the PCI config space for virtio_pci. It's actually a separate mmio region, so that argument holds little water; as only x86 is currently using the virtio mechanism, we can change this (but must do so now, before the impending s390 merge). API changes: - __virtio_config_val() just becomes a striaght vdev->config_get() call. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-05-02 21:50:50 +10:00
Marcelo Tosatti	2e895e4c23	virtio-blk: fix remove oops Do not unregister the major at device remove, since there might be another device instances around. (qemu) pci_del 0 11 (qemu) ACPI: PCI interrupt for device 0000:00:0b.0 disabled (qemu) pci_del 0 10 (qemu) ------------[ cut here ]------------ WARNING: at block/genhd.c:126 unregister_blkdev+0x74/0x9e() ACPI: PCI interrupt for device 0000:00:0a.0 disabled Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-05-02 21:50:46 +10:00
Rusty Russell	cb38fa23c1	virtio: de-structify virtio_block status byte Ron Minnich points out that a struct containing a char is not always sizeof(char); simplest to remove the structure to avoid confusion. Cc: "ron minnich" <rminnich@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-05-02 21:50:45 +10:00
Jeremy Katz	c483934670	virtio: Fix sysfs bits to have proper block symlink Fix up so that the virtio_blk devices in sysfs link correctly to their block device. This then allows them to be detected by hal, etc Signed-off-by: Jeremy Katz <katzj@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-03-17 22:58:15 +11:00
Christian Borntraeger	d50ed907dc	virtio_blk: implement naming for vda-vdz,vdaa-vdzz,vdaaa-vdzzz Am Freitag, 1. Februar 2008 schrieb Christian Borntraeger: > Right. I will fix that with an additional patch. This patch goes on top of the minor number patch. Please let me know if you want a merged patch: Currently virtio_blk creates the disk name combinging "vd" with 'a'++. This will give strange names after vdz. I have implemented names up to vdzzz - inspired by the sd.c code. That should be sufficient for now. There is one driver in the kernel (driver/s390/block/dasd_genhd.c) that implements names from dasda-dasdzzzz allowing even more disks. Maybe a janitor can come up with a common implementation usable for all kind of block device drivers. I have tested this patch with 100 disks - seems to work. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-02-04 23:50:11 +11:00
Christian Borntraeger	4f3bf19c6e	virtio_blk: Dont waste major numbers Rusty, currently virtio_blk uses one major number per device. While this works quite well on most systems it is wasteful and will exhaust major numbers on larger installations. This patch allocates a major number on init and will use 16 minor numbers for each disk. That will allow ~64k virtio_blk disks. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-02-04 23:50:10 +11:00
Christian Borntraeger	135da0b037	virtio_blk: provide getgeo Rusty, I currently try to make my guest boot from an virtio root device without having an external kernel. Some of the tools that I tried expect HDIO_GETGEO to work. The most interesting value is likely the geo.start value to get the offset of a partition. This value is filled by block/ioctl.c if fops->getgeo is set. This patch also fills in some standard values for heads, sectors and cylinders. Makes sense? Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-02-04 23:50:09 +11:00
Rusty Russell	6e5aa7efb2	virtio: reset function A reset function solves three problems: 1) It allows us to renegotiate features, eg. if we want to upgrade a guest driver without rebooting the guest. 2) It gives us a clean way of shutting down virtqueues: after a reset, we know that the buffers won't be used by the host, and 3) It helps the guest recover from messed-up drivers. So we remove the ->shutdown hook, and the only way we now remove feature bits is via reset. We leave it to the driver to do the reset before it deletes queues: the balloon driver, for example, needs to chat to the host in its remove function. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-02-04 23:50:03 +11:00
Rusty Russell	18445c4d50	virtio: explicit enable_cb/disable_cb rather than callback return. It seems that virtio_net wants to disable callbacks (interrupts) before calling netif_rx_schedule(), so we can't use the return value to do so. Rename "restart" to "cb_enable" and introduce "cb_disable" hook: callback now returns void, rather than a boolean. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-02-04 23:49:58 +11:00
Rusty Russell	a586d4f601	virtio: simplify config mechanism. Previously we used a type/len pair within the config space, but this seems overkill. We now simply define a structure which represents the layout in the config space: the config space can now only be extended at the end. The main driver-visible changes: 1) We indicate what fields are present with an explicit feature bit. 2) Virtqueues are explicitly numbered, and not in the config space. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-02-04 23:49:57 +11:00
Rusty Russell	74b2553f1d	virtio: fix module/device unloading The virtio code never hooked through the ->remove callback. Although noone supports device removal at the moment, this code is already needed for module unloading. This of course also revealed bugs in virtio_blk, virtio_net and lguest unloading paths. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2007-11-19 11:20:42 +11:00
Jens Axboe	3d1266c704	SG: audit of drivers that use blk_rq_map_sg() They need to properly init the sg table, or blk_rq_map_sg() will complain if CONFIG_DEBUG_SG is set. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-24 13:21:21 +02:00
Rusty Russell	e467cde238	Block driver using virtio. The block driver uses scatter-gather lists with sg[0] being the request information (struct virtio_blk_outhdr) with the type, sector and inbuf id. The next N sg entries are the bio itself, then the last sg is the status byte. Whether the N entries are in or out depends on whether it's a read or a write. We accept the normal (SCSI) ioctls: they get handed through to the other side which can then handle it or reply that it's unsupported. It's not clear that this actually works in general, since I don't know if blk_pc_request() requests have an accurate rq_data_dir(). Although we try to reply -ENOTTY on unsupported commands, ioctl(fd, CDROMEJECT) returns success to userspace. This needs a separate patch. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <jens.axboe@oracle.com>	2007-10-23 15:49:54 +10:00

1 2 3

149 Commits