linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-23 11:21:33 +00:00

Author	SHA1	Message	Date
Roger Pau Monne	80bfa2f6e2	xen-blkif: drop struct blkif_request_segment_aligned This was wrongly introduced in commit `402b27f9`, the only difference between blkif_request_segment_aligned and blkif_request_segment is that the former has a named padding, while both share the same memory layout. Also correct a few minor glitches in the description, including for it to no longer assume PAGE_SIZE == 4096. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> [Description fix by Jan Beulich] Signed-off-by: Jan Beulich <jbeulich@suse.com> Reported-by: Jan Beulich <jbeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Tested-by: Matt Rushton <mrushton@amazon.com> Cc: Matt Wilson <msw@amazon.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-02-07 13:03:53 -05:00
Linus Torvalds	f568849eda	Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block Pull core block IO changes from Jens Axboe: "The major piece in here is the immutable bio_ve series from Kent, the rest is fairly minor. It was supposed to go in last round, but various issues pushed it to this release instead. The pull request contains: - Various smaller blk-mq fixes from different folks. Nothing major here, just minor fixes and cleanups. - Fix for a memory leak in the error path in the block ioctl code from Christian Engelmayer. - Header export fix from CaiZhiyong. - Finally the immutable biovec changes from Kent Overstreet. This enables some nice future work on making arbitrarily sized bios possible, and splitting more efficient. Related fixes to immutable bio_vecs: - dm-cache immutable fixup from Mike Snitzer. - btrfs immutable fixup from Muthu Kumar. - bio-integrity fix from Nic Bellinger, which is also going to stable" * 'for-3.14/core' of git://git.kernel.dk/linux-block: (44 commits) xtensa: fixup simdisk driver to work with immutable bio_vecs block/blk-mq-cpu.c: use hotcpu_notifier() blk-mq: for_each_* macro correctness block: Fix memory leak in rw_copy_check_uvector() handling bio-integrity: Fix bio_integrity_verify segment start bug block: remove unrelated header files and export symbol blk-mq: uses page->list incorrectly blk-mq: use __smp_call_function_single directly btrfs: fix missing increment of bi_remaining Revert "block: Warn and free bio if bi_end_io is not set" block: Warn and free bio if bi_end_io is not set blk-mq: fix initializing request's start time block: blk-mq: don't export blk_mq_free_queue() block: blk-mq: make blk_sync_queue support mq block: blk-mq: support draining mq queue dm cache: increment bi_remaining when bi_end_io is restored block: fixup for generic bio chaining block: Really silence spurious compiler warnings block: Silence spurious compiler warnings block: Kill bio_pair_split() ...	2014-01-30 11:19:05 -08:00
Konrad Rzeszutek Wilk	51c71a3bba	xen/pvhvm: If xen_platform_pci=0 is set don't blow up (v4). The user has the option of disabling the platform driver: 00:02.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 01) which is used to unplug the emulated drivers (IDE, Realtek 8169, etc) and allow the PV drivers to take over. If the user wishes to disable that they can set: xen_platform_pci=0 (in the guest config file) or xen_emul_unplug=never (on the Linux command line) except it does not work properly. The PV drivers still try to load and since the Xen platform driver is not run - and it has not initialized the grant tables, most of the PV drivers stumble upon: input: Xen Virtual Keyboard as /devices/virtual/input/input5 input: Xen Virtual Pointer as /devices/virtual/input/input6M ------------[ cut here ]------------ kernel BUG at /home/konrad/ssd/konrad/linux/drivers/xen/grant-table.c:1206! invalid opcode: 0000 [#1] SMP Modules linked in: xen_kbdfront(+) xenfs xen_privcmd CPU: 6 PID: 1389 Comm: modprobe Not tainted 3.13.0-rc1upstream-00021-ga6c892b-dirty #1 Hardware name: Xen HVM domU, BIOS 4.4-unstable 11/26/2013 RIP: 0010:[<ffffffff813ddc40>] [<ffffffff813ddc40>] get_free_entries+0x2e0/0x300 Call Trace: [<ffffffff8150d9a3>] ? evdev_connect+0x1e3/0x240 [<ffffffff813ddd0e>] gnttab_grant_foreign_access+0x2e/0x70 [<ffffffffa0010081>] xenkbd_connect_backend+0x41/0x290 [xen_kbdfront] [<ffffffffa0010a12>] xenkbd_probe+0x2f2/0x324 [xen_kbdfront] [<ffffffff813e5757>] xenbus_dev_probe+0x77/0x130 [<ffffffff813e7217>] xenbus_frontend_dev_probe+0x47/0x50 [<ffffffff8145e9a9>] driver_probe_device+0x89/0x230 [<ffffffff8145ebeb>] __driver_attach+0x9b/0xa0 [<ffffffff8145eb50>] ? driver_probe_device+0x230/0x230 [<ffffffff8145eb50>] ? driver_probe_device+0x230/0x230 [<ffffffff8145cf1c>] bus_for_each_dev+0x8c/0xb0 [<ffffffff8145e7d9>] driver_attach+0x19/0x20 [<ffffffff8145e260>] bus_add_driver+0x1a0/0x220 [<ffffffff8145f1ff>] driver_register+0x5f/0xf0 [<ffffffff813e55c5>] xenbus_register_driver_common+0x15/0x20 [<ffffffff813e76b3>] xenbus_register_frontend+0x23/0x40 [<ffffffffa0015000>] ? 0xffffffffa0014fff [<ffffffffa001502b>] xenkbd_init+0x2b/0x1000 [xen_kbdfront] [<ffffffff81002049>] do_one_initcall+0x49/0x170 .. snip.. which is hardly nice. This patch fixes this by having each PV driver check for: - if running in PV, then it is fine to execute (as that is their native environment). - if running in HVM, check if user wanted 'xen_emul_unplug=never', in which case bail out and don't load any PV drivers. - if running in HVM, and if PCI device 5853:0001 (xen_platform_pci) does not exist, then bail out and not load PV drivers. - (v2) if running in HVM, and if the user wanted 'xen_emul_unplug=ide-disks', then bail out for all PV devices _except_ the block one. Ditto for the network one ('nics'). - (v2) if running in HVM, and if the user wanted 'xen_emul_unplug=unnecessary' then load block PV driver, and also setup the legacy IDE paths. In (v3) make it actually load PV drivers. Reported-by: Sander Eikelenboom <linux@eikelenboom.it Reported-by: Anthony PERARD <anthony.perard@citrix.com> Reported-and-Tested-by: Fabio Fantoni <fabio.fantoni@m2r.biz> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [v2: Add extra logic to handle the myrid ways 'xen_emul_unplug' can be used per Ian and Stefano suggestion] [v3: Make the unnecessary case work properly] [v4: s/disks/ide-disks/ spotted by Fabio] Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> [for PCI parts] CC: stable@vger.kernel.org	2014-01-03 14:54:18 -05:00
Jens Axboe	b28bc9b38c	Linux 3.13-rc6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJSwLfoAAoJEHm+PkMAQRiGi6QH/1U1B7lmHChDTw3jj1lfm9gA 189Si4QJlnxFWCKHvKEL+pcaVuACU+aMGI8+KyMYK4/JfuWVjjj5fr/SvyHH2/8m LdSK8aHMhJ46uBS4WJ/l6v46qQa5e2vn8RKSBAyKm/h4vpt+hd6zJdoFrFai4th7 k/TAwOAEHI5uzexUChwLlUBRTvbq4U8QUvDu+DeifC8cT63CGaaJ4qVzjOZrx1an eP6UXZrKDASZs7RU950i7xnFVDQu4PsjlZi25udsbeiKcZJgPqGgXz5ULf8ZH8RQ YCi1JOnTJRGGjyIOyLj7pyB01h7XiSM2+eMQ0S7g54F2s7gCJ58c2UwQX45vRWU= =/4/R -----END PGP SIGNATURE----- Merge tag 'v3.13-rc6' into for-3.14/core Needed to bring blk-mq uptodate, since changes have been going in since for-3.14/core was established. Fixup merge issues related to the immutable biovec changes. Signed-off-by: Jens Axboe <axboe@kernel.dk> Conflicts: block/blk-flush.c fs/btrfs/check-integrity.c fs/btrfs/extent_io.c fs/btrfs/scrub.c fs/logfs/dev_bdev.c	2013-12-31 09:51:02 -07:00
Felipe Pena	2f089cb89d	block: xen-blkfront: Fix possible NULL ptr dereference In the blkif_release function the bdget_disk() call might returns a NULL ptr which might be dereferenced on bdev->bd_openers checking Signed-off-by: Felipe Pena <felipensp@gmail.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [v2: Added WARN per Roger's suggestion]	2013-11-26 11:24:01 -05:00
Tim Gardner	427bfe07e6	xen-blkfront: Silence pfn maybe-uninitialized warning pfn cannot actually be used unless (!info->feature_persistent), nor is pfn accessed in get_grant() unless (!info->feature_persistent), but silence this warning anyway. gcc-4.8 drivers/block/xen-blkfront.c: In function 'do_blkif_request': drivers/block/xen-blkfront.c:508:20: warning: 'pfn' may be used uninitialized in this function [-Wmaybe-uninitialized] gnt_list_entry = get_grant(&gref_head, pfn, info); ^ drivers/block/xen-blkfront.c:492:19: note: 'pfn' was declared here unsigned long pfn; Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>	2013-11-26 11:23:53 -05:00
Kent Overstreet	4f024f3797	block: Abstract out bvec iterator Immutable biovecs are going to require an explicit iterator. To implement immutable bvecs, a later patch is going to add a bi_bvec_done member to this struct; for now, this patch effectively just renames things. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Ed L. Cashin" <ecashin@coraid.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Yehuda Sadeh <yehuda@inktank.com> Cc: Sage Weil <sage@inktank.com> Cc: Alex Elder <elder@inktank.com> Cc: ceph-devel@vger.kernel.org Cc: Joshua Morris <josh.h.morris@us.ibm.com> Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Neil Brown <neilb@suse.de> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: Benny Halevy <bhalevy@tonian.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <chris.mason@fusionio.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Dave Kleikamp <shaggy@kernel.org> Cc: Joern Engel <joern@logfs.org> Cc: Prasad Joshi <prasadjoshi.linux@gmail.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Ben Myers <bpm@sgi.com> Cc: xfs@oss.sgi.com Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Guo Chao <yan@linux.vnet.ibm.com> Cc: Tejun Heo <tj@kernel.org> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: "Roger Pau Monné" <roger.pau@citrix.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchand@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Peng Tao <tao.peng@emc.com> Cc: Andy Adamson <andros@netapp.com> Cc: fanchaoting <fanchaoting@cn.fujitsu.com> Cc: Jie Liu <jeff.liu@oracle.com> Cc: Sunil Mushran <sunil.mushran@gmail.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Namjae Jeon <namjae.jeon@samsung.com> Cc: Pankaj Kumar <pankaj.km@samsung.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Mel Gorman <mgorman@suse.de>6	2013-11-23 22:33:47 -08:00
Roger Pau Monne	bfe11d6de1	xen-blkfront: restore the non-persistent data path When persistent grants were added they were always used, even if the backend doesn't have this feature (there's no harm in always using the same set of pages). This restores the old data path when the backend doesn't have persistent grants, removing the burden of doing a memcpy when it is not actually needed. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reported-by: Felipe Franciosi <felipe.franciosi@citrix.com> Cc: Felipe Franciosi <felipe.franciosi@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [v2: Fix up whitespace issues]	2013-11-08 09:10:30 -07:00
Roger Pau Monne	c47206e25f	xen-blkfront: improve aproximation of required grants per request Improve the calculation of required grants to process a request by using nr_phys_segments instead of always assuming a request is going to use all posible segments. nr_phys_segments contains the number of scatter-gather DMA addr+len pairs, which is basically what we put at every granted page. for_each_sg iterates over the DMA addr+len pairs and uses a grant page for each of them. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-11-08 09:10:27 -07:00
Roger Pau Monne	fbe363c476	xen-blkfront: revoke foreign access for grants not mapped by the backend There's no need to keep the foreign access in a grant if it is not persistently mapped by the backend. This allows us to free grants that are not mapped by the backend, thus preventing blkfront from hoarding all grants. The main effect of this is that blkfront will only persistently map the same grants as the backend, and it will always try to use grants that are already mapped by the backend. Also the number of persistent grants in blkfront is the same as in blkback (and is controlled by the value in blkback). Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Matt Wilson <msw@amazon.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-11-08 09:10:27 -07:00
Kent Overstreet	6678d83f18	block: Consolidate duplicated bio_trim() implementations Someone cut and pasted md's md_trim_bio() into xen-blkfront.c. Come on, we should know better than this. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Neil Brown <neilb@suse.de> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2013-11-08 09:02:31 -07:00
Jens Axboe	f35546e072	Merge branch 'stable/for-jens-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-3.11/drivers Konrad writes: It has the 'feature-max-indirect-segments' implemented in both backend and frontend. The current problem with the backend and frontend is that the segment size is limited to 11 pages. It means we can at most squeeze in 44kB per request. The ring can hold 32 (next power of two below 36) requests, meaning we can do 1.4M of outstanding requests. Nowadays that is not enough. The problem in the past was addressed in two ways - but neither one went upstream. The first solution to this proposed by Justin from Spectralogic was to negotiate the segment size. This means that the ‘struct blkif_sring_entry’ is now a variable size. It can expand from 112 bytes (cover 11 pages of data - 44kB) to 1580 bytes (256 pages of data - so 1MB). It is a simple extension by just making the array in the request expand from 11 to a variable size negotiated. But it had limits: this extension still limits the number of segments per request to 255 (as the total number must be specified in the request, which only has an 8-bit field for that purpose). The other solution (from Intel - Ronghui) was to create one extra ring that only has the ‘struct blkif_request_segment’ in them. The ‘struct blkif_request’ would be changed to have an index in said ‘segment ring’. There is only one segment ring. This means that the size of the initial ring is still the same. The requests would point to the segment and enumerate out how many of the indexes it wants to use. The limit is of course the size of the segment. If one assumes a one-page segment this means we can in one request cover ~4MB. Those patches were posted as RFC and the author never followed up on the ideas on changing it to be a bit more flexible. There is yet another mechanism that could be employed (which these patches implement) - and it borrows from VirtIO protocol. And that is the ‘indirect descriptors’. This very similar to what Intel suggests, but with a twist. The twist is to negotiate how many of these 'segment' pages (aka indirect descriptor pages) we want to support (in reality we negotiate how many entries in the segment we want to cover, and we module the number if it is bigger than the segment size). This means that with the existing 36 slots in the ring (single page) we can cover: 32 slots * each blkif_request_indirect covers: 512 * 4096 ~= 64M. Since we ample space in the blkif_request_indirect to span more than one indirect page, that number (64M) can be also multiplied by eight = 512MB. Roger Pau Monne took the idea and implemented them in these patches. They work great and the corner cases (migration between backends with and without this extension) work nicely. The backend has a limit right now off how many indirect entries it can handle: one indirect page, and at maximum 256 entries (out of 512 - so 50% of the page is used). That comes out to 32 slots * 256 entries in a indirect page * 1 indirect page per request * 4096 = 32MB. This is a conservative number that can change in the future. Right now it strikes a good balance between giving excellent performance, memory usage in the backend, and balancing the needs of many guests. In the patchset there is also the split of the blkback structure to be per-VBD. This means that the spinlock contention we had with many guests trying to do I/O and all the blkback threads hitting the same lock has been eliminated. Also there are bug-fixes to deal with oddly sized sectors, insane amounts on th ring, and also a security fix (posted earlier).	2013-06-28 16:01:14 +02:00
Roger Pau Monne	294caaf29c	xen-blkfront: set blk_queue_max_hw_sectors correctly Now that indirect segments are enabled blk_queue_max_hw_sectors must be set to match the maximum number of sectors we can handle in a request. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reported-by: Felipe Franciosi <felipe.franciosi@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-06-21 15:58:55 -04:00
Stefan Bader	7c4d7d710f	xen/blkback: Use physical sector size for setup Currently xen-blkback passes the logical sector size over xenbus and xen-blkfront sets up the paravirt disk with that logical block size. But newer drives usually have the logical sector size set to 512 for compatibility reasons and would show the actual sector size only in physical sector size. This results in the device being partitioned and accessed in dom0 with the correct sector size, but the guest thinks 512 bytes is the correct block size. And that results in poor performance. To fix this, blkback gets modified to pass also physical-sector-size over xenbus and blkfront to use both values to set up the paravirt disk. I did not just change the passed in sector-size because I am not sure having a bigger logical sector size than the physical one is valid (and that would happen if a newer dom0 kernel hits an older domU kernel). Also this way a domU set up before should still be accessible (just some tools might detect the unaligned setup). [v2: Make xenbus write failure non-fatal] [v3: Use xenbus_scanf instead of xenbus_gather] [v4: Rebased against segment changes] Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-06-07 17:05:53 -04:00
Konrad Rzeszutek Wilk	2d5dc3ba85	xen-blkfront: Introduce a 'max' module parameter to alter the amount of indirect segments. The max module parameter (by default 32) is the maximum number of segments that the frontend will negotiate with the backend for indirect descriptors. Higher value means more potential throughput but more memory usage. The backend picks the minimum of the frontend and its default backend value. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-06-04 15:58:35 -04:00
Roger Pau Monne	b7649158a0	xen-blkfront: use a different scatterlist for each request In blkif_queue_request blkfront iterates over the scatterlist in order to set the segments of the request, and in blkif_completion blkfront iterates over the raw request, which makes it hard to know the exact position of the source and destination memory positions. This can be solved by allocating a scatterlist for each request, that will be keep until the request is finished, allowing us to copy the data back to the original memory without having to iterate over the raw request. Oracle-Bug: 16660413 - LARGE ASYNCHRONOUS READS APPEAR BROKEN ON 2.6.39-400 CC: stable@vger.kernel.org Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reported-and-Tested-by: Anne Milicia <anne.milicia@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-05-08 08:46:51 -04:00
Al Viro	db2a144bed	block_device_operations->release() should return void The value passed is 0 in all but "it can never happen" cases (and those only in a couple of drivers) and it would've been lost on the way out anyway, even if something tried to pass something meaningful. Just don't bother. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-05-07 02:16:21 -04:00
Roger Pau Monne	402b27f9f2	xen-block: implement indirect descriptors Indirect descriptors introduce a new block operation (BLKIF_OP_INDIRECT) that passes grant references instead of segments in the request. This grant references are filled with arrays of blkif_request_segment_aligned, this way we can send more segments in a request. The proposed implementation sets the maximum number of indirect grefs (frames filled with blkif_request_segment_aligned) to 256 in the backend and 32 in the frontend. The value in the frontend has been chosen experimentally, and the backend value has been set to a sane value that allows expanding the maximum number of indirect descriptors in the frontend if needed. The migration code has changed from the previous implementation, in which we simply remapped the segments on the shared ring. Now the maximum number of segments allowed in a request can change depending on the backend, so we have to requeue all the requests in the ring and in the queue and split the bios in them if they are bigger than the new maximum number of segments. [v2: Fixed minor comments by Konrad. [v1: Added padding to make the indirect request 64bit aligned. Added some BUGs, comments; fixed number of indirect pages in blkif_get_x86_{32/64}_req. Added description about the indirect operation in blkif.h] Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> [v3: Fixed spaces and tabs mix ups] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-04-18 14:16:00 -04:00
Roger Pau Monne	b1173e316b	xen-blkfront: remove frame list from blk_shadow We already have the frame (pfn of the grant page) stored inside struct grant, so there's no need to keep an aditional list of mapped frames for a specific request. This reduces memory usage in blkfront. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: xen-devel@lists.xen.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-03-19 12:50:07 -04:00
Roger Pau Monne	9c1e050cae	xen-blkfront: pre-allocate pages for requests This prevents us from having to call alloc_page while we are preparing the request. Since blkfront was calling alloc_page with a spinlock held we used GFP_ATOMIC, which can fail if we are requesting a lot of pages since it is using the emergency memory pools. Allocating all the pages at init prevents us from having to call alloc_page, thus preventing possible failures. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: xen-devel@lists.xen.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-03-19 12:50:06 -04:00
Roger Pau Monne	155b7edb51	xen-blkfront: switch from llist to list The git commit `f84adf4921` (xen-blkfront: drop the use of llist_for_each_entry_safe) was a stop-gate to fix a GCC4.1 bug. The appropiate way is to actually use an list instead of using an llist. As such this patch replaces the usage of llist with an list. Since we always manipulate the list while holding the io_lock, there's no need for additional locking (llist used previously is safe to use concurrently without additional locking). Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> CC: stable@vger.kernel.org [v1: Redid the git commit description] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-03-19 10:27:56 -04:00
Mihnea Dobrescu-Balaur	29d0b218c8	xen-blkfront: replace kmalloc and then memcpy with kmemdup The benefits are: * code is cleaner * kmemdup adds additional debugging info useful for tracking the real place where memory was allocated (CONFIG_DEBUG_SLAB). Signed-off-by: Mihnea Dobrescu-Balaur <mihneadb@gmail.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-03-18 16:31:31 -04:00
Konrad Rzeszutek Wilk	f84adf4921	xen-blkfront: drop the use of llist_for_each_entry_safe Replace llist_for_each_entry_safe with a while loop. llist_for_each_entry_safe can trigger a bug in GCC 4.1, so it's best to remove it and use a while loop and do the deletion manually. Specifically this bug can be triggered by hot-unplugging a disk, either by doing xm block-detach or by save/restore cycle. BUG: unable to handle kernel paging request at fffffffffffffff0 IP: [<ffffffffa0047223>] blkif_free+0x63/0x130 [xen_blkfront] The crash call trace is: ... bad_area_nosemaphore+0x13/0x20 do_page_fault+0x25e/0x4b0 page_fault+0x25/0x30 ? blkif_free+0x63/0x130 [xen_blkfront] blkfront_resume+0x46/0xa0 [xen_blkfront] xenbus_dev_resume+0x6c/0x140 pm_op+0x192/0x1b0 device_resume+0x82/0x1e0 dpm_resume+0xc9/0x1a0 dpm_resume_end+0x15/0x30 do_suspend+0x117/0x1e0 When drilling down to the assembler code, on newer GCC it does .L29: cmpq $-16, %r12 #, persistent_gnt check je .L30 #, out of the loop .L25: ... code in the loop testq %r13, %r13 # n je .L29 #, back to the top of the loop cmpq $-16, %r12 #, persistent_gnt check movq 16(%r12), %r13 # <variable>.node.next, n jne .L25 #, back to the top of the loop .L30: While on GCC 4.1, it is: L78: ... code in the loop testq %r13, %r13 # n je .L78 #, back to the top of the loop movq 16(%rbx), %r13 # <variable>.node.next, n jmp .L78 #, back to the top of the loop Which basically means that the exit loop condition instead of being: &(pos)->member != NULL; is: ; which makes the loop unbound. Since xen-blkfront is the only user of the llist_for_each_entry_safe macro remove it from llist.h. Orabug: 16263164 CC: stable@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-02-19 15:17:08 -05:00
Roger Pau Monne	d62f691858	xen-blkfront: handle bvecs with partial data Currently blkfront fails to handle cases in blkif_completion like the following: 1st loop in rq_for_each_segment * bv_offset: 3584 * bv_len: 512 * offset += bv_len * i: 0 2nd loop: * bv_offset: 0 * bv_len: 512 * i: 0 In the second loop i should be 1, since we assume we only wanted to read a part of the previous page. This patches fixes this cases where only a part of the shared page is read, and blkif_completion assumes that if the bv_offset of a bvec is less than the previous bv_offset plus the bv_size we have to switch to the next shared page. Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Cc: linux-kernel@vger.kernel.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-12-17 21:56:03 -05:00
Roger Pau Monne	ebb351cf78	llist/xen-blkfront: implement safe version of llist_for_each_entry Implement a safe version of llist_for_each_entry, and use it in blkif_free. Previously grants where freed while iterating the list, which lead to dereferences when trying to fetch the next item. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> [v2: Move the llist_for_each_entry_safe in llist.h] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-12-17 21:55:56 -05:00
Roger Pau Monne	07c540a0b5	xen-blkfront: free allocated page Free the page allocated for the persistent grant. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-11-26 14:58:11 -05:00
Roger Pau Monne	cb5bd4d19b	xen/blkback: persistent-grants fixes This patch contains fixes for persistent grants implementation v2: * handle == 0 is a valid handle, so initialize grants in blkback setting the handle to BLKBACK_INVALID_HANDLE instead of 0. Reported by Konrad Rzeszutek Wilk. * new_map is a boolean, use "true" or "false" instead of 1 and 0. Reported by Konrad Rzeszutek Wilk. * blkfront announces the persistent-grants feature as feature-persistent-grants, use feature-persistent instead which is consistent with blkback and the public Xen headers. * Add a consistency check in blkfront to make sure we don't try to access segments that have not been set. Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Roger Pau Monne <roger.pau@citrix.com> [v1: The new_map int->bool had already been changed] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-11-04 10:35:40 -05:00
Roger Pau Monne	0a8704a51f	xen/blkback: Persistent grant maps for xen blk drivers This patch implements persistent grants for the xen-blk{front,back} mechanism. The effect of this change is to reduce the number of unmap operations performed, since they cause a (costly) TLB shootdown. This allows the I/O performance to scale better when a large number of VMs are performing I/O. Previously, the blkfront driver was supplied a bvec[] from the request queue. This was granted to dom0; dom0 performed the I/O and wrote directly into the grant-mapped memory and unmapped it; blkfront then removed foreign access for that grant. The cost of unmapping scales badly with the number of CPUs in Dom0. An experiment showed that when Dom0 has 24 VCPUs, and guests are performing parallel I/O to a ramdisk, the IPIs from performing unmap's is a bottleneck at 5 guests (at which point 650,000 IOPS are being performed in total). If more than 5 guests are used, the performance declines. By 10 guests, only 400,000 IOPS are being performed. This patch improves performance by only unmapping when the connection between blkfront and back is broken. On startup blkfront notifies blkback that it is using persistent grants, and blkback will do the same. If blkback is not capable of persistent mapping, blkfront will still use the same grants, since it is compatible with the previous protocol, and simplifies the code complexity in blkfront. To perform a read, in persistent mode, blkfront uses a separate pool of pages that it maps to dom0. When a request comes in, blkfront transmutes the request so that blkback will write into one of these free pages. Blkback keeps note of which grefs it has already mapped. When a new ring request comes to blkback, it looks to see if it has already mapped that page. If so, it will not map it again. If the page hasn't been previously mapped, it is mapped now, and a record is kept of this mapping. Blkback proceeds as usual. When blkfront is notified that blkback has completed a request, it memcpy's from the shared memory, into the bvec supplied. A record that the {gref, page} tuple is mapped, and not inflight is kept. Writes are similar, except that the memcpy is peformed from the supplied bvecs, into the shared pages, before the request is put onto the ring. Blkback stores a mapping of grefs=>{page mapped to by gref} in a red-black tree. As the grefs are not known apriori, and provide no guarantees on their ordering, we have to perform a search through this tree to find the page, for every gref we receive. This operation takes O(log n) time in the worst case. In blkfront grants are stored using a single linked list. The maximum number of grants that blkback will persistenly map is currently set to RING_SIZE * BLKIF_MAX_SEGMENTS_PER_REQUEST, to prevent a malicios guest from attempting a DoS, by supplying fresh grefs, causing the Dom0 kernel to map excessively. If a guest is using persistent grants and exceeds the maximum number of grants to map persistenly the newly passed grefs will be mapped and unmaped. Using this approach, we can have requests that mix persistent and non-persistent grants, and we need to handle them correctly. This allows us to set the maximum number of persistent grants to a lower value than RING_SIZE * BLKIF_MAX_SEGMENTS_PER_REQUEST, although setting it will lead to unpredictable performance. In writing this patch, the question arrises as to if the additional cost of performing memcpys in the guest (to/from the pool of granted pages) outweigh the gains of not performing TLB shootdowns. The answer to that question is `no'. There appears to be very little, if any additional cost to the guest of using persistent grants. There is perhaps a small saving, from the reduced number of hypercalls performed in granting, and ending foreign access. Signed-off-by: Oliver Chick <oliver.chick@citrix.com> Signed-off-by: Roger Pau Monne <roger.pau@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [v1: Fixed up the misuse of bool as int]	2012-10-30 09:50:04 -04:00
Tejun Heo	43829731dd	workqueue: deprecate flush[_delayed]_work_sync() flush[_delayed]_work_sync() are now spurious. Mark them deprecated and convert all users to flush[_delayed]_work(). If you're cc'd and wondering what's going on: Now all workqueues are non-reentrant and the regular flushes guarantee that the work item is not pending or running on any CPU on return, so there's no reason to use the sync flushes at all and they're going away. This patch doesn't make any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Mattia Dongili <malattia@linux.it> Cc: Kent Yoder <key@linux.vnet.ibm.com> Cc: David Airlie <airlied@linux.ie> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: Bryan Wu <bryan.wu@canonical.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-wireless@vger.kernel.org Cc: Anton Vorontsov <cbou@mail.ru> Cc: Sangbeom Kim <sbkim73@samsung.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Petr Vandrovec <petr@vandrovec.name> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Avi Kivity <avi@redhat.com>	2012-08-20 14:51:24 -07:00
Linus Torvalds	3e9a97082f	This patch series contains a major revamp of how we collect entropy from interrupts for /dev/random and /dev/urandom. The goal is to addresses weaknesses discussed in the paper "Mining your Ps and Qs: Detection of Widespread Weak Keys in Network Devices", by Nadia Heninger, Zakir Durumeric, Eric Wustrow, J. Alex Halderman, which will be published in the Proceedings of the 21st Usenix Security Symposium, August 2012. (See https://factorable.net for more information and an extended version of the paper.) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABCAAGBQJQF/0DAAoJENNvdpvBGATwIowQAOep9QKtLrBvb2lwIRVmeiy8 lRf7V/tYZnz4FePbR0W92JQfKYkCV8yyOO0bmeRzWL3v4m+lRwDTSyA1DDyQMoH+ LOMzvDKSLJMSXTXdSOIr1WYACphViCR/9CrbMBCKSkYfZLJ1MdaEDxT3rcpTGD0T 6iknUweiSkHHhkerU5yQL7FKzD5kYUe0hsF47w7QVlHRHJsW2fsZqkFoh+RpnhNw 03u+djxNGBo9qV81vZ9D1b0vA9uRlEjoWOOEG2XE4M2iq6TUySueA72dQnCwunfi 3kG/u1Swv2dgq6aRrP3H7zdwhYSourGxziu3jNhEKwKEohrxYY7xjNX3RVeTqP67 AzlKsOTWpRLIDrzjSLlb8VxRQiZewu8Unex3e1G+eo20sbcIObHGrxNp7K00zZvd QZiMHhOwItwFTe4lBO+XbqH2JKbL9/uJmwh5EipMpQTraKO9E6N3CJiUHjzBLo2K iGDZxRMKf4gVJRwDxbbP6D70JPVu8ZJ09XVIpsXQ3Z1xNqaMF0QdCmP3ty56q1o0 NvkSXxPKrijZs8Sk0rVDqnJ3ll8PuDnXMv5eDtL42VT818I5WxESn9djjwEanGv0 TYxbFub/NRxmPEE5B2Js5FBpqsLf5f282OSMeS/5WLBbnHJR1OoPoAhGVpHvxntC bi5FC1OolqhvzVIdsqgt =u7KM -----END PGP SIGNATURE----- Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random Pull random subsystem patches from Ted Ts'o: "This patch series contains a major revamp of how we collect entropy from interrupts for /dev/random and /dev/urandom. The goal is to addresses weaknesses discussed in the paper "Mining your Ps and Qs: Detection of Widespread Weak Keys in Network Devices", by Nadia Heninger, Zakir Durumeric, Eric Wustrow, J. Alex Halderman, which will be published in the Proceedings of the 21st Usenix Security Symposium, August 2012. (See https://factorable.net for more information and an extended version of the paper.)" Fix up trivial conflicts due to nearby changes in drivers/{mfd/ab3100-core.c, usb/gadget/omap_udc.c} * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random: (33 commits) random: mix in architectural randomness in extract_buf() dmi: Feed DMI table to /dev/random driver random: Add comment to random_initialize() random: final removal of IRQF_SAMPLE_RANDOM um: remove IRQF_SAMPLE_RANDOM which is now a no-op sparc/ldc: remove IRQF_SAMPLE_RANDOM which is now a no-op [ARM] pxa: remove IRQF_SAMPLE_RANDOM which is now a no-op board-palmz71: remove IRQF_SAMPLE_RANDOM which is now a no-op isp1301_omap: remove IRQF_SAMPLE_RANDOM which is now a no-op pxa25x_udc: remove IRQF_SAMPLE_RANDOM which is now a no-op omap_udc: remove IRQF_SAMPLE_RANDOM which is now a no-op goku_udc: remove IRQF_SAMPLE_RANDOM which was commented out uartlite: remove IRQF_SAMPLE_RANDOM which is now a no-op drivers: hv: remove IRQF_SAMPLE_RANDOM which is now a no-op xen-blkfront: remove IRQF_SAMPLE_RANDOM which is now a no-op n2_crypto: remove IRQF_SAMPLE_RANDOM which is now a no-op pda_power: remove IRQF_SAMPLE_RANDOM which is now a no-op i2c-pmcmsp: remove IRQF_SAMPLE_RANDOM which is now a no-op input/serio/hp_sdc.c: remove IRQF_SAMPLE_RANDOM which is now a no-op mfd: remove IRQF_SAMPLE_RANDOM which is now a no-op ...	2012-07-31 19:07:42 -07:00
Theodore Ts'o	89c30f161c	xen-blkfront: remove IRQF_SAMPLE_RANDOM which is now a no-op With the changes in the random tree, IRQF_SAMPLE_RANDOM is now a no-op; interrupt randomness is now collected unconditionally in a very low-overhead fashion; see commit `775f4b297b`. The IRQF_SAMPLE_RANDOM flag was scheduled to be removed in 2009 on the feature-removal-schedule, so this patch is preparation for the final removal of this flag. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org>	2012-07-19 10:39:25 -04:00
Konrad Rzeszutek Wilk	6878c32e5c	xen/blkfront: Add WARN to deal with misbehaving backends. Part of the ring structure is the 'id' field which is under control of the frontend. The frontend stamps it with "some" value (this some in this implementation being a value less than BLK_RING_SIZE), and when it gets a response expects said value to be in the response structure. We have a check for the id field when spolling new requests but not when de-spolling responses. We also add an extra check in add_id_to_freelist to make sure that the 'struct request' was not NULL - as we cannot pass a NULL to __blk_end_request_all, otherwise that crashes (and all the operations that the response is dealing with end up with __blk_end_request_all). Lastly we also print the name of the operation that failed. [v1: s/BUG/WARN/ suggested by Stefano] [v2: Add extra check in add_id_to_freelist] [v3: Redid op_name per Jan's suggestion] [v4: add const * and add WARN on failure returns] Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-06-12 08:29:04 -04:00
Jan Beulich	8605067fb9	xen-blkfront: module exit handling adjustments The blkdev major must be released upon exit, or else the module can't attach to devices using the same majors upon being loaded again. Also avoid leaking the minor tracking bitmap. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-05-11 16:11:54 -04:00
Jan Beulich	e77c78c022	xen-blkfront: properly name all devices - devices beyond xvdzz didn't get proper names assigned at all - extended devices with minors not representable within the kernel's major/minor bit split spilled into foreign majors Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-05-11 16:11:52 -04:00
Linus Torvalds	c104f1fa1e	Merge branch 'for-3.4/drivers' of git://git.kernel.dk/linux-block Pull block driver bits from Jens Axboe: - A series of fixes for mtip32xx. Most from Asai at Micron, but also one from Greg, getting rid of the dependency on PCIE_HOTPLUG. - A few bug fixes for xen-blkfront, and blkback. - A virtio-blk fix for Vivek, making resize actually work. - Two fixes from Stephen, making larger transfers possible on cciss. This is needed for tape drive support. * 'for-3.4/drivers' of git://git.kernel.dk/linux-block: block: mtip32xx: remove HOTPLUG_PCI_PCIE dependancy mtip32xx: dump tagmap on failure mtip32xx: fix handling of commands in various scenarios mtip32xx: Shorten macro names mtip32xx: misc changes mtip32xx: Add new sysfs entry 'status' mtip32xx: make setting comp_time as common mtip32xx: Add new bitwise flag 'dd_flag' mtip32xx: fix error handling in mtip_init() virtio-blk: Call revalidate_disk() upon online disk resize xen/blkback: Make optional features be really optional. xen/blkback: Squash the discard support for 'file' and 'phy' type. mtip32xx: fix incorrect value set for drv_cleanup_done, and re-initialize and start port in mtip_restart_port() cciss: Fix scsi tape io with more than 255 scatter gather elements cciss: Initialize scsi host max_sectors for tape drive support xen-blkfront: make blkif_io_lock spinlock per-device xen/blkfront: don't put bdev right after getting it xen-blkfront: use bitmap_set() and bitmap_clear() xen/blkback: Enable blkback on HVM guests xen/blkback: use grant-table.c hypercall wrappers	2012-04-13 18:45:13 -07:00
Linus Torvalds	9479f0f801	Two fixes for regressions: * one is a workaround that will be removed in v3.5 with proper fix in the tip/x86 tree, * the other is to fix drivers to load on PV (a previous patch made them only load in PVonHVM mode). The rest are just minor fixes in the various drivers and some cleanup in the core code. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAABAgAGBQJPfyVUAAoJEFjIrFwIi8fJUjUH/jbY5JavRqSlNELZW2A4Ta76 8p00LqLHw/C56iHZcWKke8mqtWNb+ZfcQt7ZYcxDIYa4QWBL28x0OLAO2tOBIt37 ZjYESWSdFJaJvmpADluWtFyGyZ9TYJllDTBm/jWj1ZtKSZvR1YkhuMXCS0f4AmGQ xFzSWJZUDdiOAqpN+VQD8wP00gfR8knQLg16XE2fvFdQo4XwpCtqLfHV/5pMMGdy Cs/ep6rq/7cdv/nshKOcBnw7RW8l3Xoi/28ht8k3DvAQ2VtFq1Tugv2G9pcCHwQG DIBkB3SOU6/v6P5at5+egKS5xR1fJetCWlkMd8kkbcdz2NPI4UDMkvOW6Q8yQls= =6Ve+ -----END PGP SIGNATURE----- Merge tag 'stable/for-linus-3.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull xen fixes from Konrad Rzeszutek Wilk: "Two fixes for regressions: * one is a workaround that will be removed in v3.5 with proper fix in the tip/x86 tree, * the other is to fix drivers to load on PV (a previous patch made them only load in PVonHVM mode). The rest are just minor fixes in the various drivers and some cleanup in the core code." * tag 'stable/for-linus-3.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/pcifront: avoid pci_frontend_enable_msix() falsely returning success xen/pciback: fix XEN_PCI_OP_enable_msix result xen/smp: Remove unnecessary call to smp_processor_id() xen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus io-apic entries' xen: only check xen_platform_pci_unplug if hvm	2012-04-06 17:54:53 -07:00
Igor Mammedov	e95ae5a493	xen: only check xen_platform_pci_unplug if hvm commit b9136d207f08 xen: initialize platform-pci even if xen_emul_unplug=never breaks blkfront/netfront by not loading them because of xen_platform_pci_unplug=0 and it is never set for PV guest. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-04-06 12:12:52 -04:00
Linus Torvalds	e22057c859	One tiny feature that accidentally got lost in the initial git pull: * Add fast-EOI acking of interrupts (clear a bit instead of hypercall) And bug-fixes: * Fix CPU bring-up code missing a call to notify other subsystems. * Fix reading /sys/hypervisor even if PVonHVM drivers are not loaded. * In Xen ACPI processor driver: remove too verbose WARN messages, fix up the Kconfig dependency to be a module by default, and add dependency on CPU_FREQ. * Disable CPU frequency drivers from loading when booting under Xen (as we want the Xen ACPI processor to be used instead). * Cleanups in tmem code. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAABAgAGBQJPbc3DAAoJEFjIrFwIi8fJTQkIAMnH2fPhcHAb4mNaz+3gdmsZ Flo6V1gMBcO8xKZlUkFgKKPYoOm7lLmvoceXLVSH5oOKSnSJo1zSinzKmcdJQo/D kPo4/EguNwtzcAcQh2dmT6/IM9O3ihMKUli7Oajif9PLCFFFqTaG3Y3YNBo/rxTY D3HAnJrIfmIyG0NpLnaFCWhCzUvcB4M7ysutECqcF8l5gnbHxRVeCKD0blM+n9GH Wyum00dQCwo6h6wTduhPOAxHAM4rncyR3heOB2vDxq9YJHSUhhcva5QCgQ+tdUVt 6U2TQT1L2Px8iXXzr2w9YBpepOVajZReoKhajLjJ5VbkpBZFz5dVNfJ8LpF8RV8= =z8IB -----END PGP SIGNATURE----- Merge tag 'stable/for-linus-3.4-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull more xen updates from Konrad Rzeszutek Wilk: "One tiny feature that accidentally got lost in the initial git pull: * Add fast-EOI acking of interrupts (clear a bit instead of hypercall) And bug-fixes: * Fix CPU bring-up code missing a call to notify other subsystems. * Fix reading /sys/hypervisor even if PVonHVM drivers are not loaded. * In Xen ACPI processor driver: remove too verbose WARN messages, fix up the Kconfig dependency to be a module by default, and add dependency on CPU_FREQ. * Disable CPU frequency drivers from loading when booting under Xen (as we want the Xen ACPI processor to be used instead). * Cleanups in tmem code." * tag 'stable/for-linus-3.4-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/acpi: Fix Kconfig dependency on CPU_FREQ xen: initialize platform-pci even if xen_emul_unplug=never xen/smp: Fix bringup bug in AP code. xen/acpi: Remove the WARN's as they just create noise. xen/tmem: cleanup xen: support pirq_eoi_map xen/acpi-processor: Do not depend on CPU frequency scaling drivers. xen/cpufreq: Disable the cpu frequency scaling drivers from loading. provide disable_cpufreq() function to disable the API.	2012-03-24 12:20:25 -07:00
Igor Mammedov	b9136d207f	xen: initialize platform-pci even if xen_emul_unplug=never When xen_emul_unplug=never is specified on kernel command line reading files from /sys/hypervisor is broken (returns -EBUSY). It is caused by xen_bus dependency on platform-pci and platform-pci isn't initialized when xen_emul_unplug=never is specified. Fix it by allowing platform-pci to ignore xen_emul_unplug=never, and do not intialize xen_[blk\|net]front instead. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-03-22 11:37:11 -04:00
Steven Noonan	3467811e26	xen-blkfront: make blkif_io_lock spinlock per-device This patch moves the global blkif_io_lock to the per-device structure. The spinlock seems to exists for two reasons: to disable IRQs when in the interrupt handlers for blkfront, and to protect the blkfront VBDs when a detachment is requested. Having a global blkif_io_lock doesn't make sense given the use case, and it drastically hinders performance due to contention. All VBDs with pending IOs have to take the lock in order to get work done, which serializes everything pretty badly. Signed-off-by: Steven Noonan <snoonan@amazon.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-03-20 12:52:41 +01:00
Andrew Jones	dad5cf659b	xen/blkfront: don't put bdev right after getting it We should hang onto bdev until we're done with it. Signed-off-by: Andrew Jones <drjones@redhat.com> [v1: Fixed up git commit description] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-03-20 12:52:41 +01:00
Akinobu Mita	34ae2e47d9	xen-blkfront: use bitmap_set() and bitmap_clear() Use bitmap_set and bitmap_clear rather than modifying individual bits in a memory region. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: xen-devel@lists.xensource.com Cc: virtualization@lists.linux-foundation.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-03-20 12:52:41 +01:00
Linus Torvalds	16008d6416	Merge branch 'for-3.3/drivers' of git://git.kernel.dk/linux-block * 'for-3.3/drivers' of git://git.kernel.dk/linux-block: mtip32xx: do rebuild monitoring asynchronously xen-blkfront: Use kcalloc instead of kzalloc to allocate array mtip32xx: uninitialized variable in mtip_quiesce_io() mtip32xx: updates based on feedback xen-blkback: convert hole punching to discard request on loop devices xen/blkback: Move processing of BLKIF_OP_DISCARD from dispatch_rw_block_io xen/blk[front\|back]: Enhance discard support with secure erasing support. xen/blk[front\|back]: Squash blkif_request_rw and blkif_request_discard together mtip32xx: update to new ->make_request() API mtip32xx: add module.h include to avoid conflict with moduleh tree mtip32xx: mark a few more items static mtip32xx: ensure that all local functions are static mtip32xx: cleanup compat ioctl handling mtip32xx: fix warnings/errors on 32-bit compiles block: Add driver for Micron RealSSD pcie flash cards	2012-01-15 12:48:41 -08:00
Jan Beulich	73db144b58	Xen: consolidate and simplify struct xenbus_driver instantiation The 'name', 'owner', and 'mod_name' members are redundant with the identically named fields in the 'driver' sub-structure. Rather than switching each instance to specify these fields explicitly, introduce a macro to simplify this. Eliminate further redundancy by allowing the drvname argument to DEFINE_XENBUS_DRIVER() to be blank (in which case the first entry from the ID table will be used for .driver.name). Also eliminate the questionable xenbus_register_{back,front}end() wrappers - their sole remaining purpose was the checking of the 'owner' field, proper setting of which shouldn't be an issue anymore when the macro gets used. v2: Restore DRV_NAME for the driver name in xen-pciback. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-01-04 17:01:17 -05:00
Thomas Meyer	f094148a17	xen-blkfront: Use kcalloc instead of kzalloc to allocate array The advantage of kcalloc is, that will prevent integer overflows which could result from the multiplication of number of elements and size and it is also a bit nicer to read. The semantic patch that makes this change is available in https://lkml.org/lkml/2011/11/25/107 Signed-off-by: Thomas Meyer <thomas@m3y3r.de> [v1: Seperated the drivers/block/cciss_scsi.c out of this patch] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-12-16 12:36:52 -05:00
Konrad Rzeszutek Wilk	5ea4298669	xen/blk[front\|back]: Enhance discard support with secure erasing support. Part of the blkdev_issue_discard(xx) operation is that it can also issue a secure discard operation that will permanantly remove the sectors in question. We advertise that we can support that via the 'discard-secure' attribute and on the request, if the 'secure' bit is set, we will attempt to pass in REQ_DISCARD \| REQ_SECURE. CC: Li Dongyang <lidongyang@novell.com> [v1: Used 'flag' instead of 'secure:1' bit] [v2: Use 'reserved' uint8_t instead of adding a new value] [v3: Check for nseg when mapping instead of operation] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-11-18 13:28:01 -05:00
Konrad Rzeszutek Wilk	97e36834f5	xen/blk[front\|back]: Squash blkif_request_rw and blkif_request_discard together In a union type structure to deal with the overlapping attributes in a easier manner. Suggested-by: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-11-18 13:27:59 -05:00
Linus Torvalds	3d0a8d10cf	Merge branch 'for-3.2/drivers' of git://git.kernel.dk/linux-block * 'for-3.2/drivers' of git://git.kernel.dk/linux-block: (30 commits) virtio-blk: use ida to allocate disk index hpsa: add small delay when using PCI Power Management to reset for kump cciss: add small delay when using PCI Power Management to reset for kump xen/blkback: Fix two races in the handling of barrier requests. xen/blkback: Check for proper operation. xen/blkback: Fix the inhibition to map pages when discarding sector ranges. xen/blkback: Report VBD_WSECT (wr_sect) properly. xen/blkback: Support 'feature-barrier' aka old-style BARRIER requests. xen-blkfront: plug device number leak in xlblk_init() error path xen-blkfront: If no barrier or flush is supported, use invalid operation. xen-blkback: use kzalloc() in favor of kmalloc()+memset() xen-blkback: fixed indentation and comments xen-blkfront: fix a deadlock while handling discard response xen-blkfront: Handle discard requests. xen-blkback: Implement discard requests ('feature-discard') xen-blkfront: add BLKIF_OP_DISCARD and discard request struct drivers/block/loop.c: remove unnecessary bdev argument from loop_clr_fd() drivers/block/loop.c: emit uevent on auto release drivers/block/cpqarray.c: use pci_dev->revision loop: always allow userspace partitions and optionally support automatic scanning ... Fic up trivial header file includsion conflict in drivers/block/loop.c	2011-11-04 17:22:14 -07:00
Laszlo Ersek	469738e675	xen-blkfront: plug device number leak in xlblk_init() error path ... though after a failed xenbus_register_frontend() all may be lost. Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-10-13 09:48:35 -04:00
Konrad Rzeszutek Wilk	d11e615830	xen-blkfront: If no barrier or flush is supported, use invalid operation. Guard against issuing BLKIF_OP_WRITE_BARRIER or BLKIF_OP_FLUSH_CACHE by checking whether we successfully negotiated with the backend. The negotiation with the backend also sets the q->flush_flags which fortunately for us is also used when submitting an bio to us. If we don't support barriers or flushes it would be set to zero so we should never end up having to deal with REQ_FLUSH \| REQ_FUA. However, other third party implementations of __make_request that might be stacked on top of us might not be so smart, so lets fix this up. Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-10-13 09:48:35 -04:00
Li Dongyang	69ef68cef9	xen-blkfront: fix a deadlock while handling discard response When we get -EOPNOTSUPP response for a discard request, we will clear the discard flag on the request queue so we won't attempt to send discard requests to backend again, and this should be protected under rq->queue_lock. However, when we setup the request queue, we pass blkif_io_lock to blk_init_queue so rq->queue_lock is blkif_io_lock indeed, and this lock is already taken when we are in blkif_interrpt, so remove the spin_lock/spin_unlock when we clear the discard flag or we will end up with deadlock here Signed-off-by: Li Dongyang <lidongyang@novell.com> [v1: Updated description a bit and removed comment from source] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-10-13 09:48:32 -04:00
Li Dongyang	ed30bf317c	xen-blkfront: Handle discard requests. If the backend advertises 'feature-discard', then interrogate the backend for alignment and granularity. Setup the request queue with the appropiate values and send the discard operation as required. Signed-off-by: Li Dongyang <lidongyang@novell.com> [v1: Amended commit description] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-10-13 09:48:31 -04:00
Stefan Bader	89153b5cae	xen-blkfront: Fix one off warning about name clash Avoid telling users to use xvde and onwards when using xvde. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-07-14 14:19:51 -04:00
Stefan Bader	196cfe2ae8	xen-blkfront: Drop name and minor adjustments for emulated scsi devices These were intended to avoid the namespace clash when representing emulated IDE and SCSI devices. However that seems to confuse users more than expected (a disk defined as sda becomes xvde). So for now go back to the scheme which does no adjustments. This will break when mixing IDE and SCSI names in the configuration of guests but should be by now expected. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-07-14 14:19:33 -04:00
Konrad Rzeszutek Wilk	edf6ef59ec	xen-blkfront: Introduce BLKIF_OP_FLUSH_DISKCACHE support. If the backend supports the 'feature-flush-cache' mode, use that instead of the 'feature-barrier' support. Currently there are three backends that support the 'feature-flush-cache' mode: NetBSD, Solaris and Linux kernel. The 'flush' option is much light-weight version than the 'barrier' support so lets try to use as there are no filesystems in the kernel that use full barriers anymore. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 08:56:03 -04:00
Marek Marczykowski	4352b47ab7	xen-blkfront: fix data size for xenbus_gather in blkfront_connect barrier variable is int, not long. This overflow caused another variable override: "err" (in PV code) and "binfo" (in xenlinux code - drivers/xen/blkfront/blkfront.c). The later caused incorrect device flags (RO/removable etc). Signed-off-by: Marek Marczykowski <marmarek@mimuw.edu.pl> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> [v1: Changed title] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 08:55:51 -04:00
Linus Torvalds	76ca078328	Merge branch 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm * 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm: xen: suspend: remove xen_hvm_suspend xen: suspend: pull pre/post suspend hooks out into suspend_info xen: suspend: move arch specific pre/post suspend hooks into generic hooks xen: suspend: refactor non-arch specific pre/post suspend hooks xen: suspend: add "arch" to pre/post suspend hooks xen: suspend: pass extra hypercall argument via suspend_info struct xen: suspend: refactor cancellation flag into a structure xen: suspend: use HYPERVISOR_suspend for PVHVM case instead of open coding xen: switch to new schedop hypercall by default. xen: use new schedop interface for suspend xen: do not respond to unknown xenstore control requests xen: fix compile issue if XEN is enabled but XEN_PVHVM is disabled xen: PV on HVM: support PV spinlocks and IPIs xen: make the ballon driver work for hvm domains xen-blkfront: handle Xen major numbers other than XENVBD xen: do not use xen_info on HVM, set pv_info name to "Xen HVM" xen: no need to delay xen_setup_shutdown_event for hvm guests anymore	2011-03-15 10:59:09 -07:00
Owen Smith	51de69523f	xen: Union the blkif_request request specific fields Prepare for extending the block device ring to allow request specific fields, by moving the request specific fields for reads, writes and barrier requests to a union member. Acked-by: Jens Axboe <jaxboe@fusionio.com> Signed-off-by: Owen Smith <owen.smith@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-03-08 15:07:00 -05:00
Stefano Stabellini	c80a420995	xen-blkfront: handle Xen major numbers other than XENVBD This patch makes sure blkfront handles correctly virtual device numbers corresponding to Xen emulated IDE and SCSI disks: in those cases blkfront translates the major number to XENVBD and the minor number to a low xvd minor. Note: this behaviour is different from what old xenlinux PV guests used to do: they used to steal an IDE or SCSI major number and use it instead. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>	2011-02-25 16:43:05 +00:00
Linus Torvalds	23d69b09b7	Merge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq * 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (33 commits) usb: don't use flush_scheduled_work() speedtch: don't abuse struct delayed_work media/video: don't use flush_scheduled_work() media/video: explicitly flush request_module work ioc4: use static work_struct for ioc4_load_modules() init: don't call flush_scheduled_work() from do_initcalls() s390: don't use flush_scheduled_work() rtc: don't use flush_scheduled_work() mmc: update workqueue usages mfd: update workqueue usages dvb: don't use flush_scheduled_work() leds-wm8350: don't use flush_scheduled_work() mISDN: don't use flush_scheduled_work() macintosh/ams: don't use flush_scheduled_work() vmwgfx: don't use flush_scheduled_work() tpm: don't use flush_scheduled_work() sonypi: don't use flush_scheduled_work() hvsi: don't use flush_scheduled_work() xen: don't use flush_scheduled_work() gdrom: don't use flush_scheduled_work() ... Fixed up trivial conflict in drivers/media/video/bt8xx/bttv-input.c as per Tejun.	2011-01-07 16:58:04 -08:00
Tejun Heo	30d65030fd	xen: don't use flush_scheduled_work() flush_scheduled_work() is deprecated and scheduled to be removed. Directly flush info->work instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2010-12-24 15:59:06 +01:00
Jeremy Fitzhardinge	667c78afae	xen: Provide a variant of __RING_SIZE() that is an integer constant expression Without this, gcc 4.5 won't compile xen-netfront and xen-blkfront, where this is being used to specify array sizes. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: David Miller <davem@davemloft.net> Cc: Stable Kernel <stable@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-12-15 12:34:28 -08:00
Christoph Hellwig	02e031cbc8	block: remove REQ_HARDBARRIER REQ_HARDBARRIER is dead now, so remove the leftovers. What's left at this point is: - various checks inside the block layer. - sanity checks in bio based drivers. - now unused bio_empty_barrier helper. - Xen blockfront use of BLKIF_OP_WRITE_BARRIER - it's dead for a while, but Xen really needs to sort out it's barrier situaton. - setting of ordered tags in uas - dead code copied from old scsi drivers. - scsi different retry for barriers - it's dead and should have been removed when flushes were converted to FS requests. - blktrace handling of barriers - removed. Someone who knows blktrace better should add support for REQ_FLUSH and REQ_FUA, though. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-11-10 14:54:09 +01:00
Jeremy Fitzhardinge	dcb8baecea	xen/blkfront: cope with backend that fail empty BLKIF_OP_WRITE_BARRIER requests Some(?) Xen block backends fail BLKIF_OP_WRITE_BARRIER requests, which Linux uses as a cache flush operation. In that case, disable use of FLUSH. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Daniel Stodden <daniel.stodden@citrix.com>	2010-11-02 13:46:46 -04:00
Jeremy Fitzhardinge	be2f8373c1	xen/blkfront: Implement FUA with BLKIF_OP_WRITE_BARRIER The BLKIF_OP_WRITE_BARRIER is a full ordered barrier, so we can use it to implement FUA as well as a plain FLUSH. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Christoph Hellwig <hch@lst.de>	2010-11-02 11:27:59 -04:00
Jeremy Fitzhardinge	a945b9801a	xen/blkfront: change blk_shadow.request to proper pointer Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-11-02 11:27:58 -04:00
Jeremy Fitzhardinge	c64e38ea17	xen/blkfront: map REQ_FLUSH into a full barrier Implement a flush as a full barrier, since we have nothing weaker. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Christoph Hellwig <hch@lst.de>	2010-11-02 10:43:51 -04:00
Linus Torvalds	18cb657ca1	Merge branch 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen and branch 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm * 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm: xen: register xen pci notifier xen: initialize cpu masks for pv guests in xen_smp_init xen: add a missing #include to arch/x86/pci/xen.c xen: mask the MTRR feature from the cpuid xen: make hvc_xen console work for dom0. xen: add the direct mapping area for ISA bus access xen: Initialize xenbus for dom0. xen: use vcpu_ops to setup cpu masks xen: map a dummy page for local apic and ioapic in xen_set_fixmap xen: remap MSIs into pirqs when running as initial domain xen: remap GSIs as pirqs when running as initial domain xen: introduce XEN_DOM0 as a silent option xen: map MSIs into pirqs xen: support GSI -> pirq remapping in PV on HVM guests xen: add xen hvm acpi_register_gsi variant acpi: use indirect call to register gsi in different modes xen: implement xen_hvm_register_pirq xen: get the maximum number of pirqs from xen xen: support pirq != irq * 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (27 commits) X86/PCI: Remove the dependency on isapnp_disable. xen: Update Makefile with CONFIG_BLOCK dependency for biomerge.c MAINTAINERS: Add myself to the Xen Hypervisor Interface and remove Chris Wright. x86: xen: Sanitse irq handling (part two) swiotlb-xen: On x86-32 builts, select SWIOTLB instead of depending on it. MAINTAINERS: Add myself for Xen PCI and Xen SWIOTLB maintainer. xen/pci: Request ACS when Xen-SWIOTLB is activated. xen-pcifront: Xen PCI frontend driver. xenbus: prevent warnings on unhandled enumeration values xenbus: Xen paravirtualised PCI hotplug support. xen/x86/PCI: Add support for the Xen PCI subsystem x86: Introduce x86_msi_ops msi: Introduce default_[teardown\|setup]_msi_irqs with fallback. x86/PCI: Export pci_walk_bus function. x86/PCI: make sure _PAGE_IOMAP it set on pci mappings x86/PCI: Clean up pci_cache_line_size xen: fix shared irq device passthrough xen: Provide a variant of xen_poll_irq with timeout. xen: Find an unbound irq number in reverse order (high to low). xen: statically initialize cpu_evtchn_mask_p ... Fix up trivial conflicts in drivers/pci/Makefile	2010-10-28 17:11:17 -07:00
Linus Torvalds	a2887097f2	Merge branch 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-block: (46 commits) xen-blkfront: disable barrier/flush write support Added blk-lib.c and blk-barrier.c was renamed to blk-flush.c block: remove BLKDEV_IFL_WAIT aic7xxx_old: removed unused 'req' variable block: remove the BH_Eopnotsupp flag block: remove the BLKDEV_IFL_BARRIER flag block: remove the WRITE_BARRIER flag swap: do not send discards as barriers fat: do not send discards as barriers ext4: do not send discards as barriers jbd2: replace barriers with explicit flush / FUA usage jbd2: Modify ASYNC_COMMIT code to not rely on queue draining on barrier jbd: replace barriers with explicit flush / FUA usage nilfs2: replace barriers with explicit flush / FUA usage reiserfs: replace barriers with explicit flush / FUA usage gfs2: replace barriers with explicit flush / FUA usage btrfs: replace barriers with explicit flush / FUA usage xfs: replace barriers with explicit flush / FUA usage block: pass gfp_mask and flags to sb_issue_discard dm: convey that all flushes are processed as empty ...	2010-10-22 17:07:18 -07:00
Jens Axboe	005a1d15f5	xen-blkfront: disable barrier/flush write support The driver doesn't handle empty flushes. Disable barrier/flush write support until this is fixed up. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-10-22 10:58:33 +02:00
Jens Axboe	fa251f8990	Merge branch 'v2.6.36-rc8' into for-2.6.37/barrier Conflicts: block/blk-core.c drivers/block/loop.c mm/swapfile.c Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-10-19 09:13:04 +02:00
Noboru Iwamatsu	b78c951256	xenbus: prevent warnings on unhandled enumeration values XenbusStateReconfiguring/XenbusStateReconfigured were introduced by c/s 437, but aren't handled in many switch statements. .. also pulled from the linux-2.6-sparse-tree tree. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-10-18 10:49:36 -04:00
Arnd Bergmann	2a48fc0ab2	block: autoconvert trivial BKL users to private mutex The block device drivers have all gained new lock_kernel calls from a recent pushdown, and some of the drivers were already using the BKL before. This turns the BKL into a set of per-driver mutexes. Still need to check whether this is safe to do. file=$1 name=$2 if grep -q lock_kernel ${file} ; then if grep -q 'include.linux.mutex.h' ${file} ; then sed -i '/include.<linux\/smp_lock.h>/d' ${file} else sed -i 's/include.<linux\/smp_lock.h>.$/include <linux\/mutex.h>/g' ${file} fi sed -i ${file} \ -e "/^#include.linux.mutex.h/,$ { 1,/^$static\\|int\\|long$/ { /^$static\\|int\\|long$/istatic DEFINE_MUTEX(${name}_mutex); } }" \ -e "s/$un$lock_kernel\>[ ]()/mutex_\1lock(\&${name}_mutex)/g" \ -e '/[ ]cycle_kernel_lock();/d' else sed -i -e '/include.*\<smp_lock.h\>/d' ${file} \ -e '/cycle_kernel_lock()/d' fi Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2010-10-05 15:01:10 +02:00
Tejun Heo	4913efe456	block: deprecate barrier and replace blk_queue_ordered() with blk_queue_flush() Barrier is deemed too heavy and will soon be replaced by FLUSH/FUA requests. Deprecate barrier. All REQ_HARDBARRIERs are failed with -EOPNOTSUPP and blk_queue_ordered() is replaced with simpler blk_queue_flush(). blk_queue_flush() takes combinations of REQ_FLUSH and FUA. If a device has write cache and can flush it, it should set REQ_FLUSH. If the device can handle FUA writes, it should also set REQ_FUA. All blk_queue_ordered() users are converted. * ORDERED_DRAIN is mapped to 0 which is the default value. * ORDERED_DRAIN_FLUSH is mapped to REQ_FLUSH. * ORDERED_DRAIN_FLUSH_FUA is mapped to REQ_FLUSH \| REQ_FUA. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Boaz Harrosh <bharrosh@panasas.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: David S. Miller <davem@davemloft.net> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Pierre Ossman <drzeus@drzeus.cx> Cc: Stefan Weinhuber <wein@de.ibm.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-09-10 12:35:36 +02:00
Tejun Heo	6958f14545	block: kill QUEUE_ORDERED_BY_TAG Nobody is making meaningful use of ORDERED_BY_TAG now and queue draining for barrier requests will be removed soon which will render the advantage of tag ordering moot. Kill ORDERED_BY_TAG. The following users are affected. * brd: converted to ORDERED_DRAIN. * virtio_blk: ORDERED_TAG path was already marked deprecated. Removed. * xen-blkfront: ORDERED_TAG case dropped. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-09-10 12:35:36 +02:00
Ian Campbell	1dc7ce99b0	xen: pvhvm: rename xen_emul_unplug=ignore to =unnnecessary It is not immediately clear what this option causes to become ignored. The actual meaning is that it is not necessary to unplug the emulated devices to safely use the PV ones, even if the platform does not support the unplug protocol. (pressumably the user will only add this option if they have ensured that their domain configuration is safe). I think xen_emul_unplug=unnecessary better captures this. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>	2010-08-23 11:59:29 +01:00
Linus Torvalds	2f9e825d3e	Merge branch 'for-2.6.36' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.36' of git://git.kernel.dk/linux-2.6-block: (149 commits) block: make sure that REQ_* types are seen even with CONFIG_BLOCK=n xen-blkfront: fix missing out label blkdev: fix blkdev_issue_zeroout return value block: update request stacking methods to support discards block: fix missing export of blk_types.h writeback: fix bad _bh spinlock nesting drbd: revert "delay probes", feature is being re-implemented differently drbd: Initialize all members of sync_conf to their defaults [Bugz 315] drbd: Disable delay probes for the upcomming release writeback: cleanup bdi_register writeback: add new tracepoints writeback: remove unnecessary init_timer call writeback: optimize periodic bdi thread wakeups writeback: prevent unnecessary bdi threads wakeups writeback: move bdi threads exiting logic to the forker thread writeback: restructure bdi forker loop a little writeback: move last_active to bdi writeback: do not remove bdi from bdi_list writeback: simplify bdi code a little writeback: do not lose wake-ups in bdi threads ... Fixed up pretty trivial conflicts in drivers/block/virtio_blk.c and drivers/scsi/scsi_error.c as per Jens.	2010-08-10 15:22:42 -07:00
Jens Axboe	a4cc14ec9f	xen-blkfront: fix missing out label Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-08-08 21:50:05 -04:00
Jeremy Fitzhardinge	7901d14144	xen/blkfront: Use QUEUE_ORDERED_DRAIN for old backends If there's no feature-barrier key in xenstore, then it means its a fairly old backend which does uncached in-order writes, which means ORDERED_DRAIN is appropriate. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-08-07 18:52:53 +02:00
Jeremy Fitzhardinge	4dab46ff26	xen/blkfront: use tagged queuing for barriers When barriers are supported, then use QUEUE_ORDERED_TAG to tell the block subsystem that it doesn't need to do anything else with the barriers. Previously we used ORDERED_DRAIN which caused the block subsystem to drain all pending IO before submitting the barrier, which would be very expensive. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-08-07 18:52:53 +02:00
Daniel Stodden	d54142c71f	blkfront: Klog the unclean release path Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-08-07 18:51:21 +02:00
Daniel Stodden	7b32d1044a	blkfront: Remove obsolete info->users This is just bd_openers, protected by the bd_mutex. Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-08-07 18:49:20 +02:00
Daniel Stodden	acfca3c622	blkfront: Remove obsolete info->users This is just bd_openers, protected by the bd_mutex. Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-08-07 18:47:26 +02:00
Daniel Stodden	fa1bd3591a	blkfront: Lock blockfront_info during xbdev removal Same approach as blkfront_closing: * Grab the bdev safely, holding the info mutex. * Zap xbdev safely, holding the info mutex. * Try bdev removal safely, holding bd_mutex. Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-08-07 18:45:27 +02:00
Daniel Stodden	7fd152f4b6	blkfront: Fix blkfront backend switch race (bdev release) We cannot read backend state within bdev operations, because it risks grabbing the state change before xenbus gets to do it. Fixed by tracking deferral with a frontend switch to Closing. State exposure isn't strictly necessary, but the backends won't mind. For a 'clean' deferral this seems actually a more decent protocol than raising errors. Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-08-07 18:45:12 +02:00
Daniel Stodden	139617437a	blkfront: Fix blkfront backend switch race (bdev open) We need not mind if users grab a late handle on a closing disk. We probably even should not. But we have to make sure it's not a dead one already Let the bdev deal with a gendisk deleted under its feet. Takes the info mutex to decide a race against backend closing. Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-08-07 18:38:43 +02:00
Daniel Stodden	b70f5fa043	blkfront: Lock blkfront_info when closing The bdev .open/.release fops race against backend switches to Closing, handled by the XenBus thread. The original code attempted to serialize block device holders and xenbus only via bd_mutex. This is insufficient, the info->bd pointer may already be stale (or null) while xenbus tries to bump up the refcount. Protect blkfront_info with a dedicated mutex. Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-08-07 18:38:43 +02:00
Daniel Stodden	a66b5aebb7	blkfront: Clean up vbd release * Current blkfront_closing is rather a xlvbd_release_gendisk. Renamed in preparation of later patches (need the name again). * Removed the misleading comment -- this only applied to the backend switch handler, and the queue is already flushed btw. * Break out the xenbus call, callers know better when to switch frontend state. Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-08-07 18:38:43 +02:00
Daniel Stodden	9897cb5323	blkfront: Fix gendisk leak Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-08-07 18:31:37 +02:00
Daniel Stodden	89de1669ac	blkfront: Fix backtrace in del_gendisk The call to del_gendisk follows an non-refcounted gd->queue pointer. We release the last ref in blk_cleanup_queue. Fixed by reordering releases accordingly. Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-08-07 18:31:35 +02:00
K. Y. Srinivasan	2def141e71	xen/blkfront: revalidate after setting capacity Signed-off-by: K. Y. Srinivasan <ksrinivasan@novell.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-08-07 18:31:31 +02:00
Jeremy Fitzhardinge	b4dddb498c	xen/blkfront: avoid compiler warning from missing cases Fix: drivers/block/xen-blkfront.c: In function ‘blkfront_connect’: drivers/block/xen-blkfront.c:933: warning: enumeration value ‘BLKIF_STATE_DISCONNECTED’ not handled in switch Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-08-07 18:31:29 +02:00
K. Y. Srinivasan	1fa73be6be	xen/front: Propagate changed size of VBDs Support dynamic resizing of virtual block devices. This patch supports both file backed block devices as well as physical devices that can be dynamically resized on the host side. Signed-off-by: K. Y. Srinivasan <ksrinivasan@novell.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-08-07 18:31:27 +02:00
Jan Beulich	5d7ed20e82	blkfront: don't access freed struct xenbus_device Unfortunately commit "blkfront: fixes for 'xm block-detach ... --force'" still wasn't quite right - there was a reference to freed memory left from blkfront_closing(). Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-08-07 18:31:12 +02:00
Jan Beulich	0e34582699	blkfront: fixes for 'xm block-detach ... --force' Prevent prematurely freeing 'struct blkfront_info' instances (when the xenbus data structures are gone, but the Linux ones are still needed). Prevent adding a disk with the same (major, minor) [and hence the same name and sysfs entries, which leads to oopses] when the previous instance wasn't fully de-allocated yet. This still doesn't address all issues resulting from forced detach: I/O submitted after the detach still blocks forever, likely preventing subsequent un-mounting from completing. It's not clear to me (not knowing much about the block layer) how this can be avoided. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-08-07 18:28:55 +02:00
Ian Campbell	203fd61f42	xen: use less generic names in blkfront driver. All Xen frontend drivers have a couple of identically named functions which makes figuring out which device went wrong from a stacktrace harder than it needs to be. Rename them to something specificto the device type. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-08-07 18:26:39 +02:00
Arnd Bergmann	6e9624b8ca	block: push down BKL into .open and .release The open and release block_device_operations are currently called with the BKL held. In order to change that, we must first make sure that all drivers that currently rely on this have no regressions. This blindly pushes the BKL into all .open and .release operations for all block drivers to prepare for the next step. The drivers can subsequently replace the BKL with their own locks or remove it completely when it can be shown that it is not needed. The functions blkdev_get and blkdev_put are the only remaining users of the big kernel lock in the block layer, besides a few uses in the ioctl code, none of which need to serialize with blkdev_{get,put}. Most of these two functions is also under the protection of bdev->bd_mutex, including the actual calls to ->open and ->release, and the common code does not access any global data structures that need the BKL. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-08-07 18:25:34 +02:00
Arnd Bergmann	8a6cfeb6de	block: push down BKL into .locked_ioctl As a preparation for the removal of the big kernel lock in the block layer, this removes the BKL from the common ioctl handling code, moving it into every single driver still using it. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-08-07 18:25:00 +02:00
FUJITA Tomonori	00fff26539	block: remove q->prepare_flush_fn completely This removes q->prepare_flush_fn completely (changes the blk_queue_ordered API). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-08-07 18:24:15 +02:00
Christoph Hellwig	33659ebbae	block: remove wrappers for request type/flags Remove all the trivial wrappers for the cmd_type and cmd_flags fields in struct requests. This allows much easier grepping for different request types instead of unwinding through macros. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-08-07 18:17:56 +02:00
Stefano Stabellini	b98a409b80	blkfront: do not create a PV cdrom device if xen_hvm_guest It is not possible to unplug emulated cdrom devices, and PV cdroms don't handle media insert, eject and stream, so we are better off disabling PV cdroms when running as a Xen HVM guest. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2010-07-29 11:11:08 -07:00
Stefano Stabellini	c1c5413ad5	x86: Unplug emulated disks and nics. Add a xen_emul_unplug command line option to the kernel to unplug xen emulated disks and nics. Set the default value of xen_emul_unplug depending on whether or not the Xen PV frontends and the Xen platform PCI driver have been compiled for this kernel (modules or built-in are both OK). The user can specify xen_emul_unplug=ignore to enable PV drivers on HVM even if the host platform doesn't support unplug. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-07-26 23:13:25 -07:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Martin K. Petersen	8a78362c4e	block: Consolidate phys_segment and hw_segment limits Except for SCSI no device drivers distinguish between physical and hardware segment limits. Consolidate the two into a single segment limit. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2010-02-26 13:58:08 +01:00
Martin K. Petersen	086fa5ff08	block: Rename blk_queue_max_sectors to blk_queue_max_hw_sectors The block layer calling convention is blk_queue_<limit name>. blk_queue_max_sectors predates this practice, leading to some confusion. Rename the function to appropriately reflect that its intended use is to set max_hw_sectors. Also introduce a temporary wrapper for backwards compability. This can be removed after the merge window is closed. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2010-02-26 13:58:08 +01:00
Márton Németh	ec9c42ec79	block: make xenbus device id constant The ids field of the struct xenbus_device_id is constant in <linux/xen/xenbus.h> so it is worth to make blkfront_ids also constant. The semantic match that finds this kind of pattern is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r@ disable decl_init,const_decl_init; identifier I1, I2, x; @@ struct I1 { ... const struct I2 *x; ... }; @s@ identifier r.I1, y; identifier r.x, E; @@ struct I1 y = { .x = E, }; @c@ identifier r.I2; identifier s.E; @@ const struct I2 E[] = ... ; @depends on !c@ identifier r.I2; identifier s.E; @@ + const struct I2 E[] = ...; // </smpl> Signed-off-by: Márton Németh <nm127@freemail.hu> Cc: Julia Lawall <julia@diku.dk> Cc: cocci@diku.dk Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2010-01-11 14:31:27 +01:00
Jeremy Fitzhardinge	1ccbf5344c	xen: move Xen-testing predicates to common header Move xen_domain and related tests out of asm-x86 to xen/xen.h so they can be included whenever they are necessary. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 08:47:24 -08:00
Alexey Dobriyan	83d5cde47d	const: make block_device_operations const Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-22 07:17:25 -07:00
Greg Kroah-Hartman	a1b4b12b37	xen block: remove driver_data direct access of struct device In the near future, the driver core is going to not allow direct access to the driver_data pointer in struct device. Instead, the functions dev_get_drvdata() and dev_set_drvdata() should be used. These functions have been around since the beginning, so are backwards compatible with all older kernel versions. Cc: xen-devel@lists.xensource.com Cc: virtualization@lists.osdl.org Acked-by: Chris Wright <chrisw@sous-sol.org> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-06-15 21:30:27 -07:00
Martin K. Petersen	e1defc4ff0	block: Do away with the notion of hardsect_size Until now we have had a 1:1 mapping between storage device physical block size and the logical block sized used when addressing the device. With SATA 4KB drives coming out that will no longer be the case. The sector size will be 4KB but the logical block size will remain 512-bytes. Hence we need to distinguish between the physical block size and the logical ditto. This patch renames hardsect_size to logical_block_size. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-22 23:22:54 +02:00
Jens Axboe	9bd7de51ee	Merge branch 'master' into for-2.6.31 Conflicts: drivers/ide/ide-io.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-22 20:28:35 +02:00
Roel Kluin	b9ed7252d2	xen-blkfront: beyond ARRAY_SIZE of info->shadow Do not go beyond ARRAY_SIZE of info->shadow Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-22 09:59:51 +02:00
Ian Campbell	31a14400e8	xen/blkfront: fix warning when deleting gendisk on unplug/shutdown Currently blkfront gives a warning when hot unplugging due to calling del_gendisk() with interrupts disabled (due to blkif_io_lock). WARNING: at kernel/softirq.c:124 local_bh_enable+0x36/0x84() Modules linked in: xenfs xen_netfront ext3 jbd mbcache xen_blkfront Pid: 13, comm: xenwatch Not tainted 2.6.29-xs5.5.0.13 #3 Call Trace: [<c012611c>] warn_slowpath+0x80/0xb6 [<c0104cf1>] xen_sched_clock+0x16/0x63 [<c0104710>] xen_force_evtchn_callback+0xc/0x10 [<c0104e32>] check_events+0x8/0xe [<c0104d9b>] xen_restore_fl_direct_end+0x0/0x1 [<c0103749>] xen_mc_flush+0x10a/0x13f [<c0105bd2>] __switch_to+0x114/0x14e [<c011d92b>] dequeue_task+0x62/0x70 [<c0123b6f>] finish_task_switch+0x2b/0x84 [<c0299877>] schedule+0x66d/0x6e7 [<c0104710>] xen_force_evtchn_callback+0xc/0x10 [<c0104710>] xen_force_evtchn_callback+0xc/0x10 [<c012a642>] local_bh_enable+0x36/0x84 [<c022f9a7>] sk_filter+0x57/0x5c [<c0233dae>] netlink_broadcast+0x1d5/0x315 [<c01c6371>] kobject_uevent_env+0x28d/0x331 [<c01e7ead>] device_del+0x10f/0x120 [<c01e7ec6>] device_unregister+0x8/0x10 [<c015f86d>] bdi_unregister+0x2d/0x39 [<c01bf6f4>] unlink_gendisk+0x23/0x3e [<c01ac946>] del_gendisk+0x7b/0xe7 [<d0828c19>] blkfront_closing+0x28/0x6e [xen_blkfront] [<d082900c>] backend_changed+0x3ad/0x41d [xen_blkfront] We can fix this by calling del_gendisk() later in blkfront_closing, after releasing blkif_io_lock. Since the queue is stopped during the interrupts disabled phase I don't think there is any danger of an event occuring between releasing the blkif_io_lock and deleting the disk. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-19 08:27:42 +02:00
Ian Campbell	28afea5b2f	xen/blkfront: allow xenbus state transition to Closing->Closed when not Connected This situation can occur when attempting to attach a block device whose backend is an empty physical CD-ROM driver. The backend in this case will go directly from the Initialising state to Closing->Closed. Previously this would result in a NULL pointer deref on info->gd (xenbus_dev_fatal does not return as `a1a15ac5` seems to expect) Cc: stable@kernel.org Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-19 08:25:48 +02:00
Tejun Heo	9934c8c045	block: implement and enforce request peek/start/fetch Till now block layer allowed two separate modes of request execution. A request is always acquired from the request queue via elv_next_request(). After that, drivers are free to either dequeue it or process it without dequeueing. Dequeue allows elv_next_request() to return the next request so that multiple requests can be in flight. Executing requests without dequeueing has its merits mostly in allowing drivers for simpler devices which can't do sg to deal with segments only without considering request boundary. However, the benefit this brings is dubious and declining while the cost of the API ambiguity is increasing. Segment based drivers are usually for very old or limited devices and as converting to dequeueing model isn't difficult, it doesn't justify the API overhead it puts on block layer and its more modern users. Previous patches converted all block low level drivers to dequeueing model. This patch completes the API transition by... * renaming elv_next_request() to blk_peek_request() * renaming blkdev_dequeue_request() to blk_start_request() * adding blk_fetch_request() which is combination of peek and start * disallowing completion of queued (not started) requests * applying new API to all LLDs Renamings are for consistency and to break out of tree code so that it's apparent that out of tree drivers need updating. [ Impact: block request issue API cleanup, no functional change ] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Mike Miller <mike.miller@hp.com> Cc: unsik Kim <donari75@gmail.com> Cc: Paul Clements <paul.clements@steeleye.com> Cc: Tim Waugh <tim@cyberelk.net> Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: David S. Miller <davem@davemloft.net> Cc: Laurent Vivier <Laurent@lvivier.info> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Adrian McMenamin <adrian@mcmen.demon.co.uk> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Alex Dubov <oakad@yahoo.com> Cc: Pierre Ossman <drzeus@drzeus.cx> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: Stefan Weinhuber <wein@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:52:18 +02:00
Tejun Heo	296b2f6ae6	block: convert to dequeueing model (easy ones) plat-omap/mailbox, floppy, viocd, mspro_block, i2o_block and mmc/card/queue are already pretty close to dequeueing model and can be converted with simple changes. Convert them. While at it, * xen-blkfront: !fs check moved downwards to share dequeue call with normal path. * mspro_block: __blk_end_request(..., blk_rq_cur_byte()) converted to __blk_end_request_cur() * mmc/card/queue: loop of __blk_end_request() converted to __blk_end_request_all() [ Impact: dequeue in-flight request ] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Alex Dubov <oakad@yahoo.com> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: Pierre Ossman <drzeus@drzeus.cx> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:52:17 +02:00
Tejun Heo	83096ebf12	block: convert to pos and nr_sectors accessors With recent cleanups, there is no place where low level driver directly manipulates request fields. This means that the 'hard' request fields always equal the !hard fields. Convert all rq->sectors, nr_sectors and current_nr_sectors references to accessors. While at it, drop superflous blk_rq_pos() < 0 test in swim.c. [ Impact: use pos and nr_sectors accessors ] Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Tested-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Grant Likely <grant.likely@secretlab.ca> Tested-by: Adrian McMenamin <adrian@mcmen.demon.co.uk> Acked-by: Adrian McMenamin <adrian@mcmen.demon.co.uk> Acked-by: Mike Miller <mike.miller@hp.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Eric Moore <Eric.Moore@lsi.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Paul Clements <paul.clements@steeleye.com> Cc: Tim Waugh <tim@cyberelk.net> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Alex Dubov <oakad@yahoo.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Dario Ballabio <ballabio_dario@emc.com> Cc: David S. Miller <davem@davemloft.net> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: unsik Kim <donari75@gmail.com> Cc: Laurent Vivier <Laurent@lvivier.info> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:50:54 +02:00
Tejun Heo	f06d9a2b52	block: replace end_request() with [__]blk_end_request_cur() end_request() has been kept around for backward compatibility; however, it's about time for it to go away. * There aren't too many users left. * Its use of @updtodate is pretty confusing. * In some cases, newer code ends up using mixture of end_request() and [__]blk_end_request[_all](), which is way too confusing. So, add [__]blk_end_request_cur() and replace end_request() with it. Most conversions are straightforward. Noteworthy ones are... * paride/pcd: next_request() updated to take 0/-errno instead of 1/0. * paride/pf: pf_end_request() and next_request() updated to take 0/-errno instead of 1/0. * xd: xd_readwrite() updated to return 0/-errno instead of 1/0. * mtd/mtd_blkdevs: blktrans_discard_request() updated to return 0/-errno instead of 1/0. Unnecessary local variable res initialization removed from mtd_blktrans_thread(). [ Impact: cleanup ] Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Joerg Dorchain <joerg@dorchain.net> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Laurent Vivier <Laurent@lvivier.info> Cc: Tim Waugh <tim@cyberelk.net> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: unsik Kim <donari75@gmail.com>	2009-04-28 07:37:36 +02:00
Tejun Heo	40cbbb781d	block: implement and use [__]blk_end_request_all() There are many [__]blk_end_request() call sites which call it with full request length and expect full completion. Many of them ensure that the request actually completes by doing BUG_ON() the return value, which is awkward and error-prone. This patch adds [__]blk_end_request_all() which takes @rq and @error and fully completes the request. BUG_ON() is added to to ensure that this actually happens. Most conversions are simple but there are a few noteworthy ones. * cdrom/viocd: viocd_end_request() replaced with direct calls to __blk_end_request_all(). * s390/block/dasd: dasd_end_request() replaced with direct calls to __blk_end_request_all(). * s390/char/tape_block: tapeblock_end_request() replaced with direct calls to blk_end_request_all(). [ Impact: cleanup ] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Mike Miller <mike.miller@hp.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Alex Dubov <oakad@yahoo.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-04-28 07:37:35 +02:00
Kris Shannon	a1a15ac5f9	Fix kernel NULL pointer dereference in xen-blkfront When booting Xen Dom0 on a pre-release 3.2.1 hypervisor the system Oopses on a "Unable to handle kernel NULL pointer dereference" in xenwatch. From the backtrace it looks like backend_changed is calling bdget_disk with a NULL pointer. Checking for NULL and returning ENODEV instead allows the kernel to boot.	2009-03-05 12:04:57 +01:00
Jens Axboe	9e973e64ac	xen/blkfront: use blk_rq_map_sg to generate ring entries On occasion, the request will apparently have more segments than we fit into the ring. Jens says: > The second problem is that the block layer then appears to create one > too many segments, but from the dump it has rq->nr_phys_segments == > BLKIF_MAX_SEGMENTS_PER_REQUEST. I suspect the latter is due to > xen-blkfront not handling the merging on its own. It should check that > the new page doesn't form part of the previous page. The > rq_for_each_segment() iterates all single bits in the request, not dma > segments. The "easiest" way to do this is to call blk_rq_map_sg() and > then iterate the mapped sg list. That will give you what you are > looking for. > Here's a test patch, compiles but otherwise untested. I spent more > time figuring out how to enable XEN than to code it up, so YMMV! > Probably the sg list wants to be put inside the ring and only > initialized on allocation, then you can get rid of the sg on stack and > sg_init_table() loop call in the function. I'll leave that, and the > testing, to you. [Moved sg array into info structure, and initialize once. -J] Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-02-26 10:45:48 +01:00
Fernando Luis Vázquez Cao	66d352e1e4	xen-blkfront: set queue paravirt flag Xen's blkfront sets noop as the default I/O scheduler at initialization time to avoid elevator overheads such as idling, but with the advent of basic disk profiling capabilities this is not necessary anymore. We should just tell the block layer that we are a paravirt front-end driver and the elevator will automatically make the necessary adjustments. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:41 +01:00
Zhaolei	68aee07f9b	Release old elevator on change elevator We should release old elevator when change to use a new one. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-11-18 15:08:56 +01:00
Al Viro	a63c848b04	[PATCH] switch xen Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:48:13 -04:00
Al Viro	d4430d62fa	[PATCH] beginning of methods conversion To keep the size of changesets sane we split the switch by drivers; to keep the damn thing bisectable we do the following: 1) rename the affected methods, add ones with correct prototypes, make (few) callers handle both. That's this changeset. 2) for each driver convert to new methods. ALL drivers are converted in this series. 3) kill the old (renamed) methods. Note that it _is_ a flagday; all in-tree drivers are converted and by the end of this series no trace of old methods remain. The only reason why we do that this way is to keep the damn thing bisectable and allow per-driver debugging if anything goes wrong. New methods: open(bdev, mode) release(disk, mode) ioctl(bdev, mode, cmd, arg) /* Called without BKL / compat_ioctl(bdev, mode, cmd, arg) locked_ioctl(bdev, mode, cmd, arg) / Called with BKL, legacy */ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-21 07:47:32 -04:00
Ingo Molnar	365d46dc9b	Merge branch 'linus' into x86/xen Conflicts: arch/x86/kernel/cpu/common.c arch/x86/kernel/process_64.c arch/x86/xen/enlighten.c	2008-10-12 12:37:32 +02:00
Chris Lalancette	9246b5f06d	block: Expand Xen blkfront for > 16 xvd Until recently, the maximum number of xvd block devices you could attach to a Xen domU was 16. This limitation turned out to be problematic for some users, so it was expanded to handle a much larger number of disks. However, this requires a couple of changes in the way that blkfront scans for disks. This functionality is already present in the Xen linux-2.6.18-xen.hg tree; the attached patch adds this functionality to the mainline xen-blkfront implementation. I successfully tested it on a 2.6.25 tree, and build tested it on 2.6.27-rc3. Signed-off-by: Chris Lalancette <clalance@redhat.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-09 08:56:18 +02:00
Jeremy Fitzhardinge	6e833587e1	xen: clean up domain mode predicates There are four operating modes Xen code may find itself running in: - native - hvm domain - pv dom0 - pv domU Clean up predicates for testing for these states to make them more consistent. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Xen-devel <xen-devel@lists.xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-20 12:40:07 +02:00
Adrian Bunk	62aa0054da	xen-blkfront.c: make blkif_ioctl() static This patch makes the needlessly global blkif_ioctl() static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-08-06 12:30:04 +02:00
Ian Campbell	a144ff09bc	xen: Avoid allocations causing swap activity on the resume path Avoid allocations causing swap activity on the resume path by preventing the allocations from doing IO and allowing them to access the emergency pools. These paths are used when a frontend device is trying to connect to its backend driver over Xenbus. These reconnections are triggered on demand by IO, so by definition there is already IO underway, and further IO would naturally deadlock. On resume, this path is triggered when the running system tries to continue using its devices. If it cannot then the resume will fail; to try to avoid this we let it dip into the emergency pools. [ linux-2.6.18-xen changesets e8b49cfbdac, fdb998e79aba ] Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-03 13:21:13 +02:00
Jan Beulich	5a60d0cd4f	xen/blkfront: add __exit to module_exit() handlers Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-03 13:21:13 +02:00
Wim Colgate	04c0635058	xen/blkfront: Make sure that the device is fully ready before allowing release. [ linux-2.6.18-xen changeset c1c57fea77e9 ] Signed-off-by: Wim Colgate <wim@xensource.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-03 13:21:13 +02:00
Christian Limpach	440a01a7f4	xen/blkfront: Add the CDROM_GET_CAPABILITY ioctl to blkfront. Return 0 instead of -EINVAL if the blkfront device is a cdrom, i.e. had the VDISK_CDROM attribute. This allows udev's cdrom_id to correctly detect the device as a cdrom device. [ Add blkif_ioctl, and CDROMMULTISESSION ] [ linux-2.6.18-xen changeset d2bd9af846b5 ] Signed-off-by: Christian Limpach <Christian.Limpach@xensource.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-03 13:21:13 +02:00
Ian Campbell	1c91fe1a0d	xen/blkfront: Make sure we don't use bounce buffers, we don't need them. [ linux-2.6.18-xen changeset 667228bf8fc5 ] Signed-off-by: Ian Campbell <ian.campbell@xensource.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-07-03 13:21:12 +02:00
Harvey Harrison	afe42d7dea	xen: make blkif_getgeo static Introduced between 2.6.25-rc2 and -rc3 drivers/block/xen-blkfront.c:139:5: warning: symbol 'blkif_getgeo' was not declared. Should it be static? Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:06 -07:00
Mark McLoughlin	4f93f09b72	xen: Add compatibility aliases for frontend drivers Before getting merged, xen-blkfront was xenblk and xen-netfront was xennet. Temporarily adding compatibility module aliases eases upgrades from older versions by e.g. allowing mkinitrd to find the new version of the module. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:33 +02:00
Mark McLoughlin	d2f0c52bec	xen: Module autoprobing support for frontend drivers Add module aliases to support autoprobing modules for xen frontend devices. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:33 +02:00
Christian Limpach	1d78d70556	xen blkfront: Delay wait for block devices until after the disk is added When the xen block frontend driver is built as a module the module load is only synchronous up to the point where the frontend and the backend become connected rather than when the disk is added. This means that there can be a race on boot between loading the module and loading the dm-* modules and doing the scan for LVM physical volumes (all in the initrd). In the failure case the disk is not present until after the scan for physical volumes is complete. Taken from: http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/11483a00c017 Signed-off-by: Christian Limpach <Christian.Limpach@xensource.com> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:33 +02:00
Jeremy Fitzhardinge	53f0e8afcb	xen/blkfront: use bdget_disk info->dev is never initialized to anything, so bdget(info->dev) is meaningless. Get rid of info->dev, and use bdget_disk on the gendisk. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:33 +02:00
Markus Armbruster	3e334239d8	xen: Make xen-blkfront write its protocol ABI to xenstore Frontends are expected to write their protocol ABI to xenstore. Since the protocol ABI defaults to the backend's native ABI, things work fine without that as long as the frontend's native ABI is identical to the backend's native ABI. This is not the case for xen-blkfront running 32-on-64, because its ABI differs between 32 and 64 bit, and thus needs this fix. Based on http://xenbits.xensource.com/xen-unstable.hg?rev/c545932a18f3 and http://xenbits.xensource.com/xen-unstable.hg?rev/ffe52263b430 by Gerd Hoffmann <kraxel@suse.de> Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:32 +02:00
Ian Campbell	597592d951	xen: Implement getgeo for Xen virtual block device. The below implements the getgeo hook for Xen block devices. Extracted from the xen-unstable tree where it has been used for ages. It is useful to have because it allows things like grub2 (used by the Debian installer images) to work in a guest domain without having to sprinkle Xen specific hacks around the place. Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-21 16:19:13 -08:00
Kiyoshi Ueda	f530f03637	blk_end_request: changing xen-blkfront (take 4) This patch converts xen-blkfront to use blk_end_request interfaces. Related 'uptodate' arguments are converted to 'error'. Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-28 10:36:46 +01:00
Jens Axboe	6c92e699b5	Fixup rq_for_each_segment() indentation Remove one level of nesting where appropriate. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-10 09:25:56 +02:00
NeilBrown	5705f70217	Introduce rq_for_each_segment replacing rq_for_each_bio Every usage of rq_for_each_bio wraps a usage of bio_for_each_segment, so these can be combined into rq_for_each_segment. We define "struct req_iterator" to hold the 'bio' and 'index' that are needed for the double iteration. Signed-off-by: Neil Brown <neilb@suse.de> Various compile fixes by me... Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-10 09:25:56 +02:00
Jens Axboe	165125e1e4	[BLOCK] Get rid of request_queue_t typedef Some of the code has been gradually transitioned to using the proper struct request_queue, but there's lots left. So do a full sweet of the kernel and get rid of this typedef and replace its uses with the proper type. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-24 09:28:11 +02:00
Jeremy Fitzhardinge	9f27ee5950	xen: add virtual block device driver. The block device frontend driver allows the kernel to access block devices exported exported by a virtual machine containing a physical block device driver. Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Greg KH <greg@kroah.com> Cc: Jens Axboe <axboe@kernel.dk>	2007-07-18 08:47:45 -07:00

1 2 3 4 5

246 Commits