linux

Author	SHA1	Message	Date
Jonas Gorski	19c860d932	MIPS: BCM63XX: Add PCIe Support for BCM6328 Add support for the PCIe port found on BCM6328. Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Cc: linux-mips@linux-mips.org Cc: Maxime Bizon <mbizon@freebox.fr> Cc: Florian Fainelli <florian@openwrt.org> Cc: Kevin Cernekee <cernekee@gmail.com> Patchwork: https://patchwork.linux-mips.org/patch/3956/ Reviewed-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-07-24 16:33:13 +02:00
Jonas Gorski	76f42fe811	MIPS: BCM63XX: Move the PCI initialization into its own function Also make the cpu check a bit more explicit. Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Cc: linux-mips@linux-mips.org Cc: Maxime Bizon <mbizon@freebox.fr> Cc: Florian Fainelli <florian@openwrt.org> Cc: Kevin Cernekee <cernekee@gmail.com> Patchwork: https://patchwork.linux-mips.org/patch/3953/ Reviewed-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-07-24 16:33:13 +02:00
Jonas Gorski	e5766aea5b	MIPS: BCM63XX: Add basic BCM6328 support This includes CPU speed, memory size detection and working UART, but lacking the appropriate drivers, no support for attached flash. Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Cc: linux-mips@linux-mips.org Cc: Maxime Bizon <mbizon@freebox.fr> Cc: Florian Fainelli <florian@openwrt.org> Cc: Kevin Cernekee <cernekee@gmail.com> Patchwork: https://patchwork.linux-mips.org/patch/3951/ Reviewed-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-07-24 16:33:12 +02:00
Jonas Gorski	288752a8aa	MIPS: BCM63XX: Use the Chip ID register for identifying the SoC Newer BCM63XX SoCs use virtually the same CPU ID, differing only in the revision bits. But since they all have the Chip ID register at the same location, we can use that to identify the SoC we are running on. Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Cc: linux-mips@linux-mips.org Cc: Maxime Bizon <mbizon@freebox.fr> Cc: Florian Fainelli <florian@openwrt.org> Cc: Kevin Cernekee <cernekee@gmail.com> Patchwork: https://patchwork.linux-mips.org/patch/3955/ Reviewed-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-07-24 16:33:12 +02:00
Jonas Gorski	aaf3fedb56	MIPS: BCM63XX: Add flash type detection On BCM6358 and BCM6368 the attached flash type is exposed through a bootstrapping register. Use it for auto detecting the flash type on those and default to parallel flash for earlier SoCs. Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Cc: linux-mips@linux-mips.org Cc: Maxime Bizon <mbizon@freebox.fr> Cc: Florian Fainelli <florian@openwrt.org> Cc: Kevin Cernekee <cernekee@gmail.com> Patchwork: https://patchwork.linux-mips.org/patch/3954/ Reviewed-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-07-24 16:33:11 +02:00
Jonas Gorski	4b897d5483	MIPS: BCM63XX: Move flash registration out of board_bcm963xx.c board_bcm963xx.c is already large enough. Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Cc: linux-mips@linux-mips.org Cc: Maxime Bizon <mbizon@freebox.fr> Cc: Florian Fainelli <florian@openwrt.org> Cc: Kevin Cernekee <cernekee@gmail.com> Patchwork: https://patchwork.linux-mips.org/patch/3952/ Reviewed-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-07-24 16:33:11 +02:00
Florian Fainelli	553072b27e	hw_random: add Broadcom BCM63xx RNG driver Signed-off-by: Florian Fainelli <florian@openwrt.org> Cc: linux-mips@linux-mips.org Cc: mpm@selenic.com Cc: herbert@gondor.apana.org.au Patchwork: https://patchwork.linux-mips.org/patch/3327/ Patchwork: https://patchwork.linux-mips.org/patch/4072/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-07-24 16:33:11 +02:00
Florian Fainelli	b73ab84199	MIPS: BCM63XX: add RNG driver platform_device stub Signed-off-by: Florian Fainelli <florian@openwrt.org> Cc: linux-mips@linux-mips.org Cc: mpm@selenic.com Cc: herbert@gondor.apana.org.au Patchwork: https://patchwork.linux-mips.org/patch/3325/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-07-24 16:33:10 +02:00
Florian Fainelli	8aecfe9462	MIPS: BCM63XX: add RNG peripheral definitions Signed-off-by: Florian Fainelli <florian@openwrt.org> Cc: linux-mips@linux-mips.org Cc: mpm@selenic.com Cc: herbert@gondor.apana.org.au Patchwork: https://patchwork.linux-mips.org/patch/3326/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-07-24 16:33:10 +02:00
Florian Fainelli	0b55561bc6	MIPS: BCM63XX: add support for "ipsec" clock This module is only available on BCM6368 so far and does not require resetting the block. Signed-off-by: Florian Fainelli <florian@openwrt.org> Cc: linux-mips@linux-mips.org Cc: mpm@selenic.com Cc: herbert@gondor.apana.org.au Patchwork: https://patchwork.linux-mips.org/patch/3324/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-07-24 16:33:09 +02:00
David Daney	a03822ea5d	MIPS: OCTEON: Remove some unused files. These FPA related files are not used anywhere in the kernel. Remove them. Signed-off-by: David Daney <david.daney@cavium.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/3892/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-07-24 16:33:09 +02:00
Florian Fainelli	94c58b7f23	MIPS: BCM63XX: Fix platform_devices id There is only one watchdog and VoIP DSP platform devices per board, use -1 as the platform_device id accordingly. Signed-off-by: Florian Fainelli <florian@openwrt.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/3313/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2012-07-24 16:33:09 +02:00
Lars Ellenberg	a73ff3231d	drbd: announce FLUSH/FUA capability to upper layers Unconditionally announce FLUSH/FUA to upper layers. If the lower layers on either node do not actually support this, generic_make_request() will deal with it. If this causes performance regressions on your setup, make sure there are no volatile caches involved, and mount -o nobarrier or equivalent. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-07-24 15:14:28 +02:00
Lars Ellenberg	db141b2f42	drbd: fix max_bio_size to be unsigned We capped our max_bio_size respectively max_hw_sectors with min_t(int, lower level limit, our limit); unfortunately, some drivers, e.g. the kvm virtio block driver, initialize their limits to "-1U", and that is of course a smaller "int" value than our limit. Impact: we started to request 16 MB resync requests, which lead to protocol error and a reconnect loop. Fix all relevant constants and parameters to be unsigned int. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-07-24 15:14:00 +02:00
Lars Ellenberg	7ee1fb93f3	drbd: flush drbd work queue before invalidate/invalidate remote If you do back to back wait-sync/invalidate on a Primary in a tight loop, during application IO load, you could trigger a race: kernel: block drbd6: FIXME going to queue 'set_n_write from StartingSync' but 'write from resync_finished' still pending? Fix this by changing the order of the drbd_queue_work() and the wake_up() in dec_ap_pending(), and adding the additional drbd_flush_workqueue() before requesting the full sync. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-07-24 14:15:58 +02:00
Lars Ellenberg	c12e9c8964	drbd: fix potential access after free Occasionally, if we disconnect, we triggered this assert: block drbd7: ASSERT FAILED tl_hash[27] == c30b0f04, expected NULL hlist_del() happens only on master bio completion. We used to wait for pending IO to complete before freeing tl_hash on disconnect. We no longer do so, since we learned to "freeze" IO on disconnect. If the local disk is too slow, we may reach C_STANDALONE early, and there are still some requests pending locally when we call drbd_free_tl_hash(). If we now free the tl_hash, and later the local IO completion completes the master bio, which then does hlist_del() and clobbers freed memory. Do hlist_del_init() and hlist_add_fake() before kfree(tl_hash), so the hlist_del() on master bio completion is harmless. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-07-24 14:15:16 +02:00
Laurent Pinchart	fb604a3d58	i2c-omap: Add support for I2C_M_STOP message flag Generate a stop condition after each message marked with I2C_M_STOP. [JD: Add I2C_FUNC_PROTOCOL_MANGLING.] Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2012-07-24 14:13:59 +02:00
Laurent Pinchart	72fc2c7f78	i2c: Fall back to emulated SMBus if the operation isn't supported natively Adapter drivers might support only a subset of the SMBus operations natively. Those drivers currently have to manually emulate unsupported operations using I2C. Make the i2c_smbus_xfer() function fall back to i2c_smbus_xfer_emulated() when the adapter's .smbus_xfer() operation returns -EOPNOTSUPP, like it already does when the .smbus_xfer() operation isn't available at all. [JD: Minor optimization.] Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2012-07-24 14:13:59 +02:00
Laurent Pinchart	d47726c521	i2c: Add SCCB support SCCB is a serial communication bus developed by Omnivision. Its 2-wire mode is very similar to SMBus byte data transactions, but requires the controller to ignore the ACK bit and to insert a stop condition after each message. Add a device SCCB flag and a message stop flag to be passed to controller drivers. [JD: Kill rogue definition in go7007 driver.] Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2012-07-24 14:13:59 +02:00
Emmanuel Deloget	68a7602f09	i2c-tiny-usb: Add support for the Robofuzz OSIF USB/I2C converter Robofuzz OSIF is a generic USB/iIC interface that embeds an ATMega8A AVR-RISC microcontroler. The device is based upon Till Harbaum's i2c-tiny-usb and although it enhances the original design with further functionnalities it still maintain compatibility with it with respect to the USB/I2C interface. Signed-off-by: Emmanuel Deloget <logout@free.fr> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2012-07-24 14:13:59 +02:00
Daniel Kurtz	d3ff6ce400	i2c-i801: Enable IRQ for byte_by_byte transactions Byte-by-byte transactions are used primarily for accessing I2C devices with an SMBus controller. For these transactions, for each byte that is read or written, the SMBus controller generates a BYTE_DONE IRQ. The isr reads/writes the next byte, and clears the IRQ flag to start the next byte. On the penultimate IRQ, the isr also sets the LAST_BYTE flag. There is no locking around the cmd/len/count/data variables, since the I2C adapter lock ensures there is never multiple simultaneous transactions for the same device, and the driver thread never accesses these variables while interrupts might be occurring. The end result is faster I2C block read and write transactions. Note: This patch has only been tested and verified by doing I2C read and write block transfers on Cougar Point 6 Series PCH, as well as I2C read block transfers on ICH5. Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2012-07-24 14:13:59 +02:00
Jean Delvare	29b608540b	i2c-i801: Enable interrupts on ICH5/7/8/9/10 Enable interrupts on more devices. ICH5, ICH7(-M) and ICH10 have been tested to work OK. ICH8 and ICH9 are expected to work just fine as they are very close to ICH7 and ICH10. Ultimately we want to enable this feature on at least every device since the ICH5, but for now we limit the exposure. We'll enable it for other devices if we don't get negative feedback. As a bonus, let the user know when interrupts are used. Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Daniel Kurtz <djkurtz@chromium.org>	2012-07-24 14:13:59 +02:00
Daniel Kurtz	636752bcb5	i2c-i801: Enable IRQ for SMBus transactions Add a new 'feature' to i2c-i801 to enable using PCI interrupts. When the feature is enabled, then an isr is installed for the device's PCI IRQ. An I2C/SMBus transaction is always terminated by one of the following interrupt sources: FAILED, BUS_ERR, DEV_ERR, or on success: INTR. When the isr fires for one of these cases, it sets the ->status variable and wakes up the waitq. The waitq then saves off the status code, and clears ->status (in preparation for some future transaction). The SMBus controller generates an INTR irq at the end of each transaction where INTREN was set in the HST_CNT register. No locking is needed around accesses to priv->status since all writes to it are serialized: it is only ever set once in the isr at the end of a transaction, and cleared while no interrupts can occur. In addition, the I2C adapter lock guarantees that entire I2C transactions for a single adapter are always serialized. For this patch, the INTREN bit is set only for SMBus block, byte and word transactions, but not for I2C reads or writes. The use of the DS (BYTE_DONE) interrupt with byte-by-byte I2C transactions is implemented in a subsequent patch. The interrupt feature has only been enabled for COUGARPOINT hardware. In addition, it is disabled if SMBus is using the SMI# interrupt. Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2012-07-24 14:13:58 +02:00
Jean Delvare	6cad93c4bb	i2c-i801: Consolidate polling (Based on earlier work by Daniel Kurtz.) Come up with a consistent, driver-wide strategy for event polling. For intermediate steps of byte-by-byte block transactions, check for BYTE_DONE or any error flag being set. At the end of every transaction (regardless of PEC being used), check for both BUSY being cleared and INTR or any error flag being set. This ensures proper action for all transaction types. Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Daniel Kurtz <djkurtz@chromium.org>	2012-07-24 14:13:58 +02:00
Daniel Kurtz	37af871112	i2c-i801: Drop ENABLE_INT9 Later patches enable interrupts. This preliminary patch removes the older unsupported ENABLE_INT9 flag. Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2012-07-24 14:13:58 +02:00
Daniel Kurtz	edbeea6383	i2c-i801: Rename some SMBHSTCNT bit constants Rename the SMBHSTCNT register bit access constants to match the style of other register bits. Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2012-07-24 14:13:58 +02:00
Daniel Kurtz	70a1cc1952	i2c-i801: Check and return errors during byte-by-byte transfers If an error is detected in the polling loop, abort the transaction and return an error code. * DEV_ERR is set if the device does not respond with an acknowledge, and the SMBus controller times out (minimum 25ms). * BUS_ERR is set if a bus arbitration collision is detected. In other words, when the SMBus controller tries to generate a START condition, but detects that the SMBDATA is being held low, usually by another SMBus/I2C master. * FAILED is only set if a transaction is stopped by software (using the SMBHSTCNT KILL bit). Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2012-07-24 14:13:58 +02:00
Daniel Kurtz	0ba8b8bfd5	i2c-i801: Clear only status bits in HST_STS Writing back the whole status register could clear unwanted bits. In particular, it could clear the "INUSE_STS" bit, which is a 'hardware semaphore', that might be useful to use some day. To prepare for this, let's ban writing back the whole status to register HST_STS, of which this is the only instance. Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2012-07-24 14:13:57 +02:00
Daniel Kurtz	efa3cb15ad	i2c-i801: Refactor use of LAST_BYTE in i801_block_transaction_byte_by_byte As a slight optimization, pull some logic out of the polling loop during byte-by-byte transactions by just setting the I801_LAST_BYTE bit, as defined in the i801 (PCH) datasheet, when reading the last byte of a byte-by-byte I2C_SMBUS_READ. Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2012-07-24 14:13:57 +02:00
Fabio Estevam	fda2f4af37	i2c-smbus: Use module_i2c_driver() Using module_i2c_driver() makes the code smaller and cleaner. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2012-07-24 14:13:57 +02:00
Jean Delvare	9cd3f2e849	i2c/writing-clients: Mention module_i2c_driver() Based on a previous patch from Peter Meerwald. Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Peter Meerwald <p.meerwald@bct-electronic.com>	2012-07-24 14:13:57 +02:00
Andrew Armenia	2a2f7404a1	i2c-piix4: Support AMD auxiliary SMBus controller Some AMD chipsets, such as the SP5100, have an auxiliary SMBus controller with a second set of registers. This patch adds support for this auxiliary controller. Tested on ASUS KCMA-D8 motherboard. Signed-off-by: Andrew Armenia <andrew@asquaredlabs.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2012-07-24 14:13:57 +02:00
Andrew Armenia	e154bf6fbf	i2c-piix4: Separate registration and probing code Some chipsets have multiple sets of SMBus registers each controlling a separate SMBus. Supporting these chipsets properly will require registering multiple I2C adapters for one piix4. The code to initialize and register the i2c_adapter structure has been separated from piix4_probe and allows registration of a piix4 adapter given its base address. Note that the i2c_adapter and i2c_piix4_adapdata structures are now dynamically allocated. Signed-off-by: Andrew Armenia <andrew@asquaredlabs.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2012-07-24 14:13:56 +02:00
Andrew Armenia	14a8086d27	i2c-piix4: Eliminate piix4_smba global variable Some chipsets have multiple sets of piix4-compatible SMBus registers. Eliminating the global variable will allow these chipsets to be fully supported. Return value from piix4_setup and piix4_sb800_setup now returns the smba value detected. This is stored in a struct i2c_piix4_adapdata. Thus the global variable is eliminated. Signed-off-by: Andrew Armenia <andrew@asquaredlabs.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2012-07-24 14:13:56 +02:00
Axel Lin	56f2178898	i2c/busses: Use module_pci_driver Convert the drivers in drivers/i2c/busses/* to usemodule_pci_driver() macro which makes the code smaller and a bit simpler. Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Rudolf Marek <r.marek@assembler.cz> Cc: Olof Johansson <olof@lixom.net> Cc: "Mark M. Hoffman" <mhoffman@lightlink.com> Cc: Tomoya MORINAGA <tomoya.rohm@gmail.com>	2012-07-24 14:13:56 +02:00
Guenter Roeck	83a638df36	i2c: Update Guenter Roeck's e-mail address My old e-mail address won't be valid for much longer. Time to update it. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2012-07-24 14:13:56 +02:00
Lars Ellenberg	63a6d0bb3d	drbd: call local-io-error handler early In case we want to hard-reset from the local-io-error handler, we need to call it before notifying the peer or aborting local IO. Otherwise the peer will advance its data generation UUIDs even if secondary. This way, local io error looks like a "regular" node crash, which reduces the number of different failure cases. This may be useful in a bigger picture where crashed or otherwise "misbehaving" nodes are automatically re-deployed. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-07-24 14:10:41 +02:00
Lars Ellenberg	0029d62434	drbd: do not reset rs_pending_cnt too early Fix asserts like block drbd0: in got_BlockAck:4634: rs_pending_cnt = -35 < 0 ! We reset the resync lru cache and related information (rs_pending_cnt), once we successfully finished a resync or online verify, or if the replication connection is lost. We also need to reset it if a resync or online verify is aborted because a lower level disk failed. In that case the replication link is still established, and we may still have packets queued in the network buffers which want to touch rs_pending_cnt. We do not have any synchronization mechanism to know for sure when all such pending resync related packets have been drained. To avoid this counter to go negative (and violate the ASSERT that it will always be >= 0), just do not reset it when we lose a disk. It is good enough to make sure it is re-initialized before the next resync can start: reset it when we re-attach a disk. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-07-24 14:09:53 +02:00
Lars Ellenberg	88437879fb	drbd: reset congestion information before reporting it in /proc/drbd We cache the congestion status in mdev->congestion_reason whenever drbd_congested() was called. Reset this cached info before reporting it when reading /proc/drbd. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-07-24 14:07:48 +02:00
Lars Ellenberg	c2ba686f35	drbd: report congestion if we are waiting for some userland callback If the drbd worker thread is synchronously waiting for some userland callback, we don't want some casual pageout to block on us. Have drbd_congested() report congestion in that case. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-07-24 14:07:18 +02:00
Lars Ellenberg	383606e0de	drbd: differentiate between normal and forced detach Aborting local requests (not waiting for completion from the lower level disk) is dangerous: if the master bio has been completed to upper layers, data pages may be re-used for other things already. If local IO is still pending and later completes, this may cause crashes or corrupt unrelated data. Only abort local IO if explicitly requested. Intended use case is a lower level device that turned into a tarpit, not completing io requests, not even doing error completion. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-07-24 14:06:18 +02:00
Lars Ellenberg	d264580145	drbd: cleanup, remove two unused global flags Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-07-24 14:02:41 +02:00
Peter Zijlstra	8323f26ce3	sched: Fix race in task_group() Stefan reported a crash on a kernel before `a3e5d1091c` ("sched: Don't call task_group() too many times in set_task_rq()"), he found the reason to be that the multiple task_group() invocations in set_task_rq() returned different values. Looking at all that I found a lack of serialization and plain wrong comments. The below tries to fix it using an extra pointer which is updated under the appropriate scheduler locks. Its not pretty, but I can't really see another way given how all the cgroup stuff works. Reported-and-tested-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1340364965.18025.71.camel@twins Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-24 13:58:20 +02:00
Srivatsa Vaddagiri	88b8dac0a1	sched: Improve balance_cpu() to consider other cpus in its group as target of (pinned) task Current load balance scheme requires only one cpu in a sched_group (balance_cpu) to look at other peer sched_groups for imbalance and pull tasks towards itself from a busy cpu. Tasks thus pulled by balance_cpu could later get picked up by cpus that are in the same sched_group as that of balance_cpu. This scheme however fails to pull tasks that are not allowed to run on balance_cpu (but are allowed to run on other cpus in its sched_group). That can affect fairness and in some worst case scenarios cause starvation. Consider a two core (2 threads/core) system running tasks as below: Core0 Core1 / \ / \ C0 C1 C2 C3 \| \| \| \| v v v v F0 T1 F1 [idle] T2 F0 = SCHED_FIFO task (pinned to C0) F1 = SCHED_FIFO task (pinned to C2) T1 = SCHED_OTHER task (pinned to C1) T2 = SCHED_OTHER task (pinned to C1 and C2) F1 could become a cpu hog, which will starve T2 unless C1 pulls it. Between C0 and C1 however, C0 is required to look for imbalance between cores, which will fail to pull T2 towards Core0. T2 will starve eternally in this case. The same scenario can arise in presence of non-rt tasks as well (say we replace F1 with high irq load). We tackle this problem by having balance_cpu move pinned tasks to one of its sibling cpus (where they can run). We first check if load balance goal can be met by ignoring pinned tasks, failing which we retry move_tasks() with a new env->dst_cpu. This patch modifies load balance semantics on who can move load towards a given cpu in a given sched_domain. Before this patch, a given_cpu or a ilb_cpu acting on behalf of an idle given_cpu is responsible for moving load to given_cpu. With this patch applied, balance_cpu can in addition decide on moving some load to a given_cpu. There is a remote possibility that excess load could get moved as a result of this (balance_cpu and given_cpu/ilb_cpu deciding independently and at same time to move some load to a given_cpu). However we should see less of such conflicting decisions in practice and moreover subsequent load balance cycles should correct the excess load moved to given_cpu. Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Prashanth Nageshappa <prashanth@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4FE06CDB.2060605@linux.vnet.ibm.com [ minor edits ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-24 13:58:06 +02:00
Prashanth Nageshappa	bbf18b1949	sched: Reset loop counters if all tasks are pinned and we need to redo load balance While load balancing, if all tasks on the source runqueue are pinned, we retry after excluding the corresponding source cpu. However, loop counters env.loop and env.loop_break are not reset before retrying, which can lead to failure in moving the tasks. In this patch we reset env.loop and env.loop_break to their inital values before we retry. Signed-off-by: Prashanth Nageshappa <prashanth@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4FE06EEF.2090709@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-24 13:55:37 +02:00
Prashanth Nageshappa	85c1e7dae1	sched: Reorder 'struct lb_env' members to reduce its size Members of 'struct lb_env' are not in appropriate order to reuse compiler added padding on 64bit architectures. In this patch we reorder those struct members and help reduce the size of the structure from 96 bytes to 80 bytes on 64 bit architectures. Suggested-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Prashanth Nageshappa <prashanth@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4FE06DDE.7000403@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-24 13:55:20 +02:00
Mike Galbraith	970e178985	sched: Improve scalability via 'CPU buddies', which withstand random perturbations Traversing an entire package is not only expensive, it also leads to tasks bouncing all over a partially idle and possible quite large package. Fix that up by assigning a 'buddy' CPU to try to motivate. Each buddy may try to motivate that one other CPU, if it's busy, tough, it may then try its SMT sibling, but that's all this optimization is allowed to cost. Sibling cache buddies are cross-wired to prevent bouncing. 4 socket 40 core + SMT Westmere box, single 30 sec tbench runs, higher is better: clients 1 2 4 8 16 32 64 128 .......................................................................... pre 30 41 118 645 3769 6214 12233 14312 post 299 603 1211 2418 4697 6847 11606 14557 A nice increase in performance. Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1339471112.7352.32.camel@marge.simpson.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-24 13:53:34 +02:00
Srivatsa S. Bhat	a1cd2b13f7	cpusets: Remove/update outdated comments cpuset_track_online_cpus() is no longer present. So remove the outdated comment and replace it with reference to cpuset_update_active_cpus() which is its equivalent. Also, we don't lack memory hot-unplug anymore. And David Rientjes pointed out how it is dealt with. So update that comment as well. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20120524141700.3692.98192.stgit@srivatsabhat.in.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-24 13:53:28 +02:00
Srivatsa S. Bhat	7ddf96b02f	cpusets, hotplug: Restructure functions that are invoked during hotplug Separate out the cpuset related handling for CPU/Memory online/offline. This also helps us exploit the most obvious and basic level of optimization that any notification mechanism (CPU/Mem online/offline) has to offer us: "We know why we have been invoked. So stop pretending that we are lost, and do only the necessary amount of processing!". And while at it, rename scan_for_empty_cpusets() to scan_cpusets_upon_hotplug(), which is more appropriate considering how it is restructured. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20120524141650.3692.48637.stgit@srivatsabhat.in.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-24 13:53:22 +02:00
Srivatsa S. Bhat	80d1fa6463	cpusets, hotplug: Implement cpuset tree traversal in a helper function At present, the functions that deal with cpusets during CPU/Mem hotplug are quite messy, since a lot of the functionality is mixed up without clear separation. And this takes a toll on optimization as well. For example, the function cpuset_update_active_cpus() is called on both CPU offline and CPU online events; and it invokes scan_for_empty_cpusets(), which makes sense only for CPU offline events. And hence, the current code ends up unnecessarily traversing the cpuset tree during CPU online also. As a first step towards cleaning up those functions, encapsulate the cpuset tree traversal in a helper function, so as to facilitate upcoming changes. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20120524141635.3692.893.stgit@srivatsabhat.in.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-07-24 13:53:18 +02:00

... 27 28 29 30 31 ...

321314 Commits