android_kernel_zuk_msm8996.git

	Commit message (Collapse)	Author
2015-03-31	IRQCHIP: irq-mips-gic: Add new functions to start/stop the GIC counter	Markos Chandras
	We add new functions to start and stop the GIC counter since there are no guarantees the counter will be running after a CPU reset. The GIC counter is stopped by setting the 29th bit on the GIC Config register and it is started by clearing that bit. Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Andrew Bresticker <abrestic@chromium.org> Cc: Qais Yousef <qais.yousef@imgtec.com> Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/9594/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-03-31	IRQCHIP: mips-gic: Add function for retrieving FDC IRQ	James Hogan
	Add a function to the MIPS GIC driver for retrieving the Fast Debug Channel (FDC) interrupt number, similar to the existing ones for the timer and perf counter interrupts. This will be used by platform implementations of get_c0_fdc_int() if a GIC is present. A workaround exists for interAptiv and proAptiv which claim to be able to route the FDC interrupt but don't seem to be able to in practice (at least on Malta). [ralf@linux-mips.org: Fix conflict.] Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Andrew Bresticker <abrestic@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/9142/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-03-31	MIPS: Add CDMM bus support	James Hogan
	Add MIPS Common Device Memory Map (CDMM) support in the form of a bus in the standard Linux device model. Each device attached via CDMM is discoverable via an 8-bit type identifier and may contain a number of blocks of memory mapped registers in the CDMM region. IRQs are expected to be handled separately. Due to the per-cpu (per-VPE for MT cores) nature of the CDMM devices, all the driver callbacks take place from workqueues which are run on the right CPU for the device in question, so that the driver doesn't need to be as concerned about which CPU it is running on. Callbacks also exist for when CPUs are taken offline, so that any per-CPU resources used by the driver can be disabled so they don't get forcefully migrated. CDMM devices are created as children of the CPU device they are attached to. Any existing CDMM configuration by the bootloader will be inherited, however platforms wishing to enable CDMM should implement the weak mips_cdmm_phys_base() function (see asm/cdmm.h) so that the bus driver knows where it should put the CDMM region in the physical address space if the bootloader hasn't already enabled it. A mips_cdmm_early_probe() function is also provided to allow early boot or particularly low level code to set up the CDMM region and probe for a specific device type, for example early console or KGDB IO drivers for the EJTAG Fast Debug Channel (FDC) CDMM device. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/9599/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-03-31	IRQCHIP: mips-gic: Add missing definitions for FDC IRQ	James Hogan
	Add missing VPE_PEND, VPE_RMASK and VPE_SMASK definitions for the local FDC interrupt. These local interrupt definitions aren't directly used, but if they exist they should be complete. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Andrew Bresticker <abrestic@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: linux-mips@linux-mips.org Reviewed-by: Andrew Bresticker <abrestic@chromium.org> Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/9127/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-03-27	CLK: Add binding document for Pistachio clock controllers	Andrew Bresticker
	Add a device-tree binding document describing the four clock controllers present on the IMG Pistachio SoC. Signed-off-by: Damien Horsley <Damien.Horsley@imgtec.com> Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ian Campbell <ijc+devicetree@hellion.org.uk> Cc: Kumar Gala <galak@codeaurora.org> Cc: Mike Turquette <mturquette@linaro.org> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: devicetree@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: Andrew Bresticker <abrestic@chromium.org> Cc: Ezequiel Garcia <ezequiel.garcia@imgtec.com> Cc: James Hartley <james.hartley@imgtec.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Damien Horsley <Damien.Horsley@imgtec.com> Acked-by: Stephen Boyd <sboyd@codeaurora.org> Patchwork: https://patchwork.linux-mips.org/patch/9319/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-03-19	target: do not reject FUA CDBs when write cache is enabled but ↵	Christophe Vu-Brugier
	emulate_write_cache is 0 A check that rejects a CDB with FUA bit set if no write cache is emulated was added by the following commit: fde9f50 target: Add sanity checks for DPO/FUA bit usage The condition is as follows: if (!dev->dev_attrib.emulate_fua_write \|\| !dev->dev_attrib.emulate_write_cache) However, this check is wrong if the backend device supports WCE but "emulate_write_cache" is disabled. This patch uses se_dev_check_wce() (previously named spc_check_dev_wce) to invoke transport->get_write_cache() if the device has a write cache or check the "emulate_write_cache" attribute otherwise. Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christophe Vu-Brugier <cvubrugier@fastmail.fm> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-03-17	netdevice.h: fix ndo_bridge_* comments	Nicolas Dichtel
	The argument 'flags' was missing in ndo_bridge_setlink(). ndo_bridge_dellink() was missing. Fixes: 407af3299ef1 ("bridge: Add netlink interface to configure vlans on bridge ports") Fixes: add511b38266 ("bridge: add flags argument to ndo_bridge_setlink and ndo_bridge_dellink") CC: Vlad Yasevich <vyasevic@redhat.com> CC: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-17	livepatch: Fix subtle race with coming and going modules	Petr Mladek
	There is a notifier that handles live patches for coming and going modules. It takes klp_mutex lock to avoid races with coming and going patches but it does not keep the lock all the time. Therefore the following races are possible: 1. The notifier is called sometime in STATE_MODULE_COMING. The module is visible by find_module() in this state all the time. It means that new patch can be registered and enabled even before the notifier is called. It might create wrong order of stacked patches, see below for an example. 2. New patch could still see the module in the GOING state even after the notifier has been called. It will try to initialize the related object structures but the module could disappear at any time. There will stay mess in the structures. It might even cause an invalid memory access. This patch solves the problem by adding a boolean variable into struct module. The value is true after the coming and before the going handler is called. New patches need to be applied when the value is true and they need to ignore the module when the value is false. Note that we need to know state of all modules on the system. The races are related to new patches. Therefore we do not know what modules will get patched. Also note that we could not simply ignore going modules. The code from the module could be called even in the GOING state until mod->exit() finishes. If we start supporting patches with semantic changes between function calls, we need to apply new patches to any still usable code. See below for an example. Finally note that the patch solves only the situation when a new patch is registered. There are no such problems when the patch is being removed. It does not matter who disable the patch first, whether the normal disable_patch() or the module notifier. There is nothing to do once the patch is disabled. Alternative solutions: ====================== + reject new patches when a patched module is coming or going; this is ugly + wait with adding new patch until the module leaves the COMING and GOING states; this might be dangerous and complicated; we would need to release kgr_lock in the middle of the patch registration to avoid a deadlock with the coming and going handlers; also we might need a waitqueue for each module which seems to be even bigger overhead than the boolean + stop modules from entering COMING and GOING states; wait until modules leave these states when they are already there; looks complicated; we would need to ignore the module that asked to stop the others to avoid a deadlock; also it is unclear what to do when two modules asked to stop others and both are in COMING state (situation when two new patches are applied) + always register/enable new patches and fix up the potential mess (registered patches order) in klp_module_init(); this is nasty and prone to regressions in the future development + add another MODULE_STATE where the kallsyms are visible but the module is not used yet; this looks too complex; the module states are checked on "many" locations Example of patch stacking breakage: =================================== The notifier could _not_ _simply_ ignore already initialized module objects. For example, let's have three patches (P1, P2, P3) for functions a() and b() where a() is from vmcore and b() is from a module M. Something like: a() b() P1 a1() b1() P2 a2() b2() P3 a3() b3(3) If you load the module M after all patches are registered and enabled. The ftrace ops for function a() and b() has listed the functions in this order: ops_a->func_stack -> list(a3,a2,a1) ops_b->func_stack -> list(b3,b2,b1) , so the pointer to b3() is the first and will be used. Then you might have the following scenario. Let's start with state when patches P1 and P2 are registered and enabled but the module M is not loaded. Then ftrace ops for b() does not exist. Then we get into the following race: CPU0 CPU1 load_module(M) complete_formation() mod->state = MODULE_STATE_COMING; mutex_unlock(&module_mutex); klp_register_patch(P3); klp_enable_patch(P3); # STATE 1 klp_module_notify(M) klp_module_notify_coming(P1); klp_module_notify_coming(P2); klp_module_notify_coming(P3); # STATE 2 The ftrace ops for a() and b() then looks: STATE1: ops_a->func_stack -> list(a3,a2,a1); ops_b->func_stack -> list(b3); STATE2: ops_a->func_stack -> list(a3,a2,a1); ops_b->func_stack -> list(b2,b1,b3); therefore, b2() is used for the module but a3() is used for vmcore because they were the last added. Example of the race with going modules: ======================================= CPU0 CPU1 delete_module() #SYSCALL try_stop_module() mod->state = MODULE_STATE_GOING; mutex_unlock(&module_mutex); klp_register_patch() klp_enable_patch() #save place to switch universe b() # from module that is going a() # from core (patched) mod->exit(); Note that the function b() can be called until we call mod->exit(). If we do not apply patch against b() because it is in MODULE_STATE_GOING, it will call patched a() with modified semantic and things might get wrong. [jpoimboe@redhat.com: use one boolean instead of two] Signed-off-by: Petr Mladek <pmladek@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2015-03-14	arm/arm64: KVM: Keep elrsr/aisr in sync with software model	Christoffer Dall
	There is an interesting bug in the vgic code, which manifests itself when the KVM run loop has a signal pending or needs a vmid generation rollover after having disabled interrupts but before actually switching to the guest. In this case, we flush the vgic as usual, but we sync back the vgic state and exit to userspace before entering the guest. The consequence is that we will be syncing the list registers back to the software model using the GICH_ELRSR and GICH_EISR from the last execution of the guest, potentially overwriting a list register containing an interrupt. This showed up during migration testing where we would capture a state where the VM has masked the arch timer but there were no interrupts, resulting in a hung test. Cc: Marc Zyngier <marc.zyngier@arm.com> Reported-by: Alex Bennee <alex.bennee@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-03-13	vxlan: fix wrong usage of VXLAN_VID_MASK	Alexey Kodanev
	commit dfd8645ea1bd9127 wrongly assumes that VXLAN_VDI_MASK includes eight lower order reserved bits of VNI field that are using for remote checksum offload. Right now, when VNI number greater then 0xffff, vxlan_udp_encap_recv() will always return with 'bad_flag' error, reducing the usable vni range from 0..16777215 to 0..65535. Also, it doesn't really check whether RCO bits processed or not. Fix it by adding new VNI mask which has all 32 bits of VNI field: 24 bits for id and 8 bits for other usage. Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-13	of/platform: Fix sparc:allmodconfig build	Guenter Roeck
	sparc:allmodconfig fails to build with: drivers/built-in.o: In function `platform_bus_init': (.init.text+0x3684): undefined reference to `of_platform_register_reconfig_notifier' of_platform_register_reconfig_notifier is only declared if both OF_ADDRESS and OF_DYNAMIC are configured. Yet, the include file only declares a dummy function if OF_DYNAMIC is not configured. The sparc architecture does not configure OF_ADDRESS, but does configure OF_DYNAMIC, causing above error. Fixes: 801d728c10db ("of/reconfig: Add OF_DYNAMIC notifier for platform_bus_type") Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Rob Herring <robh@kernel.org>
2015-03-13	uapi/virtio_scsi: allow overriding CDB/SENSE size	Michael S. Tsirkin
	QEMU wants to use virtio scsi structures with a different VIRTIO_SCSI_CDB_SIZE/VIRTIO_SCSI_SENSE_SIZE, let's add ifdefs to allow overriding them. Keep the old defines under new names: VIRTIO_SCSI_CDB_DEFAULT_SIZE/VIRTIO_SCSI_SENSE_DEFAULT_SIZE, since that's what these values really are: defaults for cdb/sense size fields. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-03-12	kasan, module: move MODULE_ALIGN macro into <linux/moduleloader.h>	Andrey Ryabinin
	include/linux/moduleloader.h is more suitable place for this macro. Also change alignment to PAGE_SIZE for CONFIG_KASAN=n as such alignment already assumed in several places. Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Cc: Dmitry Vyukov <dvyukov@google.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-03-12	kasan, module, vmalloc: rework shadow allocation for modules	Andrey Ryabinin
	Current approach in handling shadow memory for modules is broken. Shadow memory could be freed only after memory shadow corresponds it is no longer used. vfree() called from interrupt context could use memory its freeing to store 'struct llist_node' in it: void vfree(const void addr) { ... if (unlikely(in_interrupt())) { struct vfree_deferred p = this_cpu_ptr(&vfree_deferred); if (llist_add((struct llist_node *)addr, &p->list)) schedule_work(&p->wq); Later this list node used in free_work() which actually frees memory. Currently module_memfree() called in interrupt context will free shadow before freeing module's memory which could provoke kernel crash. So shadow memory should be freed after module's memory. However, such deallocation order could race with kasan_module_alloc() in module_alloc(). Free shadow right before releasing vm area. At this point vfree()'d memory is not used anymore and yet not available for other allocations. New VM_KASAN flag used to indicate that vm area has dynamically allocated shadow memory so kasan frees shadow only if it was previously allocated. Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-03-11	xps: must clear sender_cpu before forwarding	Eric Dumazet
	John reported that my previous commit added a regression on his router. This is because sender_cpu & napi_id share a common location, so get_xps_queue() can see garbage and perform an out of bound access. We need to make sure sender_cpu is cleared before doing the transmit, otherwise any NIC busy poll enabled (skb_mark_napi_id()) can trigger this bug. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: John <jw@nuclearfallout.net> Bisected-by: John <jw@nuclearfallout.net> Fixes: 2bd82484bb4c ("xps: fix xps for stacked devices") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-11	clk: introduce clk_is_match	Michael Turquette
	Some drivers compare struct clk pointers as a means of knowing if the two pointers reference the same clock hardware. This behavior is dubious (drivers must not dereference struct clk), but did not cause any regressions until the per-user struct clk patch was merged. Now the test for matching clk's will always fail with per-user struct clk's. clk_is_match is introduced to fix the regression and prevent drivers from comparing the pointers manually. Fixes: 035a61c314eb ("clk: Make clk API return per-user struct clk instances") Cc: Russell King <linux@arm.linux.org.uk> Cc: Shawn Guo <shawn.guo@linaro.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Signed-off-by: Michael Turquette <mturquette@linaro.org> [arnd@arndb.de: Fix COMMON_CLK=N && HAS_CLK=Y config] Signed-off-by: Arnd Bergmann <arnd@arndb.de> [sboyd@codeaurora.org: const arguments to clk_is_match() and remove unnecessary ternary operation] Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2015-03-10	virtio_blk: fix comment for virtio 1.0	Michael S. Tsirkin
	Fix up comment to match virtio 1.0 logic: virtio_blk_outhdr isn't the first elements anymore, the only requirement is that it comes first in the s/g list. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-03-10	virtio_blk: typo fix	Michael S. Tsirkin
	Now that QEmu reuses linux virtio headers, we noticed a typo in the exported virtio block header. Fix it up. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-03-08	irqchip: gicv3-its: Define macros for GITS_CTLR fields	Yun Wu
	Define macros for GITS_CTLR fields to avoid using magic numbers. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Yun Wu <wuyun.wu@huawei.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Link: https://lkml.kernel.org/r/1425659870-11832-11-git-send-email-marc.zyngier@arm.com Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2015-03-08	irqchip: gicv3-its: Allocate enough memory for the full range of DeviceID	Marc Zyngier
	The ITS table allocator is only allocating a single page per table. This works fine for most things, but leads to silent lack of interrupt delivery if we end-up with a device that has an ID that is out of the range defined by a single page of memory. Even worse, depending on the page size, behaviour changes, which is not a very good experience. A solution is actually to allocate memory for the full range of ID that the ITS supports. A massive waste memory wise, but at least a safe bet. Tested on a Phytium SoC. Tested-by: Chen Baozi <chenbaozi@kylinos.com.cn> Acked-by: Chen Baozi <chenbaozi@kylinos.com.cn> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Link: https://lkml.kernel.org/r/1425659870-11832-3-git-send-email-marc.zyngier@arm.com Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2015-03-07	serial: uapi: Declare all userspace-visible io types	Peter Hurley
	ioctl(TIOCGSERIAL\|TIOCSSERIAL) report and can change the port->iotype. UART drivers use the UPIO_* definitions, but the uapi header defines parallel values and userspace uses these parallel values for ioctls; thus the userspace values are definitive. Define UPIO_* iotypes in terms of the uapi defines, SERIAL_IO_*; extend the uapi defines to include all values in use by the serial core. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-07	serial: core: Fix iotype userspace breakage	Peter Hurley
	commit 3ffb1a8193bea ("serial: core: Add big-endian iotype") re-numbered userspace-dependent values; ioctl(TIOCSSERIAL) can assign the port iotype (which is expected to match the selected i/o accessors), so iotype values must not be changed. Cc: Kevin Cernekee <cernekee@gmail.com> Cc: <stable@vger.kernel.org> # 3.19+ Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Reviewed-by: Kevin Cernekee <cernekee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-06	ARM: dts: am43xx: fix SLEWCTRL_FAST pinctrl binding	Dave Gerlach
	According to AM437x TRM, Document SPRUHL7B, Revised December 2014, Section 7.2.1 Pad Control Registers, setting bit 19 of the pad control registers actually sets the SLEWCTRL value to slow rather than fast as the current macro indicates. Introduce a new macro, SLEWCTRL_SLOW, that sets the bit, and modify SLEWCTRL_FAST to 0 but keep it for completeness. Current users of the macro (i2c, mdio, and uart) are left unmodified as SLEWCTRL_FAST was the macro used and actual desired state. Tested on am437x-gp-evm with no difference in software performance seen. Signed-off-by: Dave Gerlach <d-gerlach@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2015-03-06	ARM: dts: am33xx: fix SLEWCTRL_FAST pinctrl binding	Dave Gerlach
	According to AM335x TRM, Document spruh73l, Revised February 2015, Section 9.2.2 Pad Control Registers, setting bit 6 of the pad control registers actually sets the SLEWCTRL value to slow rather than fast as the current macro indicates. Introduce a new macro, SLEWCTRL_SLOW, that sets the bit, and modify SLEWCTRL_FAST to 0 but keep it for completeness. Current users of the macro (i2c and mdio) are left unmodified as SLEWCTRL_FAST was the macro used and actual desired state. Tested on am335x-gp-evm with no difference in software performance seen. Signed-off-by: Dave Gerlach <d-gerlach@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2015-03-05	cpuidle / sleep: Use broadcast timer for states that stop local timer	Rafael J. Wysocki
	Commit 381063133246 (PM / sleep: Re-implement suspend-to-idle handling) overlooked the fact that entering some sufficiently deep idle states by CPUs may cause their local timers to stop and in those cases it is necessary to switch over to a broadcast timer prior to entering the idle state. If the cpuidle driver in use does not provide the new ->enter_freeze callback for any of the idle states, that problem affects suspend-to-idle too, but it is not taken into account after the changes made by commit 381063133246. Fix that by changing the definition of cpuidle_enter_freeze() and re-arranging of the code in cpuidle_idle_call(), so the former does not call cpuidle_enter() any more and the fallback case is handled by cpuidle_idle_call() directly. Fixes: 381063133246 (PM / sleep: Re-implement suspend-to-idle handling) Reported-and-tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-03-05	workqueue: fix hang involving racing cancel[_delayed]_work_sync()'s for ↵	Tejun Heo
	PREEMPT_NONE cancel[_delayed]_work_sync() are implemented using __cancel_work_timer() which grabs the PENDING bit using try_to_grab_pending() and then flushes the work item with PENDING set to prevent the on-going execution of the work item from requeueing itself. try_to_grab_pending() can always grab PENDING bit without blocking except when someone else is doing the above flushing during cancelation. In that case, try_to_grab_pending() returns -ENOENT. In this case, __cancel_work_timer() currently invokes flush_work(). The assumption is that the completion of the work item is what the other canceling task would be waiting for too and thus waiting for the same condition and retrying should allow forward progress without excessive busy looping Unfortunately, this doesn't work if preemption is disabled or the latter task has real time priority. Let's say task A just got woken up from flush_work() by the completion of the target work item. If, before task A starts executing, task B gets scheduled and invokes __cancel_work_timer() on the same work item, its try_to_grab_pending() will return -ENOENT as the work item is still being canceled by task A and flush_work() will also immediately return false as the work item is no longer executing. This puts task B in a busy loop possibly preventing task A from executing and clearing the canceling state on the work item leading to a hang. task A task B worker executing work __cancel_work_timer() try_to_grab_pending() set work CANCELING flush_work() block for work completion completion, wakes up A __cancel_work_timer() while (forever) { try_to_grab_pending() -ENOENT as work is being canceled flush_work() false as work is no longer executing } This patch removes the possible hang by updating __cancel_work_timer() to explicitly wait for clearing of CANCELING rather than invoking flush_work() after try_to_grab_pending() fails with -ENOENT. Link: http://lkml.kernel.org/g/20150206171156.GA8942@axis.com v3: bit_waitqueue() can't be used for work items defined in vmalloc area. Switched to custom wake function which matches the target work item and exclusive wait and wakeup. v2: v1 used wake_up() on bit_waitqueue() which leads to NULL deref if the target bit waitqueue has wait_bit_queue's on it. Use DEFINE_WAIT_BIT() and __wake_up_bit() instead. Reported by Tomeu Vizoso. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Rabin Vincent <rabin.vincent@axis.com> Cc: Tomeu Vizoso <tomeu.vizoso@gmail.com> Cc: stable@vger.kernel.org Tested-by: Jesper Nilsson <jesper.nilsson@axis.com> Tested-by: Rabin Vincent <rabin.vincent@axis.com>
2015-03-05	Revert "pinctrl: consumer: use correct retval for placeholder functions"	Linus Walleij
	This reverts commit 5a7d2efdd93f6c4bb6cd3d5df3d2f5611c9b87ac. As per discussion on the mailing list, this is not the right thing to do. NULL cookies are valid in the stubs. Reported-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-03-05	drm/ttm: device address space != CPU address space	Alex Deucher
	We need to store device offsets in 64 bit as the device address space may be larger than the CPU's. Fixes GPU init failures on radeons with 4GB or more of vram on 32 bit kernels. We put vram at the start of the GPU's address space so the gart aperture starts at 4 GB causing all GPU addresses in the gart aperture to get truncated. bug: https://bugs.freedesktop.org/show_bug.cgi?id=89072 [airlied: fix warning on nouveau build] Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: thellstrom@vmware.com Acked-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-05	drm/mm: Support 4 GiB and larger ranges	Thierry Reding
	The current implementation is limited by the number of addresses that fit into an unsigned long. This causes problems on 32-bit Tegra where unsigned long is 32-bit but drm_mm is used to manage an IOVA space of 4 GiB. Given the 32-bit limitation, the range is limited to 4 GiB - 1 (or 4 GiB - 4 KiB for page granularity). This commit changes the start and size of the range to be an unsigned 64-bit integer, thus allowing much larger ranges to be supported. [airlied: fix i915 warnings and coloring callback] Signed-off-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com> fixupo
2015-03-04	genirq / PM: Add flag for shared NO_SUSPEND interrupt lines	Rafael J. Wysocki
	It currently is required that all users of NO_SUSPEND interrupt lines pass the IRQF_NO_SUSPEND flag when requesting the IRQ or the WARN_ON_ONCE() in irq_pm_install_action() will trigger. That is done to warn about situations in which unprepared interrupt handlers may be run unnecessarily for suspended devices and may attempt to access those devices by mistake. However, it may cause drivers that have no technical reasons for using IRQF_NO_SUSPEND to set that flag just because they happen to share the interrupt line with something like a timer. Moreover, the generic handling of wakeup interrupts introduced by commit 9ce7a25849e8 (genirq: Simplify wakeup mechanism) only works for IRQs without any NO_SUSPEND users, so the drivers of wakeup devices needing to use shared NO_SUSPEND interrupt lines for signaling system wakeup generally have to detect wakeup in their interrupt handlers. Thus if they happen to share an interrupt line with a NO_SUSPEND user, they also need to request that their interrupt handlers be run after suspend_device_irqs(). In both cases the reason for using IRQF_NO_SUSPEND is not because the driver in question has a genuine need to run its interrupt handler after suspend_device_irqs(), but because it happens to share the line with some other NO_SUSPEND user. Otherwise, the driver would do without IRQF_NO_SUSPEND just fine. To make it possible to specify that condition explicitly, introduce a new IRQ action handler flag for shared IRQs, IRQF_COND_SUSPEND, that, when set, will indicate to the IRQ core that the interrupt user is generally fine with suspending the IRQ, but it also can tolerate handler invocations after suspend_device_irqs() and, in particular, it is capable of detecting system wakeup and triggering it as appropriate from its interrupt handler. That will allow us to work around a problem with a shared timer interrupt line on at91 platforms. Link: http://marc.info/?l=linux-kernel&m=142252777602084&w=2 Link: http://marc.info/?t=142252775300011&r=1&w=2 Link: https://lkml.org/lkml/2014/12/15/552 Reported-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com>
2015-03-04	netfilter: nf_tables: fix userdata length overflow	Patrick McHardy
	The NFT_USERDATA_MAXLEN is defined to 256, however we only have a u8 to store its size. Introduce a struct nft_userdata which contains a length field and indicate its presence using a single bit in the rule. The length field of struct nft_userdata is also a u8, however we don't store zero sized data, so the actual length is udata->len + 1. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-03	pm: at91: Workaround DDRSDRC self-refresh bug with LPDDR1 memories.	Peter Rosin
	The DDRSDR controller fails miserably to put LPDDR1 memories in self-refresh. Force the controller to think it has DDR2 memories during the self-refresh period, as the DDR2 self-refresh spec is equivalent to LPDDR1, and is correctly implemented in the controller. Assume that the second controller has the same fault, but that is untested. Signed-off-by: Peter Rosin <peda@axentia.se> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
2015-03-03	NFS: Fix a regression in the read() syscall	Trond Myklebust
	When invalidating the page cache for a regular file, we want to first sync all dirty data to disk and then call invalidate_inode_pages2(). The latter relies on nfs_launder_page() and nfs_release_page() to deal respectively with dirty pages, and unstable written pages. When commit 9590544694bec ("NFS: avoid deadlocks with loop-back mounted NFS filesystems.") changed the behaviour of nfs_release_page(), then it made it possible for invalidate_inode_pages2() to fail with an EBUSY. Unfortunately, that error is then propagated back to read(). Let's therefore work around the problem for now by protecting the call to sync the data and invalidate_inode_pages2() so that they are atomic w.r.t. the addition of new writes. Later on, we can revisit whether or not we still need nfs_launder_page() and nfs_release_page(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-03	spi: fix a typo in comment.	Marcin Bis
	alway -> always Signed-off-by: Marcin Bis <marcin@bis.org.pl> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-03-02	net/mlx4_core: Fix wrong mask and error flow for the update-qp command	Or Gerlitz
	The bit mask for currently supported driver features (MLX4_UPDATE_QP_SUPPORTED_ATTRS) of the update-qp command was defined twice (using enum value and pre-processor define directive) and wrong. The return value of the call to mlx4_update_qp() from within the SRIOV resource-tracker was wrongly voided down. Fix both issues. issue: none Fixes: 09e05c3f78e9 ('net/mlx4: Set vlan stripping policy by the right command') Fixes: ce8d9e0d6746 ('net/mlx4_core: Add UPDATE_QP SRIOV wrapper support') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02	xen: Remove trailing semicolon from xenbus_register_frontend() definition	Yuval Shaia
	Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-03-01	NFS: Add attribute update barriers to NFS writebacks	Trond Myklebust
	Ensure that other operations that race with our write RPC calls cannot revert the file size updates that were made on the server. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
2015-03-01	NFS: Add attribute update barriers to nfs_setattr_update_inode()	Trond Myklebust
	Ensure that other operations which raced with our setattr RPC call cannot revert the file attribute changes that were made on the server. To do so, we artificially bump the attribute generation counter on the inode so that all calls to nfs_fattr_init() that precede ours will be dropped. The motivation for the patch came from Chuck Lever's reports of readaheads racing with truncate operations and causing the file size to be reverted. Reported-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
2015-03-01	NFS: Add a helper to set attribute barriers	Trond Myklebust
	Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
2015-02-27	rhashtable: remove indirection for grow/shrink decision functions	Daniel Borkmann
	Currently, all real users of rhashtable default their grow and shrink decision functions to rht_grow_above_75() and rht_shrink_below_30(), so that there's currently no need to have this explicitly selectable. It can/should be generic and private inside rhashtable until a real use case pops up. Since we can make this private, we'll save us this additional indirection layer and can improve insertion/deletion time as well. Reference: http://patchwork.ozlabs.org/patch/443040/ Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27	dm snapshot: suspend merging snapshot when doing exception handover	Mikulas Patocka
	The "dm snapshot: suspend origin when doing exception handover" commit fixed a exception store handover bug associated with pending exceptions to the "snapshot-origin" target. However, a similar problem exists in snapshot merging. When snapshot merging is in progress, we use the target "snapshot-merge" instead of "snapshot-origin". Consequently, during exception store handover, we must find the snapshot-merge target and suspend its associated mapped_device. To avoid lockdep warnings, the target must be suspended and resumed without holding _origins_lock. Introduce a dm_hold() function that grabs a reference on a mapped_device, but unlike dm_get(), it doesn't crash if the device has the DMF_FREEING flag set, it returns an error in this case. In snapshot_resume() we grab the reference to the origin device using dm_hold() while holding _origins_lock (_origins_lock guarantees that the device won't disappear). Then we release _origins_lock, suspend the device and grab _origins_lock again. NOTE to stable@ people: When backporting to kernels 3.18 and older, use dm_internal_suspend and dm_internal_resume instead of dm_internal_suspend_fast and dm_internal_resume_fast. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org
2015-02-26	Revert "USB: serial: make bulk_out_size a lower limit"	Johan Hovold
	This reverts commit 5083fd7bdfe6760577235a724cf6dccae13652c2. A bulk-out size smaller than the end-point size is indeed valid. The offending commit broke the usb-debug driver for EHCI debug devices, which use 8-byte buffers. Fixes: 5083fd7bdfe6 ("USB: serial: make bulk_out_size a lower limit") Reported-by: "Li, Elvin" <elvin.li@intel.com> Cc: stable <stable@vger.kernel.org> # v3.15 Signed-off-by: Johan Hovold <johan@kernel.org>
2015-02-26	OMAPDSS: fix regression with display sysfs files	Tomi Valkeinen
	omapdss's sysfs directories for displays used to have 'name' file, giving the name for the display. This file was later renamed to 'display_name' to avoid conflicts with i2c sysfs 'name' file. Looks like at least xserver-xorg-video-omap3 requires the 'name' file to be present. To fix the regression, this patch creates new kobjects for each display, allowing us to create sysfs directories for the displays. This way we have the whole directory for omapdss, and there will be no sysfs file clashes with the underlying display device's sysfs files. We can thus add the 'name' sysfs file back. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Tested-by: NeilBrown <neilb@suse.de>
2015-02-26	genirq / PM: better describe IRQF_NO_SUSPEND semantics	Mark Rutland
	The IRQF_NO_SUSPEND flag is intended to be used for interrupts required to be enabled during the suspend-resume cycle. This mostly consists of IPIs and timer interrupts, potentially including chained irqchip interrupts if these are necessary to handle timers or IPIs. If an interrupt does not fall into one of the aforementioned categories, requesting it with IRQF_NO_SUSPEND is likely incorrect. Using IRQF_NO_SUSPEND does not guarantee that the interrupt can wake the system from a suspended state. For an interrupt to be able to trigger a wakeup, it may be necessary to program various components of the system. In these cases it is necessary to use {enable,disabled}_irq_wake. Unfortunately, several drivers assume that IRQF_NO_SUSPEND ensures that an IRQ can wake up the system, and the documentation can be read ambiguously w.r.t. this property. This patch updates the documentation regarding IRQF_NO_SUSPEND to make this caveat explicit, hopefully making future misuse rarer. Cleanup of existing misuse will occur as part of later patch series. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-24	thermal: Introduce dummy functions when thermal is not defined	Nishanth Menon
	When CONFIG_THERMAL is not enabled, it is better to introduce equivalent dummy functions in the exported header than to introduce #ifdeffery in drivers using the function. This will prevent issues such as that reported in: http://www.spinics.net/lists/linux-next/msg31573.html While at it switch over to IS_ENABLED for thermal macros to allow for thermal framework to be built as framework and relevant APIs be usable by relevant drivers as a result. Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-02-23	net: sched: export tc_connmark.h so it is uapi accessible	Jamal Hadi Salim
	Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-23	x86/xen: allow privcmd hypercalls to be preempted	David Vrabel
	Hypercalls submitted by user space tools via the privcmd driver can take a long time (potentially many 10s of seconds) if the hypercall has many sub-operations. A fully preemptible kernel may deschedule such as task in any upcall called from a hypercall continuation. However, in a kernel with voluntary or no preemption, hypercall continuations in Xen allow event handlers to be run but the task issuing the hypercall will not be descheduled until the hypercall is complete and the ioctl returns to user space. These long running tasks may also trigger the kernel's soft lockup detection. Add xen_preemptible_hcall_begin() and xen_preemptible_hcall_end() to bracket hypercalls that may be preempted. Use these in the privcmd driver. When returning from an upcall, call xen_maybe_preempt_hcall() which adds a schedule point if if the current task was within a preemptible hypercall. Since _cond_resched() can move the task to a different CPU, clear and set xen_in_preemptible_hcall around the call. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2015-02-23	drm/i915/bdw: PCI IDs ending in 0xb are ULT.	Rodrigo Vivi
	When reviewing patch that fixes VGA on BDW Halo Jani noticed that we also had other ULT IDs that weren't listed there. So this follow-up patch add these pci-ids as halo and fix comments on i915_pciids.h Cc: Jani Nikula <jani.nikula@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-02-22	VFS: Split DCACHE_FILE_TYPE into regular and special types	David Howells
	Split DCACHE_FILE_TYPE into DCACHE_REGULAR_TYPE (dentries representing regular files) and DCACHE_SPECIAL_TYPE (representing blockdev, chardev, FIFO and socket files). d_is_reg() and d_is_special() are added to detect these subtypes and d_is_file() is left as the union of the two. This allows a number of places that use S_ISREG(dentry->d_inode->i_mode) to use d_is_reg(dentry) instead. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-22	VFS: Add a fallthrough flag for marking virtual dentries	David Howells
	Add a DCACHE_FALLTHRU flag to indicate that, in a layered filesystem, this is a virtual dentry that covers another one in a lower layer that should be used instead. This may be recorded on medium if directory integration is stored there. The flag can be set with d_set_fallthru() and tested with d_is_fallthru(). Original-author: Valerie Aurora <vaurora@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>