summaryrefslogtreecommitdiff
path: root/virt/kvm/arm/arch_timer.c (follow)
Commit message (Collapse)AuthorAge
* KVM: arm/arm64: Fix occasional warning from the timer work functionChristoffer Dall2017-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 63e41226afc3f7a044b70325566fa86ac3142538 ] When a VCPU blocks (WFI) and has programmed the vtimer, we program a soft timer to expire in the future to wake up the vcpu thread when appropriate. Because such as wake up involves a vcpu kick, and the timer expire function can get called from interrupt context, and the kick may sleep, we have to schedule the kick in the work function. The work function currently has a warning that gets raised if it turns out that the timer shouldn't fire when it's run, which was added because the idea was that in that case the work should never have been cancelled. However, it turns out that this whole thing is racy and we can get spurious warnings. The problem is that we clear the armed flag in the work function, which may run in parallel with the kvm_timer_unschedule->timer_disarm() call. This results in a possible situation where the timer_disarm() call does not call cancel_work_sync(), which effectively synchronizes the completion of the work function with running the VCPU. As a result, the VCPU thread proceeds before the work function completees, causing changes to the timer state such that kvm_timer_should_fire(vcpu) returns false in the work function. All we do in the work function is to kick the VCPU, and an occasional rare extra kick never harmed anyone. Since the race above is extremely rare, we don't bother checking if the race happens but simply remove the check and the clearing of the armed flag from the work function. Reported-by: Matthias Brugger <mbrugger@suse.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* KVM: arm/arm64: Handle forward time correction gracefullyMarc Zyngier2016-05-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 1c5631c73fc2261a5df64a72c155cb53dcdc0c45 upstream. On a host that runs NTP, corrections can have a direct impact on the background timer that we program on the behalf of a vcpu. In particular, NTP performing a forward correction will result in a timer expiring sooner than expected from a guest point of view. Not a big deal, we kick the vcpu anyway. But on wake-up, the vcpu thread is going to perform a check to find out whether or not it should block. And at that point, the timer check is going to say "timer has not expired yet, go back to sleep". This results in the timer event being lost forever. There are multiple ways to handle this. One would be record that the timer has expired and let kvm_cpu_has_pending_timer return true in that case, but that would be fairly invasive. Another is to check for the "short sleep" condition in the hrtimer callback, and restart the timer for the remaining time when the condition is detected. This patch implements the latter, with a bit of refactoring in order to avoid too much code duplication. Reported-by: Alexander Graf <agraf@suse.de> Reviewed-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* KVM: arm/arm64: Fix reference to uninitialised VGICAndre Przywara2016-02-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit b3aff6ccbb1d25e506b60ccd9c559013903f3464 upstream. Commit 4b4b4512da2a ("arm/arm64: KVM: Rework the arch timer to use level-triggered semantics") brought the virtual architected timer closer to the VGIC. There is one occasion were we don't properly check for the VGIC actually having been initialized before, but instead go on to check the active state of some IRQ number. If userland hasn't instantiated a virtual GIC, we end up with a kernel NULL pointer dereference: ========= Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = ffffffc9745c5000 [00000000] *pgd=00000009f631e003, *pud=00000009f631e003, *pmd=0000000000000000 Internal error: Oops: 96000006 [#2] PREEMPT SMP Modules linked in: CPU: 0 PID: 2144 Comm: kvm_simplest-ar Tainted: G D 4.5.0-rc2+ #1300 Hardware name: ARM Juno development board (r1) (DT) task: ffffffc976da8000 ti: ffffffc976e28000 task.ti: ffffffc976e28000 PC is at vgic_bitmap_get_irq_val+0x78/0x90 LR is at kvm_vgic_map_is_active+0xac/0xc8 pc : [<ffffffc0000b7e28>] lr : [<ffffffc0000b972c>] pstate: 20000145 .... ========= Fix this by bailing out early of kvm_timer_flush_hwstate() if we don't have a VGIC at all. Reported-by: Cosmin Gorgovan <cosmin@linux-geek.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* KVM: arm/arm64: arch_timer: Preserve physical dist. active state on LR.activeChristoffer Dall2015-11-24
| | | | | | | | | | | | | | | | | We were incorrectly removing the active state from the physical distributor on the timer interrupt when the timer output level was deasserted. We shouldn't be doing this without considering the virtual interrupt's active state, because the architecture requires that when an LR has the HW bit set and the pending or active bits set, then the physical interrupt must also have the corresponding bits set. This addresses an issue where we have been observing an inconsistency between the LR state and the physical distributor state where the LR state was active and the physical distributor was not active, which shouldn't happen. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* arm/arm64: KVM: Add tracepoints for vgic and timerChristoffer Dall2015-10-22
| | | | | | | | | | | The VGIC and timer code for KVM arm/arm64 doesn't have any tracepoints or tracepoint infrastructure defined. Rewriting some of the timer code handling showed me how much we need this, so let's add these simple trace points once and for all and we can easily expand with additional trace points in these files as we go along. Cc: Wei Huang <wei@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* arm/arm64: KVM: Rework the arch timer to use level-triggered semanticsChristoffer Dall2015-10-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The arch timer currently uses edge-triggered semantics in the sense that the line is never sampled by the vgic and lowering the line from the timer to the vgic doesn't have any effect on the pending state of virtual interrupts in the vgic. This means that we do not support a guest with the otherwise valid behavior of (1) disable interrupts (2) enable the timer (3) disable the timer (4) enable interrupts. Such a guest would validly not expect to see any interrupts on real hardware, but will see interrupts on KVM. This patch fixes this shortcoming through the following series of changes. First, we change the flow of the timer/vgic sync/flush operations. Now the timer is always flushed/synced before the vgic, because the vgic samples the state of the timer output. This has the implication that we move the timer operations in to non-preempible sections, but that is fine after the previous commit getting rid of hrtimer schedules on every entry/exit. Second, we change the internal behavior of the timer, letting the timer keep track of its previous output state, and only lower/raise the line to the vgic when the state changes. Note that in theory this could have been accomplished more simply by signalling the vgic every time the state *potentially* changed, but we don't want to be hitting the vgic more often than necessary. Third, we get rid of the use of the map->active field in the vgic and instead simply set the interrupt as active on the physical distributor whenever the input to the GIC is asserted and conversely clear the physical active state when the input to the GIC is deasserted. Fourth, and finally, we now initialize the timer PPIs (and all the other unused PPIs for now), to be level-triggered, and modify the sync code to sample the line state on HW sync and re-inject a new interrupt if it is still pending at that time. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* arm/arm64: KVM: arch_timer: Only schedule soft timer on vcpu_blockChristoffer Dall2015-10-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently schedule a soft timer every time we exit the guest if the timer did not expire while running the guest. This is really not necessary, because the only work we do in the timer work function is to kick the vcpu. Kicking the vcpu does two things: (1) If the vpcu thread is on a waitqueue, make it runnable and remove it from the waitqueue. (2) If the vcpu is running on a different physical CPU from the one doing the kick, it sends a reschedule IPI. The second case cannot happen, because the soft timer is only ever scheduled when the vcpu is not running. The first case is only relevant when the vcpu thread is on a waitqueue, which is only the case when the vcpu thread has called kvm_vcpu_block(). Therefore, we only need to make sure a timer is scheduled for kvm_vcpu_block(), which we do by encapsulating all calls to kvm_vcpu_block() with kvm_timer_{un}schedule calls. Additionally, we only schedule a soft timer if the timer is enabled and unmasked, since it is useless otherwise. Note that theoretically userspace can use the SET_ONE_REG interface to change registers that should cause the timer to fire, even if the vcpu is blocked without a scheduled timer, but this case was not supported before this patch and we leave it for future work for now. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* arm/arm64: KVM: Fix arch timer behavior for disabled interruptsChristoffer Dall2015-10-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have an interesting issue when the guest disables the timer interrupt on the VGIC, which happens when turning VCPUs off using PSCI, for example. The problem is that because the guest disables the virtual interrupt at the VGIC level, we never inject interrupts to the guest and therefore never mark the interrupt as active on the physical distributor. The host also never takes the timer interrupt (we only use the timer device to trigger a guest exit and everything else is done in software), so the interrupt does not become active through normal means. The result is that we keep entering the guest with a programmed timer that will always fire as soon as we context switch the hardware timer state and run the guest, preventing forward progress for the VCPU. Since the active state on the physical distributor is really part of the timer logic, it is the job of our virtual arch timer driver to manage this state. The timer->map->active boolean field indicates whether we have signalled this interrupt to the vgic and if that interrupt is still pending or active. As long as that is the case, the hardware doesn't have to generate physical interrupts and therefore we mark the interrupt as active on the physical distributor. We also have to restore the pending state of an interrupt that was queued to an LR but was retired from the LR for some reason, while remaining pending in the LR. Cc: Marc Zyngier <marc.zyngier@arm.com> Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* arm/arm64: KVM: arch timer: Reset CNTV_CTL to 0Christoffer Dall2015-09-04
| | | | | | | | | | | | | | | | | | Provide a better quality of implementation and be architecture compliant on ARMv7 for the architected timer by resetting the CNTV_CTL to 0 on reset of the timer. This change alone fixes the UEFI reset issue reported by Laszlo back in February. Cc: Laszlo Ersek <lersek@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Drew Jones <drjones@redhat.com> Cc: Wei Huang <wei@redhat.com> Cc: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* KVM: arm/arm64: timer: Allow the timer to control the active stateMarc Zyngier2015-08-12
| | | | | | | | | | | | In order to remove the crude hack where we sneak the masked bit into the timer's control register, make use of the phys_irq_map API control the active state of the interrupt. This causes some limited changes to allow for potential error propagation. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* arm/arm64: KVM: Fix migration race in the arch timerChristoffer Dall2015-03-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a VCPU is no longer running, we currently check to see if it has a timer scheduled in the future, and if it does, we schedule a host hrtimer to notify is in case the timer expires while the VCPU is still not running. When the hrtimer fires, we mask the guest's timer and inject the timer IRQ (still relying on the guest unmasking the time when it receives the IRQ). This is all good and fine, but when migration a VM (checkpoint/restore) this introduces a race. It is unlikely, but possible, for the following sequence of events to happen: 1. Userspace stops the VM 2. Hrtimer for VCPU is scheduled 3. Userspace checkpoints the VGIC state (no pending timer interrupts) 4. The hrtimer fires, schedules work in a workqueue 5. Workqueue function runs, masks the timer and injects timer interrupt 6. Userspace checkpoints the timer state (timer masked) At restore time, you end up with a masked timer without any timer interrupts and your guest halts never receiving timer interrupts. Fix this by only kicking the VCPU in the workqueue function, and sample the expired state of the timer when entering the guest again and inject the interrupt and mask the timer only then. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* timecounter: keep track of accumulated fractional nanosecondsRichard Cochran2014-12-30
| | | | | | | | | | | | | The current timecounter implementation will drop a variable amount of resolution, depending on the magnitude of the time delta. In other words, reading the clock too often or too close to a time stamp conversion will introduce errors into the time values. This patch fixes the issue by introducing a fractional nanosecond field that accumulates the low order bits. Reported-by: Janusz Użycki <j.uzycki@elproma.com.pl> Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* arm/arm64: KVM: Require in-kernel vgic for the arch timersChristoffer Dall2014-12-15
| | | | | | | | | | | | | | | | | | | | | | | It is curently possible to run a VM with architected timers support without creating an in-kernel VGIC, which will result in interrupts from the virtual timer going nowhere. To address this issue, move the architected timers initialization to the time when we run a VCPU for the first time, and then only initialize (and enable) the architected timers if we have a properly created and initialized in-kernel VGIC. When injecting interrupts from the virtual timer to the vgic, the current setup should ensure that this never calls an on-demand init of the VGIC, which is the only call path that could return an error from kvm_vgic_inject_irq(), so capture the return value and raise a warning if there's an error there. We also change the kvm_timer_init() function from returning an int to be a void function, since the function always succeeds. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* arm, kvm: fix double lock on cpu_add_remove_lockMing Lei2014-04-08
| | | | | | | | | | | | | | | | | | | | Commit 8146875de7d4 (arm, kvm: Fix CPU hotplug callback registration) holds the lock before calling the two functions: kvm_vgic_hyp_init() kvm_timer_hyp_init() and both the two functions are calling register_cpu_notifier() to register cpu notifier, so cause double lock on cpu_add_remove_lock. Considered that both two functions are only called inside kvm_arch_init() with holding cpu_add_remove_lock, so simply use __register_cpu_notifier() to fix the problem. Fixes: 8146875de7d4 (arm, kvm: Fix CPU hotplug callback registration) Signed-off-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* ARM/KVM: save and restore generic timer registersAndre Przywara2013-12-21
| | | | | | | | | | | | | For migration to work we need to save (and later restore) the state of each core's virtual generic timer. Since this is per VCPU, we can use the [gs]et_one_reg ioctl and export the three needed registers (control, counter, compare value). Though they live in cp15 space, we don't use the existing list, since they need special accessor functions and the arch timer is optional. Acked-by: Marc Zynger <marc.zyngier@arm.com> Signed-off-by: Andre Przywara <andre.przywara@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
* Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2013-07-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM fixes from Paolo Bonzini: "On the x86 side, there are some optimizations and documentation updates. The big ARM/KVM change for 3.11, support for AArch64, will come through Catalin Marinas's tree. s390 and PPC have misc cleanups and bugfixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (87 commits) KVM: PPC: Ignore PIR writes KVM: PPC: Book3S PR: Invalidate SLB entries properly KVM: PPC: Book3S PR: Allow guest to use 1TB segments KVM: PPC: Book3S PR: Don't keep scanning HPTEG after we find a match KVM: PPC: Book3S PR: Fix invalidation of SLB entry 0 on guest entry KVM: PPC: Book3S PR: Fix proto-VSID calculations KVM: PPC: Guard doorbell exception with CONFIG_PPC_DOORBELL KVM: Fix RTC interrupt coalescing tracking kvm: Add a tracepoint write_tsc_offset KVM: MMU: Inform users of mmio generation wraparound KVM: MMU: document fast invalidate all mmio sptes KVM: MMU: document fast invalidate all pages KVM: MMU: document fast page fault KVM: MMU: document mmio page fault KVM: MMU: document write_flooding_count KVM: MMU: document clear_spte_count KVM: MMU: drop kvm_mmu_zap_mmio_sptes KVM: MMU: init kvm generation close to mmio wrap-around value KVM: MMU: add tracepoint for check_mmio_spte KVM: MMU: fast invalidate all mmio sptes ...
* ARM: KVM: Allow host virt timer irq to be different from guest timer virt irqAnup Patel2013-06-26
| | | | | | | | | | | | | The arch_timer irq numbers (or PPI numbers) are implementation dependent, so the host virtual timer irq number can be different from guest virtual timer irq number. This patch ensures that host virtual timer irq number is read from DTB and guest virtual timer irq is determined based on vcpu target type. Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
* ARM: KVM: move GIC/timer code to a common locationMarc Zyngier2013-05-19
As KVM/arm64 is looming on the horizon, it makes sense to move some of the common code to a single location in order to reduce duplication. The code could live anywhere. Actually, most of KVM is already built with a bunch of ugly ../../.. hacks in the various Makefiles, so we're not exactly talking about style here. But maybe it is time to start moving into a less ugly direction. The include files must be in a "public" location, as they are accessed from non-KVM files (arch/arm/kernel/asm-offsets.c). For this purpose, introduce two new locations: - virt/kvm/arm/ : x86 and ia64 already share the ioapic code in virt/kvm, so this could be seen as a (very ugly) precedent. - include/kvm/ : there is already an include/xen, and while the intent is slightly different, this seems as good a location as any Eventually, we should probably have independant Makefiles at every levels (just like everywhere else in the kernel), but this is just the first step. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>