android_kernel_zuk_msm8996.git

diff options

author	Vinayak Menon <vinmenon@codeaurora.org>	2016-06-28 09:48:42 +0530
committer	Vinayak Menon <vinmenon@codeaurora.org>	2016-08-25 11:49:45 +0530
commit	47b41f43ee8d38cbceb8b1513b712e4924e4e2e7 (patch)
tree	557fa5e419cbb9722eac246f8edc855d3bf2e939 /lib/assoc_array.c
parent	e97b6a0e0217f7c072fdad6c50673cd7a64348e1 (diff)

mm: zbud: fix the locking scenarios with zcache

With zcache using zbud, strange locking scenarios are observed. The first problem seen is: Core 2 waiting on mapping->tree_lock which is taken by core 6 do_raw_spin_lock raw_spin_lock_irq atomic_cmpxchg page_freeze_refs __remove_mapping shrink_page_list Core 6 after taking mapping->tree_lock is waiting on zbud pool lock which is held by core 5 zbud_alloc zcache_store_page __cleancache_put_page cleancache_put_page __delete_from_page_cache spin_unlock_irq __remove_mapping shrink_page_list shrink_inactive_list Core 5 after taking zbud pool lock from zbud_free received an IRQ, and after IRQ exit, softirqs were scheduled and end_page_writeback tried to lock on mapping->tree_lock which is already held by Core 6. Deadlock. do_raw_spin_lock raw_spin_lock_irqsave test_clear_page_writeba end_page_writeback ext4_finish_bio ext4_end_bio bio_endio blk_update_request end_clone_bio bio_endio blk_update_request blk_update_bidi_request blk_end_bidi_request blk_end_request mmc_blk_cmdq_complete_r mmc_cmdq_softirq_done blk_done_softirq static_key_count static_key_false trace_softirq_exit __do_softirq() tick_irq_exit irq_exit() set_irq_regs __handle_domain_irq gic_handle_irq el1_irq exception __list_del_entry list_del zbud_free zcache_load_page __cleancache_get_page(? This shows that allowing softirqs while holding zbud pool lock can result in deadlocks. To fix this, 'commit 6a1fdaa36272 ("mm: zbud: prevent softirq during zbud alloc, free and reclaim")' decided to take spin_lock_bh during zbud_free, zbud_alloc and zbud_reclaim. But this resulted in another deadlock. spin_bug() do_raw_spin_lock() _raw_spin_lock_irqsave() test_clear_page_writeback() end_page_writeback() ext4_finish_bio() ext4_end_bio() bio_endio() blk_update_request() end_clone_bio() bio_endio() blk_update_request() blk_update_bidi_request() blk_end_request() mmc_blk_cmdq_complete_rq() mmc_cmdq_softirq_done() blk_done_softirq() __do_softirq() do_softirq() __local_bh_enable_ip() _raw_spin_unlock_bh() zbud_alloc() zcache_store_page() __cleancache_put_page() __delete_from_page_cache() __remove_mapping() shrink_page_list() Here, the spin_unlock_bh resulted in explicit invocation of do_sofirq, which resulted in the acquisition of mapping->tree_lock which was already taken by __remove_mapping. The new fix considers the following facts. 1) zcache_store_page is always be called from __delete_from_page_cache with mapping->tree_lock held and interrupts disabled. Thus zbud_alloc which is called only from zcache_store_page is always called with interrupts disabled. 2) zbud_free and zbud_reclaim_page can be called from zcache with or without interrupts disabled. So an interrupt while holding zbud pool lock can result in do_softirq and acquisition of mapping->tree_lock. (1) implies zbud_alloc need not explicitly disable bh. But disable interrupts to make sure zbud_alloc is safe with zcache, irrespective of future changes. This will fix the second scenario. (2) implies zbud_free and zbud_reclaim_page should use spin_lock_irqsave, so that interrupts will not be triggered and inturn softirqs. spin_lock_bh can't be used because a spin_unlock_bh can triger a softirq even in interrupt context. This will fix the first scenario. Change-Id: Ibc810525dddf97614db41643642fec7472bd6a2c Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>

Diffstat (limited to 'lib/assoc_array.c')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: