From 0b29643a58439dc9a8b0c0cacad0e7cb608c8199 Mon Sep 17 00:00:00 2001 From: Fenghua Yu Date: Thu, 29 May 2014 11:12:33 -0700 Subject: x86/xsaves: Change compacted format xsave area header The XSAVE area header is changed to support both compacted format and standard format of xsave area. The XSAVE header of an xsave area comprises the 64 bytes starting at offset 512 from the area base address: - Bytes 7:0 of the xsave header is a state-component bitmap called xstate_bv. It identifies the state components in the xsave area. - Bytes 15:8 of the xsave header is a state-component bitmap called xcomp_bv. It is used as follows: - xcomp_bv[63] indicates the format of the extended region of the xsave area. If it is clear, the standard format is used. If it is set, the compacted format is used. - xcomp_bv[62:0] indicate which features (starting at feature 2) have space allocated for them in the compacted format. - Bytes 63:16 of the xsave header are reserved. Signed-off-by: Fenghua Yu Link: http://lkml.kernel.org/r/1401387164-43416-6-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin --- arch/x86/include/asm/processor.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'arch/x86/include/asm/processor.h') diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index a4ea02351f4d..2c8d3b8e7fcb 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -386,8 +386,8 @@ struct bndcsr_struct { struct xsave_hdr_struct { u64 xstate_bv; - u64 reserved1[2]; - u64 reserved2[5]; + u64 xcomp_bv; + u64 reserved[6]; } __attribute__((packed)); struct xsave_struct { -- cgit v1.2.3 From 3a6bfbc91df04b081a44d419e0260bad54abddf7 Mon Sep 17 00:00:00 2001 From: Davidlohr Bueso Date: Sun, 29 Jun 2014 15:09:33 -0700 Subject: arch, locking: Ciao arch_mutex_cpu_relax() The arch_mutex_cpu_relax() function, introduced by 34b133f, is hacky and ugly. It was added a few years ago to address the fact that common cpu_relax() calls include yielding on s390, and thus impact the optimistic spinning functionality of mutexes. Nowadays we use this function well beyond mutexes: rwsem, qrwlock, mcs and lockref. Since the macro that defines the call is in the mutex header, any users must include mutex.h and the naming is misleading as well. This patch (i) renames the call to cpu_relax_lowlatency ("relax, but only if you can do it with very low latency") and (ii) defines it in each arch's asm/processor.h local header, just like for regular cpu_relax functions. On all archs, except s390, cpu_relax_lowlatency is simply cpu_relax, and thus we can take it out of mutex.h. While this can seem redundant, I believe it is a good choice as it allows us to move out arch specific logic from generic locking primitives and enables future(?) archs to transparently define it, similarly to System Z. Signed-off-by: Davidlohr Bueso Signed-off-by: Peter Zijlstra Cc: Andrew Morton Cc: Anton Blanchard Cc: Aurelien Jacquiot Cc: Benjamin Herrenschmidt Cc: Bharat Bhushan Cc: Catalin Marinas Cc: Chen Liqin Cc: Chris Metcalf Cc: Christian Borntraeger Cc: Chris Zankel Cc: David Howells Cc: David S. Miller Cc: Deepthi Dharwar Cc: Dominik Dingel Cc: Fenghua Yu Cc: Geert Uytterhoeven Cc: Guan Xuetao Cc: Haavard Skinnemoen Cc: Hans-Christian Egtvedt Cc: Heiko Carstens Cc: Helge Deller Cc: Hirokazu Takata Cc: Ivan Kokshaysky Cc: James E.J. Bottomley Cc: James Hogan Cc: Jason Wang Cc: Jesper Nilsson Cc: Joe Perches Cc: Jonas Bonn Cc: Joseph Myers Cc: Kees Cook Cc: Koichi Yasutake Cc: Lennox Wu Cc: Linus Torvalds Cc: Mark Salter Cc: Martin Schwidefsky Cc: Matt Turner Cc: Max Filippov Cc: Michael Neuling Cc: Michal Simek Cc: Mikael Starvik Cc: Nicolas Pitre Cc: Paolo Bonzini Cc: Paul Burton Cc: Paul E. McKenney Cc: Paul Gortmaker Cc: Paul Mackerras Cc: Qais Yousef Cc: Qiaowei Ren Cc: Rafael Wysocki Cc: Ralf Baechle Cc: Richard Henderson Cc: Richard Kuo Cc: Russell King Cc: Steven Miao Cc: Steven Rostedt Cc: Stratos Karafotis Cc: Tim Chen Cc: Tony Luck Cc: Vasily Kulikov Cc: Vineet Gupta Cc: Vineet Gupta Cc: Waiman Long Cc: Will Deacon Cc: Wolfram Sang Cc: adi-buildroot-devel@lists.sourceforge.net Cc: linux390@de.ibm.com Cc: linux-alpha@vger.kernel.org Cc: linux-am33-list@redhat.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-cris-kernel@axis.com Cc: linux-hexagon@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux@lists.openrisc.net Cc: linux-m32r-ja@ml.linux-m32r.org Cc: linux-m32r@ml.linux-m32r.org Cc: linux-m68k@lists.linux-m68k.org Cc: linux-metag@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: linux-xtensa@linux-xtensa.org Cc: sparclinux@vger.kernel.org Link: http://lkml.kernel.org/r/1404079773.2619.4.camel@buesod1.americas.hpqcorp.net Signed-off-by: Ingo Molnar --- arch/x86/include/asm/processor.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'arch/x86/include/asm/processor.h') diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index a4ea02351f4d..32cc237f8e20 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -696,6 +696,8 @@ static inline void cpu_relax(void) rep_nop(); } +#define cpu_relax_lowlatency() cpu_relax() + /* Stop speculative execution and prefetching of modified code. */ static inline void sync_core(void) { -- cgit v1.2.3 From e9f4e0a9fe2723078b7a1a1169828dd46a7b2f9e Mon Sep 17 00:00:00 2001 From: Dave Hansen Date: Thu, 31 Jul 2014 08:40:55 -0700 Subject: x86/mm: Rip out complicated, out-of-date, buggy TLB flushing I think the flush_tlb_mm_range() code that tries to tune the flush sizes based on the CPU needs to get ripped out for several reasons: 1. It is obviously buggy. It uses mm->total_vm to judge the task's footprint in the TLB. It should certainly be using some measure of RSS, *NOT* ->total_vm since only resident memory can populate the TLB. 2. Haswell, and several other CPUs are missing from the intel_tlb_flushall_shift_set() function. Thus, it has been demonstrated to bitrot quickly in practice. 3. It is plain wrong in my vm: [ 0.037444] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0 [ 0.037444] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0 [ 0.037444] tlb_flushall_shift: 6 Which leads to it to never use invlpg. 4. The assumptions about TLB refill costs are wrong: http://lkml.kernel.org/r/1337782555-8088-3-git-send-email-alex.shi@intel.com (more on this in later patches) 5. I can not reproduce the original data: https://lkml.org/lkml/2012/5/17/59 I believe the sample times were too short. Running the benchmark in a loop yields times that vary quite a bit. Note that this leaves us with a static ceiling of 1 page. This is a conservative, dumb setting, and will be revised in a later patch. This also removes the code which attempts to predict whether we are flushing data or instructions. We expect instruction flushes to be relatively rare and not worth tuning for explicitly. Signed-off-by: Dave Hansen Link: http://lkml.kernel.org/r/20140731154055.ABC88E89@viggo.jf.intel.com Acked-by: Rik van Riel Acked-by: Mel Gorman Signed-off-by: H. Peter Anvin --- arch/x86/include/asm/processor.h | 1 - 1 file changed, 1 deletion(-) (limited to 'arch/x86/include/asm/processor.h') diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index a4ea02351f4d..43d61daea966 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -72,7 +72,6 @@ extern u16 __read_mostly tlb_lld_4k[NR_INFO]; extern u16 __read_mostly tlb_lld_2m[NR_INFO]; extern u16 __read_mostly tlb_lld_4m[NR_INFO]; extern u16 __read_mostly tlb_lld_1g[NR_INFO]; -extern s8 __read_mostly tlb_flushall_shift; /* * CPU type and hardware bug flags. Kept separately for each CPU. -- cgit v1.2.3