From f94ad404f8934e0cae299dd6520707dc02bbf2fc Mon Sep 17 00:00:00 2001 From: Julian Anastasov Date: Sat, 5 Mar 2016 15:03:22 +0200 Subject: ipvs: drop first packet to redirect conntrack commit f719e3754ee2f7275437e61a6afd520181fdd43b upstream. Jiri Bohac is reporting for a problem where the attempt to reschedule existing connection to another real server needs proper redirect for the conntrack used by the IPVS connection. For example, when IPVS connection is created to NAT-ed real server we alter the reply direction of conntrack. If we later decide to select different real server we can not alter again the conntrack. And if we expire the old connection, the new connection is left without conntrack. So, the only way to redirect both the IPVS connection and the Netfilter's conntrack is to drop the SYN packet that hits existing connection, to wait for the next jiffie to expire the old connection and its conntrack and to rely on client's retransmission to create new connection as usually. Jiri Bohac provided a fix that drops all SYNs on rescheduling, I extended his patch to do such drops only for connections that use conntrack. Here is the original report from Jiri Bohac: Since commit dc7b3eb900aa ("ipvs: Fix reuse connection if real server is dead"), new connections to dead servers are redistributed immediately to new servers. The old connection is expired using ip_vs_conn_expire_now() which sets the connection timer to expire immediately. However, before the timer callback, ip_vs_conn_expire(), is run to clean the connection's conntrack entry, the new redistributed connection may already be established and its conntrack removed instead. Fix this by dropping the first packet of the new connection instead, like we do when the destination server is not available. The timer will have deleted the old conntrack entry long before the first packet of the new connection is retransmitted. Fixes: dc7b3eb900aa ("ipvs: Fix reuse connection if real server is dead") Signed-off-by: Jiri Bohac Signed-off-by: Julian Anastasov Signed-off-by: Simon Horman Signed-off-by: Greg Kroah-Hartman --- include/net/ip_vs.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) (limited to 'include') diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h index 0816c872b689..a6cc576fd467 100644 --- a/include/net/ip_vs.h +++ b/include/net/ip_vs.h @@ -1588,6 +1588,23 @@ static inline void ip_vs_conn_drop_conntrack(struct ip_vs_conn *cp) } #endif /* CONFIG_IP_VS_NFCT */ +/* Really using conntrack? */ +static inline bool ip_vs_conn_uses_conntrack(struct ip_vs_conn *cp, + struct sk_buff *skb) +{ +#ifdef CONFIG_IP_VS_NFCT + enum ip_conntrack_info ctinfo; + struct nf_conn *ct; + + if (!(cp->flags & IP_VS_CONN_F_NFCT)) + return false; + ct = nf_ct_get(skb, &ctinfo); + if (ct && !nf_ct_is_untracked(ct)) + return true; +#endif + return false; +} + static inline int ip_vs_dest_conn_overhead(struct ip_vs_dest *dest) { -- cgit v1.2.3 From f0e92143b8e2e6fa1e854385667427011cfe1059 Mon Sep 17 00:00:00 2001 From: Heiko Stuebner Date: Thu, 21 Jan 2016 21:53:09 +0100 Subject: clk-divider: make sure read-only dividers do not write to their register commit 50359819794b4a16ae35051cd80f2dab025f6019 upstream. Commit e6d5e7d90be9 ("clk-divider: Fix READ_ONLY when divider > 1") removed the special ops struct for read-only clocks and instead opted to handle them inside the regular ops. On the rk3368 this results in breakage as aclkm now gets set a value. While it is the same divider value, the A53 core still doesn't like it, which can result in the cpu ending up in a hang. The reason being that "ACLKENMasserts one clock cycle before the rising edge of ACLKM" and the clock should only be touched when STANDBYWFIL2 is asserted. To fix this, reintroduce the read-only ops but do include the round_rate callback. That way no writes that may be unsafe are done to the divider register in any case. The Rockchip use of the clk_divider_ops is adapted to this split again, as is the nxp, lpc18xx-ccu driver that was included since the original commit. On lpc18xx-ccu the divider seems to always be read-only so only uses the new ops now. Fixes: e6d5e7d90be9 ("clk-divider: Fix READ_ONLY when divider > 1") Reported-by: Zhang Qing Signed-off-by: Heiko Stuebner Signed-off-by: Stephen Boyd Signed-off-by: Greg Kroah-Hartman --- include/linux/clk-provider.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include') diff --git a/include/linux/clk-provider.h b/include/linux/clk-provider.h index c56988ac63f7..7cd0171963ae 100644 --- a/include/linux/clk-provider.h +++ b/include/linux/clk-provider.h @@ -384,6 +384,7 @@ struct clk_divider { #define CLK_DIVIDER_MAX_AT_ZERO BIT(6) extern const struct clk_ops clk_divider_ops; +extern const struct clk_ops clk_divider_ro_ops; unsigned long divider_recalc_rate(struct clk_hw *hw, unsigned long parent_rate, unsigned int val, const struct clk_div_table *table, -- cgit v1.2.3 From fe21a25e8c0cc97a080cc73c135e92ddce61a660 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Mon, 2 May 2016 12:46:42 -0700 Subject: Minimal fix-up of bad hashing behavior of hash_64() commit 689de1d6ca95b3b5bd8ee446863bf81a4883ea25 upstream. This is a fairly minimal fixup to the horribly bad behavior of hash_64() with certain input patterns. In particular, because the multiplicative value used for the 64-bit hash was intentionally bit-sparse (so that the multiply could be done with shifts and adds on architectures without hardware multipliers), some bits did not get spread out very much. In particular, certain fairly common bit ranges in the input (roughly bits 12-20: commonly with the most information in them when you hash things like byte offsets in files or memory that have block factors that mean that the low bits are often zero) would not necessarily show up much in the result. There's a bigger patch-series brewing to fix up things more completely, but this is the fairly minimal fix for the 64-bit hashing problem. It simply picks a much better constant multiplier, spreading the bits out a lot better. NOTE! For 32-bit architectures, the bad old hash_64() remains the same for now, since 64-bit multiplies are expensive. The bigger hashing cleanup will replace the 32-bit case with something better. The new constants were picked by George Spelvin who wrote that bigger cleanup series. I just picked out the constants and part of the comment from that series. Cc: George Spelvin Cc: Thomas Gleixner Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- include/linux/hash.h | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) (limited to 'include') diff --git a/include/linux/hash.h b/include/linux/hash.h index 1afde47e1528..79c52fa81cac 100644 --- a/include/linux/hash.h +++ b/include/linux/hash.h @@ -32,12 +32,28 @@ #error Wordsize not 32 or 64 #endif +/* + * The above primes are actively bad for hashing, since they are + * too sparse. The 32-bit one is mostly ok, the 64-bit one causes + * real problems. Besides, the "prime" part is pointless for the + * multiplicative hash. + * + * Although a random odd number will do, it turns out that the golden + * ratio phi = (sqrt(5)-1)/2, or its negative, has particularly nice + * properties. + * + * These are the negative, (1 - phi) = (phi^2) = (3 - sqrt(5))/2. + * (See Knuth vol 3, section 6.4, exercise 9.) + */ +#define GOLDEN_RATIO_32 0x61C88647 +#define GOLDEN_RATIO_64 0x61C8864680B583EBull + static __always_inline u64 hash_64(u64 val, unsigned int bits) { u64 hash = val; -#if defined(CONFIG_ARCH_HAS_FAST_MULTIPLIER) && BITS_PER_LONG == 64 - hash = hash * GOLDEN_RATIO_PRIME_64; +#if BITS_PER_LONG == 64 + hash = hash * GOLDEN_RATIO_64; #else /* Sigh, gcc can't optimise this alone like it does for 32 bits. */ u64 n = hash; -- cgit v1.2.3 From 0f7ea0699ac02fb7c5d67e8eac8f8581912f4988 Mon Sep 17 00:00:00 2001 From: Ross Lagerwall Date: Thu, 17 Mar 2016 16:51:59 +0000 Subject: xen: Fix page <-> pfn conversion on 32 bit systems commit 60901df3aed230d4565dca003f11b6a95fbf30d9 upstream. Commit 1084b1988d22dc165c9dbbc2b0e057f9248ac4db (xen: Add Xen specific page definition) caused a regression in 4.4. The xen functions to convert between pages and pfns fail due to an overflow on systems where a physical address may not fit in an unsigned long (e.g. x86 32 bit PAE systems). Rework the conversion to avoid overflow. This should also result in simpler object code. This bug manifested itself as disk corruption with Linux 4.4 when using blkfront in a Xen HVM x86 32 bit guest with more than 4 GiB of memory. Signed-off-by: Ross Lagerwall Signed-off-by: David Vrabel Signed-off-by: Greg Kroah-Hartman --- include/xen/page.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'include') diff --git a/include/xen/page.h b/include/xen/page.h index 96294ac93755..9dc46cb8a0fd 100644 --- a/include/xen/page.h +++ b/include/xen/page.h @@ -15,9 +15,9 @@ */ #define xen_pfn_to_page(xen_pfn) \ - ((pfn_to_page(((unsigned long)(xen_pfn) << XEN_PAGE_SHIFT) >> PAGE_SHIFT))) + (pfn_to_page((unsigned long)(xen_pfn) >> (PAGE_SHIFT - XEN_PAGE_SHIFT))) #define page_to_xen_pfn(page) \ - (((page_to_pfn(page)) << PAGE_SHIFT) >> XEN_PAGE_SHIFT) + ((page_to_pfn(page)) << (PAGE_SHIFT - XEN_PAGE_SHIFT)) #define XEN_PFN_PER_PAGE (PAGE_SIZE / XEN_PAGE_SIZE) -- cgit v1.2.3