经过查证,在国际最新筛法专著的前言中,作者专门提及陈景润定理的现代意义,而我们国人却陈景润不理解。呜呼!
请看本文附件。
袁萌 陈启清 2月4日
附件:在最新筛法专著的前言中,专门提及陈景润定理的现代意义。
Sieve Methods
DENIS XAVIER CHARLES
Preface(前言)
Sieve methods have had a long and fruitful history. The sieve of Eratosthenes (around 3rd century B.C.) was a device to generate prime numbers. Later Legendre used it in his studies of the prime number counting function π(x). Sieve methods bloomed and became a topic of intense investigation after the pioneering work of Viggo Brun (see [Bru16],[Bru19], [Bru22]). Using his formulation of the sieve Brun proved, that the sum
∑ p, p+2 both prime
1 p
converges. This was the rst result of its kind, regarding the Twin-prime problem. A slew of sieve methods were developed over the years — Selberg’s upper bound sieve, Rosser’s Sieve, the Large Sieve, the Asymptotic sieve, to name a few. Many beautiful results have been proved using these sieves. The Brun-Titchmarsh theorem and the extremely powerful result of Bombieri are two important examples. Chen’s theorem [Che73], namely that there are in nitely many primes p such that p+2 is a product of at most two primes, is another indication of the power of sieve methods.
Sieve methods are of importance even in applied elds of number theory such as Algorithmic Number Theory, and Cryptography. There are many direct applications, for example nding all the prime numbers below a certain bound, or constructing numbers free of large prime factors. There are indirect applications too, for example the running time of several factoring algorithms depends directly on the distribution of smooth numbers in short intervals. The so called undeniable signature schemes require prime numbers of the form 2p+1 such that p is also prime. Sieve methods can yield valuable clues about these distributions and hence allow us to bound the running times of these algorithms.
In this treatise we survey the major sieve methods and their important applications in number theory. We apply sieves to study the distribution of square-free numbers, smooth numbers, and prime numbers. The rst chapter is a discussion of the basic sieve formulation of Legendre. We show that the distribution of square-free numbers can be deduced using a square-free sieve1. We give an account of improvements in the error term of this distribution, using known results regarding the Riemann Zeta function.
The second chapter deals with Brun’s Combinatorial sieve as presented in the modern language of [HR74]. We apply the general sieve to give a simpler proof of a theorem of Rademacher [Rad24]. The bound obtained by this simpler proof is slightly inferior, but still suf cient for applications such as the result of Erd os, Chowla and Briggs on the number of mutually orthogonal Latin squares. The formulation of Brun’s sieve in [HR74] also includes a proof of the important Buchstab identity. We use it to derive some bounds on the distribution of smooth numbers ([Hal70]).
The third chapter deals with the development and the applications of Selberg’s upper bound method. The proof by van Lint and Richert [vLR65] of the Brun-Titchmarsh theorem is given as the chief application. Hooley’s improvement of bounds on prime factors in a problem studied by Chebyschev is also outlined here. The last chapter is a study of the Large Sieve. We give an outline of a proof of Bombieri’s central theorem on the error term in the distribution of primes. A new application of the Bombieri theorem is shown; we prove that there are in nitely many primes p such that p+2 is a square-free number with at most 7 prime factors.
Acknowledgements: I would like to thank my advisor Dr. Ken Regan, for allowing me to work on a topic of my own interest. His support, encouragement and advice has been invaluable for my work. I thank him for proofreading the entire document and his constructive comments. A special word of thanks to Dr. Jin-Yi for helping me with character sums. I thank him for answering my queries in such a way that I gained a new insight into the problem. I
1This is not a new proof - it is implicit in the work of Erd os [Erd60]
3
4 PREFACE
thank Dr. Alan Selman for his encouragement and advice. I am deeply grateful to Professors Eric Bach, Tom Cusick, Kevin Ford, and Andrew Granville for promptly answering my queries. Their suggestions, pointers, and ideas were invaluable for this work. I am indebted to the National Science foundation for the monetary support for this work, under my advisor’s grant CCR 98-20140.
I thank my parents for their love, encouragement and prayers. I thank Pavan, Maurice, and Samik for pretending to be interested in sieves, and for reviewing the proofs. A special word of thanks to all my friends for anchoring me in sanity through this summer.
Denis Charles. July 2000
To Truth and Purity
Contents
Preface 3
Chapter 0. Notation and preliminaries 9 0.1. Standard Nomenclature 9 0.2. Conventions 9 0.3. Preliminaries 9
Chapter 1. The sieve of Eratosthenes 13 1.1. Introduction 13 1.2. Sieve of Eratosthenes-Legendre 13 1.3. Smooth numbers 15 1.4. Density of squarefree numbers 15 1.5. The error term in the distribution of Squarefree numbers 18 1.6. Pairs of squarefree numbers 22 1.7. The smallest squarefree number in an arithmetic progression 25 1.8. The Sieve Problem 27
Chapter 2. The Combinatorial Sieve 31 2.1. Brun’s Pure Sieve 31 2.2. Brun’s Sieve 36 2.3. Orthogonal Latin Squares and the Euler Conjecture 44 2.4. A Theorem of Schinzel 49 2.5. Smooth Numbers 54 2.6. On the number of integers prime to a given number 55
Chapter 3. Selberg’s Sieve 57 3.1. The Selberg upper-bound method 57 3.2. The Brun-Titchmarsh Theorem 64 3.3. Prelude to a theorem of Hooley 69 3.4. A theorem of Hooley 71
Chapter 4. The Large Sieve 79 4.1. Bounds on exponential sums 79 4.2. The Large Sieve 84 4.3. The Brun-Titchmarsh Theorem revisited 88 4.4. Bombieri’s Theorem 90 4.5. Prime and Squarefree pairs 93
Bibliography 97
7
8 CONTENTS
CHAPTER 0
Notation and preliminaries
0.1. Standard Nomenclature The largest integer not exceeding x is denotedbxc. We write a\b for two integers a,b a6= 0 if a divides b. The M¨obius function is denoted by μ(n) and de ned as: μ(n) =(( 1)k if n = p1···pk, for 1≤i < j≤k : pi 6= pj, 0 otherwise. The prime counting function is π(x) de ned as the cardinality of the set P ={p≤x| p a prime}, while π(x;q,a) will denote the cardinality of{p≤x| p≡a mod q}. We denote the von-Mangoldt function by Λ(n): Λ(n) =(log p if n = pk for a prime p, 0 otherwise, and its cumulation by ψ(x) = ∑n≤xΛ(n). If n = pe1 1 ···pek k is the prime factorization of n then ν(n) = k denotes the number of distinct primes in the factorization. We write (n) for Euler’s totient function:
(n) = n∏ p\n1 1 p.
0.2. Conventions The letter p will always denote a prime number. Consequently, ∑n≤p≤m f(p) will denote a sum overthe prime numbers in the range of summation. A will stand for a general integer sequence to be sifted, and P for the sifting set of primes. We employ the standard O and o-notation. We use the Vinogradov notation to mean that inequality holds with some constant, i.e., f(n)g(n) c > 0 : f(n)≤cg(n). If gcd(a,b) = 1 for two integers a and b, then we also write a⊥b. 0.3. Preliminaries THEOREM 0.3.1. Let n≥1 be an integer. Then ∑ d\n μ(d) =(1, if n = 1, 0, otherwise. Proof : Since divisors that are not squarefree drop out of the sum by the de nition of μ, we may without loss of generality assume that n is squarefree. Let n = p1p2···pl, then any divisor d of n has the form pe1 1 pe2 2 ···pel l with ei ∈{0,1}for 1≤i≤l. Using this we can split up the sum we wish to evaluate: ∑ d\n μ(d) = ∑ p e1 1 p e2 2 ···p el l e1+···+el= even 1 ∑ p e1 1 p e2 2 ···p el l e1+···+el= odd 1 =n 0 n 1+n 2+···+( 1)nn n = (1 1)n = 0.
9
10 0. NOTATION AND PRELIMINARIES
There is another way we could have evaluated the sum. Let T(l) be the number of 0-1 strings of length l that have odd number of 1s in them. Consider the last position of such a string. If it is a 1, then we must ll the rest of the positions with an even number of 1s which can be done in 2l 1 T(l 1) ways. If the last position is a 0, then the rest of the string must have an odd number of 1s which can be done in T(l 1) ways. We have argued that T(l) satis es the following recurrence: T(l) = T(l 1)+(2l 1 T(l 1)) = 2l 1.
Thus the number of sequences with odd number of 1s and the number of them with even number of 1s is the same, and so the above sum is zero.
THEOREM 0.3.2. (M¨obius Inversion) If
f(n) = ∑ d\n
g(d)
then
g(n) = ∑ d\n
μ(d)f
n d.
Proof :
∑ d\n
μ(d)f
n d= ∑ d\n
μ(d) ∑ l\(n/d)
g(l)
=∑ l\n
g(l) ∑ d\(n/l)
μ(d)
= ∑ l=n
g(l) by Theorem 0.3.1
= g(n).
THEOREM 0.3.3. If
f(n) = ∑ d\n
g(d)
then
g(n) = ∑ d\n
μ
n df(d).
Proof :
∑ d\n
μ
n df(d) = ∑ d\n
μ
n d∑ l\d
g(l)
=∑ l\n
g(l) ∑ d\n/l
μ
n dl
= ∑ l=n
g(l) by Theorem 0.3.1
= g(n).
0.3. PRELIMINARIES 11
THEOREM 0.3.4.
∑ d\n
μ(d) d
=∏ p\n1 1 p =∏ p\n1+ μ(p) p . Proof : We know that ∑d\n (d) = n. Using M¨obius inversion on this we get: n∏ p\n1 1 p= (n) = ∑ d\n μ(d) n d = n∑ d\n μ(d) d .
REMARK 0.3.5. The proof of Theorem 0.3.4 actually works for any multiplicative function of the divisors of n in the denominator, provided it is zero at non-squarefree divisors. We could have also proved Theorem 0.3.1 using the identity: ∑ d\n μ(d) =∏ p\n1+μ(p).
12 0. NOTATION AND PRELIMINARIES
CHAPTER 1
The sieve of Eratosthenes
1.1. Introduction
The sieve of Eratosthenes is a simple effective procedure for nding all the primes up to a certain bound x. Take a list of the numbers 2,3,···,bxc. Call 2 a prime, and start by crossing out all the multiples of 2. Because 3 is uncrossed at this stage 3 must be prime. Cross out the multiples of 3 since they are composite, and then pick the next number that is still uncrossed and repeat. If after a stage the next uncrossed number exceeds √x then stop. At this stage all the numbers that are not crossed out are prime.
Legendre realized that this procedure can be captured succinctly in a theoretical analog of the sifting process, and used this in his study of the function π(x) =
{p≤x| p a prime}
.In this chapter we will try to apply this basic technique to study some simple problems. First we shall look at the sieve applied to the problem of estimating π(x). Although the method would lead to an exact formula for π(x) π(√x) this does not give useful estimates for π(x) owing to a huge error term. However we can adapt the basic method to study other sequences of numbers, for example the squarefree numbers, meaning numbers that are products of distinct primes. The basic sieve we develop will be more successful in dealing with squarefree numbers, essentially because they are denser than the primes. We will be able to give interesting bounds on the density of these numbers in arithmetic progressions and in pairs (n,n+2). We shall also nd a bound on the smallest squarefree number in an arithmetic progression. Finally we shall give the general setup of a sieve problem and re-formulate the classical sieve of Eratosthenes-Legendre in this framework.
1.2. Sieve of Eratosthenes-Legendre
Let Pz = ∏p This is the characteristic function of the set S. Using the properties of the M¨obius function (see Chapter 0), we can write an explicit expression for s(n). s(n) = ∑ d\gcd(n,Pz) μ(d). We will call such a function s(n) the sifting function. 13 14 1. THE SIEVE OF ERATOSTHENES Then |S|= ∑ n≤x s(n) = ∑ n≤x ∑ d\gcd(n,Pz) μ(d) = ∑ d\Pz μ(d)∑ n≤x d\n 1 = ∑ d\Pz μ(d)x d = ∑ d\Pz μ(d)x d +x d x d = ∑ d\Pz μ(d) x d + ∑ d\Pz μ(d)x d x d. Since each term in the second sum has absolute value at most 1, we obtain |S|≤x ∑ d\Pz μ(d) d +2π(z) = x ∏ p\Pz1 1 p+2π(z). Now a theorem of Mertens states that ∏ p , and this yields the estimate: |S|≤x eγ lnz +2π(z) provided z→∞ as x→∞. The usefulness of the above scheme is restricted by the huge error term 2π(z). For z = O(lnx) for example we get π(x) π(lnx) = O x lnlnx, and since π(x)≤x we get the estimate π(x) = O x lnlnx. This is markedly inferior to the truth π(x)~ x lnx. Note that if z =√x then|S|measures π(x) π(√x), for which we have derived the following exact formula: π(x) π(√x)+1 = x ∏ p<√x1 1 p+ ∑ d\P√x μ(d)x d x d. 1.4. DENSITY OF SQUAREFREE NUMBERS 15 1.3. Smooth numbers DEFINITION 1.3.1. A number n will be called k-smooth if p : (p\n) (p < k). Let Ψ(x,k) =|{n≤x|n is k-smooth}|i.e., the number of k-smooth numbers up to a bound x. We can use our sieve argument to try to nd a bound on Ψ(x,k). The weakness of this simple sieve will be apparent in the bound it gives us. PROPOSITION 1.3.2. Ψ(x,k) = Oxlnk lnx +2π(x) π(k). Proof : Since a number is k-smooth only if all its prime divisors are below k, we can nd the k-smooth numbers below a bound x, by using as our sifting set P ={p|k < p≤x}. Let Pk,x = ∏p∈P p. Let S ={n|n is k-smooth}, and this time de ne s(n) =(1 if n∈S or n = 1, 0 otherwise. Now rewriting s(n) using the M¨obius function, we obtain s(n) = ∑ d\gcd(n,Pk,x) μ(d). Setting S(n) =|S|, we apply Mertens’ Theorem at the end to conclude: S(n) = ∑ n≤x s(n) = ∑ n≤x ∑ d\gcd(n,Pk,x) μ(d) = ∑ d\Pk,x μ(d)∑ n≤x d\n 1 = ∑ d\Pk,x μ(d)x d = x ∏ k
The bound is clearly very poor. However we can improve this bound using more advanced sieve techniques. In [Warl90], a much better bound is given under some conditions on the sifting primes. 1.4. Density of squarefree numbers The basic method of the sieve of Eratosthenes-Legendre can be adapted to prove a more interesting result. Let S = {n | n≤x,n is squarefree}, and let κ(x) =|S|. To obtain S as a result of a sifting process, all we need to do is take primes p <√x and cross of multiples of p2 from the list. We shall show that a variant of the function s(n) introduced earlier works in this case. THEOREM 1.4.1. κ(x) = 6 π2 x+O(√x). 16 1. THE SIEVE OF ERATOSTHENES Proof : The sifting function for this set is now s(n) =|μ(n)|, and κ(x) = ∑n≤x s(n) = ∑n≤x|μ(n)|. Now we reach an impasse, because there does not seem to be any easy way of evaluating this sum. The trick is to look for another expression for the sifting function. [1] s(n) = ∑d2\n μ(d). Any number n can be represented as n = m2w, where w is squarefree and m is the largest square divisor of n. If n = pe1 1 pe2 2 ···pel l with ei = 2qi +ri,0≤ri ≤1, then m = ∏i pqi i satis es the expression. We shall write (n) to stand for the largest square divisor of n. Now ∑ d2\n μ(d) = ∑ d\ (n) μ(d), and this sum is 0 unless (n) = 1 in which case it is also 1. This proves the claim. Setting m =√x, we obtain: κ(n) = ∑ n≤x s(n) = ∑ n≤x ∑ d2\n μ(d) = ∑ d≤m μ(d) ∑ n≤x d2\n 1 = ∑ d≤m μ(d)x d2 = x ∑ d≤m μ(d) d2 + ∑ d≤m μ(d)x d2 x d2 = x ∑ d≤m μ(d) d2 +O(m). Using the fact that ∏ p 1 1 p2= ∑ n≥1 μ(n) n2 we get κ(n) = x∏ p 1 1 p2 ∑ d>m μ(d) d2 +O(m) = x∏ p 1 1 p2+O(m). Also ∏ p 1 1 p2= 1 ζ(2) , so that we nally get κ(n) = x 1 ζ(2) +O(√x). Euler showed that ζ(2) = π2 6 , and using this in the above expression we have κ(n) = 6 π2 x+O(√x). 1.4. DENSITY OF SQUAREFREE NUMBERS 17 Another natural question to ask is: what is the density of squarefree numbers in an arithmetic progression We shall give a partial answer to that question in the next theorem. Let κ(x;a,l) =|{n≤x|n is squarefree,n≡a mod l}|. THEOREM 1.4.2. Let q > 2 be a prime, and let a be a positive integer relatively prime to q. Then there is a constant c > 0 depending only on q such that κ(x;a,q)≥cx+O(√x). Proof : Using the same idea as in the previous proof we have: κ(x;a,q) = ∑ n≡qa n≤x ∑ d2\n μ(d)(1.1) = ∑ d≤m ∑ d2\n n≡qa n≤x 1!where m isb√xc.(1.2) The quantity we need to bound is de ned by N(x;d,a,q) = ∑ d2\n n≡qa n≤x 1 This is essentially the number of solutions in k to the congruence kd2 ≡a mod q. There are two cases: [d⊥q] In this case there is a unique solution k such that k≡a(d 2) mod q. However, if k∈{0,1,···,q 1}is such a solution then for e≥1, k+eq is also a solution. Now (k+eq)d2 = n≤x, so (k+eq)≤ x d2 e≤ x d2q k q e≤ x d2qas k < q. [d6⊥q] In this case there are no solutions to the congruence as a > 0. Thus N(x;d,a,q) = x d2qif d ⊥q, and 0 otherwise. Substituting in (1.2) we get κ(x;a,q) = ∑ d≤m μ(d) x d2q ∑ d≤m d6⊥q μ(d) x d2q = x q∑ d≤m μ(d) d2 ∑ d6⊥q μ(d) d2 +O(m) 18 1. THE SIEVE OF ERATOSTHENES ∑ d6⊥q μ(d) d2 ≤ ∑ d6⊥q 1 d2 = ∑ q\d,d≤x 1 d2 = ∑ k≤(x/q) 1 k2q2 = 1 q2 ∑ k≤(x/q) 1 k2 ≤ 1 q2 ∑ k≥1 1 k2 ≤ π2 6q2 Thus we get κ(x;a,q)≥x 1 qζ(2) ζ(2) q2 +O(√x). and hence κ(x;a,q)≥cx+O(√x). 1.5. The error term in the distribution of Squarefree numbers We proved in the previous section that κ(x) 6 π2 x = O(√x), and it turns out to be extremely dif cult to improve on this bound. In this section we brie y digress form the topic of sieves to show a strengthening of the error term if one assumes the Riemann Hypothesis (henceforth called RH). First we shall strengthen the error term (unconditionally) using a theorem of Wal sz. THEOREM 1.5.1 ([Wal63] Satz§5.5.3). ∑ n≤x μ(n) = Bxexp Alog3 5 xloglog 1 5 x for some positive constants A and B. We simplify the proof in [Wal63] of the following theorem: THEOREM 1.5.2 ([Wal63] Satz§5.6.1). κ(x) = 6 π2 x+O√xexp clog3 5 xloglog 1 5 x for some positive constant c > 0. Proof : κ(x) = ∑ 1≤n≤x ∑ d2\n μ(d) = ∑ d2m≤x μ(d) = ∑ d2≤x μ(d)x d2. 1.5. THE ERROR TERM IN THE DISTRIBUTION OF SQUAREFREE NUMBERS 19 Let S2(x,y) = ∑d≤y μ(d)δx d2, where δ(z) = z bzc 1 2 and M(y) = ∑n≤y μ(n). Then κ(x) = x ∑ d2≤x μ2(d) d2 S2(x,√x) 1 2 M(√x). In [MV81] (see p.255) the following bound is proved: S2(x,y) = O(x 2 7 +y 1 2 x 1 7+ε), and this implies that S(x,√x) = O(x11 28 ). Now consider: ∑ d>y μ(d) d2 = 2 ∑ d>y μ(d) ∞ d 1 z3 dz = 2 ∞ y dz z3 ∑ y μ(n) (interchanging of the sum and the integral is valid since both of them are convergent) = 2 ∞ y M(z)dz z3 2M(y) ∞ y dz z3 = OM(y) ∞ y dz z3 o(1) = OM(y) y2 . Hence ∑ d>√x μ(d) d2 = Oexp{clog 3 5 xloglog 1 5 x}√ x and also ∑ d≤√x μ(d) d2 = 1 ζ(2) +Oexp{clog 3 5 xloglog 1 5 x}√ x . The theorem follows from these estimates. COROLLARY 1.5.3. The number of squarefree numbers in the interval [x,···,x+√x] is asymptotic to 6√x π2 . The corresponding problem for primes seems to be far more dif cult, see [HB88]. It turns out that if the Riemann Hypothesis holds then M(y) = O(√y), and using this in the above proof we get the following theorem: THEOREM 1.5.4. Assuming the Riemann Hypothesis, κ(x) = 6 π2 x+O(x 11 28 ). 20 1. THE SIEVE OF ERATOSTHENES It turns out that if we assume the Riemann Hypothesis we can do better even without the strong bound on S2(x,y). We begin as we did before, κ(x) = ∑ 1≤d≤x μ(d) ∑ 1≤n≤x d2\n 1 = ∑ d2n≤x μ(d) = ∑ d2n≤x d≤y μ(d)+ ∑ d2n≤x d>y μ(d) = Σ1 +Σ2 (say). Now (as in the proof of the previous theorem) Σ1 = ∑ d≤y μ(d)x d2 = ∑ d≤y μ(d) x d2 x d2 x d2 1 2! 1 2 ∑ d≤y μ(d). Let as before S2(x,y) = ∑ d≤y μ(d)δx d2 and M(y) = ∑d≤y μ(d), where δ(z) = z bzc 1 2, so that Σ1 = x ∑ d≤y μ(d) d2 S2(x,y) 1 2 M(y). Let fy(s) = 1 ζ(s) ∑ d≤y μ(d) ds . We adopt the standard convention of referring to the real part of s as σ and the imaginary part as t. If σ > 1 then we have fy(s) = ∑ d>y μ(d) ds , since in this case we also have 1 ζ(s) = ∑ 1≤d μ(d) ds . Consider ζ(s)fy(2s) =∑ 1≤n 1 ns∑ d>y μ(d) d2s = ∑ 1≤n 1 ns∑ d>y d2\n μ(d). If we look at the restricted version of this sum, namely, ∑ 1≤n≤x 1 ns∑ d>y d2\n μ(d), then as s → 0 this sum equals Σ2. Thus we need a way of evaluating this sum when s → 0. The following result (Lemma (3.12) [Tit86] p60) will help us do just that. 1.5. THE ERROR TERM IN THE DISTRIBUTION OF SQUAREFREE NUMBERS 21 LEMMA 1.5.5. [Tit86] Lethanibe a sequence of real numbers, such that as σ→1 from above, ∑ n≥1 |an| nσ = O 1 (σ 1)α, for some α≥1. Let ψ(n) be an upper bound for|an|, and de ne: f(s) = ∑ n≥1 an ns , for σ > 1. If c > 0,σ≥0,σ+c > 1, x is not an integer, and N is the nearest integer to x, then for all T > 0: ∑ n c+iT c iT f(s+w) xw w dw+O xc T(σ+c 1)α+Oψ(2x)x1 σlogx T +Oψ(N)x1 σ T|x N|. Applying this lemma to the series ∑ 1≤n≤x 1 ns∑ d>y d2\n μ(d) with c = 1+ 1 logx and T = x gives remainder terms of O(xε), since ψ(z) = O(√z). Making the change of variable w←s taking the s in the lemma to be 0, and setting x0 =bxc+ 1 2 so that x0 is not an integer, we obtain Σ2 = 1 2πi c+ix c ix ζ(s)fy(2s) xs 0 s ds+O(xε). Now consider splitting the integral into four regions: c+ix c ix + 1 2+ix c+ix + 1 2 ix 1 2+ix + c ix 1 2 ix (where the integrand is the same as above). Since the integrand has a simple pole at s = 1, with residue 2πify(2)x0, we have c+ix c ix + 1 2+ix c+ix + 1 2 ix 1 2+ix + c ix 1 2 ix = 2πify(2)x0 and so Σ2 = fy(2)x0 + 1 2πi C ζ(s)fy(2s) xs 0 s ds+O(xε), where C is the path made up of the line segments c ix → 1 2 ix 1 2 ix → 1 2 +ix 1 2 +ix →c+ix. By Theorem (14.2) on p.337 of [Tit86], RH implies that 1 ζ(s) = O(|t|ε). Also THEOREM 1.5.6 ([Tit86] (14.25A)). Assume RH. For s with σ > 1 2, ∑ n μ(n) ns = 1 ζ(s) +O(T1 εx2)+O(Tεx1 2 σ+δ). Using this we can take T large so that fy(s) = O(y 1 2 σ+δ0)(1.3) under RH. 22 1. THE SIEVE OF ERATOSTHENES Also by Theorem (14.25C) [Tit86], RH implies M(z) = O(z 1 2+ε). Using all this information we can bound C ζ(s)fy(2s) xs 0 s ds on the contourC: we have fy(2s)=O(y1 2 1+ε)=O(y 1 2+ε) andζ(s)= 1 s 1 +O(tε), and since xs =xσ+it =e(σ+it)logx = eσlogx+it logx, we have|xs|= xσ. Thus the integral in (1.3) is: O(x 1 2+εy 1 2+ε). Combining all these estimates we get the following bounds: THEOREM 1.5.7 ([MV81]). Assuming the Riemann Hypothesis, for any y > 0 κ(x) = x ζ(2) S2(x,y)+Ox1 2+εy 1 2+ε +y1 2+ε.C OROLLARY 1.5.8. Assuming the Riemann Hypothesis, κ(x) = x ζ(2) +Ox1 3+δ. Proof : Clearly we have S2(x,y) = O(y), now setting y = x 1 3 in the above theorem we get the result. In the same article [MV81] Montgomery and Vaughan went on to estimate the sums involved more precisely to show that κ(x) = 1 ζ(2)x+O(x 9 28+ε). Subsequently the exponent of the error term was reduced to 7 22 by various authors (see [BakPin85]). 1.6. Pairs of squarefree numbers The famous twin prime problem asks whether there are in nitely many primes p such that p+2 is also prime. Although this problem is still open, the analogous question for the squarefree numbers can be settled rather easily using the methods we have seen so far. For a more general version of this result see [Mir49]. Let κ2(x) = κ2(x) = ∑ ab≤y μ(a)μ(b)N(x;a2,b2,2)+ ∑ ab>y k1a2 k2b2=2,k2b2≤x μ(a)μ(b). Here N(x;a2,b2,2) is a count of the number of solutions to the equation k1a2 k2b2 = 2, k2b2 ≤x. 1.6. PAIRS OF SQUAREFREE NUMBERS 23 It is clear that N(x;a2,b2,2) = 0 if gcd(a2,b2) does not divide 2, and otherwise N(x;a2,b2,2) = x lcm( a2,b2) +O(1) = x (ab)2 +O(1), since a⊥b. Using this we have ∑ ab≤y a⊥b μ(a)μ(b)N(x;a2,b2,2)≤ ∑ ab≤y a⊥b μ(a)μ(b) x (ab)2 +O(1) = x ∑ ab≤y μ(ab) (ab)2 + ∑ ab≤y μ(a)μ(b), since the terms with a6⊥b are killed by the M¨obius function. Thus ∑ ab≤y μ(a)μ(b)≤ ∑ ab≤y 1 =y 1+y 2+···+y y = Oy ∑ 1≤k≤y 1 k= O(ylny). Now the sum ∑ ab≤y μ(ab) (ab)2 can be evaluated by looking at the terms with ν(ab) = k. Write a = pε1 1 pε2 2 ···pεk k and b = pδ1 1 pδ2 2 ···pδk k . Since a⊥b we should have ( i : 1≤i≤k) εi +δi = 1, so there are 2ν(ab) terms whose denominator is (ab)2. Hence ∑ ab≤y μ(ab) (ab)2 = ∑ n≤y μ(n)2ν(n) n2 = ∏ p≤y1 2 p2. So ∑ ab≤y μ(ab) (ab)2 =∏ p 1 2 p2 ∑ n>y μ(n)2ν(n) n2 . We need a bound on the sum on the right hand side of the above equation. Now ∑ ab>y 1 (ab)2 = ∑ b 1 (ab)2 + ∑ a>y,b>y 1 (ab)2 . The second sum converges so we need to bound on the rst part of the sum. Now: ∑ b 1 (ab)2 ≤ ∑ 1≤b≤y 1 b2∑ a> y b 1 a2 24 1. THE SIEVE OF ERATOSTHENES ∑ a> y b 1 a2 ≤ ∞ y b 1 a2 da = b y so we have ∑ b 1 (ab)2 ≤ 1 y ∑ 1≤b≤y 1 b = 1 y lny. We nally get x ∑ ab≤y μ(ab) (ab)2 = x∏ p 1 1 p2+Ox y lny. Now we have to bound the sum ∑ ab>y k1a2 k2b2=2,k2b2≤x μ(a)μ(b). We re-express this sum as follows: ∑ ab>y a2c b2d=2 b2d≤x μ(a)μ(b)≤ ∑ a2c b2d=2 b2d≤x ab>y 1. Since a2c = 2+b2d, a2c≤2+x, and this gives us c≤ (x+2) a2 . Since d ≤ x b2 and y < ab we have either cd ≤ x(x+2) a2b2 or cd ≤ x(x+2) y2 . This gives ∑ a2c b2d=2 b2d≤x,ab>y 1≤ ∑ cd M(x;c,d,2), where M(x;c,d,2) is the number of solutions of ca2 db2 = 2,db2 ≤x.(1.4) The above equation implies that 2c 1 p ≡1 mod p, for all p\d, 2d 1 p ≡1 mod p, for all p\c. Estermann studied these congruences and for the case cd not a square he proved [Est31]: M(x;c,d,2) = O(lnx), in fact that M(x;c,d,2)≤4(ln(x+2)+1). If cd is a square then since the equation (1.4) implies c⊥d we can set c = l2, d = m2 to obtain: M(x;c,d,2) = ∑ l2a2 m2b2=2 1 ≤ ∑ r2 s2=2 1 = 0. 1.7. THE SMALLEST SQUAREFREE NUMBER IN AN ARITHMETIC PROGRESSION 25 In any case we have M(x;c,d,2) = O(lnx), and using this we have: ∑ cd 1. For any positive constant K we have: ∑ cd 1 = ∑ c K c ≤KlnK, so ∑ cd M(x;c,d,2)≤ln2 xx(x+2) y2 = Ox2 y2 ln2 x. Setting y = x 2 3 ln 1 3 x we have ∑ ab>y μ(ab)≤ ∑ cd M(x;c,d,2) ≤x 2 3 ln 4 3 x, and also x ∑ ab≤y μ(ab) (ab)2 = x∏ p 1 1 p2+Ox1 3ln2 3 x+o(1). The theorem follows from these two bounds. 1.7. The smallest squarefree number in an arithmetic progression The simple methods that we have seen so far are surprisingly powerful and provide a quick bound on the smallest squarefree number in an arithmetic progression. The following result is from [Erd60] and is one of the early uses of a squarefree sieve. THEOREM 1.7.1. Let a ⊥ D, 1 ≤ a < D. Then the smallest squarefree number in the arithmetic progression ha+ kD : k≥0iis OD 3 2 lnD. Proof : Let A =ha+kD : k≥0ibe the sequence. The rst step would be to sift A by all squares of primes below a certain limit z. This will leave out only those numbers that could have a large prime as their square divisor. We will nally bound the number of such integers below x and show that there are still some numbers left over — and that will prove the theorem. Let Pz = ∏p 26 1. THE SIEVE OF ERATOSTHENES Now ∑ n∈An ≤x,d2\n 1 is exactly the number of solutions to the following pair of congruences: n≡0 mod d2 n≡a mod D. Suppose d ⊥D. Then there is exactly one solution in the interval lcm(D,d2) = Dd2, so the total number of solutions in 1≤n≤x is at most x Dd2 +1. If gcd(d,D)=δ then n=kδ by the rst congruenceand n a=k0δ by the second congruence. This yields a=(k k0)δ and so gcd(a,D)6= 1. This is a contradiction, so if d 6⊥D there are no solutions to the congruence. Let k =b(x a) D c,which is the maximum value of k for a+kD to be in A. Then S(A;Pz,x) = ∑ d\Pz,d⊥D μ(d) x Dd2 +1 = x D ∑ d\Pz,d⊥D μ(d) d2 = k∑ d\Pz d⊥D μ(d) d2 +o(1) = k ∏ p\Pz,p6\d1 1 p2+o(1) ≥k∏ p 1 1 p2+o(1) = k6 π2 +o(1). Taking k to be c√D lnD we have S(A;Pz,k)≥ 6 π2 c√D lnD . The number of integers a+kD in A for which k < c√D lnD and also n≡0 mod p2 n≡a mod D is at most c√D p2 lnD +1. Let N stand for the number of integers k < c√D lnD in A for which a+kD6≡0 mod p2 for all p≤√cD. Then N ≥ 6 π2 c√D lnD c√D lnD∑ p≥z 1 p2 ∑ p≥z,p≤√cD 1(1.5) ≥ 6 π2 c√D lnD c√D lnD 1 z π( √cD),(1.6) and so for large enough c and L N > 1 2 c√D lnD .(1.7) We have used the fact that π(x) < 2x lnx for large enough x. 1.8. THE SIEVE PROBLEM 27 Now we are left with the numbers that are either squarefree or divisible by a prime p > √cD. For these numbers a+kD either a+kD≡0 mod p2,k < c √D lnD and p >√cD or a+kD = αp2 with α < √D lnD . Supposing p > D 1 2+ε, we would have α < D1 2 ε if D is large enough, so we also have p < D. Thus a+kD = αp2 yields a congruence a ≡αp2 mod D. Let us x an α; then clearly the number of such prime solutions is less than the number of solutions for the congruence x2 ≡aα 1 mod D, 0 < x < D. If aα 1 is a quadratic residue modulo D, then by the Chinese Remainder Theorem there are at most 2ν(D) such solutions to this congruence. Since ν(n) = o(lnn), we can write 2ν(D) = o(Dε 2 ). If p > D 1 2+ε then there are only D1 2 ε choices for α, so on the whole there are only o(D 1 2 ε 2 ) such solutions. Let us consider the solutions for√cD < p < D1 2+ε. We have p2 ≡aα 1 mod D, α < √D lnD , √cD < p < D1 2+ε. Let cα be the number of solutions of this congruence for a xed α. These solutions give rise to ∑cα 2solutions to thecongruence p2 ≡q2, p,q < D1 2+ε.(1.8) Since (1.8) implies (p q)(p+q)≡0 mod D, the number of such solutions is at most the number of solutions to uv≡0 mod D, u < 2D 1 2+ε,v < 2D1 2+ε. This gives us uv = βD,1≤β < 4D2ε.(1.9) Also for a xed β the number of such solutions is less than the number of factors of the numberβD, which is o((βD)ε), so the number of solutions of (1.9) is o((βD)ε)4D2ε = o(D4ε). This gives ∑cα 2= o(D4ε) and hence ∑ cα>1 cα = o(D4ε). Since α < √D lnD, ∑cα ≤ √D lnD +o(D4ε). Thus the number of integers 0≤k < c √D lnD for which a+kD≡0 mod p2 for some p>√cD is at most D1 2 lnD +o(D 1 2 ε 2 ). So the number of integers k, 0≤k < c √D lnD , for which a+kD is squarefree is 1 2 c√D lnD √D lnD o(D 1 ε 2 ) > 0 for large enough c. 1.8. The Sieve Problem Now that we have seen some examples of sieve techniques at work, we can formulate the sieve problem in a generic setting so that the essential quantities are clearly visible. The notation we shall adopt is that of the seminal book by Halberstam and Richert [HR74]. 28 1. THE SIEVE OF ERATOSTHENES 1.8.1. Notation. 1. A,B,··· will stand for integer sequences. 2. Ad =ha∈A : a≡0 mod di. 3. Az =ha∈A : a≤zi. 4. If A is a nite sequence then|A|will denote the length of the sequence. 5. P =hpi : pi is the i-th primei. 6. Pz = ∏p∈Pz p. 7. S(A;Pz,x) will be the number of elements inAx that survive the sifting process by the sequencePz. In general the sifting is determined by a sifting function σ :A→{0,1}which determines whether a number survives the sifting, but usually we will only be considering simple sifting functions like σ(n) = 1 n⊥ ∏ p∈Sz p 8. If A is a nite sequence then ω(p) is de ned such that ω(p) p x is a good approximation to |Ax p|. If d is any squarefree integer we can generalize this notation by de ning ω(d) = ∏p\d ω(d). 9. De ne Rd(x) =|Ax d| ω(p) p x, i.e. the remainder term in our estimate of|Ax d|. 10. De ne W(z) = ∏ p\Pz1 ω(p) p . 1.8.2. The Sieve of Eratosthenes-Legendre revisited. The generic sieve problem is to estimate S(A;Pz,x). Needless to say solving the problem as stated in this generality is too great a task. This treatise will only be concerned with restricted versions of the sieve problem which nevertheless yield interesting and non-trivial results. The case of great importance is when Sz =Pz and A is some subsequence of positive integers. The sieve of Eratosthenes-Legendre can be recast in this framework as follows. Let A be the sequence to be sifted, and let ω(d) and Rd be the modulo counting function and the remainder function for the sequence, respectively. Let Pz be the sifting sequence; then the sifting function is σ(n) =(1 if, n⊥Pz 0 otherwise. We can rewrite σ(n) as σ(n) = ∑ d\gcd(n,Pz) μ(d). 1.8. THE SIEVE PROBLEM 29 Thus we have S(A,Pz,x) = ∑ n∈A,n≤x σ(n) = ∑ n∈Ax ∑ d\gcd(n,Pz) μ(d) = ∑ d\Pz μ(d)∑ n∈Ax d\n 1 = ∑ d\Pz μ(d)|Ax d| = ∑ d\Pz μ(d)ω(d) d x+Rd(x) = x ∑ d\Pz μ(d)ω(d) d + ∑ d\Pz μ(d)Rd(x) = x ∏ p\Pz1 ω(p) p + ∑ d\Pz μ(d)Rd(x) = xW(z)+ ∑ d\Pz μ(d)Rd(x) = xW(z)+θ ∑ d\Pz Rd(x) where|θ|≤1. If we assume that|Rd(x)|≤ω(d) and suppose that ω(p)≤C0, where C0 is some constant, then ω(d)≤Cν(d) 0 . So ∑ d\Pz Rd(x)≤ ∑ d\Pz C ν(d) 0 = ∏ p\Pz (1+C0) = (1+C0)π(z). Thus we have proved the following theorem. THEOREM 1.8.1. For all suf ciently large x and z < x, there is a θ with|θ|≤1 (θ depending on z), such that S(A;Pz,x) = xW(z)+θ ∑ d\Pz Rd(x). If we have|Rd(x)|≤ω(d) and ω(p)≤C0 then S(A;Pz,x) = xW(z)+O(1+C0)π(z). It is very clear that the effectiveness of the basic sieve is limited by the fact that the remainder term is a sum over all the divisors of Pz. Beginning with the next chapter we shall systematically try to reduce this term. 30 1. THE SIEVE OF ERATOSTHENES CHAPTER 2 The Combinatorial Sieve In this chapter we begin by exploring the ideas of Viggo Brun, who rst showed how we can improve on the Legendre method if we relax our requirement of asymptotic results but instead look for inequalities. After developing Brun’s sieve in general we shall look at applications that bring out the surprising power of the technique. We follow the presentation in Halberstam & Richert [HR74] rather closely since its form is well suited for our applications. However our development will be targeted only to the Brun’s sieve. 2.1. Brun’s Pure Sieve Let Ax be a nite sequence of integers and let Sz be the sifting primes. In the previous chapter the sifting function was: σ(n) = ∑ d\gcd(n,Pz) μ(d). Let us see what can be done if instead we have a pair of functions χ1(d) and χ2(d) such that σ2(n)≡∑ d\n μ(d)χ2(d)≤∑ d\n μ(d)≤∑ d\n μ(d)χ1(d)≡σ1(n). Since S(A;Pz,x) = ∑ d\Pz μ(d)|Ad| =|A| ∑ p\Pz |Ap|+ ∑ pq\Pz |Apq|+··· we expect that truncating the series after an even (odd) number of sums will give us a lower (upper) bound. Brun’s pure sieve is an application of this well-known idea. Using the notation developed in the last chapter we have ∑ n∈A ∑ d\n d\Pz μ(d)χ2(d)≤S(A,Pz,x)≤ ∑ n∈A ∑ d\n d\Pz μ(d)χ1(d). Let us rst look at the upper bound: ∑ n∈A ∑ d\n d\Pz μ(d)χ1(d) = ∑ d\Pz μ(d)χ1(d)|Ax d| = ∑ d\Pz μ(d)χ1(d)ω(d)x d +|Rd(x)| = x ∑ d\Pz μ(d)χ1(d) ω(d) d + ∑ d\Pz μ(d)χ1(d)|Rd(x)|. Let σ1(n) = ∑d\n μ(d)χ1(d); then by M¨obius inversion we get μ(d)χ1(d) = ∑ δ\d μ d δσ(δ). 31 32 2. THE COMBINATORIAL SIEVE Substituting this in the above expression we get x ∑ d\Pz μ(d)χ1(d) ω(d) d = x ∑ d\Pz ω(d) d ∑ δ\d μ d δσ1(δ) = x ∑ δ\Pz σ1(δ)ω(δ) δ ∑ t\(Pz/δ) μ(t) ω(t) t (since ω(d) is a multiplicative function) = x ∑ δ\Pz σ1(δ) ω(δ) δ ∏ p\(Pz/δ)1 ω(p) p = xW(z) ∑ δ\Pz σ1(δ) ω(δ) δ∏p\δ1 ω(p) p = xW(z) ∑ δ\Pz σ1(δ)g(δ) = xW(z)1+ ∑ 1<δ\Pz σ1(δ)g(δ), where g(d) abbreviates ω(d) d∏p\d1 ω(p) p . The remainder term is clearly at most ∑ d\Pz μ(d)χ1(d)|Rd(x)|≤ ∑ d\Pz |χ1(d)||Rd(x)|. A similar argument works for the lower bound too. Thus we have: xW(z)1+ ∑ 1<δ\Pz σ2(δ)g(δ) ∑ d\Pz |χ2(d)||Rd(x)|≤S(A,Pz,x)(2.10) ≤xW(z)1+ ∑ 1<δ\Pz σ1(δ)g(δ)+ ∑ d\Pz |χ1(d)||Rd(x)|.(2.11) Our aim will be to minimize|∑1<δ\Pz σi(δ)g(δ)| for i = 1,2 such that the remainder term ∑d\Pz|χi(d)||Rd|is small. A whole class of estimates can be obtained by restricting the functions χi to be the characteristic sequences of two divisor sets D+ and D of Pz. The resulting sieves are called Combinatorial Sieves. Let us consider the following functions: χ(r)(d) =(1 if ν(d)≤r, and μ2(d) = 1, 0 otherwise. These functions restrict the divisor sets over which we take the sum. In particular the restriction is on the number of distinct prime factors of the divisors. We will require the following lemma. LEMMA 2.1.1. ∑ 0≤i≤k ( 1)in i= ( 1)kn 1 k . Proof : The proof is by induction on k. For k = 0 we have ( 1)0n 0=n 1 0 +n 1 1=n 1 0 . 2.1. BRUN’S PURE SIEVE 33 Now ∑ 0≤i≤(k+1) ( 1)in i= ∑ 0≤i≤k ( 1)in i+( 1)k+1 n k+1 = ( 1)kn 1 k +( 1)k+1 n k+1 = ( 1)kn 1 k +( 1)k+1n 1 k +n 1 k+1 = ( 1)k+1n 1 k+1. LEMMA 2.1.2. Let n be a positive integer and s a non-negative integer. Then ∑ d\n μ(d)χ(2s+1)(d)≤∑ d\n μ(d)≤∑ d\n μ(d)χ(2s)(d). Proof : When n = 1 all the sums are equal so we can assume n > 1. Then ∑ d\n μ(d)χ(r)(d) = ∑ 1≤k≤r ( 1)kν(n) k = ( 1)rν(n) 1 r . by Lemma (2.1.1). Now let us try to bound the terms involved in (2.11). Let σ(r)(n) = ∑d\n μ(d)χ(r)(d), so that we have σ(r)(n) = ∑ d\nν (d)≤r μ(d) = ( 1)rν(n) 1 r and hence|σ(r)(n)|=ν(n) 1 r ≤ν(n) r .Then we have ∑ 1 ≤ ∑ 1 g(d) ≤ ∑ m≤rm r 1 m!∑ p g(p)m = 1 r!∑ p g(p)r exp ∑ p g(p). Suppose we make the assumption|Rd(x)|≤ω(d); then we can also bound the remainder term as follows: ∑ d\Pz |χ(r)(d)||Rd(x)|≤ ∑ d\Pz,ν(d)≤r ω(d)≤1+ ∑ p ω(p)r. 34 2. THE COMBINATORIAL SIEVE Since ∑ d\Pzν (d)≤2s+1 μ(d)|Ad|= ∑ d\Pzν (d)≤2s μ(d)|Ad| ∑ d\Pzν (d)=2s+1 |Ad|≤S(A;Pz,x)≤ ∑ d\Pzν (d)≤2s μ(d)|Ad| we can always write S(A;Pz,x) = ∑ d\Pzν (d)≤r μ(d)|Ad|+θ ∑ d\Pzν (d)=r+1 |Ad|,|θ|≤1. Putting all these together we have: S(A;Pz,x) = xW(z)1+θ 1 r!∑ p g(p)r exp∑ p g(p)+θ01+ ∑ p ω(p)r for some positive integer r, and with|θ|≤1,|θ0|≤1. Thus we have proved: THEOREM 2.1.3 (Brun’s Pure Sieve). Let g(d) = ω(d) d∏p\d1 ω(p) p be well de ned for all d with μ(d)6= 0, and suppose|Rd(x)|≤ω(d). Then for every non-negative integer r there exist θ,θ0 with|θ|≤1,|θ0|≤1 such that S(A;Pz,x) = xW(z)1+θ 1 r!∑ p x, and in this case since x p ={n≤x|n≡0 mod p}we have| x p|=bx pc= 1 px+δ0,|δ0|< 1. So we can take ω(p) = 1, and the condition|Rd(x)|≤ω(d) is also satisi ed. Also g(p) = 1 p1 1 p≤ 2 p , and this gives us ∑ p g(p)≤2 ∑ p 1 p < 2(lnlnz+1). We use the trivial estimate 1+∑p 2.1. BRUN’S PURE SIEVE 35 gives us ∑p lnz = lnx γlnlnx , and set λ = ξlnz(lnlnz+1) lnx so that for a large enough x and appropriate settings of ξ,γ we get λe1+λ ≤ 1. For this setting of z and r we have zr = ox1 εfor some ε > 0. Thus the theorem gives π(x) = Oxlnlnx lnx 1+θe clnlnx+o(x1 ε). This approximation is signi cantly better than our rst and shows the improvement that can be made using this simple idea. Next we will look at the twin primes problem, which was Brun’s primary application of his pure sieve. In this case we take the sequence to be A =|{n(n+2)|n≤x}|. Let p > 2; then Ap ={n(n+2) | n≤x,n(n+2)≡0 mod p}. Now n(n+2)≡0 mod p only if n≡0 or n+2≡0 since p is an odd prime. Clearly 0 and p 2 are two solutions in the interval 0,···,p 1. So we can take ω(p) = 2, p > 2. For p = 2 we have ω(p) = 1. By the Chinese Remainder Theorem we have|Rd(x)|≤ω(d). We take the sifting primes to be P ={p | p > 2}. Since S(A;Pz,x) counts all the twin-prime pairs above z, S(A;Pz,x)+2z is an upper bound on the number of twin-primes below x. Then: W(z) = ∏ 2
THEOREM 2.1.5. ∑ p p+2=q 1 p converges. Proof : 36 2. THE COMBINATORIAL SIEVE ∑ p p+2=q 1 p =∑ n π2(n) π2(n 1) n =∑ n π2(n)1 n 1 n+1 =∑ n π2(n) 1 n(n+1) ≤B∑ n n(lnlnn)2 n(n+1)ln2 n = B∑ n 1 nlnlnn lnn 2 = O(1). The last step follows via ∑ n≤x 1 nlnlnn lnn 2 ≤ 2 lnx + 2lnlnx lnx + (lnlnx)2 lnx (1+o(1)) using approximation by integration and taking the limit x→∞. 2.2. Brun’s Sieve The second idea of Brun was to limit the remainder term by restricting the size of primes making up the divisors. This simple idea results in a sieve of remarkable power which can be used to prove rather sharp bounds on S(A,Pz,x). Since we are modifying the divisor sets in a non-trivial fashion we would like to have some simple conditions on the characteristic functions χ of the divisor sets, such that χ still yields good lower or upper bounds. Our rst task is to nd such a set of conditions. We begin with the following observation. PROPOSITION 2.2.1. S(A,Pz,x) = ∑ d\Pz μ(d)χ(d)|Ad| ∑ 1 σ(d)S(Ad;Pz (d),z) where Pz (d) = ∏p∈Pz p6\d p. Proof : ∑ d\Pz μ(d)χ(d)|Ax d|= ∑ d\Pz |Ax d|∑ δ\d μd δσ(δ) = ∑ δ\Pz σ(δ) ∑ t\Pz/δ μ(t)|Aδt| = ∑ t\Pz μ(t)|At|+ ∑ 1<δ\Pz σ(δ) ∑ t\Pz/δ μ(t)|Aδt| = S(A,Pz,x)+ ∑ 1<δ\Pz σ(δ) ∑ t\Pz/δ μ(t)|Aδt| = S(A,Pz,x)+ ∑ 1 σ(d)S(Ad;Pz (d),z), where we have used the M¨obius inversion on the expression for σ(d) as in the previous section. We will use the above proposition to compare ∑d\Pz μ(d)|Ad|with ∑d\Pz μ(d)χ(d)|Ad|. 2.2. BRUN’S SIEVE 37 Now σ(d) =∑ l\d μ(l)χ(l) = ∑ l\d/p μ(l)χ(l)+ ∑ l\d/p μ(lp)χ(lp) = ∑ l\d/p μ(l)χ(l) ∑ l\d/p μ(l)χ(lp) = ∑ l\d/p μ(l)χ(l) χ(lp). Let q(d) be the smallest prime divisor of d. Now using the above expression we can write ∑ 1 = ∑ δ\Pz ∑ p\Pzp S(Apδ;Pz (pδ),x)∑ l\δ μ(l)χ(l) χ(pl) = ∑ l\Pz ∑ p\Pzp μ(l)χ(l) χ(pl) ∑ t\Pz/l p S(Aplt;Pz (plt),x) = ∑ l\Pz ∑ p\Pzp μ(l)χ(l) χ(pl)S(Apl;Pp (pl),x). Using this in the above proposition, S(A;Pz,x) = ∑ d\Pz μ(d)χ(d)|Ad| ∑ d\Pz ∑ p\Pzp μ(d)χ(d) χ(pd)S(Apd;Pp (pd),x) = ∑ d\Pz μ(d)χ(d)|Ad| ∑ d\Pz ∑ p\Pzp μ(d)χ(d) χ(pd)S(Apd;Pp,x) since Pp (pd) = Pp. Suppose we have χ(1) = 1 and χ(d) = 0 for d > 1. Then S(A;Pz,x) =|A| ∑ p S(Ap;Pp,x). Now let χ1,χ2 be the characteristic functions of the divisor sets that we wish to use to get upper and lower bounds respectively. If we arrange ( 1)i 1μ(d)χi(d) χi(pd)≥0 whenever pd\Pz and p < q(d) for i = 1,2, then ∑ d\Pz μ(d)χ2(d)|Ad|≤S(A;Pz,x)≤ ∑ d\Pz μ(d)χ1(d)|Ad|. The above inequality is valid (needless to say) only if the sums involving χi are positive. This gives us a set of suf cient conditions for our functions χi to be well behaved. If pd\Pz and p < q(d) then the conditions can be satis ed in only one of the following ways: 1. χi(d) = χi(pd) 2. χi(d) = 1,χi(pd) = 0 and μ(d) = ( 1)i 1 3. χi(d) = 0,χi(pd) = 1 and μ(d) = ( 1)i. 38 2. THE COMBINATORIAL SIEVE We can avoid the last possibility by requiring that the functions χi be divisor closed, i.e. that χi(d) = 1 δ\d : χi(δ) = 1. So the functions χi for i = 1,2 should have the following properties: 1. If d\Pz, then either χi(d) = 0 or χi(d) = 1; 2. χi(1) = 1 (this is required for the derivation in Proposition (2.2.1)); 3. χi(d) = 1 δ\d : χi(δ) = 1; 4. χi(d) = 1,μ(d) = ( 1)i χi(pd) = 1 for all pd\Pz, where p < q(d). Suppose we restrict χ(r) (which was the divisor selection function of the previous section) to also limit the number of prime factors that come from a certain interval. Suppose at most δ1 divisors can come from the interval z1 < p < z. Then the remainder term obeys ∑ d\Pz χ(r)(d)|Rd|≤1+ ∑ p ω(p)δ11+ ∑ p ω(p)r 1 δ1. This allows a more accurate estimation of the remainder term. The full Brun Sieve uses n such intervals to minimize the remainder term. The rst step is to compare ∑d\Pz μ(d)ω(d) d with ∑d\Pz μ(d)χi(d)ω(d) d . By writing χi(d) = 1+ i(d), we can split the sum ∑ d\Pz μ(d)χi(d) ω(d) d = ∑ d\Pz μ(d)χi(d) ω(d) d + ∑ d\Pz μ(d) i(d) ω(d) d . Let d = p1···pr; then 1 χi(d) = χi(p2···pr) χi(p1···pr) +χi(p3···pr) χi(p2···pr) +··· +χi(1) χi(pr). If we write P(p+,z) = ∏p ∑ d\Pz μ(d)χi(d) ω(d) d =W(z)+ ∑ d\Pz ∑ p\d μd pχi(gcd(d,P(p+,z))) χi(gcd(d,P(p,z)))ω(d) d . Let d = δpt, where δ\Pp and t\P(p+,z). Rewriting the above expression we get: ∑ d\Pz μ(d)χi(d) ω(d) d =W(z)+ ∑ p ω(t) =W(z)+( 1)i 1 ∑ p ω(p) p W(p) ∑ t\P(p+,z)χi(t)(1 χi(pt)) t ω(t), where we have used χi(t) χi(pt) = ( 1)i 1μ(t)χi(t)(1 χi(pt)) if pt\Pz and p < q(t). To verify this, if χi(t) = χi(pt), then both sides are 0, and this is the case if χi(pt) = 1 (since the χi are divisor closed). Now if χi(t) = 1 and χi(pt) = 0, then from the properties of χi listed above, we have that μ(t) = ( 1)i 1, and so the relation holds. So nally we get ∑ d\Pz μ(d)χi(d) ω(d) d =W(z)1+( 1)i 1 ∑ p ω(p) p W(p) W(z) ∑ t\P(p+,z) χi(t)1 χi(pt) t ω(t). 2.2. BRUN’S SIEVE 39 This identity holds in general for every combinatorial sieve with χi satisfying the properties listed above, provided W(z) and W(p) are well de ned. This will happen if g(d) stays bounded. Construction of the Divisor sets: Let r be a positive integer and let zi for 1≤i≤r be real numbers. We will divide the interval [2···z] into r intervals as follows: let 2 = zr < zr 1 <···< z1 < z0 = z. Let d\Pz and βn = gcd(d,P(zn,z)) for 1≤n≤r. Let us set χi(d) = 1 if for all 1≤n≤r we have ν(βn)≤A+Cn, where A and C will be picked to make χi an acceptable function. For the current choice χi is already divisor closed, so the only property we need to check is: χi(t) = 1,μ(t) = ( 1)i pt\Pz,p < q(t) : χi(pt) = 1. Let zm ≤ p < zm 1. Since χi(t) = 1 we should have ν(βm)≤ A+Cm. If ν(βm) < A+Cm then χi(pt) = 1. Now if ν(βm) = A+Cm, then we also have μ(t) = ( 1)i. By de nition μ(t) = ( 1)ν(t), since ν(t) = A+Cm, we have μ(t) = ( 1)A+Cm. This suggests that we set A = B i. Then we have that ( 1)B+Cm = 1 or B+Cm should be even. If we make B+Cm an odd number, then the assumption that χi(t) = 1 and μ(t) = ( 1)i results in a contradiction. Consequently, ν(βm) = A+Cm cannot happen if χi(t) = 1. For some integer b we set B = 2b 1 and C = 2. This suggests using ν(βn)≤ 2b 1+i+2n to be the condition on the number of factors of d in the interval [zn,···,z). Summarizing, the characteristic functions of the divisor sets will be (for i = 1,2) χi(d) =(1 if m : 1≤m≤r, ν(βm)≤2b i 1+2m, 0 otherwise. The construction was such that the above function is the characteristic function of an acceptable divisor set. Derivation of the Sieve bounds: Now ∑ 1≤n≤r ∑ zn≤p ω(p)W(p) pW(z) ∑ t\P(p+,z) χi(t)1 χi(pt) t ω(t) ≤ ∑ 1≤n≤r W(zn) W(z) ∑ z≤p ω(p) p ∑ t\P(p+,z) χi(t)(1 χi(pt)) t ω(t). We have used the fact that W(p) ≤W(zn) if zn ≤ p < zn 1. Now for each t which makes a contribution we have χi(pt) = 0 and χi(t) = 1. So we must have ν(t) = 2b i+2n 1for zn ≤ p < zn 1. Hence this sum is at most ∑ 1≤n≤r W(zn) W(z) ∑ d\P(zn,z) ν(d)=2b i+2n ω(d) d , and so ∑ p ω(p)W(p) pW(z) ∑ t\P(p+,z) χi(t)1 χi(pt) t ω(t)≤ ∑ 1≤n≤r W(zn) W(z) 1 (2b i+2n)! ∑ zn≤p ω(p) p (2b i+2n). Now to simplify this sum further we have to make some assumptions about the function ω(p); instead of assuming ω(p) = O(1) we shall use the more general assumption: ∑ w≤p ω(p)ln p p ≤κlnz w+η, for 2≤w≤z.(2.12) If indeed we had ω(p) = 1, then we have ∑ w≤p lnp p ≤lnz w+1, for 2≤w≤z. 40 2. THE COMBINATORIAL SIEVE So the assumption we have made is an assumption on the average distribution of ω(p). Such an assumption usually holds, and is much easier to verify in more complicated situations. A question we can ask is: Does the above assumption imply a bound for the sum ∑ w≤p ω(p) p Let S(k)≡ ∑ w≤p ω(p)ln p p . Since S(k) S(k 1)= ω(k)lnk k if k is prime we have ∑ w≤p S(k) S(k 1) lnk = ∑ w≤k S(k) 1 lnk 1 lnk+1! = ∑ w≤k S(k) ln(k+1) lnk lnklnk+1 ! Now ln(k+1) = lnk+ln1+ 1 k, and since 1+x≤ex we have ln1+ 1 k≤ 1 k . Then ∑ w≤p ω(p) p ≤ ∑ w≤k S(k) kln2 k (2.13) ≤ ∑ w≤k κlnk w+η kln2 k (2.14) ≤κ ∑ w≤k 1 klnk lnw ∑ w≤k 1 kln2 k +η ∑ w≤k 1 kln2 k (2.15) ≤κlnlnz lnw+ η lnw .(2.16) Here we have used z w 1 xlnx dx = lnlnw+lnlnz, and z w 1 xln2 x dx = 1 lnw 1 lnz . Now returning to our original problem we need bounds on W(zn) W(z) = ∏ zn≤p 1 1 ω(p) p , 2.2. BRUN’S SIEVE 41 and so ln W(zn) W(z) ≈ ∑ zn≤p ω(p) p . Our assumption (2.12) yields a bound on ∑ zn≤p ω(p) p by (2.16). Thus we expect that a bound of the form W(zn) W(z) ≤eγnλ+ c lnz can be enforced with some— constants γ,λ and c. This can be achieved for example with a double-exponentialfall-off of zn with respect to z, in fact this is what we shall do later. If a bound for W(zn) W(z) of the above form exists, then this also gives us (as we might expect) ∑ zn≤p ω(p) p ≤ ∑ zn≤p ln 1 1 ω(p) p ≤ln W(zn) W(z) < γnλ+ c lnz. Let f = c lnz, and suppose we can enforce γ = 2 (this helps in the simpli cation to follow). Then ∑ 1≤n≤r W(zn) W(z) ∑ d\P(zn,z) ν(d)=2b i+2n ω(d) d ≤ ∑ 1≤n≤r e2nλ+2f (2nλ+2f)2b i+2n (2b i+2n)! ≤ ∑ 1≤n≤r e2feλ2n (2n)2b i+2n (2n)!(2n)2b i1+ f n2b i+2n (since (2b i+2n)!≥(2n)!(2n)2b i) = ∑ 1≤n≤r e2f(λeλ)2n(2ne 1)2ne2n (2n)! (λ2b i)1+ f nλ2b i1+ f nλ2n = e2f(λ+ f)2b i ∑ 1≤n≤r (2ne 1)2n (2n)! (λe1+λ)2n1+ f nλ2n since (ne 1)n n! is decreasing, and1+ f nλ2n ≤e2f λ . Also assuming λe1+λ ≤1; ∑ 1≤n≤r W(zn) W(z) ∑ d\P(zn,z) ν(d)=2b i+2n ω(d) d ≤e2f(λ+ f)2b i2e 2e2f λ ∑ 1≤nλe1+λ = 2λ2b i+2e2λ 1 (λe1+λ)21+ c λ2b ie2f(1+1 λ) ≤ 2λ2b i+2e2λ 1 (λe1+λ)2 e(2b i+4) f λ . Thus ∑ d\Pz μ(d)χi(d) ω(d) d =W(z)1+2θ λ2b i+2e2λ 1 (λe1+λ)2 e(2b i+4) f λfor i = 1,2. 42 2. THE COMBINATORIAL SIEVE Now we have to bound the remainder term, which is signi cantly easier. Let us assume that ω(p) ≤ A for some constant A > 0. Then ∑ d\Pz χi(d)|Rd|≤ ∑ d\Pz χi(d)ω(d) ≤1+ ∑ p and set zr = 2. Here r is selected such that lnzr 1 = e (r 1)Λlnz > ln2, and e rΛ lnz≤ln2, so we have e(r 1)Λ < lnz ln2 ≤erΛ. Thus for a suitable constant B the remainder term becomes ∑ d\Pz χi(d)|Rd|≤Bz lnz2b i+1 ∏ 1≤n z2 n. Now Be 1 2 rΛ lnz ≤ BeΛ/2 lnz rlnz ln2 < 1, and also ∏ 1≤n≤r 1 z2 n = exp2lnz ∑ 1≤n≤r 1 e nΛ≤z 2 eΛ 1 . Thus ∑ d\Pz χi(d)|Rd|= Oz2b i+1+ 2 eΛ 1for i = 1,2. We still have to check that W(zn) W(z) ≤e2(nλ+f). By our assumptions about the sum ∑w≤p 2.2. BRUN’S SIEVE 43 If 1≤ 1 1 ω(p) p ≤A, then c = η 21+Aκ+ ηA ln2. Since Λ > 0 we have enΛ 1 n ≤ erΛ 1 r , and this is at most Λ erΛ rΛ ≤Λ eΛ ln2 lnz ln(lnz/ln2) . So we get W(zn) W(z) ≤e2c exp nΛκ1+ 2ceΛ κln2 1 ln(lnz/ln2)!for n = 1,···,r. To meet our conditions on W(zn) W(z) we take Λ = 2λ κ 1 1+ε ε = 1 δe 1 κ , and so e 2λ κ eΛ ≤2λ κ Λe2λ κ ≤εΛe1 κ . Since eΛ 1≥Λ we have e 2λ κ 1 eΛ 1 ≤1+ εΛe 1 k eΛ 1 ≤1+εe 1 κ = 1+ 1 δ . With ξ = 1+ 1 δ we obtain ∑ d\Pz χi(d)|Rd|= Oz2b i+1+ 2ξ e 2λ κ 1for i = 1,2. Thus we have proved the following theorem. THEOREM 2.2.2. Assume that 1≤ 1 1 ω(p) p ≤A, ∑ w≤p ω(p)ln p p ≤κlnlnz lnw+ η lnw , and |Rd|≤ω(d). Let λ be such that 0 < λe1+λ < 1. Then S(A;Pz,x)≤xW(z)1+2 λ2b+1e2λ 1 (λe1+λ)2 exp(2b+3) c λlnz+Oz2b 1+ 2ξ e 2λ κ 1,(2.17) 44 2. THE COMBINATORIAL SIEVE and S(A;Pz,x)≥xW(z)1 2 λ2be2λ 1 (λe1+λ)2 exp(2b+2) c λlnz+Oz2b 1+ 2ξ e 2λ κ 1,(2.18) where c = η 21+Aκ+ η ln2 and ξ = 1+ε for 0 < ε < 1. Application to the Twin Primes problem : We set A ={n(n+2) | n ≤ x}. In this case we have ω(2) = 1 and ω(p) = 2. Further, all the conditions of Theorem (2.2.2) hold, and the lower bound is seen to be positive. Thus (2.18) tends to in nity with x, ([HR74], p.63) for z = x 1 u with u < 8. This implies that every divisor of a number in the sifted set is≥x 1 u so each number in the sifted set can have at most u < 8 factors1. Thus we have the following theorem. THEOREM 2.2.3. There are in nitely many n such that ν(n(n+2))≤7. We will look at some interesting applications of Brun’s sieve in the following sections. 2.3. Orthogonal Latin Squares and the Euler Conjecture DEFINITION 2.3.1. A Latin square of order n is an n×n matrix with entries in S ={1,···,n}such that every row and column is a permutation of the set S. DEFINITION 2.3.2. Two Latin squares A and B or order n are said to be mutually orthogonal if the n2 pairs (aij,bij) are distinct. Here is a Latin square of order 3: A = 1 2 3 2 3 1 3 1 2 ,and here is a latin square that is orthogonal to it: B = 1 2 3 3 1 2 2 3 1 .Euler conjectured that there are no mutually orthogonal Latin squares of order n, where n≡2 mod 4. The conjecture was disproved for the case n = 10, and later Bose, Parker and Shrikande [BPS60] showed that for every higher n > 6 the conjecture was false. Let ⊥(n) be the number of orthogonal latin squares of order n. Chowla, Erd os and Straus [CES60] building on this and some previous results, established that ⊥(n) > 1 3n 1 91 for large enough n. The proof involves an interesting use of the Brun Sieve, and we shall give an account of this. The exponent 1 91 is far from optimal and has been subsequently improved. The starting point for the proof is the following pair of results: THEOREM 2.3.3. [BPS60] If k≤⊥(m)+1 and 1 < u < m then ⊥(km+u)≥min{⊥(k),⊥(k+1),⊥(m)+1,⊥(u)+1} 1. THEOREM 2.3.4 (MacNiesh). 1. ⊥(ab)≥min{⊥(a),⊥(b)}; 2. ⊥(q) = q 1 if q is a power of a prime. First we shall prove the following: THEOREM 2.3.5. lim n→∞⊥(n) = ∞. 1For a similar derivation see Theorem (2.3.6). 2.3. ORTHOGONAL LATIN SQUARES AND THE EULER CONJECTURE 45 Proof : The idea is to have a lower bound on each of the quantities involved in Theorem (2.3.3), and then use the theorem with km+u = n. Let x be a large positive integer. If k+1 = ∏ p≤x px, then by Theorem (2.3.4) we have⊥(k+1)≥2x 1≥x. Also since k≡1 mod p for p≤x all the prime factors k are larger than x, so applying Theorem (2.3.4) again we have⊥(k)≥x. Now we select m in two pieces m1 and m2. The rst piece is set to be m1 = kk ∏ q6\n q≤x qk. Note that m1 is bounded in terms of x alone. Now if n is large enough the interval n (k+1)m1 ··· n 1 km1 contains an integer m2 such that m2 ≡1 mod k!, simply because the length of the interval becomes larger than k!. Now set m = m1m2 then⊥(m)≥min{⊥(m1),⊥(m2)}≥min{2k 1,k}≥k. Thus we have⊥(m)+1≥k to satisfy the condition of Theorem (2.3.3). Set u = n km; we need to bound⊥(u), but rst we need to verify that 1 < u < m. We have n (k+1)m1 < m2 < (n 1) km1 or n (k+1) < m < n 1 k . This yields km+1 Note that this has already disproved Euler’s conjecture. It is clear that our method of proof relied on our ability to produce some numbers with large prime factors and some congruence properties, this indicates that a sieve argument might help. The necessary machinery from sieves is encapsulated in the following theorem: THEOREM 2.3.6. [Rad24] Let p1,···,pr be primes, and let ai < pi,bi < pi be non-negativeintegers for 1≤i≤r. Let D > 1 be an integer with gcd(D,pi) = 1 for each i, 1≤i≤r, and Λ is an integer, 0 < Λ < D such that gcd(Λ,D) = 1. Let P(D,x; p1,a1,b1; p2,a2,b2;···; pr,ar,br) = REMARK 2.3.7. The original theorem has 7.9 instead of our slightly worse 7.938, but this can be improved using a more detailed analysis of our proof. Proof : The quantity S(A;Pz,x) is the number of integers in A that are6≡0 modulo pi for each pi ∈P,pi ≤z. In this case we have two constraints for each prime pi. But we can collapse these two constraints into one as follows. The constraint for the prime i is that n6≡ai, n6≡bi modulo pi. So the constraint fails iff (n ai)(n bi)≡0 mod pi. Let A = {n ≤ x | n ≡ Λ mod D}, Api = {n ≤ x | (n ai)(n bi) ≡ 0 mod pi}, and if d = pi1···pik then Ad = {n≤x| ∏1≤j≤k(n aij)(n bij)≡0 mod d}. Suppose|Api|= ω(pi) pi x+Rpi; then we see that if d is squarefree then |Ad|= ω(d) d x+Rd, where ω(d) is de ned multiplicatively. Thus we are interested in the estimate: 46 2. THE COMBINATORIAL SIEVE P(D,x; p1,a1,b1;···; pr,ar,br) =|A| ∑|Api|+∑|Apipj| ··· = ∑ d\p1···pr μ(d)|Ad|, which is just the sieve estimate. The congruence (n ai)(n bi)≡0 mod pi has at most 2 solutions modulo pi so ω(pi) = 2 for each i. We will try to apply Brun’s Sieve to this problem. We just need to verify that the conditions of the proof of Theorem (2.2.2) are valid. First 1 1 ω(p) p ≤3 so A = 3. Next ∑ w≤p 1+ 2+2ε eλ 1 ≤u≤7.938 and 1 2(λeλ)2 1 (λe1+λ)2 > 0. Then the second condition implies λeλ < 1 √2+e2 ≈0.3263540699··· and the rst implies 2+2ξ 6.938 +1≤eλ. Now set ξ = 10 9, so we must have λ≥log1.288267513692707. This value of λ also satis es the other constraint. Now we take z = pr, and using|Ax|= x D +θ,|θ|< 1, S(A;Ppr,x)≥ Cx D ∏ 1≤i≤r1 2 pi+O(p7.938 r ), and also ∏ i 1 2 pi≤ ∏ p≤pr1 2 pi. Now in ln ∏ p≤pr1 2 pi= 2 ∑ p≤pr 1 pi 2 ∑ p≤pr ∑ m>1 1 mpm the second sum converges, so we have ∏ p≤pr1 2 pi= 1 ln2 pr +o 1 ln2 pr. 2.3. ORTHOGONAL LATIN SQUARES AND THE EULER CONJECTURE 47 The theorem follows. Now we have the following simple lemma: LEMMA 2.3.8. For all c,0 < c < 1, the number of integers y≤x that are divisible by a prime factor p > nc of n, is at most x cnc . Proof : At most x p integers y≤x are divisble by p and so the total number of such integers is given by: ∑ p\np >nc x p ≤ x nc ∑ p\np >nc 1 ≤ x cnc . The last part follows because, there can be at most 1/c prime factors of a number n that are greater than nc. THEOREM 2.3.9. [CES60] There is an n0 > 0 such that for all n > n0,⊥(n) > 1 3n 1 91 . Proof : The idea as before is to apply Theorem (2.3.3) to suitable k, m and u for a given n such that n = km+u. For this to yield a lower bound on⊥(n) we need lower bounds on⊥(k),⊥(k+1),⊥(m) and⊥(u). We begin with the selection of k: we need k as well as k+1 to have no small prime factors. This is exactly the sort of problem handled by the theorem we have just proved. It turns out that the constraints on k depend on the parity of n. Case 1. (n even). Consider the constraints: k≡ 1 mod 2b 1 91 lgnc k6≡0 or 1 for p≤n 1 10 and k < n 1 10 . The rst congruence restricts k to lie in an arithmetic progression with difference 2b 1 91 lgnc < c1n 1 91 . The second incongruence implies that both k and k +1 are free of small prime factors, apart from the large power of 2 dividing k+1. Now applying Theorem (2.3.6) there are at least: Cn 1 10 c1 1 902 n 1 91 log2 n C0n79.38 10 1 90 = c2 n 81 910 log2 n C0n79.38 900 > c3 n 81 910 log2 n values of k satisfying the constraints. By Lemma (2.3.8) the number of integers below n 1 10 that have a prime factor greater than n 1 90 in common with n is at most 90n 8 90 . Thus from the bound for the values of k we have that there is a k such that gcd(k,n) = 1. Just by our selection of k we have that k has no small prime factors and though k+1 has 2 as a prime factor we still have that k+1≡0 mod 2b 1 91 lgnc and all the other factors are bigger than n 1 90 so using Theorem (2.3.4) ⊥(k) > n 1 90 1 > 1 3 n 1 91 ⊥(k+1) > min1 2 n 1 91 ,n 1 90 1 > 1 3 n 1 91 . We now set n = n1 +n2k where 0 < n1 < k. We cannot directly use n1 and n2 in our application of Theorem (2.3.3), since we have no bounds for⊥(n1) and⊥(n2). Though we have freedom in our choice of m we are still forced to pick k as our quotient in the division of n by m to write n = km+u. This suggests picking a u subject to certain conditions and then set m = n u k . Again this immediately restricts us to look at numbers that are congruent to n1 modulo k. Let 48 2. THE COMBINATORIAL SIEVE u = n1 +u1k where u1 is picked according to the following conditions: u1 6≡n1 mod 2, u1 6≡ n1 k mod p,p6\k, u1 6≡n2 mod p, for 3≤ p≤k, and u1 < n 159 200 . The rst incongruence forces u1k to be of opposite parity from n1 and always xes u to be odd. In this setup we will set m = n u k = n2 u1. We want m to be free of small prime factors to guarantee a good lower bound for ⊥(m) and this is taken care by the third incongruence. Meanwhile, the second incongruence arranges for u itself to have no small prime factors. The limit on u1 is forced on us because of the limitations of Theorem (2.3.6). The restrictions of the incongruences modulo the primes 2,3, and 5 can be handled by restricting u1 to belong to an arithmetic progression with difference 30. To apply Theorem (2.3.6) we need gcd(u1,30)= 1. If we had gcd(u1,30)> 1, then we can set u0 1 = u1 gcd(u1,30) and apply Theorem (2.3.6). Thus there are at least Cn 159 200 30log2 k C0k79.38 10 > c4 n 159 200 log2 n C0n79.38 100 > 0 choices for u1, if n is large enough. Now u is not divisible by any prime p ≤ k. First suppose that p 6\k, then this contradicts the incongruence n1 6≡ u1k mod p. Next, if p\k, then p\n1 which implies p\n a contradiction to gcd(k,n) = 1. Thus⊥(u)≥k, but k is not divisible by any prime≤n 1 90 , so ⊥(u)≥n 1 90 > 1 3n 1 91 . Now as promised we set m = n u k , we need to verify that m > u > 1 to apply Theorem (2.3.3), and observe that m > n n 1 10 (1+n 159 200 ) > 1 2 n 9 10 > n 1 10 +(1+n 159 200 ) > u > 1, for large enough n. Furthermore, all prime factors of m exceed k by our choice of u, and hence: ⊥(m)≥k > 1 3 n 1 91 . Finally putting all these together and applying Theorem (2.3.3) we get: ⊥(n)> 1 3n 1 91 for large enough even numbers n. Case 2. (n odd). We apply Theorem (2.3.6) to k+1 instead with the following constraints: k+1≡1 mod 2b 1 91 lgnc k+16≡0 or 1 mod p,p≤n 1 90 k+1≤n 1 10 . Now the argument proceeds with the role of k and k+1 interchanged, and the second set of constraints becomes: u1 6≡n2 mod 2, u1 6≡ n1 k mod p,p≤k,p6\k, u1 6≡n2 mod p,p≤k, and u1 < n 159 200 . So here both n and m are odd. The argument then proceeds similarly. Better estimates for⊥(n) are known— for example in [Wil74] a bound⊥(n)≥n 1 17 2 is proved (for large enough n). The current best estimate seems to be⊥(n)≥n 1 14.8 [Be83]. 2.4. A THEOREM OF SCHINZEL 49 2.4. A Theorem of Schinzel In this section we will give an application involving a variation of Theorem 2.3.6, where we look at some constant number of constraints. The proof is an interesting use of Brun’s sieve. THEOREM 2.4.1. [Sch66] For all positive integers h and N ≥3 there is an integer D such that: 1. 1≤D≤(logN)20h; 2. gcd(iD+1,N) = 1, for 1≤i≤h. Proof : For h = 1 we can take D = q 1, where q is the least prime not dividing N. Since ∑p≤D logp≤logN, we have from [RS62] Theorem 10, that either D≤100 or 0.84D≤logN. Since D≤N we have D≤(logN)20, for all N ≥3. If N ≤(logN)20h, then D = N satis es the conditions of the theorem, so we can assume N > (logN)20h, with h≥2. Now N > (logN)20h logN > 20hloglogN. If logN < 110h, then N < e110h and (logN)20h ≥(110h)20h = elog110h20h ≥elog110+logh20h = e94.0069h+20hlogh ≥e114.0096h, which is a contradiction to N > (logN)20h. Hence we must have logN ≥ 110h, and loglogN ≥ log110+logh ≥ 5.3936, or loglogN > 5. Let H = ∏p≤10h p, and we let p1,···,pr be the primes pi > 10h such that pi\N. Let p1 < p2 < ··· < pr. Let P(H,x; p1,···,pr) be the number of integers n≤x such that n≡0 mod H, and ( i j) : 1≤i≤h,1≤ j ≤r : in+16≡0 mod pj. Since pi > 10h for all the values of i in the incongruences, i is invertible. Thus, the above constraints are equivalent to a system of h incongruences per prime (we had 2 such constraints in Theorem 2.3.6). Thus we have a system of incongruences: x6≡aij mod pj, for some aij. Here we are in a special situation of the Sieve problem. The number of primes with respect to which we sift the sequence is very small, namely we sift only by the prime factors of N, of which there can be at most logN. Hence we shall re-do the analysis of the Brun sieve and thereby get a better estimate. Let A ={n≤x|n≡0 mod H}, P = ∏1≤i≤r pi and let Apj =n∈A| ∏ 1≤i≤h (n aij)≡0 mod pj. We extend the notation to Ad for d a divisor of P. We have that P(H;x; p1,···,pr) = ∑ d\P μ(d)|Ad|. For |Apj|, we can select ω(p) = hx Hpj , and Rpj ≤ h since for each congruence there is an error of at most 1 in the approximation. The denominator H can be taken out of our analysis if we set x← x H . We also have that Rd ≤ω(d). 50 2. THE COMBINATORIAL SIEVE Hence W(k) = ∏ 1≤i≤k1 h pi. From our earlier work in section (2), we have P(H;x,p1,···,pr) > xW(pr) H (1+Θ)+R, where Θ = 1 ∑ i≤r ω(pi) pi W(pi) W(pr) ∑ t\P(i···r] χ(t)(1 χ(pt)) t ω(t) and P(i···r) = ∏i Following the same argument as in Section 2 (with b = 1), we arrive at the following upper bound for Θ: ∑ 1≤n≤t W(rn) W(r) 1 (2n+1)! ∑ rn≤i≤r ω(pi) pi 2n+1. We will show later that we can pick ri such that W(rn) W(r) = 1 ∏rn≤i≤r1 h pi≤enγ,where γ = log1.3. As before ∑ rn≤i≤r ω(pi) pi ≤logW(rn) W(r)≤nγ. So the bound for Θ is ∑ 1≤n≤t enγ (2n+1)! (nγ)2n+1 = ∑ 1≤n≤t (ne 1)2n+1 (2n+1)! e2n+1γ2n+1enγ ≤ 1 e3(3!) ∑ 1≤n≤t (γe1+γ)2n+1 (since (ne 1)2n+1 (2n+1)! is decreasing) ≤ 1 e3(3!) γe1+γ ∑ 1≤n<∞ (γe1+γ)2n = 1 e3(3!) γe1+γ 1 1 (γe1+γ)2 . The last step follows because γe1+γ < 1. The nal expression is≈0.05478< 1. Thus Θ < 1. Let us de ne the intervals by selecting ri (for 1≤i≤t), as the least index such that πi = ∏ ri 2.4. A THEOREM OF SCHINZEL 51 Since pi > 10h this is always possible. This automatically satis es the requirements set earlier on γ. Select t such that πt = ∏ 1≤k≤rt 11 h pk≥ 1 1.3 . Since pi > 10h we have 1 h pi > 1 h 10h = 9 10 so 9 10 πi =1 h 10hπi <1 h priπi, which by the de nition of ri is such that < 1 1.3 . Thus πi ≤ 10 9 1 1.3 = 1 1.17 < 8 9 . We will show that log ∏ 1≤i≤r1 h pi> hloglogN elogeh > 0.2hloglogN. Using the series expansion of log(1+x) we see that log ∏ 1≤i≤r1 h pi+log ∏ 1≤i≤r1 h pi h ≥ ∑ 1≤i≤r ∑ 2≤m 1 mh pim ≥ ∑ 1≤i≤r 1 2 ∑ 1≤mh pim = 1 2 ∑ 1≤i≤rh pi2 1 1 h pi. We need a good bound on ∑i 1 p2 i . We have by [RS62] (p.87), that ∑ x
1 pn ≤ 1.02n xn 1 lnx . Using this with n = 2 and x = 10h, (all the primes pi > 10h by our choice) we have ∑ 1≤i≤r 1 p2 i ≤ 2.04 10hlog10h . Thus 1 2 ∑ 1≤i≤r 1 1 h pih pi2 ≥ 5 9 h2 ∑ 1≤i≤r 1 p2 i ≥ 0.2h log10h . Now if we can bound from above log ∏ 1≤i≤r1 h pi h, 52 2. THE COMBINATORIAL SIEVE then we can obtain a lower bound on log∏1≤i≤r1 h pi. Let N0 = N gcd(H,N). We have A (A) 1 ∏1≤i≤r1 1 pi = AN0 (AN0) . By [RS62] Theorem (15): For n≥3 n (n) < eγ loglogn+ 5 2loglogn , where γ is the Euler constant. Also by [RS62] Theorem (9): logH < 11h < 0.1logN. Using this we have HN0 (HN0) < eγ loglogHN0+ 2.51 loglogHN0 < eγ loglogN0+ eγ 10 + 2.51 5 < eγ(loglogN +0.4), as HN0≥N, loglogN > 5 by our conditions, and also N0≤N. Now by [RS62], where a lower bound of e γ logx1 1 log2 xfor ∏p≤x 1 1 1 p is given, we have: H (H) > eγ log10h1 1 2log2 10h> eγ(logh+2.1). Since loglogN > log10h, ∏ 1≤i≤r1 h pi 1 < 1 eγ(logh+2.1)eγ loglogN +0.4 yielding hlog ∏ 1≤i≤r1 1 pi< hlog(loglogN +0.4) log(logh+2.1)+ 0.2 log10h and nally log ∏ 1≤i≤r1 h pi> hlogloglogN loglogeh. Using logx loga = 1+logx ae≤ x ae, we have log ∏ 1≤i≤r1 h pi> hloglogN elogeh . Since πi ≤ 1 1.17, we obtain (t 1)log1.17≤log ∏ 1≤i≤r1 h pi 1 ≤ hloglogN elogeh < hloglogN elog(h+1) . This yields (2t +1)log(h+1) < 3log(h+1)+ 2hloglogN elog1.17 < 3log(h+1)+4.7hloglogN. 2.4. A THEOREM OF SCHINZEL 53 Now pi > ilogi, by [RS62] (Corollary to Theorem 3). Hence logπi = ∑ rn
log1 h pi > 10 9 ∑ rn
h ps > 10h 9 rn 1 rn dt t logt = 10h 9 log logrn 1 logrn . Since πi ≤ 1 1.17, we have logrn logrn 1 < 1 1.17 9 10h <1+ 9 10h log1.17 1 ≤(1+0.141h 1) 1, and so logrn logr < (1+0.141h 1) n for 1≤n≤t 1. Further logN ≥ ∑ 1≤i≤r logpi > rlog10h≥rlog20, so logr < loglogN 1. Now for the remainder term: R = ∑ d\P χ(d)|Rd|≤1+ ∑ 1≤i≤r ω(pi) ∏ 1≤i≤t 11+ ∑ j≤ri ω(pj)2 (since ω(p) = h) ≤(1+hr) ∏ 1≤i≤t 1 (1+hri)2. Thus logR≤log(1+h)+logr+2(t 1)log(h+1)+2 ∑ 1≤i≤t 1 logri = (2t 1)log(h+1)+logr+2 ∑ 1≤i≤t 1 logri < 3log(h+1)+4.7hloglogN +(loglogN 1)2 ∑ 0≤n (1+0.141h 1) n 1 < 3log(h+1)+4.7hloglogN +(loglogN 1)(14.2h+1) < 19.4hloglogN 11h 1. Since logH < 11h, we have logR < 19.4hloglogN logH 1, and logc(logN)20h H ∏ 1≤i≤r1 h pi> logR, 54 2. THE COMBINATORIAL SIEVE where c = 1 0.05478. Thus P(H,(logN)20h,p1,···,pr) > 0. Thus there is an integer D satisfying the conditions of the theorem. 2.5. Smooth Numbers Here we illustrate the surprising power of the indentity proved in Proposition 2.2.1. Let Px z ={p|z≤p THEOREM 2.5.1 ([Hal70]). Let y = x 1 θ where 1 < θ≤2. Then Ψ(x,y) = x1 logθ+O 1 logx. Proof : Applying the identity (2.19) with z = x and z1 = y, we have Ψ(x,y) = Ψ(x,x) ∑ y≤p Ψx p ,p.(2.20) Now Ψ(x,x) =bxc. Since 1 < θ≤2, p≥√x, we have that x p ≤√x≤ p. Consequently, Ψx p,p= x p.Substituting in (2.20), we have Ψ(x,y) =bxc ∑ y≤p The recurrence formula can be used to convert upper bounds to other useful lower bounds, and can also be used iteratively. Here is a simple example. Let us try to evaluate Ψ(x,x 1 δ ) for 2 < δ < e using the recurrence formula Ψ(x,x 1 δ ) = Ψ(x,x 1 2 ) ∑ x 1 δ≤p≤x 1 2 Ψx p ,p. 2.6. ON THE NUMBER OF INTEGERS PRIME TO A GIVEN NUMBER 55 Applying the trivial bound Ψx p,p≤ x p ∑ x 1 δ≤p≤x 1 2 Ψx p ,p≤x ∑ x 1 δ≤p≤x 1 2 1 p = x(logδ log2). Now applying the theorem with θ = 2, we have Ψ(x,x 1 2 ) = x1 log2+O 1 logx. Thus we obtain Ψ(x,x 1 δ )≥x1 logδ+O 1 logx. Of course, in this case we could have directly derived this result as in the theorem, but this just is an illustration of the usage of Buchstab’s identity. In estimating ψ(x,y) we could try to use Brun’s sieve as in section (1.3). It is clear however, that to obtain a good estimate we need to take lnz < εlnx, but this would make the error term very large, since that depends on the size of the interval x z. 2.6. On the number of integers prime to a given number Let k > 1 be an integer and x > 1 a real number, here we will nd bounds for the sum: ∑ n≤xgcd (n,k)=1 1. It is clear that in every interval mod k there are (k) such integers. However, it is not clear how uniform the distribution of these numbers are inside the interval. The sequence to be sifted is A ={n|1≤n≤x}, and the sifting primes are P ={p| p\k}. We assume x≥k. In this case we can take|Ad|= x d +Rd, where ω(d) = 1 and Rd ≤1. Now, 1 1 1 p ≤2. Hence A = 2, we also have ∑ w≤p≤z p∈P ω(p)ln p p ≤lnlnz lnw+ 1 lnw thus κ = η = 1. To apply the lower bound estimate of the Brun sieve (with b = 1), we need to nd λ such that 1 2(λeλ)2 1+(λe1+λ)2 > 0 and 1+ 2.01 e2λ 1 < γ, where we have used ξ = 1.005. It turns out that we can take γ < 5, and satisfy both the constraints for λ = 0.204. This gives S(A;P,z)≥xW(z)1 o(1)+Oz4.85.T aking z = x 1 5 , we obtain S(A;P,z)≥c∏ p\k1 1 px+Ox0.97. Now to get the actual estimate ∑n≤x,n⊥k 1 we need to account for the numbers that might have been included in this estimate which are not really prime to k. Clearly, by our choice of the limit for z, each number which is over-counted must share a factor p with k that is larger than x 1 5 . Let us assume that the largest prime factor of k is < x 1 5 . 56 2. THE COMBINATORIAL SIEVE Thus we have: ∑ n≤xgcd (n,k)=1 1≥ c (k) k x+Ox0.97, where c < 1. For the upper bound we can take the same value of λ as for the lower bound but this forces us to take z = x 1 6 in this case and we get ∑ n≤xgcd (n,k)=1 1≤ c0 (k) k x+Ox0.975, where c0 < 4. In summary we have proved: THEOREM 2.6.1. Let x > 0 and k a positive integer whose largest prime factor p is less than x 1 5 . Then c (k) k x+Ox0.97≤ ∑ n≤xgcd (n,k)=1 1≤ c0 (k) k x+Ox0.975, where c < 1 and c0 < 4 are constants. CHAPTER 3 Selberg’s Sieve Around 1946 Atle Selberg introduced a new method for nding upper bounds to the sieve estimate [Sel47]. The method usually gives much better bounds than the Brun’s sieve. To obtain lower bounds one can couple the Selberg sieve with the Buchstab identities. After developing the basic ideas of this sieve technique, we shall look at the most important application of this method - to derive inequalities of the Brun-Titchmarsh type. 3.1. The Selberg upper-bound method Selberg’s method of estimating the sum S(A;Pz,x) = ∑ a∈A ∑ d\gcd(a,Pz) μ(d) relies on nding a sequence of numbers λd such that λ1 = 1 and using the inequality: S(A;Pz,x)≤ ∑ a∈A ∑ d\gcd(a,Pz) λd2. This allows us complete freedom in our choice of the numbers λd for d > 1, and the idea of this method is to select the λd such that the sum is minimized. Note that setting λ1 = 1 and λd = 0 for d > 1, leads to the trivial estimate S(A;Pz,x)≤|Ax|. Selberg’s method relies on choices of λd that mimic the cancellation occuring in the sum ∑d\nμ(d). Such choices lead to better estimates when we interchange the sum. Now ∑ a∈Ax ∑ d\gcd(a,Pz) λd2 = ∑ di\Pz i=1,2 λd1λd2 ∑ a∈Axa ≡0 mod D 1, where D = lcm(d1,d2) . By our conventions about the sequence A, we have ∑ a∈Axa ≡0 modD 1 =|Ax D|= ω(D) D x+RD. This yields, ∑ di\Pz i=1,2 λd1λd2|Ax D|, = x ∑ di\Pz i=1,2 λd1λd2 ω(D) D + ∑ di\Pz i=1,2 λd1λd2|RD| = xΣ1 +Σ2. The problem of selecting λd already seems dif cult. We can make the assumption that λd = 0 for d > z and hope that since the second sum σ2 contains only z2 terms we can concentrate on minimizing the leading sum σ1. Our rst effort will be directed towards this. Minimization of ∑1 : Using the fact that ω(d) is a multiplicative function, we have ω(D) D = ω(d1)ω(d2) ω(gcd(d1,d2)) gcd(d1,d2) d1d2 , 57 58 3. SELBERG’S SIEVE so Σ1 = ∑ di\Pz λd1λd2 ω(d1) d1 ω(d2) d2 gcd(d1,d2) d1d2 . Let f(d) = ω(d) d , so that the sum becomes Σ1 = ∑ di\Pz λd1λd2 f(d1)f(d2) f(d) ,(3.21) where d = gcd(d1,d2). We need to get rid of the term in the denominator, and to this end we introduce the function J(r) = 1 f(r) ∏ p\r1 f(p). Let r = ps, and consider: ∑ δ\ps J(δ) =∑ δ\sJ(pδ)+J(δ) =∑ δ\s 1 f(pδ) ∏ q\pδ1 f(q)+ 1 f(δ) ∏ q\δ1 f(q) =∑ δ\s J(δ) 1 f(p)1 f(p)+1 = 1 f(p) ∑ δ\s J(δ), together with ∑ δ\p J(δ) = J(p)+J(1) = 1 f(p) . Thus we have 1 f(d) = ∑ δ\d J(d). Substituting this for 1 f(d) in (3.21) we get, ∑ di\Pz λd1λd2 f(d1)f(d2) f(d) = ∑ di\Pz λd1λd2 f(d1)f(d2) ∑ δ\d1,δ\d2 J(d) = ∑ r≤z r\Pz J(r)∑ r\d d≤z λd f(d)2. Let ξr = ∑r\d d≤z λd f(d), so that Σ1 = ∑ r≤z r\Pz J(r)ξ2 r. This is what we need to minimize subject to the restriction λ1 = 1. We wish to write this constraint as a constraint among the variables ξi, which would allow us to convert the minimization problem to one entirely involving the variables ξi. 3.1. THE SELBERG UPPER-BOUND METHOD 59 The idea is to use M¨obius inversion to pick out λ1, and this is not dif cult: ∑ r≤z μ(r)ξr = ∑ r≤z μ(r) ∑ r\d d≤z λd f(d) = ∑ d≤z f(d)λd∑ r\d μ(d) = λ1 f(1) = λ1 = 1. Thus we need to minimize ∑r≤z J(r)ξ2 r, subject to the constraint ∑r≤z μ(r)ξr = 1. Let F = ∑r J(r)ξ2 r for some real . Since ∑r≤z μ(r)ξr = 1, we have F = ∑r J(r)ξ2 r ∑r μ(r)ξr. Minimizing F is the same as minimizing the function ∑r≤z J(r)ξ2 r. Let us try to complete the square term in the rst sum in F. This suggests setting ←2ω, so ∑ r≤z J(r)ξ2 r 2ω∑ r≤z μ(r)ξr = ∑ r≤z J(r)ξ2 r 2ωμ(r)ξr J(r) = ∑ r≤z J(r)ξ2 r 2ωμ(r)ξr J(r) +ωμ(r) J(r) 2 ∑ r≤z ω2μ(r)2 J(r) = ∑ r≤z J(r)ξr ωμ(r) J(r) 2 ω2 ∑ r≤z μ2(r) J(r) . Thus at the minimum value of F we should have ξr = ωμ(r) J(r) , and the minimum value of F would be ω2∑r≤z μ2(r) J(r) . To nd the value of ω, we can substitute ξr into the constraint ∑r≤z μ(r)ξr = 1, and this gives us immediately that ω = 1 ∑r≤z μ(r)2 J(r) . So min∑ r≤z J(r)ξ2 r = ∑ r≤z ω2 μ(r)2 J(r) = ω2 ∑ r≤z μ(r)2 J(r) = ω2 ω = ω = 1 ∑r≤z μ(r)2 J(r) . By our de nition of the function g(d) we have g(r) = 1 J(r), so ∑ r≤z μ(r)2 J(r) = ∑ r≤z μ(r)2g(r). Set G(z) = ∑ r≤z μ2(r)g(r). Then the minimum value of Σ1 is x G(z). Evaluation of Σ2: To estimate the remainder term Σ2, we need an estimate on the size of the λd. We had earlier used M¨obius inversion to extract λ1 from a combination of the ξr, and we can repeat the process to get λδ for any δ. 60 3. SELBERG’S SIEVE Now by de nition ξr = ∑ r\d d≤z λd f(d). Let r = γδ, so that ξγδ = ∑ γδ\d d≤z λd f(d) = ∑ γ\d δ d≤z λd f(d) = ∑ γ\v,v≤d δ gcd(v,δ)=1 λδv f(δv). Since we want to extract the term with γ = 1, we calculate: ∑ γ≤z δ γ⊥δ μ(γ)ξγδ = ∑ γ≤z δ γ⊥δ μ(γ) ∑ γ\v,v≤z δ v⊥δ λδv f(δv) = ∑ v≤z δ,v⊥δ λδv f(δv)∑ γ\v μ(k) = λδ f(δ). Thus λδ = 1 f(δ) ∑ γ≤z δ γ⊥δ μ(γ)ξγδ, and substituting for ξγδ gives λδ = ω f(δ) ∑ γ≤z δ γ⊥δ μ(γδ)μ(γ) J(γδ) = ωμ(δ) f(δ)J(δ) ∑ γ≤z δ γ⊥δ μ(γ)2 J(δ) . Let Gd(y) = ∑ δ μ2(δ)g(δ). Then λδ = ωμ(δ) f(δ)J(δ) Gδz δ.(3.22) 3.1. THE SELBERG UPPER-BOUND METHOD 61 We will show that|λd|≤1. Observe that G(z) =∑ l\d ∑ m≤zgcd (m,d)=l μ(m)2g(m) =∑ l\d ∑ h μ(lh)2h(lh) =∑ l\d μ(l)2g(l)Gdz l ≥∑ l\d μ(l)2g(l)Gdz d and ∑ l\d μ(l)2g(l) =∏ p\d1+g(p) =∏ p\d p p ω(p) = 1 ∏p\d1 ω(p) p , and so Gdz d≤∏ p\d1 ω(p) p G(z).(3.23) Now substituting for J(δ) in (3.22), we get: λd = μ(d) ∏p\d1 ω(p) p Gd(z/d) G(z) .(3.24) Thus by (3.23) and (3.24), we have|λd|≤1. Now Σ2 ≤ ∑ di Rlcm(d1,d2) . Fix a d; we can estimate the number of integers d1,d2 for which d = lcm(d1,d2) . Now d as well as d1 and d2 are squarefree. If d1 = ∏i pei i and d2 = ∏i pfi i , then d = ∏i pmax{ei,fi} i . Suppose p\d, then p\d1 or p\d2 or p divides both of them. So the number of integers which can give rise to d as their lcm is exactly 3ν(d). Using this and the fact that d < z2, we get Σ2 ≤ ∑ d 3ν(d)|Rd|. If we also have the remainder condition|Rd|≤ω(d), then we can simplify further: 62 3. SELBERG’S SIEVE ∑ d 3ν(d)|Rd|≤ ∑ d 3ν(d)ω(d) ≤z2 ∑ d\Pz 3ν(d)ω(d) d = z2 ∏ p Thus we have proved: THEOREM 3.1.1. If|Rd|≤ω(d), then S(A;Pz,x)≤ x G(z) + z2 W3(z) , where G(z) = ∑ r≤z μ2(r)g(r). The second term can also be upper bounded by ∑ d 3ν(d)|Rd|, which is also upper bounded by ∑ d μ2(d)3ν(d)|Rd|. Here Γ(d) stands for the set of prime divisors of d. We will apply the Selberg method to the simple but important case where ω(d) = 1 and|Rd|≤1. THEOREM 3.1.2. Suppose ω(d) = 1 and|Rd|≤1. If d is squarefree and p / ∈P p⊥d then S(A;P,z)≤ x ∏p g(d) = ω(d) d∏p\d1 ω(p) p where d\Pz. In this case we have ω(d) = 1, so we have g(d) = 1 (d) . Let k = ∏p 3.1. THE SELBERG UPPER-BOUND METHOD 63 Let Sk(z) = ∑ d μ2(d) (d) . Then S1(z) = ∑ d μ2(d) (d) =∑ l\k ∑ d μ2(d) (d) =∑ l\k ∑ h μ2(lh) (lh) =∑ l\k μ2(l) (l) ∑ h μ2(h) (h) =∑ l\k μ2(l) (l) Skz l ≤∑ l\k μ2(l) (l) Sk(z), because Sk(z) is an increasing function of z. Now ∑ l\k μ2(l) (l) =∏ p\k1+ 1 p 1 = 1 ∏p\k1 1 p= k (k) , and so Sk(z)≥ (k) k S1(z). To apply Theorem 3.1.1 we need a good lower bound on G(z). Since G(z) = Sk(z), the above derivation says that we can translate a lower bound on S1(z) to a lower bound on Sk(z). We have S1(z) = ∑ d μ2(d) d 1 ∏p\d1 1 p = ∑ d μ2(d) d ∏ p\d1+ 1 p + 1 p2 +···. 64 3. SELBERG’S SIEVE If we set (n) to be the largest squarefree divisor of n, then S1(z) = ∑ (n) 1 n ≥ ∑ n 1 n ≥logz. So Sk(z)≥ (k) k logz. We know from the proof of Theorem (3.1.1) that the remainder term is at most ∑ di\Pz di Thus S(A;Pz,x)≤ 1 ∏p x+z2. 3.2. The Brun-Titchmarsh Theorem The prime number theorem for arithmetic progressions states that π(x;l,k) = lix (k) +Oxe A√logx uniformly for k≤(logx)B, where B is any positive constant and A is a positive constant depending only on B. This is a very narrow range of values of k. It turns out that if we assume the Extended Riemann Hypothesis, then π(x;l,k) = lix (k) +O√xlogx uniformly for k≤ √x log2 x . By a careful analysis of the Selberg sieve (especially the remainder term) van Lint and Richert [vLR65] showed a good upper bound for π(x;l,k) valid for any k < x. In this section we shall look at the proof of this result (see Theorem 3.2.5). In a later chapter we shall improve on this result using the so called Large sieve. Let k,l > 0 be relatively prime integers, and let x,y > 1 be reals with y≤x. We will concentrate on the sequence A ={n|x y < n≤x, n≡l mod k}. For K a multiple of k, we take as the sifting primes PK ={p| p6\K}. First we shall prove a form of the Selberg sieve, where we have a better estimate of the remainder term. We de ne SK(z) = ∑ 1≤n≤z n⊥K μ2(n) (n) as in the proof of Theorem (3.1.2), and HK(z) = ∑ 1≤n≤x n⊥K μ2(n)σ(n) (n) with σ(n) = ∑d\n d. 3.2. THE BRUN-TITCHMARSH THEOREM 65 LEMMA 3.2.1. S(A;Pz K,x,y)≤ y kSK(z) + H2 K(z) S2 K(z) . Proof : The cardinality of the set AD ={n|x y < n≤x,n≡l mod k,n≡0 mod D} is y kD +RD. Following the proof of the Selberg sieve and using the analysis in Theorem (3.1.2) we get the rst term to be y kSK(z) . Now the remainder term is (using|Rd|≤1) at most ∑ di\PK i=1,2 |λd1λd2|=∑ d\PK |λd|2. In the notation of this proof we have λd = μ(d) d (d) SKdz dS K(z) so ∑ d\PK |λd|= ∑ 1≤d≤z d⊥K μ2(d)d (d) 1 SK(z) ∑ 1≤m≤z/d m⊥Kd μ2(m) (m) = 1 SK(z) ∑ 1≤d≤z d⊥K μ2(d) (d) ∑ 1≤m≤z/d m⊥kd μ2(m) (m) = 1 SK(z) ∑ 1≤d≤z d⊥K ∑ 1≤m≤z/d m⊥kd μ2(md) (md) d = 1 SK(z) ∑ 1≤n≤z n⊥K μ2(n) (n) ∑ d\n d = HK(z) SK(z) . Hence the remainder term is at most H2 K(z) S2 K(z) , and the lemma follows. Our aim now is to nd a good upper bound on H2 K(z). One idea is to use Cauchy’s inequality on this sum, and this suggests that we rst nd a concrete upper bound for the sum ∑n≤x,n⊥K 1, which we have seen in the last chapter. Using Theorem (3.1.2) we have THEOREM 3.2.2. If 1≤k < y≤x and P is a set of primes p with k⊥ p, then we have for any z≥2 that 66 3. SELBERG’S SIEVE Proof : Take k = 1,y = x and P ={p| p6\k}in Theorem (3.2.2). For z≤x we have ΦK(x) = +z2. Thus k (k) ΦK(x) x ≤ 1 ∏p≤x1 1 p 1 logz + z2 x, and using ∏ p≤x1 1 p 1 ≤eγlogx1+ 1 2log2 x and setting z = x 1 3 , we get k (k) ΦK(x) x ≤eγlogx1+ 1 2log2 x 3 logx + 1 x 1 3. The right hand side is decreasing, and for x = e6 is < 7. LEMMA 3.2.4. For z > 103, h even, H2 h(z) S2 h(z) < 22.5 h (h) z2 log2 z . Proof : Let Jh(z) = ∑ 1≤n≤z n⊥h μ2(n)σ2(n) 2(n) , and as above let Φh(z) = ∑1≤n≤z n⊥h 1. Now Hk(z) = ∑ 1≤n≤z n⊥k μ2(n)σ(n) (n) . Cauchy’s inequality states that ∑ 1≤n≤N anbn2 ≤ ∑ 1≤n≤N a2 n ∑ 1≤n≤N b2 n. Using this with bn = 1, an = μ2(n)σ(n) (n) and observing that μ4(n) = μ2(n), we have H2 h(z)≤Φh(z)Jh(z). Let n be an integer and p⊥n; then σ(np) = ∑ d\np d = ∑ d\n d + p∑ d\n d = σ(n)(1+ p), 3.2. THE BRUN-TITCHMARSH THEOREM 67 and also (np) = (n) (p). If n is squarefree, then σ2(np) 2(np) = σ2(n) 2(n)(1+ p)2 2(p) = σ2(n) 2(n) 2(p)+4p 2(p) = σ2(n) 2(n)1+ 4p 2(p). By induction we have σ2(n) 2(n) =∏ p\n1+ 4p 2(p) = ∑ d\n 4ν(d)d 2(d) ,μ2(n) = 1. Since 2\h we have Jh(z)≤J2(z) and J2(z) = ∑ 1≤n≤z n⊥2 μ2(n)∑ d\n 4ν(d)d 2(d) = ∑ 1≤d≤z d⊥2 μ2(d)4ν(d)d 2(d) ∑ 1≤m≤z/d m⊥2d μ2(m) ≤z ∑ 1≤d≤z d⊥2 μ2(d)4ν(d) 2(d) ≤z∏ p>21+ 4 (p 1)2 < 16 5 z. In the proof of Theorem (3.1.2) we had proved Sh(x)≥ (h) h logx; now using this and Lemma (3.2.3) we have: H2 h(z) S2 h(z) ≤ 7 (h) h z16 5 z 2(h) h2 log2 z = 22.5 z2 log2 z h (h) . THEOREM 3.2.5. If x and y are real numbers and k and l are integers satisfying 1≤k < y≤x with k⊥l, then π(x;k,l) π(x y;k,l) < 3y (k)logy k (3.25) and π(x;k,l) π(x y;k,l) < y (k)logqy k1+ 4 logqy k.(3.26) Proof: Let (x,y,k,l)=π(x;k,l) π(x y;k,l) andh= 2k gcd(2,k). Then thereis an l1 such that (x,y,k,l)≤ (x,y,h,l1)+ 1. For if k is even, then h = k, and we can take l1 = l. If k is odd, then the parity of mk+l changes alternately. In this case, we can set l1 to be the solution to l1 ≡1 mod 2 and l1 ≡l mod k. So at worst we miss one prime in the even subsequence. 68 3. SELBERG’S SIEVE By what we have proved so far, the sifting of the sequence A by Pz yields the following upper bound: (x,y,k,l)≤ (x,y,h,l1)+1(3.27) ≤ y (h)S1(z) + H2 h(z) S2 h(z) +π(z,h,l1)+1(3.28) ≤ y (k)S1(z) + H2 h(z) S2 h(z) +π(z,h,l1)+1 for any z > 1.(3.29) We begin with a trivial estimate (x,y,h,l1)≤ ∑ x y 1 ≤ y h +1. So (x,y,k,l)≤ y h +2. Let u =qy k. Since (k) = (h)≤ 1 2h, we have (x,y,k,l) y ≤ 1 k + 2 y . Using y = u2k we obtain (k) (x,y,k,l) y ≤ (k) k + 2 (k) y ≤ 1 2 + 2 (k) u2k ≤ 1 2 + 2 u2 . Thus Q = logqy k (k) y (x,y,k,l)≤logu1 2 + 2 u2 < 3 2 for 1 < u≤e2.9. Now π(z,h,l1)+1≤ ∑ 1≤n≤z,k⊥2 μ2(k)≤ z 1 2 for z≥9. The remainder term is at most ∑ d μ2(d)2 ≤ ∑ d μ2(d)2, since 2\h ≤z 1 2 2 if z≥9. By (3.27), and the above bounds we have: Q≤logu 1 logz + 1 u2z 1 2 2 + z 1 2 < logu 1 logz + z2 4u2if z≥9. De ne ω by u = ω √2eω, 3.3. PRELUDE TO A THEOREM OF HOOLEY 69 and set z = eω so that Q≤ logω √2+ω ω 1+ 1 2ωfor ω≥log9. For ω≥√2e > log9 this function is decreasing, and for ω =√2e it is < 3 2. This proves (3.25). Now (3.26) is a consequence of (3.25) for u≤e8. If e8 < u < e10, then using the above bound for Q, we obtain Q≤ logω √2+ω ω 1+ 1 2ω. If u > e8, then ω < 6.4 and this gives Q < 1.4 < 1+ 4 logu. This shows (3.26) for u < e10. Now using (3.27) and setting logz = logu 2, we get logry k (Q 1)≤logulogu logz 1+48 logu u2 z2 log2 z + logu u2 z, = logu 2 logu 2 + 48 e4 logu (logu 2)2 + logu e2u, which is a decreasing function in u. In particular it is < 4 if u≥e10. This proves (3.26). 3.3. Prelude to a theorem of Hooley In this section we will look at a variation of a problem of Chebyschev that we shall see in the next section. The problem is to prove a lower bound on the largest prime divisor of ∏ p≤x (p2 1) = ∏ p≤x (p+1)∏ p≤x (p 1). We will prove the following theorem of Motohashi [Mot70]. THEOREM 3.3.1. Let Px be the largest prime divisor of ∏ p≤x (p2 1). Then Px > xθ for any θ < 1 1 2e 1 4 . Proof : In this proof q will also stand for primes, and sums or products over q will represent sums or products over primes in the range. Consider the product Ξ = ∏p≤x(p2 1). Taking log on both sides, we have logΞ = log∏ p≤x p21 1 p2 = 2 ∑ p≤x logp O∑ p≤x 1 p2 = 2x+O(xe c√logx) O(1). Let π(x,k) be the number of primes below x such that p2 1≡0 mod k. We have that p2 1 = (p+1)(p 1) and for p > 2 we have gcd(p+1,p 1)= 2. If k = qa, q6= 2, then p2 1≡0 mod k implies that either p+1≡0 mod k or p 1 ≡ 0 mod k. In this case we have π(x,qa) = π(x; 1,qa)+π(x;+1,qa). Furthermore, π(x,2) = π(x), and π(x,4) = π(x). For a > 2, we have π(x,2a) = π(x; 1,2a 1)+π(x;+1,2a 1). Using the function π(x,qa), we can write Ξ as ∏ qa 70 3. SELBERG’S SIEVE We split up the sum as follows: ∑ qa π(x,qa)logq = ∑ q≤ √x logB x a=1 + ∑ √x logB x + ∑ xθ + ∑ qa = Σ1 +Σ2 +Σ3 +Σ4, where B is a positive real number. We wish to show that Σ3 is non-zero for the value of θ claimed. Since we already have an asymptotic formula for the sum, to obtain a lower bound for Σ3 we need upper bounds for the remaining sums. We have π(x,k)~ 2lix (k). Σ1 Bombieri’s Theorem— which we shall prove in Chapter 4, can be used directly to bound this sum we get: Σ1 = 2x logx ∑ q≤ √x logB x logq (q 1) +O x logx = x+Oxloglogx logx . Σ2 We have from the Brun-Titchmarsh Theorem (3.2.5) that π(x,q)≤4 x (q 1)logx q1+ 8 logx q.Hence Σ2 ≤4x( ∑ √x logB x ∑ y
logp qlogx q = loglog x z loglog x y+o(1). Thus ∑ √x logB x logq qlogx q = log2(1 θ)+o(1), 3.4. A THEOREM OF HOOLEY 71 and so Σ2 ≤ 4log2(1 θ)x+o(x). Σ4 We split up Σ4 into two parts, Σ4 = ∑ qa≤x 2 3 a≥2 + ∑ x 2 3 Using the Brun-Titchmarsh theorem: Σ41 = O∑ q≤√x logq x logx ∑ a≥2 1 (qa) = O x logx ∑ q≤√x logq q2 = O x logx and Σ42 = O∑ q≤√x logq ∑ x 2 3 x qa = Ox1 3 ∑ q≤√x logq logx logq = O(x 5 6 ). Thus Σ4 = O x logx. From the bounds we have derived we get: Σ3 > (1+4log2(1 θ))x+o(x). Hence if 1+4log2(1 θ) > 0 i.e., if 1 1 2e 1 4 > θ, then there is a prime factor exceeding xθ. Among known improvements to this result, the best one is that the largest prime factor exceeds xθ for θ = 0.677 (see [BakHar95], [BakHar98], and also [Ho73]). 3.4. A theorem of Hooley Chebyhev proved that if Px is the largest prime factor of ∏n≤x(n2+1), then Px x →∞. Hooley [Ho67] (see also [Ho76])impro ved the previous best known result of Px x > (logx)A1 logloglogx by Erd os [Erd52] to Px > x 11 10 using the Selberg sieve. In this section we shall outline the proof given by Hooley in [Ho76]. The exponent 11 10 has since been improved to θ< 1.202···, where θ is the solution to 2 θ 2log(2 θ) = 5 4, by Deshouillers and Iwaniec [DI83] (see also [Dar96]). 72 3. SELBERG’S SIEVE THEOREM 3.4.1 ([Ho76]). The largest prime factor of ∏ n≤x (n2 +1) exceeds x 11 10 for all large enough values of x. Proof : Let Px be the largest prime factor of ∏n≤x(n2 +1), and set Nx(l) = Taking logs, log∏ n≤x (n2 +1) = log∏ n≤x n21+ 1 n2 > logbxc!2 = 2xlogx+O(x) by Stirling’s theorem, and so ∑ p≤Px pα Nx(pα)log p > 2xlogx+O(x). Now ∑ p≤Px pα Nx(pα)logp = ∑ x≤p≤Px Nx(p)log p+ ∑ p≤x Nx(p)logp+ ∑ p≤Px α>1 Nx(pα)log p = ΣA +ΣB +ΣC. As before we proceed to upper-bound ΣB and ΣC, thereby obtaining a lower bound for ΣA. Now Nx(l) = ∑ n2+1≡0 mod l n≤x 1 = ∑ v2+1≡0 mod l 0 Let ρ(l) be the number of solution to the congruence v2 +1≡0 mod l. Then since ∑ n≡v mod l n≤x 1 x l = O(1), we have Nx(l) = xρ(l) l +O(ρ(l)). Now ρ(2) = 1, and since the congruence 1 p ≡( 1) p 1 2 mod p has no solutions for p≡3 mod 4, and has exactly two solutions for p≡1 mod 4. We conclude ρ(p) =(2 if p≡1 mod 4, 0 if p≡1 mod 4. 3.4. A THEOREM OF HOOLEY 73 The needed bounds are given by: ΣB = x ∑ p≤x ρ(p)logp p +O∑ p≤x ρ(p)logp = 2x ∑ p≤xp ≡1 mod 4 log p p +O(x)+O(∑ p≤x logp), = xlogx+O(x). using ∑ p≤x p≡l mod k logp p = 1 (k) logx+O(1), ΣC = O ∑ p≤√x2+1 logp ∑ 2≤α x pα +1 = Ox∑ p logp p1 1 p= Ox∑ p log p p(p 1) = O(x) since the sum converges. Thus we get ΣA > xlogx+O(x). Our next task is to upper-bound the sum Tx(y) = ∑x
Tx(y) = ∑ x
Nx(p)log p+ ∑ xX
Nx(p)log p = Tx(xX)+T0 x(y). To evaluate Tx(xX), we let Vx(v) = ∑v
Tx(xX) = ∑ 0≤α ∑ xeα
Nx(p)log p ≤ ∑ 0≤α log(xeα+1)Vx(xeα). Now for the sum T0 x(y), using the de nition of Nx(l), we have: T0 x(y) = ∑ xX
logp = ∑ m> x2 ylog8 x log p+ ∑ m≤ x2 ylog8 x log p = T00 x (y)+T000 x (y)(say). 74 3. SELBERG’S SIEVE Now the conditions of the summation T000 x (y) yield m≤ x2 ylog8 x , and so n
. Since m≤n, we have m≤ x log4 x . Using this we have T000 x (y) = 2logx ∑ lm=n2+1 m,n≤ x log4 x 1 = 2logx ∑ m≤ x log4 x N x log4 x (m). Now if m = ∏i phi i , then ρ(m) = ∏iρ(phi), and each of the individual terms is a constant. So ρ(m)≤2ν(m), and this itself is upper bounded by d(m), i.e. the number of divisors of m. Therefore: T000 x (y)≤ 2x log3 x ∑ m≤ x log4 x ρ(m) m +Ologx ∑ m≤ x log4 x ρ(m) = O x log3 x ∑ m≤ x log4 x ρ(m) m = O x log3 x ∑ m≤x d(m) m . Now consider ∑ 1≤n≤x 1 n ∑ 1≤m≤x 1 m= ∑ 1≤n≤x2n is x smooth 1 n∑ u,v≤x uv=n 1 ≥ ∑ 1≤n≤x d(n) n . This yields ∑1≤n≤x d(n) n = O(log2 x), and so T000 x (y) = O x logx. In T00 x (y), we have m > x2 ylog8 x and pm≤x2 +1, so m≤ x2+1 p . Furthermore p > xX, and so m≤ x X1+ 1 x2≤ ex X . Thus we have T00 x (y)≤ ∑ x2 ylog8 x log ex2 m . Let Wx(w) = ∑ w 1. Then T00 x (y)≤ ∑ 0≤α logxXeα+1Wxxe α X , where Y = eylog8 x xX . Finally, T0 x(y)≤ ∑ 0≤α log(xXeα+1)Wxxe α x +O x logx. 3.4. A THEOREM OF HOOLEY 75 We will format the sums involved for application of the Selberg sieve. Let λ be a squarefree number, and de ne (u;λ) = ∑ u<λk≤eu Nx(λk). We impose the conditions x 4 5 < u < x 4 3 and λ < minu 5 4 x , x u 3 4 . By a rather ingenious and elaborate argument Hooleysho wed that (u;λ) = 3xρ(λ) 2πλ 1 ∏p\λ1+ 1 p +Ox1 2+εu3 8 λ 1 2 (see [Ho76]§2.3 -§2.6). Since the argument is not central to our application of the sieve, we exclude the derivation of this bound here. Application of the Sieve: Let x≤v < x 12 11 , so that v satis es the conditions on u imposed by our bounds on (u;λ). Let d denote a squarefree number, and let λd be the Selberg coef cients. Then Vx(v)≤ ∑ v Nx(l)∑ d\l λ2 d = ∑ d1,d2≤z λd1λd2 ∑ v
[1]
{n(n+2)|μ(n)2 = μ(n+2)2 = 1,n≤x}
.T HEOREM 1.6.1. κ2(x) =∏ p 1 2 p2x+O(x2 3 ln 4 3 x). Proof : Let s(n) = ∑d2\n μ(d). Using this we have κ2(x) = ∑ n≤x s(n)s(n+2) = ∑ n≤x∑ a2\n μ(a)∑ b2\n μ(b). If a2\n and b2\(n+2), then writing n = k1a2 and n+2 = k2b2 we have k0 1a2 +k2b2 = 2 (k0 1 = k1). This says that gcd(a2,b2) divides 2, so gcd(a,b) must be 1, i.e. a⊥b. Now interchanging the sum we get κ2(x) = ∑ k1a2 k2b2=2 k2b2≤x a⊥b μ(a)μ(b). The rest of the proof is now to bound the above sum, and to this end we split up the sum into two parts:
{n≤x|n≡Λ(mod D),( i : 1≤i≤r) : n6≡ai(mod pi),n6≡bi(mod pi)}
.If p1 < p2 <···< pr and pi > 2, then P(D,x; p1,a1,b1;···; pr,ar,br) > Cx Dln2 pr C0p7.938 r , where C and C0 are positive constants.
{n|x y < n≤x,n≡l mod k,n⊥Pz}
≤ y ∏p
{n : n≤x,gcd(n, ∏ p
≤ x ∏p
{n≤x | n2 ≡ 1 mod l}
. We begin by nding a lower bound for ∑x≤p≤Px Nx(p)log p, as in the proof of Theorem (3.3.1). We have ∏ n≤x (n2 +1) = ∏ p≤Px pα