Introduction to Field Theory, Iain T. Adamson, 1964, Oliver & Boyd, pp. 26–37.
§ 5. Polynomials. In elementary books on algebra, polynomials are usually defined to be “expressions of the form
|f(x) ≡ a0 + a1x + a2x2 + … + anxn|
Let R be a commutative ring with identity element e. We denote by P(R) the set of infinite sequences (a0, a1, …, an, …) of elements of R, each of which has the property that only finitely many of the members ai of the sequence are non-zero; thus for each sequence a = (a0, a1, a2, …) in P(R) there is an integer Na such that ai = 0 for all integers i > Na. It is important to be clear that two sequences are equal if and only if corresponding members are equal, i.e. if a = (a0, a1, a2, …) and b = (b0, b1, b2, …) then a = b if and only if ai = bi (i = 0, 1, 2, …).
We introduce an operation of addition in P(R) by setting
|(a0, a1, a2, …)+(b0, b1, b2, …) = (a0 + b0, a1 + b1, a2 + b2, …).|
Next we introduce an operation of multiplication in P(R) by setting
|(a0, a1, a2, …)(b0, b1, b2, …) = (c0, c1, c2, …),|
|cn = ∑i=0n aibn−i (n = 0, 1, 2, …).|
Consider now the mapping κ of R into P(R) defined by setting κ(a0) = (a0, 0, 0, …) for all elements a0 of R. It is easy to see that κ is a monomorphism; we call it the canonical monomorphism of R into P(R). Then R and its image κ(R) under κ are isomorphic; they differ, of course, in the nature of their elements, but have exactly the same structure. We frequently find it convenient to blur the distinction between R and κ(R) and to use the same symbol a0 for both an element of R and for its image under κ in P(R); when we do this, we say that we are identifying R with its image under κ and regarding R as a subring of the ring P(R). It will be found in practice that very little confusion is likely to arise from this identification procedure; but any confusion which does arise can be resolved by a return to the strictly logical notation.
We now introduce a name for the special sequence (0, e, 0, 0, …) in P(R): we call it X. By induction we can prove at once that, for every positive integer n, Xn is the sequence (c0, c1, c2, …) for which cn = e and ci = 0 whenever i ≠ n. Then if f = (a0, a1, …, aN, 0, 0, …) is any sequence in P(R), with an = 0 for all integers n > N, we have
|f||=||(a0, 0, 0, …) + (0, a1, 0, …) + … + (0, …, 0, aN, 0, …)|
|=||κ(a0) + κ(a1)X + … + κ(aN)XN.||(5.1)|
|f = a0 + a1X + … + aNXN.|
Let f be a non-zero polynomial with coefficients in R: say f = (a0, a1, a2, …). We define the degree of f to be the greatest integer n such that an is non-zero; we denote the degree of f by ∂f. The polynomials of degree zero are precisely the non-zero elements of the subring κ(R); we call them constant polynomials or simply constants. Polynomials of degree 1 are also called linear polynomials. It is convenient to define the degree of the zero polynomial z = κ(0) to be −∞, with the usual understanding that for every integer n ≧ 0 we have n > −∞ and −∞ + n = −∞. We deduce immediately from the definitions of addition and multiplication that if f and g are polynomials coefficients in R, then
|∂(f + g) ≦ max(∂f, ∂g)|
|∂(fg) = ∂f + ∂g;|
If f = (a0, a1, …, aN, …) is a non-zero polynomial with degree N, we call aN the leading coefficient of f; this name perhaps appears more reasonable when we express f in the form f = aNXN + … + a1X + a0. If the leading coefficient of f is the identity e of R, we say that f is a monic polynomial, and we drop the leading coefficient, writing simply f = XN + … + a1X + a0.
We now concentrate our attention on polynomials with coefficients in a field. So let F be a field; a polynomial f with coefficients in F is said to be divisible by another such polynomial d, and d is said to be a factor of f, if there exists a polynomial q such that f = qd. In this situation we say also that f is a multiple of d. The polynomial f is said to be irreducible if it has no factor d such that 0 < ∂d < ∂f; thus the only factors of an irreducible polynomial f are the constant polynomials and the products of f by the constant polynomials.
We now state without proof two theorems to which we shal have constant recourse in the next chapter. Proofs of these results may be found in Turnbull, Theory of Equations, § 17; there the coefficients of the polynomials considered are described as “constants”, but if we interpret this to mean “elements of the field F” we can extract the following statements.
Theorem 5.1. Let f be any polynomial and let d be a non-zero polynomial with coefficients in F. Then there exist unique polynomials q and r with coefficients in F such that f = qd+r and ∂r<∂d.
Theorem 5.2. Let f and g be any two non-zero polynomials with coefficients in F. Then there exists a unique monic polynomial h with coefficients in F such that (1) h is a factor of both f and g; (2) if k is any polynomial which is a factor of both f and g, then k is a factor of h. Further, there exist polynomials a and b with coefficients in F such that h = af+bg.
The polynomials q and r in Theorem 5.1 are called respectively the quotient and remainder when f is divided by d. The unique polynomial h described in Theorem 5.2 is called the highest common factor or greatest common divisor of f and g. If the highest common factor of f and g is the constant polynomial e (to be quite precise we should call it κ(e)), we say that f and g are relatively prime.
We now introduce a very important mapping of the polynomial ring P(R) into itself; this is the mapping D defined by setting
|Df||=||D(a0 + a1X + a2X2 + … + anXn)|
|=||a1 + 2a2X + 3a3X3 + … + nanXn−1|
|D(f + g) = D(f) + D(g)||and||D(fg) = (Df)g + f(Dg)|
It is, of course, quite impracticable to define derivatives of polynomials with coefficients in a general field F by the familiar type of limiting process used in the calculus; for one reason, we have not defined polynomials as functions; for another, in a general field we do not have any notion of “limit”. But the definition we have given makes sense in any field F, since the coefficients 2a2, 3a3, … are integral multiples of elements of F and hence are well-determined elements of F.
The next result is also an immediate consequence of the definition.
Theorem 5.3. Let f = a0 + a1X + … + anXn be a polynomial in P(F). If F has characteristic zero then Df = z (the zero polynomial) if and only if f is either zero or a constant polynomial, i.e., if and only if a1 = a2 = … = an = 0. If F has non-zero characteristic p then Df = z if and only if ak = 0 for all integers k not divisible by p.
Let R be any commutative ring containing the field F, and let α be any element of R. An element β of R which can be expressed (not necessarily uniquely) in the form β = a0 + a1α + a2α2 + … + anαn, where a0, a1, …, an are elements of F and n is a non-negative integer, is called a polynomial in α with coefficients in F. The set of all such elements is a subring of R, which we denote by F[α]. If κ denotes, as before, the canonical monomorphism of F into P(F), then equation (5.1) shows that P(F) = κ(F)[X] and so, identifying F and κ(F), we have P(F) = F[X] which is a standard notation for the polynomial ring with coefficients in F.
Returning to the general case, we define a mapping σα of the polynomial ring P(F) into R by setting
|σα(f)||=||σα(a0 + a1X + … + anXn)|
|=||a0 + a1α + … + anαn|
Theorem 5.4. If R is a commutative ring containing the field F and α is any element of R then the mapping σα is an epimorphism of P(R) onto F[α].
Proof. Let f = a0 + a1X + a2X2 + … and g = b0 + b1X + b2X2 + … be any two polynomials in P(F). Then
|f + g = (a0 + b0) + (a1 + b1)X + (a2 + b2)X2 + …|
|fg = (a0b0) + (a0b1 + a1b0)X + (a0b2 + a1b1 + a2b0)X2 + ….|
|σα(f) + σα(g)||=||(a0 + a1α + a2α2 + …) + (b0 + b1α + b2α2 + …)|
|=||(a0 + b0) + (a1 + b1)α + (a2 + b2)α2 + …|
|=||σα(f + g).|
|σα(f)σα(g)||=||(a0 + a1α + a2α2 + …)(b0 + b1α + b2α2 + …)|
|=||a0b0 + (a0b1α + a1αb0) + (a0b2α2 + a1αb1α + a2α2b0) + …||(5.2)|
|=||a0b0 + (a0b1 + a1b0)α + (a0b2 + a1b1 + a2b0)α2 + …||(5.3)|
If f is a polynomial in P(F) and σα(f) = 0, then we say that α is a root of f in R. As in elementary algebra, there is an intimate connexion between the roots of f in the field F itself and the linear factors of f in P(F). If α is any element of F we denote the linear polynomial X − α by lα.
Theorem 5.5. If α is an element of F and f is a polynomial in P(F), then α is a root of f in F if and only if lα is a factor of f in P(F).
Proof. According to Theorem 5.1 there exist polynomials q and r in P(F) such that f = qlα + r and ∂r < ∂lα = 1. Thus r has the form κ(a) where a is an element of F, possibly zero. Then
|σα(f) = σα(qlα + r) = σα(q)σα(lα) + σα(r) = σα(r) = a,|
We now contend that if two fields are isomorphic then the rings of polynomials with coefficients in those fields are also isomorphic. This result is an easy consequence of the following theorem.
Theorem 5.6. Let τ be a monomorphism of a field F1 into a field F2; let κ1, κ2 be the canonical monomorphisms of F1 into P(F1), F2 into P(F2) respectively. Then there exists a monomorphism τP of P(F1) into P(F2) such that for every element a of F1 we have τP(κ1(a)) = κ2(τ(a)).
Proof. The mapping τP of P(F1) into P(F2) defined by setting
|τP(f)||=||τP(a0 + a1X + … + anXn)|
|=||τ(a0) + τ(a1)X + … + τ(an)Xn|
The condition that τP(κ1(a)) = κ2(τ(a)) for every element a in F1 is sometimes described by saying that the diagram in fig. 1 is commutative; for the condition asserts that if we start with any element a of F1 and “transport it” to P(F2) by either of the routes indicated in the diagram&emdash;by applying first κ1 and then τP or by applying first τ and then κ2&emdash;we obtain the same result.
If we identify F1 and F2 with their images under the canonical monomorphisms and regard them as subfields of P(F1) and P(F2) respectively, then the condition on τP is simply that τP(a) = τ(a) for every element a of F1, i.e. that τP shall act like τ on the elements of F1. For this reason we call τP the canonical extension of τ to P(F1).
§ 6. Higher polynomial rings; rational functions. Let R be any commutative ring with identity e. We define inductively a family of “higher polynomial rings” with coefficients in R, as follows: P1(R) is simply the polynomial ring P(R) as we defined it in the last section; then for n>1, we set Pn(R) = P(Pn−1(R)). We call Pn(R) the nth order polynomial ring with coefficients in R.
In order to achieve some insight into the structure of these rings we shall examine P2(R) = P(P(R)). Let κ1 and κ2 be the canonical monomorphisms of R into P(R) and of P(R) into P2(R) respectively; the mapping κ of R into P2(R) defined by setting κ(a) = κ2(κ1(a)) for all elements a of R is clearly also a monomorphism. If we denote by X the element (0, e, 0, …) of P(R), then, as we saw in § 5, every element b of P(R) can be expressed in the form
|b = κ1(a0) + κ1(a1)X + … + κ1(am)Xm|
|p = κ2(b0) + κ2(b1)X2 + … + κ2(bn)X2n|
|bj = κ1(a0j) + κ1(a1j)X + … + κ1(amjj)Xmj (j = 0, 1, …, n)|
|p||=||κ(a00) + κ(a10)X1 + … + κ(am00)X1m0|
|+||(κ(a01 + κ(a11)X1 + … + κ(am11)X1m1)X2|
|+||(κ(a0n + κ(a1n)X1 + … + κ(amnn)X1mn)X2n|
|=||∑j=0n ∑i=0mj κ(aij)X1iX2j.|
|p = ∑i=0∞ ∑j=0∞ aijX1iX2j|
A similar discussion shows that by suitable choice of elements X1, … Xn and the usual identification procedure, every element of Pn(R) can be expressed in the form
|∑i1=0∞ … ∑in=0∞ ai1…inX1i1…Xnin|
Let now S be any commutative ring containing R and let α = (α1, …, αn) be an ordered n-tuple of elements of S. We may then define a mapping σα of Pn(R) into S by setting
|σα(∑ ai1…inX1i1…Xnin) = ∑ ai1…inα1i1…αnin|
5. Let F be a field. For each polynomial f in P(F) we may define a mapping f* of F into itself by setting f*(α) = σα(f) for every element α of F. Show that f* is not in general a homomorphism.
6. Let F = Zp and let f be the polynomial Xp − X in P(F). Show that while f is not the zero polynomial the mapping f* of F into itself determined by f as in Example 5 is the zero mapping, i.e., f*(α) = 0 for every element α of F.
8. Let F be a field. Show that the only elements of the polynomial ring P(F) which have multiplicative inverses are the constant polynomials.
9. Let F be a field, κ the canonical monomorphism of F into the polynomial ring P(F). If ϕ is an automorphism of P(F) show that there is an automorphism ϕ1 of F such that ϕ(κ(a)) = κ(ϕ1(a)) for all elements a of F and that there are elements c, d of F (c ≠ 0) such that ϕ(X) = cX + d.
10. If f is any polynomial of degree n with coefficients in a field F of characteristic zero with identity element e, prove that (in the notation of example 5)
|f = ∑r=0n (r!e)−1(Drf)*(0)Xr,|
† Turnbull, Theory of Equations, Chapter VI.
Galois Theory, Joseph Rotman, 1990, Springer-Verlag, pp. 2–3.
3. If R is a ring, define a polynomial f(x) with coefficients in R (briefly, a polynomial over R) to be a sequence f = (r0, r1, …, rn, 0, 0, …) with ri ∈ R for all i and ri = 0 for all i > n. If g(x) = (s0, s1, …, sm, 0, 0, …) is another polynomial over R, it follows that f(x) = g(x) if and only if ri = si for all i. Denote the set of all such polynomials by R[x], and define addition and multiplication on R[x] as follows: (r0, r1, …, ri, …) + (s0, s1, …, si, …) = (r0 + s0, r1 + s1, …, ri + si, …) and (r0, r1, …, ri, …)(s0, s1, …, sj, …) = (t0, t1, …, tk, …), where t0 = r0s0, t1 = r0s1 + r1s0, and, in general, tk = ∑ risj, the summation being over all i, j with i + j = k. Let (1, 0, 0, …) be abbreviated to 1 (there are now two meanings for this symbol). It is routine but tedious to verify that R[x] is a ring, the polynomial ring over R.
What is the significance of the letter x in the notation f(x)? Let x denote the specific element of R[x]: x = (0, 1, 0, 0, …). It is easy to prove that x2 = (0, 0, 1, 0, 0, …) and, by induction, that xi is the sequence having 0 everywhere except for 1 in the ith spot. The reader may now prove (thereby recapturing the usual notation) that f(x) = (r0, r1, …, rn, 0, 0, …) = r0 + r1x + … + rnxn = ∑ rixi (r0 = r01 if we identify r0 with (r0, 0, 0, …) in R[x]). Notice that x is an honest element of a ring and not a variable; its role as a variable, however, is given in Exercise 18.
We remind the reader of the usual vocabulary associated with f(x) = r0 + r1x + … + rnxn. The leading coefficient of f(x) is rn, where n is the largest integer (if any) with rn ≠ 0; n is called the degree of f(x) and is denoted by ∂f; every polynomial except 0 = (0, 0, …) has a degree.
A monic polynomial is one whose leading coefficient is 1. The constant term of f(x) is r0; a constant (polynomial) is either the zero polynomial 0 or a polynomial of degree 0; linear, quadratic, cubic, quartic (or biquadratic), and quintic polynomials have degrees, respectively, 1, 2, 3, 4, and 5.
Recall from linear algebra that a linear homogeneous system over a field with r equations and n unknowns has a nontrivial solution if r < n; if r = n, one must examine a determinant. If f(x) = (x − α1)…(x − αn) = ∑ rixi, then it is easy to see, by induction on n, that
18. If a ∈ R, define ea : R[x] → R by f(x) = ∑ rixi ↦ ∑ riai (denote this element of R by f(a)); prove that ea is a ring map (it is called evaluation at a). If f(a) = 0, then a is called a root of f(x).
(This exercise allows one to regard x as a variable ranging over R; that is, each polynomial f(x) ∈ R(x) determines a function R → R. But look at the next exercise.)
19. Give an example of distinct polynomials f(x), g(x) ∈ ℤp[x] with f(a) = g(a) for all a ∈ ℤp[x].
(Distinct polynomials (not all coefficients are the same) may determine the same function; this is one reason for our defining polynomials in such a formal way. Indeed, if F is any finite field (there are such other than ℤp), there are only finitely many functions F → F but there are infinitely many polynomials. We shall see after Theorem 11 that this exercise is false if ℤp is replaced by any infinite field.)
Theorem 11. If F is a field and f(x) ∈ F[x] has degree n, then F contains at most n roots of f(x).
This last theorem is false for arbitrary rings R; for example, x2 − 1 has four roots in ℤ8.
Recall that every polynomial f(x) ∈ F[x] determines a function F → F, namely, a ↦ f(a). In Exercise 19, however, we saw that distinct polynomials in ℤp[x] may determine the same function. This pathology vanishes when the coefficient field is infinite. Let F be an infinite field and let f(x) ≠ g(x) in F[x] satisfy f(a) = g(a) for all a ∈ F. Then h(x) = f(x)−g(x) is not the zero polynomial; hence it has a degree, say, n. But each of the infinitely many elements a ∈ F is a root of h(x), and this contradicts Theorem 11.