2. Using polyrith
This section and the next can be read independently. They present a
series of extended examples, showing how to transform problems into nails which the polyrith
hammer can hit.
The focus in this section is on making sure that polyrith
has all the hypotheses it needs to
solve a problem.
2.1. Chebyshev polynomials
The Chebyshev polynomials (of the first kind) are a sequence of polynomials \(T_n(x)\) defined recursively:
So \(T_2(x)=2x^2-1\), \(T_3(x)=4x^3-3x\), etc.
We express the sequence of Chebyshev polynomials in Lean as a function from the natural numbers
ℕ
to the integer-coefficient polynomials ℤ[X]
, defined recursively.
/-- The Chebyshev polynomials, defined recursively. -/
noncomputable def T : ℕ → ℤ[X]
| 0 => 1
| 1 => X
| n + 2 => 2 * X * T (n + 1) - T n
-- now record the three pieces of the recursive definition for easy access
theorem T_zero : T 0 = 1 := rfl
theorem T_one : T 1 = X := rfl
theorem T_add_two (n : ℕ) : T (n + 2) = 2 * X * T (n + 1) - T n := by rw [T]
In this section we will prove the multiplication formula for Chebyshev polynomials:
Theorem
For all natural numbers \(m\) and \(k\),
This is proved by induction on \(m\): if the formula is true for \(m\) and \(m+1\) (for all \(k\)), then it is true for \(m+2\) (for all \(k\)). I leave the two base cases as exercises. Here is the paper proof of the inductive step:
Notice that two facts are being invoked (each repeatedly) during this proof: the recurrence
relation, in Lean T_add_two
, and the inductive hypothesis of the theorem, which within the
proof below will carry the same name, mul_T
, as the theorem itself. To make polyrith
do
this proof you should determine precisely which input values these facts are being used with in the
above proof, and state them in Lean with those input values. See the broken proof below for the
basic idea.
There is one subtlety. It will happen that the same natural number appears as the index of the
sequence T
, but in more than one way. For example, in the setup below, both k * 37
and
37 * k
appear as indices. This prevents polyrith
from parsing T (k * 37)
and
T (37 * k)
as the same token. So, before running polyrith
, normalize all indices by
running ring_nf
(“ring normal form”) simultaneously on all the hypotheses you care about and on
the goal.
/-- The product of two Chebyshev polynomials is the sum of two other Chebyshev polynomials. -/
theorem mul_T : ∀ m : ℕ, ∀ k, 2 * T m * T (m + k) = T (2 * m + k) + T k
| 0 => by
intro k
ring_nf
have h := T_zero
polyrith
| 1 => by
intro k
have h₁ := T_add_two k
have h₂ := T_one
ring_nf at h₁ h₂ ⊢
polyrith
| m + 2 => by
intro k
have H₁ := mul_T (m - 5) (k * 37) -- not actually a relevant pair of input values!
have h₁ := T_add_two 7 -- not actually a relevant input value!
have h₂ := T_add_two (37 * k) -- not actually a relevant input value!
-- ... add more/different instantiations of `mul_T` and `T_add_two` here
ring_nf at H₁ h₁ h₂ ⊢
polyrith -- this will work once the right facts have been provided above
2.2. Isometries of the complex plane
Let us identify the Euclidean plane with the complex numbers \(\mathbb{C}\). In this section we prove the following lemma:
Lemma
If \(f:\mathbb{C}\to\mathbb{C}\) is a real-linear isometry of the plane fixing \(1\), then \(f(i)\) is either \(i\) or \(-i\).
This lemma appears in Mathlib in the proof of the classical theorem that isometries \(f:\mathbb{C}\to\mathbb{C}\) of the plane are of the two forms
for \(a,b\in\mathbb{C}\) with \(|a|=1\): compositions of a translation, a rotation, and (possibly) a reflection. (An isometry has to be affine; by postcomposition with a translation we can arrange that \(f(0)=0\), i.e. \(f\) is linear; by postcomposition with a rotation we can arrange that \(f(1)=1\), and then after the above lemma \(f\) is determined on the real basis \(\{1, i\}\) for \(\mathbb{C}\), hence everywhere by linearity.
Real-linear isometries from \(\mathbb{C}\) to \(\mathbb{C}\) are expressed in
Lean by the type ℂ →ₗᵢ[ℝ] ℂ
. The ₗᵢ
subscript stands
for “linear isometry”, and the [ℝ]
denotes the scalar field to consider for the linearity.
The lemma is proved on paper as follows. Since \(f\) is linear and norm-preserving and \(f(1)=1\), we have
Exercise
Deduce from these two facts that
This fixes \(f(i)\) as either \(i\) (if the left factor vanishes) or \(-i\) (if the right factor vanishes).
We present the skeleton of this argument in Lean below, with the exercise indicated above left as a
sorry. A tricky point is that linear_combination
/polyrith
does not know facts specific to the ring of complex numbers, such as that \(i^2=-1\) or that
\(\overline{i}=-i\). You can state any needed such facts explicitly and prove them using
norm_num
(of course, there are lemmas for this, but why bother memorizing them?). Then you can
rewrite by these facts, or just include them in the context for polyrith
to pick up.
example {f : ℂ →ₗᵢ[ℝ] ℂ} (h : f 1 = 1) : f I = I ∨ f I = -I := by
have key : (f I - I) * (conj (f I) - conj (-I)) = 0 := by
have conj_fact : ∀ z, f z * conj (f z) = z * conj z := by simp [mul_conj']
have H₀ := conj_fact I
have H₁ := conj_fact (I - 1)
simp [h] at *
-- state some fact(s) about `ℂ` here so that `polyrith` can use them in the computation
-- when you have done this, `polyrith` will work
polyrith
-- From `key`, we deduce that either `f I - I = 0` or `conj (f I) - conj (- I) = 0`.
rw [mul_eq_zero] at key
obtain hI | hI := key
· left
linear_combination hI
· right
apply (conj : ℂ →+* ℂ).injective
linear_combination hI
A similar strategy applies when working over other concrete rings with special properties.
For example, when working over \(\mathbb{Z}/p\),
you can add the fact \(p=0\) to the context and this will give polyrith
more to work with.
2.3. Double cover of SO(3, ℝ)
In this section we construct the double cover of \(SO(3,\mathbb{R})\) by the unit quaternions. We first state the formula which will describe this double cover as a map from the quaternions \(\mathbb{H}\) to the 3 × 3 matrices. It sends the quaternion \(x+iy+jz+wk\) to
/-- Explicit matrix formula for the double cover of SO(3, ℝ) by the unit quaternions. -/
@[simp] def Quaternion.toMatrix (a : ℍ) : Matrix (Fin 3) (Fin 3) ℝ :=
let ⟨x, y, z, w⟩ := a
!![x ^ 2 + y ^ 2 - z ^ 2 - w ^ 2, 2 * (y * z - w * x), 2 * (y * w + z * x);
2 * (y * z + w * x), x ^ 2 + z ^ 2 - y ^ 2 - w ^ 2, 2 * (z * w - y * x);
2 * (y * w - z * x), 2 * (z * w + y * x), x ^ 2 + w ^ 2 - y ^ 2 - z ^ 2]
This turns out to be a “homomorphism of monoids”, denoted in Lean by →*
; that is, it preserves
multiplication and the identity. Here is what the proof of that fact looks like. It would be
extremely tedious on paper, but it’s very fast in Lean, because it simply requires checking
coefficient-wise \(9=3 \times 3\) identities of real-number expressions (for the preservation
of the identity element) and 9 identities of polynomials (for the preservation of multiplication).
/-- The explicit matrix formula `toMatrix` defines a monoid homomorphism from the quaternions
into the algebra of 3x3 matrices. -/
@[simps] def Quaternion.toMatrixHom' : ℍ →* Matrix (Fin 3) (Fin 3) ℝ where
toFun := Quaternion.toMatrix
map_one' := by
ext i j
fin_cases i <;> fin_cases j <;> simp
map_mul' := by
rintro ⟨x, y, z, w⟩ ⟨r, s, t, u⟩
ext i j
fin_cases i <;> fin_cases j <;> simp <;> ring
Now we write down a definition of the unit quaternions. We give it the abstract-nonsense
definition as the mker
(for “monoid kernel”) of the “norm_square” monoid homomorphism
Quaternion.normSq
from the quaternions to the reals; this already existed in Mathlib. Then we
prove a quick lemma that this is equivalent to the pedestrian definition, the sums of the squares
of the four components summing to 1.
/-- The group (we only prove it to be a monoid) of unit quaternions. -/
def unitQuaternions : Submonoid ℍ :=
MonoidHom.mker ((Quaternion.normSq : ℍ →*₀ ℝ) : ℍ →* ℝ)
@[simp] theorem mem_unitQuaternions (a : ℍ) :
a ∈ unitQuaternions ↔ a.re ^ 2 + a.imI ^ 2 + a.imJ ^ 2 + a.imK ^ 2 = 1 := by
rw [←Quaternion.normSq_def']
exact Iff.rfl
The 3 × 3 orthogonal group is available in Mathlib already, as a submonoid of the 3 x 3 matrices. (Exercise for the reader: the monoid structure is in Mathlib but no one seems yet to have proved it’s a group, i.e. closed under inversion.)
#check Matrix.orthogonalGroup (Fin 3) ℝ
-- Matrix.orthogonalGroup (Fin 3) ℝ : Submonoid (Matrix (Fin 3) (Fin 3) ℝ)
The first serious calculation is to prove that the matrix formula stated above is well-defined as a map from the unit quaternions to \(SO(3,\mathbb{R})\). That is, we have to prove that if \(x+iy+jz+kw\) is a unit quaternion then the formula produces an element of \(SO(3,\mathbb{R})\).
To prove this, we have to multiply it by its transpose, obtaining a 3 × 3 matrix of quartic polynomials in \(x,y,z,w\), and check entry by entry that if \(x^2+y^2+z^2+w^2=1\) then that quartic polynomial is one (respectively, zero) on (respectively, off) the diagonal.
All these 9 checks can be done by polyrith
. Click through in your Lean file to see the proofs
that polyrith
comes up with for each one.
/-- The explicit matrix formula `to_matrix` sends a unit quaternion to an element of SO(3, ℝ).
-/
theorem toMatrix_mem_orthogonal {a : ℍ} (ha : a ∈ unitQuaternions) :
toMatrix a ∈ Matrix.orthogonalGroup (Fin 3) ℝ := by
rw [Matrix.mem_orthogonalGroup_iff]
obtain ⟨x, y, z, w⟩ := a
have H : x ^ 2 + y ^ 2 + z ^ 2 + w ^ 2 = 1 := by rwa [mem_unitQuaternions] at ha
ext i j
fin_cases i <;> fin_cases j <;>
simp
· polyrith
· polyrith
· polyrith
· polyrith
· polyrith
· polyrith
· polyrith
· polyrith
· polyrith
So the monoid homomorphism from \(\mathbb{H}\) to 3 × 3 matrices descends to a homomorphism
unitQuaternions.toMatrixHom
from the unit quaternions to \(SO(3, \mathbb{R})\), with a
couple more lines of abstract nonsense.
We now want to show that this homomorphism is a double cover; that is, its kernel has two elements, \(\{1, -1\}\).
There are two directions to this proof. The easy direction is that both these elements are in the
kernel; this is a numeric check that can be done with 2 × 3 × 3 applications of norm_num
. The
hard direction is that an element of the kernel is one of these two.
So let \(x+iy+jz+kw\) be an element of the kernel. We have that \(x^2+y^2+z^2+w^2=1\), since it is a unit quaternion, and we also have 9 further quadratic equations available about \(x,y,z,w\): for example, by looking at the second row and first column in the matrix (in Lean this is the first row and zero-th column!) we have that \(2(y z + w x)=0\).
We will use polyrith
to deduce that \(y=0\), \(z=0\), \(w=0\), and \(x^2=1\),
which lets us conclude that \(x+iy+jz+kw = \pm 1 + 0i+0j+0k\) as desired.
The proof below is on the right track, but it is broken: I have not given polyrith
all the
information it needs. (I was lazy and just wrote out the result of checking the (0,1)-th
coefficient of the matrix, as discussed above.) Figure out what further matrix coefficients to
check so that polyrith
has all the information it needs. Or just write down all nine!
/-- Verify the "double cover" property of the homomorphism from unit quaternions to SO(3, ℝ):
the kernel is {1, -1}. -/
theorem toMatrixHom_mker :
(MonoidHom.mker toMatrixHom : Set unitQuaternions) = {1, negOne} := by
ext a
obtain ⟨⟨x, y, z, w⟩, h⟩ := a
have H : x ^ 2 + y ^ 2 + z ^ 2 + w ^ 2 = 1 := by rwa [mem_unitQuaternions] at h
simp [MonoidHom.mem_mker]
constructor
· -- hard direction: a quaternion in the kernel is ±1
intro h1
have h₀₁ := congr_fun₂ h1 0 1
-- Add more matrix entry inspections here as needed, and adjust the simplification line.
-- The `polyrith` applications that follow will be broken until you do this!
simp at h₀₁
obtain rfl : y = 0 := by
polyrith
obtain rfl : z = 0 := by
polyrith
obtain rfl : w = 0 := by
polyrith
have hx : x ^ 2 = (1 : ℝ) ^ 2 := by
polyrith
-- now do a case division depending on the two cases for `x ^ 2 = 1 ^ 2`
rw [sq_eq_sq_iff_eq_or_eq_neg] at hx
obtain rfl | rfl := hx
· left
ext <;> simp
· right
ext <;> simp
· -- easy direction: ±1 are in the kernel
rintro (⟨rfl, rfl, rfl, rfl⟩ | ⟨rfl, rfl, rfl, rfl⟩) <;> ext i j <;> fin_cases i <;>
fin_cases j <;> simp
You may notice a new syntax (exp := 2)
appearing in the polyrith
output in some of the above calculations.
This will be discussed in the next section.