2. Using polyrith
This section and the next can be read independently. They present a
series of extended examples, showing how to transform problems into nails which the polyrith
hammer can hit.
The focus in this section is on making sure that polyrith
has all the hypotheses it needs to
solve a problem.
2.1. Chebyshev polynomials
The Chebyshev polynomials (of the first kind) are a sequence of polynomials \(T_n(x)\) defined recursively:
So \(T_2(x)=2x^2-1\), \(T_3(x)=4x^3-3x\), etc.
We express the sequence of Chebyshev polynomials in Lean as a function from the natural numbers
ℕ
to the integer-coefficient polynomials ℤ[X]
, defined recursively.
noncomputable def T : ℕ → ℤ[X]
| 0 := 1
| 1 := X
| (n + 2) := 2 * X * T (n + 1) - T n
-- now record the three pieces of the recursive definition for easy access
lemma T_zero : T 0 = 1 := rfl
lemma T_one : T 1 = X := rfl
lemma T_add_two (n : ℕ) : T (n + 2) = 2 * X * T (n + 1) - T n := by rw T
In this section we will prove the multiplication formula for Chebyshev polynomials:
Theorem
For all natural numbers \(m\) and \(k\),
This is proved by induction on \(m\): if the formula is true for \(m\) and \(m+1\) (for all \(k\)), then it is true for \(m+2\) (for all \(k\)). I leave the two base cases as exercises. Here is the paper proof of the inductive step:
Notice that two facts are being invoked (each repeatedly) during this proof: the recurrence
relation, in Lean T_add_two
, and the inductive hypothesis of the theorem, which within the
proof below will carry the same name, mul_T
, as the theorem itself. To make polyrith
do
this proof you should determine precisely which input values these facts are being used with in the
above proof, and state them in Lean with those input values. See the broken proof below for the
basic idea.
There is one subtlety. It will happen that the same natural number appears as the index of the
sequence T, but in more than one way. For example, in the setup below, both m + 37
and
37 + m
appear as indices. This prevents polyrith
from parsing T (m + 37)
and
T (37 + m)
as the same token. So, before running polyrith
, normalize all indices by
running ring_nf
(“ring normal form”) simultaneously on all the hypotheses you care about and on
the goal.
lemma mul_T : ∀ m : ℕ, ∀ k, 2 * T m * T (m + k) = T (2 * m + k) + T k
| 0 := begin
sorry,
end
| 1 := begin
sorry,
end
| (m + 2) := begin
intros k,
have H₁ := mul_T (m + 37) (k * 37), -- not actually a relevant pair of input values!
have h₁ := T_add_two 7, -- not actually a relevant input value!
have h₂ := T_add_two (37 + m), -- not actually a relevant input value!
ring_nf at H₁ h₁ h₂ ⊢,
polyrith,
end
2.2. Isometries of the complex plane
Let us identify the Euclidean plane with the complex numbers \(\mathbb{C}\). In this section we prove the following lemma:
Lemma
If \(f:\mathbb{C}\to\mathbb{C}\) is a real-linear isometry of the plane fixing \(1\), then \(f(i)\) is either \(i\) or \(-i\).
This lemma appears in mathlib in the proof of the classical theorem that isometries \(f:\mathbb{C}\to\mathbb{C}\) of the plane are of the two forms
for \(a,b\in\mathbb{C}\) with \(|a|=1\): compositions of a translation, a rotation, and (possibly) a reflection. (An isometry has to be affine; by postcomposition with a translation we can arrange that \(f(0)=0\), i.e. \(f\) is linear; by postcomposition with a rotation we can arrange that \(f(1)=1\), and then after the above lemma \(f\) is determined on the real basis \(\{1, i\}\) for \(\mathbb{C}\), hence everywhere by linearity.
Real-linear isometries from \(\mathbb{C}\) to \(\mathbb{C}\) are expressed in
Lean by the type ℂ →ₗᵢ[ℝ] ℂ
. The ₗᵢ
subscript stands
for “linear isometry”, and the [ℝ]
denotes the scalar field to consider for the linearity.
The lemma is proved on paper as follows. Since \(f\) is linear and norm-preserving and \(f(1)=1\), we have
Exercise
Deduce from these two facts that
(Hint: Subtract and factor.)
This fixes \(f(i)\) as either \(i\) (if the left factor vanishes) or \(-i\) (if the right factor vanishes).
We present the skeleton of this argument in Lean below, with the exercise indicated above left as a
sorry. A tricky point is that linear_combination
/polyrith
does not know facts specific to the ring of complex numbers, such as that \(i^2=-1\) or that
\(\overline{i}=-i\). You can state any needed such facts explicitly and prove them using
norm_num
(of course, there are lemmas for this, but why bother memorizing them?). Then you can
rewrite by these facts, or just include them in the context for polyrith
to pick up.
example {f : ℂ →ₗᵢ[ℝ] ℂ} (h : f 1 = 1) : f I = I ∨ f I = - I :=
begin
have key : (f I - I) * (conj (f I) - conj (- I)) = 0,
{ have H₁ := congr_arg (λ t, (↑(t ^ 2) : ℂ)) (f.norm_map (I - 1)),
have H₂ := congr_arg (λ t, (↑(t ^ 2) : ℂ)) (f.norm_map I),
simp only [h, ←norm_sq_eq_abs] with complex_simps at H₁ H₂,
sorry },
-- From `key`, we deduce that either `f I - I = 0` or `conj (f I) - conj (- I) = 0`.
cases eq_zero_or_eq_zero_of_mul_eq_zero key with hI hI,
{ left,
linear_combination hI },
{ right,
apply (conj : ℂ →+* ℂ).injective,
linear_combination hI },
end
2.3. Double cover of SO(3, ℝ)
In this section we construct the double cover of \(SO(3,\mathbb{R})\) by the unit quaternions. We first state the formula which will describe this double cover as a map from the quaternions \(\mathbb{H}\) to the 3 × 3 matrices. It sends the quaternion \(x+iy+jz+wk\) to
def quaternion.to_matrix (a : ℍ) : matrix (fin 3) (fin 3) ℝ :=
let ⟨x, y, z, w⟩ := a in
![![x ^ 2 + y ^ 2 - z ^ 2 - w ^ 2, 2 * (y * z - w * x), 2 * (y * w + z * x)],
![2 * (y * z + w * x), x ^ 2 + z ^ 2 - y ^ 2 - w ^ 2, 2 * (z * w - y * x)],
![2 * (y * w - z * x), 2 * (z * w + y * x), x ^ 2 + w ^ 2 - y ^ 2 - z ^ 2]]
This turns out to be a “homomorphism of monoids”, denoted in Lean by →*
; that is, it preserves
multiplication and the identity. Here is what the proof of that fact looks like. It would be
extremely tedious on paper, but it’s very fast in Lean, because it simply requires checking
coefficient-wise \(9=3 \times 3\) identities of real-number expressions (for the preservation
of the identity element) and 9 identities of polynomials (for the preservation of multiplication).
def quaternion.to_matrix_hom' : ℍ →* matrix (fin 3) (fin 3) ℝ :=
{ to_fun := quaternion.to_matrix,
map_one' := show quaternion.to_matrix ⟨1, 0, 0, 0⟩ = 1, begin
ext i j; fin_cases i; fin_cases j;
simp only with quaternion_simps { discharger := `[norm_num1] }; norm_num,
end,
map_mul' := begin
rintros ⟨x, y, z, w⟩ ⟨r, s, t, u⟩,
show quaternion.to_matrix ⟨_, _, _, _⟩
= quaternion.to_matrix ⟨_, _, _, _⟩ * quaternion.to_matrix ⟨_, _, _, _⟩,
ext i j; fin_cases i; fin_cases j; simp only with quaternion_simps; ring,
end }
Now we write down a definition of the unit quaternions. We give it the abstract-nonsense
definition as the mker
(for “monoid kernel”) of the “norm_square” monoid homomorphism
quaternion.norm_sq
from the quaternions to the reals; this already existed in mathlib. Then we
prove a quick lemma that this is equivalent to the pedestrian definition, the sums of the squares
of the four components summing to 1.
def unit_quaternions : submonoid ℍ :=
monoid_hom.mker ((quaternion.norm_sq : ℍ →*₀ ℝ) : ℍ →* ℝ)
@[simp] lemma mem_unit_quaternions (a : ℍ) :
a ∈ unit_quaternions ↔ a.re ^ 2 + a.im_i ^ 2 + a.im_j ^ 2 + a.im_k ^ 2 = 1 :=
by { rw ← quaternion.norm_sq_def' a, exact iff.rfl }
The 3 × 3 orthogonal group is available in mathlib already, as a submonoid of the 3 x 3 matrices. (Exercise for the reader: the monoid structure is in mathlib but no one seems yet to have proved it’s a group, i.e. closed under inversion.)
#check matrix.orthogonal_group (fin 3) ℝ
-- matrix.orthogonal_group ℝ : submonoid (matrix (fin 3) (fin 3) ℝ)
The first serious calculation is to prove that the matrix formula stated above is well-defined as a map from the unit quaternions to \(SO(3,\mathbb{R})\). That is, we have to prove that if \(x+iy+jz+kw\) is a unit quaternion then the formula produces an element of \(SO(3,\mathbb{R})\).
To prove this, we have to multiply it by its transpose, obtaining a 3 × 3 matrix of quartic polynomials in \(x,y,z,w\), and check entry by entry that if \(x^2+y^2+z^2+w^2=1\) then that quartic polynomial is one (respectively, zero) on (respectively, off) the diagonal.
All these 9 checks can be done by polyrith
. Click through in your Lean file to see the proofs
that polyrith
comes up with for each one.
lemma to_matrix_mem_orthogonal {a : ℍ} (ha : a ∈ unit_quaternions) :
to_matrix a ∈ matrix.orthogonal_group (fin 3) ℝ :=
begin
rw matrix.mem_orthogonal_group_iff,
cases a with x y z w,
have H : x ^ 2 + y ^ 2 + z ^ 2 + w ^ 2 = 1 := by rwa [mem_unit_quaternions] at ha,
ext i j,
fin_cases i; fin_cases j; simp only with quaternion_simps { discharger := `[norm_num1] },
{ polyrith },
{ polyrith },
{ polyrith },
{ polyrith },
{ polyrith },
{ polyrith },
{ polyrith },
{ polyrith },
{ polyrith },
end
So the monoid homomorphism from \(\mathbb{H}\) to 3 × 3 matrices descends to a homomorphism
unit_quaternions.to_matrix_hom
from the unit quaternions to \(SO(3, \mathbb{R})\), with a
couple more lines of abstract nonsense.
We now want to show that this homomorphism is a double cover; that is, its kernel has two elements, \(\{1, -1\}\).
There are two directions to this proof. The easy direction is that both these elements are in the
kernel; this is a numeric check that can be done with 2 × 3 × 3 applications of norm_num
. The
hard direction is that an element of the kernel is one of these two.
So let \(x+iy+jz+kw\) be an element of the kernel. We have that \(x^2+y^2+z^2+w^2=1\), since it is a unit quaternion, and we also have 9 further quadratic equations available about \(x,y,z,w\): for example, by looking at the second row and first column in the matrix (in Lean this is the first row and zero-th column!) we have that \(2(y z + w x)=0\).
We will use polyrith
to deduce that \(y=0\), \(z=0\), \(w=0\), and \(x^2=1\),
which lets us conclude that \(x+iy+jz+kw = \pm 1 + 0i+0j+0k\) as desired.
The proof below is on the right track, but it is broken: I have not given polyrith
all the
information it needs. (I was lazy and just wrote out the result of checking the (0,1)-th
coefficient of the matrix, as discussed above.) Figure out what further matrix coefficients to
check so that polyrith
has all the information it needs. Or just write down all nine!
lemma to_matrix_hom_mker : (to_matrix_hom.mker : set unit_quaternions) = {1, neg_one} :=
begin
ext a,
obtain ⟨⟨x, y, z, w⟩, h⟩ := a,
have H : x ^ 2 + y ^ 2 + z ^ 2 + w ^ 2 = 1 := by rwa [mem_unit_quaternions] at h,
simp only with quaternion_simps,
split,
{ -- hard direction: a quaternion in the kernel is ±1
intros h,
have h₀₁ := congr_fun₂ h 0 1,
-- Add more matrix entry inspections here as needed, and adjust the simplification line.
-- The `polyrith` applications that follow will be broken until you do this!
simp only with quaternion_simps at h₀₁ { discharger := `[norm_num1] },
have hy : y = 0,
{ polyrith },
have hz : z = 0,
{ polyrith },
have hw : w = 0,
{ polyrith },
have hx : x ^ 2 = 1 ^ 2,
{ polyrith },
-- now do a case division depending on the two cases for `x ^ 2 = 1 ^ 2`
rw sq_eq_sq_iff_eq_or_eq_neg at hx,
cases hx,
{ simp [hx, hy, hz, hw] },
{ simp [hx, hy, hz, hw] } },
{ -- easy direction: ±1 are in the kernel
rintros (⟨rfl, rfl, rfl, rfl⟩ | ⟨rfl, rfl, rfl, rfl⟩); ext i j; fin_cases i; fin_cases j;
simp only with quaternion_simps { discharger := `[norm_num1] }; norm_num },
end