2. Using polyrith

This section and the next can be read independently. They present a series of extended examples, showing how to transform problems into nails which the polyrith hammer can hit.

The focus in this section is on making sure that polyrith has all the hypotheses it needs to solve a problem.

2.1. Chebyshev polynomials

The Chebyshev polynomials (of the first kind) are a sequence of polynomials \(T_n(x)\) defined recursively:

\[\begin{split}\begin{cases} T_0(x)&=1\\ T_1(x)&=x\\ \text{for $n\in\mathbb{N}$, } T_{n+2}(x)&=2xT_{n+1}(x)-T_n(x). \end{cases}\end{split}\]

So \(T_2(x)=2x^2-1\), \(T_3(x)=4x^3-3x\), etc.

We express the sequence of Chebyshev polynomials in Lean as a function from the natural numbers to the integer-coefficient polynomials ℤ[X], defined recursively.

noncomputable def T :   [X]
| 0       := 1
| 1       := X
| (n + 2) := 2 * X * T (n + 1) - T n

-- now record the three pieces of the recursive definition for easy access
lemma T_zero : T 0 = 1 := rfl
lemma T_one : T 1 = X := rfl
lemma T_add_two (n : ) : T (n + 2) = 2 * X * T (n + 1) - T n := by rw T

In this section we will prove the multiplication formula for Chebyshev polynomials:

Theorem

For all natural numbers \(m\) and \(k\),

\[2 T_m(x)T_{m+k}(x) = T_{2m+k}(x)+T_k(x).\]

This is proved by induction on \(m\): if the formula is true for \(m\) and \(m+1\) (for all \(k\)), then it is true for \(m+2\) (for all \(k\)). I leave the two base cases as exercises. Here is the paper proof of the inductive step:

\[\begin{split}2T_{m+2}T_{(m+2)+k} &=2\left(2xT_{m+1}-T_m\right)T_{m+k+2}\\ &=2x\left(2T_{m+1}T_{(m+1)+(k+1)}\right)-2T_mT_{m+(k+2)}\\ &=2x\left(T_{2(m+1)+(k+1)}+T_{k+1}\right)-\left(T_{2m+(k+2)}+T_{k+2}\right)\\ &=\left(2xT_{2m+k+3}-T_{2m+k+2}\right)+\left(2xT_{k+1}-T_{k+2}\right)\\ &=T_{2(m+2)+k}+T_{k}.\end{split}\]

Notice that two facts are being invoked (each repeatedly) during this proof: the recurrence relation, in Lean T_add_two, and the inductive hypothesis of the theorem, which within the proof below will carry the same name, mul_T, as the theorem itself. To make polyrith do this proof you should determine precisely which input values these facts are being used with in the above proof, and state them in Lean with those input values. See the broken proof below for the basic idea.

There is one subtlety. It will happen that the same natural number appears as the index of the sequence T, but in more than one way. For example, in the setup below, both m + 37 and 37 + m appear as indices. This prevents polyrith from parsing T (m + 37) and T (37 + m) as the same token. So, before running polyrith, normalize all indices by running ring_nf (“ring normal form”) simultaneously on all the hypotheses you care about and on the goal.

lemma mul_T :  m : ,  k, 2 * T m * T (m + k) = T (2 * m + k) + T k
| 0 := begin
    sorry,
  end
| 1 := begin
    sorry,
  end
| (m + 2) := begin
    intros k,
    have H₁ := mul_T (m + 37) (k * 37), -- not actually a relevant pair of input values!
    have h₁ := T_add_two 7, -- not actually a relevant input value!
    have h₂ := T_add_two (37 + m), -- not actually a relevant input value!
    ring_nf at H₁ h₁ h₂ ,
    polyrith,
  end

2.2. Isometries of the complex plane

Let us identify the Euclidean plane with the complex numbers \(\mathbb{C}\). In this section we prove the following lemma:

Lemma

If \(f:\mathbb{C}\to\mathbb{C}\) is a real-linear isometry of the plane fixing \(1\), then \(f(i)\) is either \(i\) or \(-i\).

This lemma appears in mathlib in the proof of the classical theorem that isometries \(f:\mathbb{C}\to\mathbb{C}\) of the plane are of the two forms

\[f(z)=az+b,\qquad f(z)=a\overline{z}+b\]

for \(a,b\in\mathbb{C}\) with \(|a|=1\): compositions of a translation, a rotation, and (possibly) a reflection. (An isometry has to be affine; by postcomposition with a translation we can arrange that \(f(0)=0\), i.e. \(f\) is linear; by postcomposition with a rotation we can arrange that \(f(1)=1\), and then after the above lemma \(f\) is determined on the real basis \(\{1, i\}\) for \(\mathbb{C}\), hence everywhere by linearity.

Real-linear isometries from \(\mathbb{C}\) to \(\mathbb{C}\) are expressed in Lean by the type →ₗᵢ[ℝ] . The ₗᵢ subscript stands for “linear isometry”, and the [ℝ] denotes the scalar field to consider for the linearity.

The lemma is proved on paper as follows. Since \(f\) is linear and norm-preserving and \(f(1)=1\), we have

\[\begin{split}f(i)\overline{f(i)}&=|f(i)|^2\\ &=|i|^2\\ &=i\overline{i},\\ \left(f(i)-1\right)\left(\overline{f(i)} - 1\right) &=\left(f(i)-f(1)\right)\left(\overline{f(i)} - \overline{f(1)}\right)\\ &=f(i-1)\overline{f(i - 1)}\\ &=|f(i-1)|^2\\ &=|i-1|^2\\ &=(i-1)(\overline{i}-1).\end{split}\]

Exercise

Deduce from these two facts that

\[\left(f(i) - i\right)\left(\overline{f(i)}-\overline{-i}\right) = 0.\]

(Hint: Subtract and factor.)

This fixes \(f(i)\) as either \(i\) (if the left factor vanishes) or \(-i\) (if the right factor vanishes).

We present the skeleton of this argument in Lean below, with the exercise indicated above left as a sorry. A tricky point is that linear_combination/polyrith does not know facts specific to the ring of complex numbers, such as that \(i^2=-1\) or that \(\overline{i}=-i\). You can state any needed such facts explicitly and prove them using norm_num (of course, there are lemmas for this, but why bother memorizing them?). Then you can rewrite by these facts, or just include them in the context for polyrith to pick up.

example {f :  →ₗᵢ[] } (h : f 1 = 1) : f I = I  f I = - I :=
begin
  have key : (f I - I) * (conj (f I) - conj (- I)) = 0,
  { have H₁ := congr_arg (λ t, ((t ^ 2) : )) (f.norm_map (I - 1)),
    have H₂ := congr_arg (λ t, ((t ^ 2) : )) (f.norm_map I),
    simp only [h, norm_sq_eq_abs] with complex_simps at H₁ H₂,
    sorry },
  -- From `key`, we deduce that either `f I - I = 0` or `conj (f I) - conj (- I) = 0`.
  cases eq_zero_or_eq_zero_of_mul_eq_zero key with hI hI,
  { left,
    linear_combination hI },
  { right,
    apply (conj :  →+* ).injective,
    linear_combination hI },
end

2.3. Double cover of SO(3, ℝ)

In this section we construct the double cover of \(SO(3,\mathbb{R})\) by the unit quaternions. We first state the formula which will describe this double cover as a map from the quaternions \(\mathbb{H}\) to the 3 × 3 matrices. It sends the quaternion \(x+iy+jz+wk\) to

\[\begin{split}\begin{pmatrix} x ^ 2 + y ^ 2 - z ^ 2 - w ^ 2 & 2 (y z - w x) & 2 (y w + z x) \\ 2 (y z + w x) & x ^ 2 + z ^ 2 - y ^ 2 - w ^ 2 & 2 (z w - y x) \\ 2 (y w - z x) & 2 (z w + y x) & x ^ 2 + w ^ 2 - y ^ 2 - z ^ 2 \end{pmatrix}.\end{split}\]
def quaternion.to_matrix (a : ) : matrix (fin 3) (fin 3)  :=
let x, y, z, w := a in
![![x ^ 2 + y ^ 2 - z ^ 2 - w ^ 2, 2 * (y * z - w * x), 2 * (y * w + z * x)],
  ![2 * (y * z + w * x), x ^ 2 + z ^ 2 - y ^ 2 - w ^ 2, 2 * (z * w - y * x)],
  ![2 * (y * w - z * x), 2 * (z * w + y * x), x ^ 2 + w ^ 2 - y ^ 2 - z ^ 2]]

This turns out to be a “homomorphism of monoids”, denoted in Lean by →*; that is, it preserves multiplication and the identity. Here is what the proof of that fact looks like. It would be extremely tedious on paper, but it’s very fast in Lean, because it simply requires checking coefficient-wise \(9=3 \times 3\) identities of real-number expressions (for the preservation of the identity element) and 9 identities of polynomials (for the preservation of multiplication).

def quaternion.to_matrix_hom' :  →* matrix (fin 3) (fin 3)  :=
{ to_fun := quaternion.to_matrix,
  map_one' := show quaternion.to_matrix 1, 0, 0, 0 = 1, begin
    ext i j; fin_cases i; fin_cases j;
      simp only with quaternion_simps { discharger := `[norm_num1] }; norm_num,
  end,
  map_mul' :=  begin
    rintros x, y, z, w r, s, t, u⟩,
    show quaternion.to_matrix _, _, _, _
      = quaternion.to_matrix _, _, _, _ * quaternion.to_matrix _, _, _, _⟩,
    ext i j; fin_cases i; fin_cases j; simp only with quaternion_simps; ring,
  end }

Now we write down a definition of the unit quaternions. We give it the abstract-nonsense definition as the mker (for “monoid kernel”) of the “norm_square” monoid homomorphism quaternion.norm_sq from the quaternions to the reals; this already existed in mathlib. Then we prove a quick lemma that this is equivalent to the pedestrian definition, the sums of the squares of the four components summing to 1.

def unit_quaternions : submonoid  :=
monoid_hom.mker ((quaternion.norm_sq :  →*₀ ) :  →* )

@[simp] lemma mem_unit_quaternions (a : ) :
  a  unit_quaternions  a.re ^ 2 + a.im_i ^ 2 + a.im_j ^ 2 + a.im_k ^ 2 = 1 :=
by { rw  quaternion.norm_sq_def' a, exact iff.rfl }

The 3 × 3 orthogonal group is available in mathlib already, as a submonoid of the 3 x 3 matrices. (Exercise for the reader: the monoid structure is in mathlib but no one seems yet to have proved it’s a group, i.e. closed under inversion.)

#check matrix.orthogonal_group (fin 3) 
-- matrix.orthogonal_group ℝ : submonoid (matrix (fin 3) (fin 3) ℝ)

The first serious calculation is to prove that the matrix formula stated above is well-defined as a map from the unit quaternions to \(SO(3,\mathbb{R})\). That is, we have to prove that if \(x+iy+jz+kw\) is a unit quaternion then the formula produces an element of \(SO(3,\mathbb{R})\).

To prove this, we have to multiply it by its transpose, obtaining a 3 × 3 matrix of quartic polynomials in \(x,y,z,w\), and check entry by entry that if \(x^2+y^2+z^2+w^2=1\) then that quartic polynomial is one (respectively, zero) on (respectively, off) the diagonal.

All these 9 checks can be done by polyrith. Click through in your Lean file to see the proofs that polyrith comes up with for each one.

lemma to_matrix_mem_orthogonal {a : } (ha : a  unit_quaternions) :
  to_matrix a  matrix.orthogonal_group (fin 3)  :=
begin
  rw matrix.mem_orthogonal_group_iff,
  cases a with x y z w,
  have H : x ^ 2 + y ^ 2 + z ^ 2 + w ^ 2 = 1 := by rwa [mem_unit_quaternions] at ha,
  ext i j,
  fin_cases i; fin_cases j; simp only with quaternion_simps { discharger := `[norm_num1] },
  { polyrith },
  { polyrith },
  { polyrith },
  { polyrith },
  { polyrith },
  { polyrith },
  { polyrith },
  { polyrith },
  { polyrith },
end

So the monoid homomorphism from \(\mathbb{H}\) to 3 × 3 matrices descends to a homomorphism unit_quaternions.to_matrix_hom from the unit quaternions to \(SO(3, \mathbb{R})\), with a couple more lines of abstract nonsense.

We now want to show that this homomorphism is a double cover; that is, its kernel has two elements, \(\{1, -1\}\).

There are two directions to this proof. The easy direction is that both these elements are in the kernel; this is a numeric check that can be done with 2 × 3 × 3 applications of norm_num. The hard direction is that an element of the kernel is one of these two.

So let \(x+iy+jz+kw\) be an element of the kernel. We have that \(x^2+y^2+z^2+w^2=1\), since it is a unit quaternion, and we also have 9 further quadratic equations available about \(x,y,z,w\): for example, by looking at the second row and first column in the matrix (in Lean this is the first row and zero-th column!) we have that \(2(y z + w x)=0\).

We will use polyrith to deduce that \(y=0\), \(z=0\), \(w=0\), and \(x^2=1\), which lets us conclude that \(x+iy+jz+kw = \pm 1 + 0i+0j+0k\) as desired.

The proof below is on the right track, but it is broken: I have not given polyrith all the information it needs. (I was lazy and just wrote out the result of checking the (0,1)-th coefficient of the matrix, as discussed above.) Figure out what further matrix coefficients to check so that polyrith has all the information it needs. Or just write down all nine!

lemma to_matrix_hom_mker : (to_matrix_hom.mker : set unit_quaternions) = {1, neg_one} :=
begin
  ext a,
  obtain ⟨⟨x, y, z, w⟩, h := a,
  have H : x ^ 2 + y ^ 2 + z ^ 2 + w ^ 2 = 1 := by rwa [mem_unit_quaternions] at h,
  simp only with quaternion_simps,
  split,
  { -- hard direction: a quaternion in the kernel is ±1
    intros h,
    have h₀₁ := congr_fun₂ h 0 1,
    -- Add more matrix entry inspections here as needed, and adjust the simplification line.
    -- The `polyrith` applications that follow will be broken until you do this!
    simp only with quaternion_simps at h₀₁ { discharger := `[norm_num1] },
    have hy : y = 0,
    { polyrith },
    have hz : z = 0,
    { polyrith },
    have hw : w = 0,
    { polyrith },
    have hx : x ^ 2 = 1 ^ 2,
    { polyrith },
    -- now do a case division depending on the two cases for `x ^ 2 = 1 ^ 2`
    rw sq_eq_sq_iff_eq_or_eq_neg at hx,
    cases hx,
    { simp [hx, hy, hz, hw] },
    { simp [hx, hy, hz, hw] } },
  { -- easy direction: ±1 are in the kernel
    rintros (⟨rfl, rfl, rfl, rfl | rfl, rfl, rfl, rfl⟩); ext i j; fin_cases i; fin_cases j;
      simp only with quaternion_simps { discharger := `[norm_num1] }; norm_num },
end