2. Using polyrith

This section and the next can be read independently. They present a series of extended examples, showing how to transform problems into nails which the polyrith hammer can hit.

The focus in this section is on making sure that polyrith has all the hypotheses it needs to solve a problem.

2.1. Chebyshev polynomials

The Chebyshev polynomials (of the first kind) are a sequence of polynomials \(T_n(x)\) defined recursively:

\[\begin{split}\begin{cases} T_0(x)&=1\\ T_1(x)&=x\\ \text{for $n\in\mathbb{N}$, } T_{n+2}(x)&=2xT_{n+1}(x)-T_n(x). \end{cases}\end{split}\]

So \(T_2(x)=2x^2-1\), \(T_3(x)=4x^3-3x\), etc.

We express the sequence of Chebyshev polynomials in Lean as a function from the natural numbers to the integer-coefficient polynomials ℤ[X], defined recursively.

/-- The Chebyshev polynomials, defined recursively. -/
noncomputable def T :   [X]
  | 0 => 1
  | 1 => X
  | n + 2 => 2 * X * T (n + 1) - T n

-- now record the three pieces of the recursive definition for easy access
theorem T_zero : T 0 = 1 := rfl

theorem T_one : T 1 = X := rfl

theorem T_add_two (n : ) : T (n + 2) = 2 * X * T (n + 1) - T n := by rw [T]

In this section we will prove the multiplication formula for Chebyshev polynomials:

Theorem

For all natural numbers \(m\) and \(k\),

\[2 T_m(x)T_{m+k}(x) = T_{2m+k}(x)+T_k(x).\]

This is proved by induction on \(m\): if the formula is true for \(m\) and \(m+1\) (for all \(k\)), then it is true for \(m+2\) (for all \(k\)). I leave the two base cases as exercises. Here is the paper proof of the inductive step:

\[\begin{split}2T_{m+2}T_{(m+2)+k} &=2\left(2xT_{m+1}-T_m\right)T_{m+k+2}\\ &=2x\left(2T_{m+1}T_{(m+1)+(k+1)}\right)-2T_mT_{m+(k+2)}\\ &=2x\left(T_{2(m+1)+(k+1)}+T_{k+1}\right)-\left(T_{2m+(k+2)}+T_{k+2}\right)\\ &=\left(2xT_{2m+k+3}-T_{2m+k+2}\right)+\left(2xT_{k+1}-T_{k+2}\right)\\ &=T_{2(m+2)+k}+T_{k}.\end{split}\]

Notice that two facts are being invoked (each repeatedly) during this proof: the recurrence relation, in Lean T_add_two, and the inductive hypothesis of the theorem, which within the proof below will carry the same name, mul_T, as the theorem itself. To make polyrith do this proof you should determine precisely which input values these facts are being used with in the above proof, and state them in Lean with those input values. See the broken proof below for the basic idea.

There is one subtlety. It will happen that the same natural number appears as the index of the sequence T, but in more than one way. For example, in the setup below, both k * 37 and 37 * k appear as indices. This prevents polyrith from parsing T (k * 37) and T (37 * k) as the same token. So, before running polyrith, normalize all indices by running ring_nf (“ring normal form”) simultaneously on all the hypotheses you care about and on the goal.

/-- The product of two Chebyshev polynomials is the sum of two other Chebyshev polynomials. -/
theorem mul_T :  m : ,  k, 2 * T m * T (m + k) = T (2 * m + k) + T k
  | 0 => by
    intro k
    ring_nf
    have h := T_zero
    polyrith
  | 1 => by
    intro k
    have h₁ := T_add_two k
    have h₂ := T_one
    ring_nf at h₁ h₂ 
    polyrith
  | m + 2 => by
    intro k
    have H₁ := mul_T (m - 5) (k * 37) -- not actually a relevant pair of input values!
    have h₁ := T_add_two 7 -- not actually a relevant input value!
    have h₂ := T_add_two (37 * k) -- not actually a relevant input value!
    -- ... add more/different instantiations of `mul_T` and `T_add_two` here
    ring_nf at H₁ h₁ h₂ 
    polyrith -- this will work once the right facts have been provided above

2.2. Isometries of the complex plane

Let us identify the Euclidean plane with the complex numbers \(\mathbb{C}\). In this section we prove the following lemma:

Lemma

If \(f:\mathbb{C}\to\mathbb{C}\) is a real-linear isometry of the plane fixing \(1\), then \(f(i)\) is either \(i\) or \(-i\).

This lemma appears in Mathlib in the proof of the classical theorem that isometries \(f:\mathbb{C}\to\mathbb{C}\) of the plane are of the two forms

\[f(z)=az+b,\qquad f(z)=a\overline{z}+b\]

for \(a,b\in\mathbb{C}\) with \(|a|=1\): compositions of a translation, a rotation, and (possibly) a reflection. (An isometry has to be affine; by postcomposition with a translation we can arrange that \(f(0)=0\), i.e. \(f\) is linear; by postcomposition with a rotation we can arrange that \(f(1)=1\), and then after the above lemma \(f\) is determined on the real basis \(\{1, i\}\) for \(\mathbb{C}\), hence everywhere by linearity.

Real-linear isometries from \(\mathbb{C}\) to \(\mathbb{C}\) are expressed in Lean by the type →ₗᵢ[ℝ] . The ₗᵢ subscript stands for “linear isometry”, and the [ℝ] denotes the scalar field to consider for the linearity.

The lemma is proved on paper as follows. Since \(f\) is linear and norm-preserving and \(f(1)=1\), we have

\[\begin{split}f(i)\overline{f(i)}&=|f(i)|^2\\ &=|i|^2\\ &=1,\\ \left(f(i)-1\right)\left(\overline{f(i)} - 1\right) &=\left(f(i)-f(1)\right)\left(\overline{f(i)} - \overline{f(1)}\right)\\ &=f(i-1)\overline{f(i - 1)}\\ &=|f(i-1)|^2\\ &=|i-1|^2\\ &=(i-1)(\overline{i}-1).\end{split}\]

Exercise

Deduce from these two facts that

\[\left(f(i) - i\right)\left(\overline{f(i)}-\overline{-i}\right) = 0.\]

This fixes \(f(i)\) as either \(i\) (if the left factor vanishes) or \(-i\) (if the right factor vanishes).

We present the skeleton of this argument in Lean below, with the exercise indicated above left as a sorry. A tricky point is that linear_combination/polyrith does not know facts specific to the ring of complex numbers, such as that \(i^2=-1\) or that \(\overline{i}=-i\). You can state any needed such facts explicitly and prove them using norm_num (of course, there are lemmas for this, but why bother memorizing them?). Then you can rewrite by these facts, or just include them in the context for polyrith to pick up.

example {f :  →ₗᵢ[] } (h : f 1 = 1) : f I = I  f I = -I := by
  have key : (f I - I) * (conj (f I) - conj (-I)) = 0 := by
    have conj_fact :  z, f z * conj (f z) = z * conj z := by simp [mul_conj']
    have H₀ := conj_fact I
    have H₁ := conj_fact (I - 1)
    simp [h] at *
    -- state some fact(s) about `ℂ` here so that `polyrith` can use them in the computation
    -- when you have done this, `polyrith` will work
    polyrith
  -- From `key`, we deduce that either `f I - I = 0` or `conj (f I) - conj (- I) = 0`.
  rw [mul_eq_zero] at key
  obtain hI | hI := key
  · left
    linear_combination hI
  · right
    apply (conj :  →+* ).injective
    linear_combination hI

A similar strategy applies when working over other concrete rings with special properties. For example, when working over \(\mathbb{Z}/p\), you can add the fact \(p=0\) to the context and this will give polyrith more to work with.

2.3. Double cover of SO(3, ℝ)

In this section we construct the double cover of \(SO(3,\mathbb{R})\) by the unit quaternions. We first state the formula which will describe this double cover as a map from the quaternions \(\mathbb{H}\) to the 3 × 3 matrices. It sends the quaternion \(x+iy+jz+wk\) to

\[\begin{split}\begin{pmatrix} x ^ 2 + y ^ 2 - z ^ 2 - w ^ 2 & 2 (y z - w x) & 2 (y w + z x) \\ 2 (y z + w x) & x ^ 2 + z ^ 2 - y ^ 2 - w ^ 2 & 2 (z w - y x) \\ 2 (y w - z x) & 2 (z w + y x) & x ^ 2 + w ^ 2 - y ^ 2 - z ^ 2 \end{pmatrix}.\end{split}\]
/-- Explicit matrix formula for the double cover of SO(3, ℝ) by the unit quaternions. -/
@[simp] def Quaternion.toMatrix (a : ) : Matrix (Fin 3) (Fin 3)  :=
  let x, y, z, w := a
  !![x ^ 2 + y ^ 2 - z ^ 2 - w ^ 2, 2 * (y * z - w * x), 2 * (y * w + z * x);
    2 * (y * z + w * x), x ^ 2 + z ^ 2 - y ^ 2 - w ^ 2, 2 * (z * w - y * x);
    2 * (y * w - z * x), 2 * (z * w + y * x), x ^ 2 + w ^ 2 - y ^ 2 - z ^ 2]

This turns out to be a “homomorphism of monoids”, denoted in Lean by →*; that is, it preserves multiplication and the identity. Here is what the proof of that fact looks like. It would be extremely tedious on paper, but it’s very fast in Lean, because it simply requires checking coefficient-wise \(9=3 \times 3\) identities of real-number expressions (for the preservation of the identity element) and 9 identities of polynomials (for the preservation of multiplication).

/-- The explicit matrix formula `toMatrix` defines a monoid homomorphism from the quaternions
into the algebra of 3x3 matrices. -/
@[simps] def Quaternion.toMatrixHom' :  →* Matrix (Fin 3) (Fin 3)  where
  toFun := Quaternion.toMatrix
  map_one' := by
    ext i j
    fin_cases i <;> fin_cases j <;> simp
  map_mul' := by
    rintro x, y, z, w r, s, t, u
    ext i j
    fin_cases i <;> fin_cases j <;> simp <;> ring

Now we write down a definition of the unit quaternions. We give it the abstract-nonsense definition as the mker (for “monoid kernel”) of the “norm_square” monoid homomorphism Quaternion.normSq from the quaternions to the reals; this already existed in Mathlib. Then we prove a quick lemma that this is equivalent to the pedestrian definition, the sums of the squares of the four components summing to 1.

/-- The group (we only prove it to be a monoid) of unit quaternions. -/
def unitQuaternions : Submonoid  :=
  MonoidHom.mker ((Quaternion.normSq :  →*₀ ) :  →* )

@[simp] theorem mem_unitQuaternions (a : ) :
    a  unitQuaternions  a.re ^ 2 + a.imI ^ 2 + a.imJ ^ 2 + a.imK ^ 2 = 1 := by
  rw [Quaternion.normSq_def']
  exact Iff.rfl

The 3 × 3 orthogonal group is available in Mathlib already, as a submonoid of the 3 x 3 matrices. (Exercise for the reader: the monoid structure is in Mathlib but no one seems yet to have proved it’s a group, i.e. closed under inversion.)

#check Matrix.orthogonalGroup (Fin 3) 
-- Matrix.orthogonalGroup (Fin 3) ℝ : Submonoid (Matrix (Fin 3) (Fin 3) ℝ)

The first serious calculation is to prove that the matrix formula stated above is well-defined as a map from the unit quaternions to \(SO(3,\mathbb{R})\). That is, we have to prove that if \(x+iy+jz+kw\) is a unit quaternion then the formula produces an element of \(SO(3,\mathbb{R})\).

To prove this, we have to multiply it by its transpose, obtaining a 3 × 3 matrix of quartic polynomials in \(x,y,z,w\), and check entry by entry that if \(x^2+y^2+z^2+w^2=1\) then that quartic polynomial is one (respectively, zero) on (respectively, off) the diagonal.

All these 9 checks can be done by polyrith. Click through in your Lean file to see the proofs that polyrith comes up with for each one.

/-- The explicit matrix formula `to_matrix` sends a unit quaternion to an element of SO(3, ℝ).
-/
theorem toMatrix_mem_orthogonal {a : } (ha : a  unitQuaternions) :
    toMatrix a  Matrix.orthogonalGroup (Fin 3)  := by
  rw [Matrix.mem_orthogonalGroup_iff]
  obtain x, y, z, w := a
  have H : x ^ 2 + y ^ 2 + z ^ 2 + w ^ 2 = 1 := by rwa [mem_unitQuaternions] at ha
  ext i j
  fin_cases i <;> fin_cases j <;>
    simp
  · polyrith
  · polyrith
  · polyrith
  · polyrith
  · polyrith
  · polyrith
  · polyrith
  · polyrith
  · polyrith

So the monoid homomorphism from \(\mathbb{H}\) to 3 × 3 matrices descends to a homomorphism unitQuaternions.toMatrixHom from the unit quaternions to \(SO(3, \mathbb{R})\), with a couple more lines of abstract nonsense.

We now want to show that this homomorphism is a double cover; that is, its kernel has two elements, \(\{1, -1\}\).

There are two directions to this proof. The easy direction is that both these elements are in the kernel; this is a numeric check that can be done with 2 × 3 × 3 applications of norm_num. The hard direction is that an element of the kernel is one of these two.

So let \(x+iy+jz+kw\) be an element of the kernel. We have that \(x^2+y^2+z^2+w^2=1\), since it is a unit quaternion, and we also have 9 further quadratic equations available about \(x,y,z,w\): for example, by looking at the second row and first column in the matrix (in Lean this is the first row and zero-th column!) we have that \(2(y z + w x)=0\).

We will use polyrith to deduce that \(y=0\), \(z=0\), \(w=0\), and \(x^2=1\), which lets us conclude that \(x+iy+jz+kw = \pm 1 + 0i+0j+0k\) as desired.

The proof below is on the right track, but it is broken: I have not given polyrith all the information it needs. (I was lazy and just wrote out the result of checking the (0,1)-th coefficient of the matrix, as discussed above.) Figure out what further matrix coefficients to check so that polyrith has all the information it needs. Or just write down all nine!

/-- Verify the "double cover" property of the homomorphism from unit quaternions to SO(3, ℝ):
the kernel is {1, -1}. -/
theorem toMatrixHom_mker :
    (MonoidHom.mker toMatrixHom : Set unitQuaternions) = {1, negOne} := by
  ext a
  obtain ⟨⟨x, y, z, w⟩, h := a
  have H : x ^ 2 + y ^ 2 + z ^ 2 + w ^ 2 = 1 := by rwa [mem_unitQuaternions] at h
  simp [MonoidHom.mem_mker]
  constructor
  · -- hard direction: a quaternion in the kernel is ±1
    intro h1
    have h₀₁ := congr_fun₂ h1 0 1
    -- Add more matrix entry inspections here as needed, and adjust the simplification line.
    -- The `polyrith` applications that follow will be broken until you do this!
    simp at h₀₁
    obtain rfl : y = 0 := by
      polyrith
    obtain rfl : z = 0 := by
      polyrith
    obtain rfl : w = 0 := by
      polyrith
    have hx : x ^ 2 = (1 : ) ^ 2 := by
      polyrith
    -- now do a case division depending on the two cases for `x ^ 2 = 1 ^ 2`
    rw [sq_eq_sq_iff_eq_or_eq_neg] at hx
    obtain rfl | rfl := hx
    · left
      ext <;> simp
    · right
      ext <;> simp
  · -- easy direction: ±1 are in the kernel
    rintro (⟨rfl, rfl, rfl, rfl | rfl, rfl, rfl, rfl⟩) <;> ext i j <;> fin_cases i <;>
      fin_cases j <;> simp

You may notice a new syntax (exp := 2) appearing in the polyrith output in some of the above calculations. This will be discussed in the next section.