The star-center of the quaternionic numerical range

In this paper we prove that the quaternionic numerical range is always star-shaped and its star-center is given by the equivalence classes of the star-center of the bild. We determine the star-center of the bild, and consequently of the numerical range, by showing that the geometrical shape of the upper part of the center is defined by two lines, tangents to the lower the bild.


Introduction
Let H denote the skew-field of Hamilton quaternions. Let A be a n × n matrix with quaternionic entries. It is well known that the numerical range W H (A) = W (A) is a connected but not necessarily convex subset of the quaternions. The group of unitary quaternions S H acts on H by automorphisms. Since every class [q], q ∈ H, has a representative in C + and each class of q ∈ W (A) is contained in W (A), it became clear from the early studies of the quaternionic numerical range that it is enough to study the bild of A, B(A) = W (A) ∩ C or the upper-bild B + (A) = W (A) ∩ C + . The latter has the advantage of being always convex whereas B(A) is convex if, and only if, W (A) is convex, see [Zh,page 53] and theorem 3.1. The convexity of the numerical range, the bild and upper bild has been studied by several authors, see [AY1,AY2,R,ST,STZ].
In the complex setting the numerical range is convex thanks to the celebrated Toeplitz-Hausdorff Theorem [GR]. Over the time, several generalizations of the numerical range have been proposed, namely the C-numerical range, the joint numerical range, among others, and in these cases convexity may fail. It then becomes natural to look for convexity-like geometric properties. For instance, the property of star-shapedness has been studied in [CT,LLPS18,LLPS19,LNT,LP]. We recall that star-shapedness of a set B only requires that there is an element b 0 ∈ B such that every segment connecting b 0 and any other element of B must be contained in B, see definition 2.1. Accordingly, we say that b 0 is in the star-center of B.
For some generalizations of the numerical range, the star-shapedness of the (complex) numerical range holds under certain conditions. In the article we tackle the question of the star-shapedness in the quaternionic setting. We prove that the quaternionic numerical range is always star-shaped. In addition, we characterize the shape of the star-center for quaternionic matrices.
The star-shapedness of the numerical range is a consequence of two simple facts (see theorem 3.4). Firstly, the convexity of the upper and lower bilds imply that the segments whose end is a real element of the bild is contained in the bild. Therefore the bild is star-shaped and the reals therein are part of its center. And secondly, the equality, up to isomorphism, of all two dimensional real subalgebras of the quaternions that include the reals (as a real subspace), leads us to the conclusion that the reals in W (A) are in fact part of the (star) center of the numerical range.
As mentioned before, the general reason to focus on the bild is that the whole numerical range can be reconstructed from it by using similarity classes. Our result is in line with the elements of the bild being the building blocks of the numerical range. In fact, we prove in theorem 3.8 that the center of the numerical range is given by the similarity classes of the center of the bild. Therefore, we only need to know the center of the bild, and then to build the similarity classes to obtain the center of the numerical range. When the matrix is non hermitian the upper center (likewise for the lower center) is the region of the upper bild limited by two lines. These two lines are the tangents to the curve defining the boundary of the lower bild at the reals, see theorems 4.1, 4.3 and corollary 4.5. As a consequence of these results we establish a new proof of the important theorem by Au-Yeung [AY1, theorem 3], which establish a necessary and sufficient condition for convexity of the numerical range, see corollary 4.4. We conclude with an example where we explicitly compute the center.

Preliminaries
The quaternionic skew-field H is an algebra of rank 4 over R with basis {1, i, j, k}, where the product is given by i 2 = j 2 = k 2 = ijk = −1. For any q = a 0 + a 1 i + a 2 j + a 3 k ∈ H we denote by q r = a 0 and q v = a 1 i + a 2 j + a 3 k, the real and imaginary parts of q, respectively. Let the pure quaternions be P = span R {i, j, k}. The conjugate of q is given by q * = q r − q v and the norm is defined by |q| 2 = qq * . Two quaternions q, q ′ ∈ H are called similar, if there exists a unitary quaternion s such that s * q ′ s = q. Similarity is an equivalence relation and we denote by [q] the equivalence class containing q. A necessary and sufficient condition for the similarity of q and q ′ is given by q r = q ′ r and |q v | = |q ′ v |, see [R, theorem 2.2.6]. We will denote the set of all equivalence classes of the elements of a set X ⊆ H by [X]. Then, Let H n be the n-dimensional H-space. The norm of x ∈ H n is |x| 2 = x * x. The disk with center a ∈ H n and radius r > 0 is the set D H n (a, r) = {x ∈ H n : |x − a| ≤ r} and its boundary is the sphere S H n (a, r). In particular, if a = 0 and r = 1, we simply write D H n and S H n . With this notation, the group of unitary quaternions is S H whereas S P denotes the unit sphere over the pure quaternions.
Let M n (H) be the set of all n × n matrices with entries over H. The set W (A) = {x * Ax : x ∈ S H n } is called the quaternionic numerical range of A in H. From the above definition we see that the quaternionic numerical range of A ∈ M n (H) is the subset of H containing the images of the quadratic function f A (x) = x * Ax over the quaternionic unitary sphere, x ∈ S H n . The numerical range is invariant under unitary equivalence, i.e.
It is well known that if q ∈ W (A) then [q] ⊆ W (A), see [R, page 38]. This means that if q 1 ∼ q 2 and q 2 ∈ W (A) then q 1 ∈ W (A). For simplicity we just say that q 2 belongs to W (A) by similarity. Therefore, it is enough to study the subset of complex elements in each similarity class. This set is known as B(A), the bild of A: B(A) = W (A) ∩ C. We will freely use both notations B(A) and W (A) ∩ C for the bild of A. Although the bild may not be convex, the upper bild B + = W (A) ∩ C + is always convex, see [ST]. Analogously, the lower bild B − = W (A)∩C − is also always convex. Note that For p ∈ P, let Span{1, p} + = {α + βp : α ∈ R, β ∈ R + 0 }. For any w ∈ W (A) and p ∈ P, let w (p) be the representative of the class [w] in span {1, p} + , that is, For h 0 , h 1 ∈ H we will denote by [h 0 , h 1 ] the set of convex linear combinations of h 0 and h 1 : For simplicity, we refer to the star-center of a set as the center.

Star-shapedness of the bild and numerical range
The upper bild and the bild fully specify the numerical range, but the first is considered better suited to represent the quaternionic numerical range. This is not only because it is convex but also because it has the advantage of containing one single element from each similarity class. In a sense, the upper bild can be interpreted as the set of equivalence classes for the similarity relation ∼, that is, the quotient set B + = W/ ∼. However, from the convexity of the upper bild we cannot infer about the convexity of the numerical range, as the first is always convex and the latter is not.
The first result of this paper relates the convexity of the bild with the convexity of the numerical range. This is a known result (see [Zh,page 53]), however we present a different proof based on elementary properties of the numerical range.
We will prove that c (i) ∈ B + , thus proving by similarity that c ∈ W (A). Since the upper bild is convex, By similarity, ω * ∈ B − . Note that c (i) = c (i),r + c (i),v = c r + i|c v |. From (3.2), c (i),r = ω r = ω * r and from (3.1), Since the numerical range is invariant under unitary equivalence, we can work with U * AU , that can be written in the form We claim that 0 ∈ W H (S). To prove this we will find a vector x ∈ S H n such that f S (x) = 0. Let x 3 = . . . = x n = 0, then take z 1 and z 2 in S H such that q 1 = z * 1 s 1 z 1 ∈ C + and q 2 = z * 2 s 2 z 2 ∈ C − . The quaternions q 1 and q 2 are either zero or the representatives of s 1 in C + and s 2 in C − , respectively. Thus they are pure complex. Finally, choose β ∈ [0, 1] such that βq 1 + (1 − β)q 2 = 0. Take x 1 = β 1/2 z 1 and x 2 = (1 − β) 1/2 z 2 . Then, the vector x ∈ S H n is in the stated conditions. It is now clear that W (A)∩R = ∅. In fact, take vector x and compute We have proved the following result. 1 From now on, we fix a matrix with quaternionic entries, A ∈ M n (H), and we denote the quaternionic numerical range of A simply by W = W (A).
Let q 1 , q 2 ∈ S P . We say an element a 1 ∈ span{1, q 1 } is∼-similar to a 2 ∈ span{1, q 2 }, if and only if, for some r, s ∈ R, a 1 = r + sq 1 and a 2 = r + sq 2 , in which case we write a 1∼ a 2 . We say that A 1 ⊆ span{1, q 1 } and A 2 ⊆ span{1, q 2 } are∼-similar, and denote it by A 1∼ A 2 , if and only if, for any a 1 ∈ A 1 there is an a 2 ∈ A 2 such that a 1∼ a 2 , and vice versa. When two sets are∼-similar they share some properties, namely convexity. In fact, if A 1 is convex we can conclude that A 2 is convex. Take any a 2 ,ã 2 ∈ A 2 . Then, there are a 1 ,ã 1 ∈ A 1 , such that a 1∼ a 2 andã 1∼ã2 . For any α ∈ [0, 1] it is a matter of simple calculations to note that Therefore A 2 is also convex. A similar argument proves that the centers arė ∼-similar for any two∼-similar sets A 1 and A 2 , since whenever a segment is in A 1 the∼-similar segment must be in A 2 . That is, Lemma 3.3. For any q 1 , q 2 ∈ S P we have: Proof. The numerical range is such that, by similarity, W (q 1 )∼ W (q 2 ) , for any q 1 , q 2 ∈ S P . It is also an immediate conclusion of numerical range's closedness to similarity that W (q 1 )+∼ W (q 2 )+ , for any q 1 , q 2 ∈ S P . It is known that the upper bild W (i)+ = B + is convex, thus from the previous discussion, we have that W (q)+ is also convex for any q ∈ S P . Moreover from 1 This result is apparently known for some time, as it appears in the thesis of [Siu], supervised by Au-Yeung, however it has never been published before, (to the best of our knowledge). In spite of this, Au-Yeung in [AY1, corollary 1] apropos of the connectedness of W H ∩R, and citing a result from [J], states the possibility of W H ∩R = ∅. This possibility is also stated by [Zh,theorem 9.2] and [K, corollary 2.10], repeating again the same result by [J], (although [K] doesn't cite it).
As a consequence of this lemma we only need to study the center of one of the W (q) 's and the natural choice is to take q = i, that is, we only need to study the center of the bild B = W (i) .
Theorem 3.4. The quaternionic numerical range W is star-shaped and The numerical range W (A) is contained in R if, and only if, A is hermitian, see [R, corollary 3.5.3].. The next result follows trivially from theorem 3.4.
Lemma 3.6. The center of the bild is closed under conjugation, i.e.
Let ω be any element of the bild ω ∈ W ∩ C.
We now establish the equality between the center of the bild and the complex part of the center of the numerical range.
We can assume, without loss of generality, that c = c (i) ∈ C + . Since any quaternion y can be written as the sum of a real with a pure quaternion, we may write y = y r + |y v |q, with q ∈ S P . We have: By similarity, it is enough to prove that y (i) ∈ B + . With this purpose, we will find two elements a, b ∈ B + such that In this case, by convexity of the upper bild, The conclusion that b ∈ B + follows from the fact that w (i) , c (i) ∈ B + , which is a convex set. If . We now need to check that a and b are in W and satisfy conditions (3.3). It is trivial to conclude that the real parts are all equal. On the other hand, To conclude that |a v | ≤ |y v | we will use Cauchy-Schwartz inequality. If we look a quaternion q ∈ H as a vector in R 4 , its norm is given by q, q = |q| 2 , where ., . is the usual inner product in real vector spaces. Then we have: Since |c v | = |c (i),v | and|w v | = |w (i),v |, we have: Using the equality (αc (i) + (1 − α)w * It remains to prove that a ∈ W . If a = αc (i) + (1 − α)w * (i) , by hypothesis c (i) ∈ C (W ∩ C) and w * (i) ∈ W ∩ C, then any convex combination of them is also in W ∩C. If a = αc * (i) +(1−α)w (i) then a ∈ W , because c * (i) ∈ C (W ∩C) by lemma 3.6, and w (i) ∈ W ∩ C.
Next result establish the relation between the center of the numerical range C (W ) and the center of the bild C (W ∩ C).
Theorem 3.8. The center of the numerical range is such that Proof. Let c ∈ C (W ). For some q ∈ S P , we have c ∈ C (W ) ∩ span {1, q}. Using a similar reasoning of the proof of proposition 3.7, we can show that ). Now, c ∈ C (W ) if and only if c ∈ C (W (q) ), for some q ∈ S P , that is, By lemma 3.3, C (W (q) )∼C (W (i) ). We conclude that If we use the fact that W is the set of all elements similar to those in W ∩ C, that is, W = W ∩ C , the above result can be written in the following way: In other words, the operations of taking the center and of taking the equivalence classes of a numerical range commute.

Characterization of the center of the bild
We now know that it is possible to characterize the center of the numerical range from the center of the bild. On the other hand, lemma 3.6 guarantees that the lower part of the center of the bild is the conjugate of the upper part, (4.1) C − = C (W ) ∩ C − = (C + ) * = (C (W ) ∩ C + ) * , and we conclude that to determine C (W ) we only need to know C + . From corollary 3.5, we may focus only on non-hermitian matrices.
By the convexity of the upper bild, the segment joining any two elements in the upper bild is contained in it. Therefore an element of the upper bild is not in the center if and only if a convex combination with an element in the lower bild is not in the bild. That is, an element ω ∈ W ∩ C + is not in the center of the bild, ω ∈ C (W ∩ C), if and only if, there is z ∈ W ∩ C − such that the segment connecting the two is not contained in the bild, i.e.
[ω, z] ⊆ W ∩ C. The argument we will use is build upon the fact that a segment, joining two elements of the bild, is not totally contained in the bild, if and only if it crosses the reals outside of it. Thus, either an element ω of the upper bild has all its segments [ω, z], for z ∈ W ∩ C − , crossing the real line inside the bild, that is, [ω, z] ∩ R ⊆ B, in which case ω is in the center, or there is one of these segments that crosses the real line outside the bild, and the element ω is not in the center.
For the rest of this section we will slightly change notation and write z = x + iy as (x, y). Let m = min W ∩ R and M = max W ∩ R be the minimum and maximum of the real elements in the bild. Using the previous reasoning, but on a dual perspective, to find out if an element ω in the upper bild is in the center, we only need to see if the segments joining ω to (M, 0) and to (m, 0) intersects the interior of the lower bild or not. In the case where it does the element is not in the center. For instance, if the segment joining ω ∈ B + to (m, 0) intersects B − at z in the lower part of the interior of the bild, then there is an elementz to the left of z such that the segment [ω,z] will cross the reals to the left of (m, 0), and therefore outside of the bild.
The next results formalize this intuitive argument. To reach this we will need to define for each ω ∈ C + two lines, one denoted l ω connecting ω = (ω 1 , ω 2 ) to (m, 0), and the other denoted L ω connecting ω to (M, 0). Since the real points of the numerical range belongs to the center (see theorem 3.4), it is enough to consider points ω = (ω 1 , ω 2 ), with ω 2 > 0. The lines are given by Let y m = min{π Span{i} (B)} and y M = max{π Span{i} (B)}. By symmetry of the bild, y M = −y m . Since the matrix is non-hermitian, y M > 0.
In the case where When m = M and π m < π M , we know that x 1 (·) ≤ x 2 (·) and x 1 (0) = In the case where a > b we have l(y) = m + ay > m + by = L(y), for y > 0.
Therefore, {(x, y) ∈ B + : l(y) ≤ x ≤ L(y)} = (m, 0) and this is, in fact, the upper center of B. We now consider a = b = 0 (the case where a = b = 0 is the one where B = {m} × [y m , y M ]). Since x 1 (·) is convex and x 2 (·) is concave we know that, using again [Roc,theorem 25.1], l(y) ≤ x 1 (y) ≤ x 2 (y) ≤ L(y). As a consequence of a = b we have that l = L and thus l(y) = x 1 (y) = x 2 (y), that is, the lower bild is a line, and we can write it as the set Since the upper bild is the conjugate of the lower bild, B + = {(x, y) ∈ R 2 : x = m − ay, 0 ≤ y ≤ y M }. Then the intersection of B + and l = {(x, y) ∈ R 2 : x = m + ay, y ∈ R}, when a = 0 is just (m, 0). That is {(x, y) ∈ B + : x = l(y)} = (m, 0) = C + (B).
A simple observation on the slope of the lines l and L allows us to give a different proof of the known result of Au-Yeung (see, [AY1,theorem 3]), which establishes an equivalent condition for the convexity of the quaternionic numerical range.
It is well known [Roc,theorem 23.1] that for any convex function f of real variable and any fixed element y 1 in the domain of f the function defined by y → f (y 1 ) − f (y) y 1 − y is increasing with y. Then any line that joins (f (y), y) and (f (y 1 ), y 1 ) in the graph of f with y < y 1 has slope smaller than f ′ (y − 1 ). Notice now that there is an element (π m , y πm ) in the lower bild. Using the previous conclusion when the convex function is x 1 , the reference point is y 1 = 0 and x 1 (y πm ) = π m < x 1 (0) = m, we conclude that [Roc,theorem 23.1], that is, l has positive slope. For the case when π m = m we have since ǫ < 0 and m ≤ x 1 (ǫ). Thus l has nonpositive slope.
Analogously, it can be shown that when M < π M , L has negative slope and when M = π M , L has nonnegative slope. Proof. We begin by proving that if π m = m or π M = M , then the numerical range is non-convex. Suppose π m < m (the case M < π M is analogous). Let x = l(y) = m + ay be the left tangent line to x 1 (·) at 0 as in (4.5).
Analogously, we can show that x ≤ M ≤ L(y). From theorem 4.3 we have that (x, y) ∈ C (B). Since (x, y) is arbitrary, we have that C (B) = B is convex and from theorem 3.1, W is convex.
An interesting case, where the center is a kite, is when π m < m and M < π M . The next corollary proves this result.
Proof. When π m < m, as we have noticed in (4.6), l has positive slope. Similarly, we can show that L has negative slope. Since l passes through (m, 0) and L through (M, 0), l and L must cross at a point in C + . Let this point beω. The result follows from theorem 4.3.
The follow example illustrates a case where the center is a kite.
Example. Following [ST,page 318], let A = k 1 i α −α 1 + k 2 i , with α, k 1 , k 2 ∈ R + and α 2 > k 1 k 2 . In this case, the boundary of the lower bild B − consists of an ellipse E and the segment [m, M ] × {0}, where (m, 0) and (M, 0) are the points where E intersects the real axis (the notation in [ST] is m = T 1 and M = T 2 ). Our aim is to describe the center of the bild of A.
Moreover, we know that the vertical lines x = 0 and x = 1 are tangent to the ellipse at (0, −k 1 ) and (1, −k 2 ), respectively. These data fully characterize the ellipse E . Therefore, if we substitute those points in the general equation Ax 2 + Bxy + Cy 2 + Dx + Ey + F = 0, we obtain a homogeneous system of six linear equations with six unknowns. From formulas (4.8) one concludes that the linear system's matrix has rank 5. Solving the linear system leads to the following characterization of E : (4.9) x 2 + 2(k 2 − k 1 ) mM k 2 1 xy + mM k 2 1 y 2 − (M + m)x + 2mM k 1 y + mM = 0.
It is now possible, albeit a tedious computation, to define the lines l and L as in theorem 4.3 and characterize C (B).
We conclude that the center of the bild of A = |y| .