1 Introduction
Let T be an invertible bounded operator on a complex Banach space X. In numerical analysis, it is a matter of interest to estimate the quantity $CN(T):=\Vert T\Vert \Vert T^{-1}\Vert $ , called the condition number of T. Roughly speaking, the condition number of T measures the greatest loss of precision that the linear system $AT=x$ can exhibit over all inputs and their potential errors. The condition numbers of matrices are also intimately related to many problems from matrix analysis, such as the study of the distribution of the eigenvalues of classical random matrices appearing in mathematical statistics. We refer the reader to, e.g., [Reference Quarteroni, Sacco and Saleri10, Chapter 3] or [Reference Eidelman3, Reference Smale15], and the references therein.
Let us denote by $\mathcal {L}(X)$ the algebra of all bounded operators on X, and for $n\in {\mathbb {N}}$ , let the notation $\mathcal {B}(n)$ (resp. $\mathcal {H}(n)$ ) stand for the set of all n-dimensional complex Banach spaces (resp. Hilbert spaces). When X belongs to $\mathcal {H}(n)$ , one can easily deduce from the polar decomposition that
where $\lambda _{i}(T)$ , $i=1,\ldots ,n$ , are the eigenvalues of T counting multiplicity. This inequality is clearly sharp since equality occurs if T is the $n\times n$ identity matrix. When turning to the Banach setting, the analogous problem becomes more involved and was first considered in the 70s by B. L. Van der Waerden, W. A. Coppel, and J. J. Schäffer. In view of (1.1), it is natural to seek for estimates of the quantity
A normalization reduces this problem to that of estimating the norm of the inverse of contractive invertible operators (that is, those with norm less than or equal to $1$ ). Namely, denoting by $\mathcal {C}(X)\subset \mathcal {L}(X)$ the class of contractive matrices on X, $C(n)$ coincides with the quantity
which turns out to be easier to handle. In 1970, Schäffer [Reference Schäffer14] proved the general estimate $C(n)\leq \sqrt {en}$ , and showed that the inequality
holds for invertible contractions T acting on ${\mathbb {C}}^{n}$ endowed with the $\ell ^{1}$ -norm or with the $l^{\infty }$ -norm, (and that (1.4) is sharp in both cases). This led him to make the conjecture—nowadays known as Schäffer’s conjecture—that $C(n)=2$ , $n\in {\mathbb {N}}$ . The latter was disproved first by Gluskin, Meyer, and Pajor [Reference Gluskin, Meyer and Pajor4], J. Bourgain [Reference Gluskin, Meyer and Pajor4],Footnote 1 and later by Queffélec [Reference Queffélec11], who obtained that $C(n)\geq c\sqrt {n}$ for some absolute constant $c>0$ , proving that the initial upper estimate given by Schäffer is optimal (see below for more precise statements with respect to our motivations).
In order to introduce our motivations and for the sake of general interest, we shall briefly survey recent extensions and refinements of the solutions to Schäffer’s conjecture that have been developed at the light of new theoretical approaches, and that are natural to the study of the condition number of matrices. It will imply the introduction of a number of notations that, hopefully, will be still kept limited.
First of all, it is a natural problem to wonder how does the constant $C(n)$ behave, if in its equivalent definition (1.3) one replaces the class of all invertible contractions by some other classical sets of invertible operators. Generally speaking, let $\mathcal {P}$ be a property, and for $T\in \mathcal {L}(X)$ , let the notation $T\in \mathcal {P}$ mean that the operator T satisfies the property $\mathcal {P}$ . We set
Thus, for example, $C(n)=C(n,\mathcal {C})$ if $T\in \mathcal {C}$ means “T is a contraction.” For properties $\mathcal {P}$ that define too large classes of operators, it may happen that $C(n,\mathcal {P})=\infty $ . It is obviously the case if $\mathcal {P}$ is satisfied by any invertible operator acting on any Banach space (indeed, consider all the matrices $\lambda I_{n}$ , $|\lambda |>1$ , where $I_{n}$ is the identity matrix of size n). In this situation, studying $C(n,\mathcal {P})$ is not relevant for estimating condition numbers. On the contrary, if $\mathcal {P}$ is a property for which $C(n,\mathcal {P})<\infty $ , then by definition it holds
for any $T\in \mathcal {P}$ . In this case, estimating $C(n,\mathcal {P})$ becomes especially relevant for properties $\mathcal {P}$ that are satisfied by operators that are not necessarily contractions since, for such operators, it may give upper estimates of condition numbers that are better than that given by Schäffer’s inequality (the latter is valid for any invertible operator).
In 2005, Nikolski [Reference Nikolski8] developed a new approach to achieve estimates of $C(n,\mathcal {P})$ by using functional calculus on spaces of functions holomorphic on the unit disk ${\mathbb {D}}:=\{z\in {\mathbb {C}}:\,|z|<1\}$ . The idea is that for certain properties $\mathcal {P}$ and any Banach space X, an invertible operator $T\in \mathcal {L}(X)$ satisfies $\mathcal {P}$ if and only if $\Vert f(T)\Vert \leq K\Vert f\Vert _{A_{\mathcal {P}}}$ for any f in some algebra $A_{\mathcal {P}}$ of holomorphic functions on $\mathbb {D}$ and for some absolute constant $K>0$ . To make it shorter, we will say in this case that property $\mathcal {P}$ obeys an $A_{\mathcal {P}}$ -functional calculus. Then, for such properties $\mathcal {P}$ , estimating $\Vert T^{-1}\Vert $ from above for $T\in \mathcal {P}$ reduces to estimating $\Vert f\Vert _{A_{\mathcal {P}}}$ or $\Vert f\Vert _{A_{\mathcal {P}}/m_{T}A_{\mathcal {P}}}$ from above for any $f\in A_{\mathcal {P}}$ satisfying the Bézout identity $zf+m_{T}h=1$ , where $m_{T}$ is the minimal polynomial of T and h is any function in $A_{\mathcal {P}}$ . For example, it is well known and easily seen that the property $\mathcal {C}$ of being a contraction obeys a W-functional calculus (with constant 1) where W refers to the Wiener space consisting of all functions f analytic in ${\mathbb {D}}$ , such that $\Vert f\Vert _{W}:=\sum _{k\geq 0}|\hat {f}(k)|<\infty $ , and where $\hat {f}(k)$ denotes the kth Taylor coefficient of f. It is also readily checked that given $K\geq 1$ , the property $\mathcal {PB}_{K}$ satisfied by all Banach space power bounded operators T such that $\sup _{n}\Vert T^{n}\Vert \leq K$ obeys a W-functional calculus (with constant K), which leads to the same estimates for $C(n,\mathcal {PB}_{K})$ and $C(n,\mathcal {C})$ , up to an absolute constant. We also refer the reader to [Reference Szehr18, Paragraph 2.3] for more explanations of the above Nikolski’s strategy. Another standard class of operators obeying a functional calculus over a function algebra is that of Kreiss operators. It is defined, for $K\geq 1$ , by the property $\mathcal {K}_{K}$ given for any Banach operator T by $T\in \mathcal {K}_{K}$ if and only if the following resolvent estimate holds:
For a given operator T, the infimum of all constants K satisfying (1.6) is called the Kreiss constant of T. This class of operators satisfying $\mathcal {K}_{K}$ is an important one in numerical analysis [Reference Nikolski9, Reference Spijker, Tracogna and Welfert17]. Note that every contraction is a Kreiss operator with Kreiss constant less than or equal to $1$ and that there exist Kreiss operators that are not contractive. In [Reference Nikolski8], Nikolski made use of the fact that $\mathcal {K}_{K}$ obeys a Besov functional calculus, according to a result by Vitse [Reference Vitse22], to obtain the following theorem.
Theorem 1.1 $\quad $
-
1. [Reference Nikolski8, Theorem 3.26] Let X be a complex Banach space, and let $T\in \mathcal {L}(X)$ be a Kreiss operator with Kreiss constant $K\geq 1$ . Then
(1.7) $$ \begin{align} \Vert T^{-1}\Vert\leq CK\frac{n}{\prod_{i=1}^{n}|\lambda_{i}|}, \end{align} $$where $(\lambda _{1},\ldots ,\lambda _{n})\in {\mathbb {D}}^n$ denotes the eigenvalues of T and C an absolute constant. -
2. (Particular case of [Reference Nikolski8, Theorem 3.31]) For any $(\lambda _{1},\ldots ,\lambda _{n})\in {\mathbb {D}}^{n}$ , there exists an invertible Kreiss operator T with eigenvalues $(\lambda _{1},\ldots ,\lambda _{n})$ such that
(1.8) $$ \begin{align} \left\Vert T^{-1}\right\Vert \geq c\frac{n}{\prod_{i=1}^{n}|\lambda_{i}|}\left(c'-\prod_{j=1}^{n}\left(1+\left|\lambda_{i}\right|\right)\right), \end{align} $$where $c>0$ and $c'> 1$ are absolute constants.
The first part of this theorem implies
for any $T\in \mathcal {K}_{K}$ . In passing, we point out that for any given $K\geq 1$ , there exists an absolute constant $C'$ such that $\Vert T\Vert \leq C'K$ for any $T\in \mathcal {K}_{K}$ (to see it, one can use the Riesz–Dunford functional calculus). Moreover, letting the $\lambda _{i}$ ’s tend to $0$ sufficiently fast in (1.8), one can deduce that for fixed K, the inequality $C(n,\mathcal {K}_{K})\leq CKn$ is made sharp with respect to $n\rightarrow \infty $ for sequences of operators with spectrum shrinking to $0$ . It follows from the approach used by Nikolski that these extremal operators are model operators acting on model spaces (with adequate norms). Then three questions naturally arise.
Question 1 Is it possible to find operators with fixed degenerate spectrum that are extremal for $C(n,\mathcal {K}_{K})\leq CKn$ , $n\to \infty $ ?
Question 2 Can one choose these operators among classes of structured matrices (e.g., Toeplitz matrices)?
Question 3 Is the inequality $C(n,\mathcal {K}_{K})\leq CKn$ sharp simultaneously when K and n go to $\infty $ ?
The aim of this note is to propose a solution to these three questions. For contractions, the same questions have already been investigated for a long time. A brief exposition of these investigations and a statement of our results requires the introduction of the following natural quantities:
-
• $C_{n}^{\mathcal {H}}(r,\mathcal {P}):=\sup \left \{ \left \Vert T^{-1}\right \Vert :\:T\in \mathcal {P},\,T\in \mathcal {L}(X)\text { invertible},\,X\in \mathcal {H}(n),\:r_{\min }(T)\geq r\right \} $ ;
-
• $C_{n}^{\mathcal {B}}(r,\mathcal {P}):=\sup \left \{ \left \Vert T^{-1}\right \Vert :\:T\in \mathcal {P},\,T\in \mathcal {L}(X)\text { invertible},\,X\in \mathcal {B}(n),\:r_{\min }(T)\geq r\right \} $ ;
-
• $\tau _{n}^{\mathcal {H}}(r,\mathcal {P}):=\sup \left \{ \left \Vert T^{-1}\right \Vert :\:T\in \mathcal {P}{\kern-1.2pt}\cap{\kern-1.2pt} \mathcal {T},\,T{\kern-1.2pt}\in{\kern-1.2pt} \mathcal {L}(X)\text { invertible},\,X{\kern-1.2pt}\in{\kern-1.2pt} \mathcal {H}(n),\:r_{\min }(T){\kern-1.2pt}\geq{\kern-1.2pt} r\right \} $ ;
-
• $\tau _{n}^{\mathcal {B}}(r,\mathcal {P}):=\sup \left \{ \left \Vert T^{-1}\right \Vert :\:T\in \mathcal {P}\cap \mathcal {T},\,T\in \mathcal {L}(X)\text { invertible},\,X{\kern-1.2pt}\in{\kern-1.2pt} \mathcal {B}(n),\:r_{\min }(T){\kern-1.2pt}\geq{\kern-1.2pt} r\right \} $ ,
where $r_{\text {min}}(T)$ stands for the smallest eigenvalues of T, where $\mathcal {T}$ denotes the property of being a Toeplitz operator, and where the notation $\mathcal {P}\cap \mathcal {P}'$ means that $\mathcal {P}$ and $\mathcal {P}'$ are simultaneously satisfied.
If $\mathcal {P}=\mathcal {C}$ , and in the Hilbert setting, it is known that for any $r\in (0,1), C_{n}^{\mathcal {H}}(r,\mathcal {C})=r^{-n}$ . This result probably dates back to the 19th century and is sometimes attributed to L. Kronecker. A description of matrices achieving the supremum was obtained by Nikolski [Reference Nikolski8, Theorem 2.1], using the abovementioned functional calculus approach. In the Banach setting, none of the results given in [Reference Gluskin, Meyer and Pajor4, Reference Queffélec12] lead to sharp estimates of $C_{n}^{\mathcal {B}}(r,\mathcal {C})$ . Indeed, while Schäffer’s inequality [Reference Schäffer14] gives the upper estimates
the lower estimates obtained in [Reference Gluskin, Meyer and Pajor4, Reference Queffélec11] are not enough to prove that (1.10) is sharp with respect to $r,n$ : Gluskin–Meyer–Pajor [Reference Gluskin, Meyer and Pajor4], Bourgain (see [Reference Gluskin, Meyer and Pajor4]), and Queffélec [Reference Queffélec11], respectively, stated
where c is an absolute constant. In all these estimates, it appears that r and n are correlated (in particular, the spectrum of the extremal operators is not fixed).
When $\mathcal {P}=\mathcal {K}_{K}$ , we recall that the first point of Theorem 1.1 gives that for any $r\in (0,1)$ ,
where C is an absolute constant, whereas the second one yields the asymptotic sharpness of (1.11) as $n\rightarrow \infty $ , only for $r\ll 1/n$ .
A somewhat natural approach to attack third the question of exhibiting operators with fixed spectrum that are asymptotically extremal for (1.10) and (1.11) may consist in looking for such operators in classes of structured matrices such as Jordan blocks, which are special cases of Toeplitz matrices. More generally, Toeplitz or Hankel matrices, which play a crucial role in matrix analysis and operator theory, are natural candidates. Yet, in the Banach setting, the proofs given in [Reference Gluskin, Meyer and Pajor4, Reference Queffélec11] are far from providing with explicit examples achieving $C(n,\mathcal {C})$ —and a fortiori with extremal Toeplitz matrices with degenerate spectrum. In the Hilbert setting, we recall that Nikolski characterized matrices that are extremal for $C_{n}^{\mathcal {H}}(r,\mathcal {C})$ for any r in [Reference Nikolski8] and asked for the existence of Toeplitz matrices satisfying this characterization. Szehr and the last author obtained in [Reference Szehr and Zarouf19] the equalities
for any $r\in (0,1)$ (see also [Reference Zarouf23] where the weaker estimate $\tau _{n}^{\mathcal {H}}(r,\mathcal {C})\geq 2^{-1}r^{-n}$ is obtained) and, in 2021, they proved that for some absolute constant $c>0$ ,
for any $r\in (0,1)$ and any n large enough, so that (1.10) is sharp. Moreover, they provided with explicit examples of extremal Banach space Toeplitz operators with arbitrary degenerate spectrum [Reference Szehr and Zarouf20]. Worth insisting, these extremal Toeplitz operators may be chosen with spectrum equal to $\left \{ \lambda \right \} ,$ $\lambda $ arbitrary in $\mathbb {D}\setminus \left \{ 0\right \}$ .
Our main contribution in this note is to follow their strategy and to obtain an analogous result for Kreiss operators, giving a solution to Questions 1 and 2. More precisely, we will prove the following result.
Theorem 1.2 (For a more precise statement, see Theorem 2.1)
For a fixed $K\geq 1$ , there exists an absolute constant $c>0$ such that
for any $r\in (0,1)$ and n large enough.
Moreover, the extremal Toeplitz matrices in the second inequality can be chosen with a degenerate spectrum arbitrary in $\mathbb {D}\setminus \left \{ 0\right \} $ .
This implies the sharpness of (1.11) as $n\to \infty $ for any $r\in (0,1)$ . With respect to numerical analysis, this theorem says that degenerate Toeplitz matrices may be ill-conditioned in high dimensions. The outline of the proof will be similar to that proposed by Szehr and Zarouf in [Reference Szehr and Zarouf20] for contractions, but the techniques will differ. We mention that Theorem 2.1 was announced in [Reference Charpentier, Fouchet, Szehr and Zarouf2], without a proof.
Our second result concerns the asymptotic sharpness of Nikolski’s upper bound (5.1) when K is permitted to grow unboundedly as $n\rightarrow \infty $ . It is indeed a natural question to ask whether the dependency on K in (5.1) can be improved or not. We will show that (5.1) is also sharp in the following sense: there exists a sequence of Jordan blocks $T_{n}\in \mathcal {L}(X),$ where $X=\left (\mathbb {C}^{n},\,\left |\!\left |\cdot \right |\!\right |{}_{l^{1}}\right )$ , such that $K\left (T_{n}\right )\rightarrow \infty $ as $n\to \infty $ and
for some absolute constant $c>0$ (see Proposition 4.1). We mention that the use of Jordan blocks as extremal matrices in the context of Kreiss matrices is rather classical (see, for example, [Reference van Dorsselaer, Kraaijevanger and Spijker21]).
The organization of the paper is as follows. The next section contains the prerequisites and the statement of Theorem 2.1. The latter is proved in the third section. Section 4 contains the proof of the asymptotic sharpness of (5.1) with respect to K and n (Proposition 4.1). In the last section, we present—for the interested reader—a short and simple proof of the upper estimate (5.1) in Theorem 1.1.
Notation Throughout the paper, we will use the notation $f\lesssim g$ meaning that $f\leq cg$ , where c is some absolute constant. The notation $f\simeq g$ will mean that $f\lesssim g$ and $g\lesssim f$ .
2 Background and statement of the main result
Let us denote by $H(\mathbb {D})$ the space of analytic functions in $\mathbb {D}$ and by $H^{\infty }$ the Banach algebra of bounded analytic functions in $\mathbb {D}$ , endowed with the supremum norm on $\mathbb {D}$ . The standard Hardy space $H^{2}$ is defined as the subspace of $H(\mathbb {D})$ consisting of those functions f for which
where m is the normalized Lebesgue measure on the unit circle $\mathbb {T}:=\left \{ z\in \mathbb {C}:\,|z|=1\right \} $ . Endowed with $\Vert \cdot \Vert _{H^{2}}$ , $H^{2}$ is a Hilbert space with inner product given by
By Fatou’s theorem, $H^{2}$ can be identified with the (Hilbert) subspace of $L^{2}(\partial \mathbb {D})$ so that $\left \langle f,g\right \rangle =\frac {1}{2\pi }\int _{-\pi }^{\pi }f(e^{it})\overline {g(e^{it})}dt$ , where f and g in the right-hand side of the last equality denote, without possible confusion, the almost everywhere radial limits of f and g, respectively.
Given $\lambda \in {\mathbb {D}}$ , we denote by $b_{\lambda }$ the Blaschke factor associated with $\lambda $ , namely
Let $\sigma =\{\lambda _{1},\ldots ,\lambda _{n}\}\subset {\mathbb {D}}$ be a finite sequence. We shall consider the model space $K_{B_{\sigma }}$ given by
where
is the finite Blaschke product associated with $\sigma =\{\lambda _{1},\ldots ,\lambda _{n}\}$ . Any such space $K_{B_{\sigma }}$ is an n-dimensional subspace of $H^{2}$ . Now, for $1\leq k\leq n$ , let $f_{k}:=\frac {1}{1-\overline {\lambda _{k}}z}$ and observe that $\Vert f_{k}\Vert _{H^{2}}=\left (1-\vert \lambda _{k}\vert ^{2}\right )^{-1/2}$ . Then set
It is known that $\left (e_{k}\right )_{1\leq k\leq n}$ defines an orthonormal basis of $K_{B_{\sigma }}$ , called the Malmquist–Walsh basis [Reference Nikolski6, p. 117].
The central object of Nikolski’s approach in [Reference Nikolski8] is the model operator $M_{B_{\sigma }}$ defined by
where $P_{B_{\sigma }}$ denotes the orthogonal projection on $K_{B_{\sigma }}$ . The matrix representation $\widehat {M_{B_{\sigma }}}$ of $M_{B_{\sigma }}$ with respect to the Malmquist–Walsh basis $\left (e_{k}\right )_{1\leq k\leq n}$ is as follows (see [Reference Szehr18, Proposition III.4]):
where $\left (\widehat {M_{B_{\sigma }}}\right )_{ij}$ stands for the $i,j$ entry of $\widehat {M_{B_{\sigma }}}$ . The reader may consult [Reference Nikolski6, Reference Nikolski7] for a complete description of model spaces and model operators.
For the proof of Theorem 2.1, we shall focus on the case where $\sigma =\left \{ \lambda ,\ldots ,\lambda \right \} $ (namely, $\lambda _{1}=\cdots =\lambda _{n}=\lambda $ with the previous notations). In this case,
and the Malmquist–Walsh basis $\beta _{\lambda }:=\left (e_{k}\right )_{1\leq k\leq n}$ is given by
Moreover, one can also show that $K_{B_{\sigma }}$ coincides as a set with the n-dimensional Banach space consisting of all rational functions of degree at most n with poles located at $1/\overline {\lambda }$ . We may and shall equip $K_{B_{\sigma }}$ with the Banach norm
Then $\left (K_{B_{\sigma }},\left |\!\left |\cdot \right |\!\right |{}_{\mathcal {B}_{\infty }}\right )$ is a (Banach) subspace of the Besov space $\mathcal {B}_{\infty }$ defined as
Our main theorem states as follows.
Theorem 2.1 Let $\lambda \in \mathbb {D}\setminus \{0\}$ be fixed, and let $T_{\lambda }$ denote the operator acting on $\left (K_{B_{\sigma }},\left |\!\left |\cdot \right |\!\right |{}_{\mathcal {B}_{\infty }}\right )$ whose matrix with respect to $\beta _{\lambda }$ is given by
Then $T_{\lambda }$ satisfies the Kreiss condition and the inequality
where $c(\lambda )>0$ and where $|\!|\cdot |\!|_{*}$ is the operator norm induced by $\left |\!\left |\cdot \right |\!\right |{}_{\mathcal {B}_{\infty }}$ . In particular, for any $r\in (0,1)$ ,
as $n\to \infty $ .
The proof of Theorem 2.1—displayed in the next section—is based on a duality argument already used in [Reference Szehr and Zarouf20]. Considering Kreiss operators, we will here make use of the Besov functional calculus developed in [Reference Vitse22] and the duality between the Besov spaces $\mathcal {B}_{\infty }$ and $\mathcal {B}_{1}$ , where
where A stands for the normalized Lebesgue measure on $\mathbb {D}$ . This duality is given by the relation (with equivalence of the norms, see [Reference Vitse22, p. 1815] for details)
for the Cauchy duality (2.1).
3 Proof of Theorem 2.1
Proof of Theorem 2.1
In the whole section, $\sigma =\left \{ \lambda ,\ldots ,\lambda \right \} \subset \mathbb {D}\setminus \left \{ 0\right \} $ is fixed. First, using the orthonormality of the Malmquist–Walsh basis, we can see that the adjoint of $M_{B_{\sigma }}$ acting on $\left (K_{B_{\sigma }},\left |\!\left |\cdot \right |\!\right |{}_{\mathcal {B}_{\infty }}\right )$ coincides with $T_{\bar {\lambda }}$ . So, in order to prove that $T_{\lambda }$ is a Kreiss operator, it is enough to show that $\widehat {M_{B_{\sigma }}}^{*}$ —the adjoint of $\widehat {M_{B_{\sigma }}}$ —satisfies the Kreiss condition, namely that there exists some constant $K>0$ such that for any $|\zeta |>1$ ,
Let us fix $\zeta $ , $|\zeta |>1$ . By definition, the adjoint of $M_{B_{\sigma }}$ coincides with the backward shift operator
acting on $K_{B_{\sigma }}$ .
Thus, using the Cauchy duality (2.5), we further have $\left \Vert \left (\zeta -\widehat {M_{B_{\sigma }}}^{*}\right )^{-1}\right \Vert _{*}:= \left \Vert \left (\zeta -S^{*}\right )^{-1}\right \Vert _{*}$ . Now, considering the shift operator $(Sf)(z)=zf(z)$ acting on the whole space $\mathcal {B}_{1}$ (and recalling that S is the adjoint of the operator $S^{*}$ acting on $\mathcal {B}_{\infty }$ ), we have $\left \Vert \left (\zeta -S^{*}\right )^{-1}\right \Vert _{\mathcal {B}_{\infty }\rightarrow \mathcal {B}_{\infty }}=\left \Vert \left (\zeta -S\right )^{-1}\right \Vert _{\mathcal {B}_{1}\rightarrow \mathcal {B}_{1}}$ . The $\mathcal {B}_{1}$ -functional calculus [Reference Vitse22] then tells us that for any $f\in \mathcal {B}_{1}$ ,
for some constant $K_{1}>0$ (see also [Reference Nikolski8, p. 143] and the references therein for more details on the last inequality). It remains to observe that
whence there exists $K_{3}>0$ such that
Since $\left \Vert \left (\zeta -S^{*}\right )^{-1}\right \Vert _{*}\leq \left \Vert \left (\zeta -S^{*}\right )^{-1}\right \Vert _{\mathcal {B}_{\infty }\rightarrow \mathcal {B}_{\infty }}$ , this yields (3.1) and the fact that $T_{\lambda }$ is a Kreiss operator.
In order to derive (2.3), it is enough to show that $\left \Vert \left (\widehat {M_{B_{\sigma }}}^{*}\right )^{-1}\right \Vert _{*}\geq c(\lambda )\frac {n}{|\lambda |^{n}}$ for some constant $c(\lambda )>0$ . To do so, we apply $\left (\widehat {M_{B_{\sigma }}}^{*}\right )^{-1}$ to the test vector $X_{0}=(0,\dots ,0,1)$ , i.e., to the rational function $R(z)=e_{n}(z)$ . We set
where $X_{0}^{\top }$ is the transpose of $X_{0}$ . We have
which means that
The condition $g\in K_{B_{\sigma }}$ imposes that g is a rational function with $\lim _{|z|\to \infty }g(z)=0$ , and hence
It follows that
Let us now estimate $\left |\!\left |e_{n}\right |\!\right |{}_{\mathcal {B}_{\infty }}$ from the above. Defining $\widetilde {b_{\lambda }}(z)=-b_{\lambda }(z)=\frac {\lambda -z}{1-\overline {\lambda }z}$ , we have $\widetilde {b_{\lambda }}\circ \widetilde {b_{\lambda }}=\text {id}$ - where $\text {id}$ denotes the identity function on ${\mathbb {D}}$ —and the $H^{\infty }$ norm is invariant under the composition by $\widetilde {b_{\lambda }}$ . Therefore,
Since
and $|1-\overline {\lambda }\widetilde {b_{\lambda }}(z)|\geq (1-|\lambda |)$ , we get
Studying the function $r\mapsto (1-r)r^{n-1}$ for $r\in (0,1)$ , we can check that the supremum in the last inequality is attained at $|z|=1-1/n$ . We conclude that
In particular,
which completes the proof.
4 On the sharpness of (5.1) with respect to K and n
In the previous sections, we were interested in the sharpness of the inequality $\det (T)\Vert T^{-1}\Vert ~\leq ~CK(T)n$ (see (5.1)) for Kreiss matrices T with Kreiss constant $K(T)$ less than some constant K. In this section, we will consider this inequality for all Kreiss matrices and prove that (5.1) is sharp as n and K tend to $\infty $ . Relaxing the bound on the Kreiss constants will allow us to exhibit sequences of Jordan blocks that are extremal for (5.1).
For $a>1$ and $\lambda \in {\mathbb {D}}$ fixed, let $J_{\lambda }$ stand for the Jordan block of size n
From now on, we denote by $\Vert \cdot \Vert $ the operator norm induced by the $\ell ^{1}$ - or the $\ell ^{\infty }$ -norm of $\mathbb {C}^n$ . Observe that for any $z\in \mathbb {C}$ , $z\neq \lambda $ , the matrix $\left (zI_{n}-J_{\lambda }\right )^{-1}$ is well defined and is the Toeplitz matrix given by
Then
In the following proposition, we exhibit sequences of Jordan blocks that are extremal for (5.1), with respect to the dimension n and the Kreiss constant K.
Proposition 4.1 There exists $\left (\lambda _{n}\right )_{n}$ , $\lambda _{n}>1$ , such that $\lim _{n}K\left (J_{\lambda _{n}}\right )=\infty $ and
Proof By (4.3) with $z=0$ and since $\text {det}\left (J_{\lambda }\right )=\lambda ^{n}$ , we have $\left |\text {det}\left (J_{\lambda }\right )\right |\left \Vert J_{\lambda }^{-1}\right \Vert =\frac {\left |\lambda \right |{}^{n}-\left |a\right |{}^{n}}{\left |\lambda \right |-\left |a\right |}$ . We need to check that
for some $\left (\lambda _{n}\right )_{n}$ . From now on, we assume that $\lambda _{n}=1/n$ . Since the function $x\mapsto \frac {x^{-n}a^{n}-1}{a-x}$ is decreasing on $[0,+\infty [$ and since $\left |z-\lambda _n\right |\geq \left |z\right |-\lambda _n$ for any $\left |z\right |>1$ , we have
Setting $x:=a/(t-1/n)$ and
gives $K\left (J_{\lambda _{n}}\right )=\sup _{0<x<\frac {a}{1-1/n}}g(x)$ . Studying the derivative and the second derivative of g easily leads to observe that $g'$ vanish only once in the interval $[0,\frac {a}{1-1/n}]$ . Moreover, a computation shows that $g'\left (a=\frac {a}{1+1/n-1/n}\right )>0$ , while $g'\left (\frac {na}{n+1}=\frac {a}{1+2/n-1/n}\right )<0$ for n large enough. Thus, there exists $\delta _{n}\in (1/n,2/n)$ such that g admits a maximum at $x_{n}:=\frac {a}{1+\delta _{n}-1/n}$ . Now,
as $n\to \infty $ , as desired.
Remark 4.2 In contrast with the estimate $C(n,\mathcal {K}_{K})\simeq Kn$ (as $n\to \infty $ ) considered in the previous sections, it follows from the previous proposition that
Now, Proposition 4 shows that some sequence of Jordan blocks asymptotically achieves—up to numerical factor independent of K and n—the supremum
and provides with an elementary proof of the sharpness of the estimate $M(n,\mathcal {K})\lesssim n$ as $n\to \infty $ (which is a consequence of the first part of Theorem 1.1). Note that Theorem 2.1 obviously also leads to the latter assertion, but not for Jordan blocks and with more sophisticated arguments.
We shall also notice that, by the Kreiss Matrix Theorem [Reference Kreiss5, Reference Richtmyer and Morton13, Reference Sod16], properties $\mathcal {PB}$ and $\mathcal {K}$ are equivalent, so $C(n,\mathcal {PB})=C(n,\mathcal {K})=\infty $ . One can wonder whether an estimate similar to (4.4) can be obtained with $\sup _k\Vert T_n^k \Vert $ instead of $K(T_n)$ , for some (simple) sequence $(T_n)_n$ instead of $(J_{\lambda _n})_n$ .
5 A short and simple proof of Inequality (5.1)
In fact, the statement of Theorem 3.26 in [Reference Nikolski8] is slightly stronger than the first part of Theorem 1.1. More precisely, Nikolski proves the following.
Theorem 5.1 Let X be a complex Banach space, and let $T\in \mathcal {L}(X)$ be a Kreiss operator with Kreiss constant $K\geq 1$ . Let us denote by $m_{T}=\prod _{i=1}^{d}(z-\lambda _{i})$ its minimal polynomial, and assume that $(\lambda _{1},\ldots ,\lambda _{d})\in {\mathbb {D}}^{d}$ . Then
where C is an absolute constant.
We will display below a short proof of this more precise result that follows Nikolski’s approach and is obtained as a combination of Vitse’s [Reference Vitse22] functional calculus for Kreiss operators and the Bonsall–Walsh inequality [Reference Bonsall and Walsh1] for rational functions. In comparison with the proof of [Reference Nikolski8, Theorem 3.26], its simplicity lies in the choice of a very simple test function.
Proof A short proof of Inequality (5.1)
Let T, $(\lambda _{1},\ldots ,\lambda _{d})$ , $m_{T}$ , and K be as in the statement of Theorem 1.1. We denote by B the Blaschke product associated with the sequence $(\lambda _{1},\ldots ,\lambda _{d})$ and introduce the test function f given by
Observe that f is a rational function (analytic in $\overline {\mathbb {D}}$ ) that interpolates the function $1/z$ on the set $(\lambda _{1},\ldots ,\lambda _{d})$ . More precisely,
where $h(z)=\frac {1}{B(0)}\prod _{i=1}^{d}(\overline {\lambda _{i}}z-1)^{-1}$ , and therefore $Tf(T)$ is the identity matrix. Now, by [Reference Vitse22, Theorem 2.4(3)],
Note that the above proof gives the explicit constant $\frac {32}{\pi }$ in (5.1). Yet we expect that it is not optimal.
Remark 5.1 For completeness, let us give an insight into the first inequality of (5.2). It is obtained as a combination of Vitse’s functional calculus and the Bonsall–Walsh inequality: applying [Reference Vitse22, Theorem 2.4(1)] to the function f, we get
where $\mathcal {B}_{1}$ is the analytic Besov algebra defined in Section 2, and it remains to apply the Bonsall–Walsh inequality [Reference Bonsall and Walsh1] to the rational function f:
where $\deg f$ stands for the degree of f.