On the condition number of a Kreiss matrix

Stéphane Charpentier; Karine Fouchet; Rachid Zarouf

doi:10.4153/S0008439523000437

On the condition number of a Kreiss matrix

Part of: Special matrices Basic linear algebra Function theory on the disc

Published online by Cambridge University Press: 29 May 2023

Stéphane Charpentier ,

Karine Fouchet and

Rachid Zarouf

Show author details

Stéphane Charpentier*: Affiliation:
Institut de Mathématiques de Marseille, UMR 7373, Aix-Marseille Université, 39 rue F. Joliot Curie, 13453 Marseille Cedex 13, France e-mail: karine.isambard@univ-amu.fr
Karine Fouchet: Affiliation:
Institut de Mathématiques de Marseille, UMR 7373, Aix-Marseille Université, 39 rue F. Joliot Curie, 13453 Marseille Cedex 13, France e-mail: karine.isambard@univ-amu.fr
Rachid Zarouf: Affiliation:
Laboratoire ADEF, Aix-Marseille Université, Campus Universitaire de Saint-Jérôme, 52 Avenue Escadrille Normandie Niemen, 13013 Marseille, France e-mail: rachid.zarouf@univ-amu.fr
*: e-mail: stephane.charpentier.1@univ-amu.fr

Article contents

Abstract
Introduction
Background and statement of the main result
Proof of Theorem
On the sharpness of () with respect to K and n
A short and simple proof of Inequality ()
Footnotes
References

Rights & Permissions

Abstract

In 2005, N. Nikolski proved among other things that for any $r\in (0,1)$ and any $K\geq 1$, the condition number $CN(T)=\Vert T\Vert \cdot \Vert T^{-1}\Vert $ of any invertible n-dimensional complex Banach space operators T satisfying the Kreiss condition, with spectrum contained in $\left \{ r\leq |z|<1\right \}$, satisfies the inequality $CN(T)\leq CK(T)\Vert T \Vert n/r^{n}$ where $K(T)$ denotes the Kreiss constant of T and $C>0$ is an absolute constant. He also proved that for $r\ll 1/n,$ the latter bound is asymptotically sharp as $n\rightarrow \infty $. In this note, we prove that this bound is actually achieved by a family of explicit $n\times n$ Toeplitz matrices with arbitrary singleton spectrum $\{\lambda \}\subset \mathbb {D}\setminus \{0\}$ and uniformly bounded Kreiss constant. Independently, we exhibit a sequence of Jordan blocks with Kreiss constants tending to $\infty $ showing that Nikolski’s inequality is still asymptotically sharp as K and n go to $\infty $.

Keywords

Condition numbers Kreiss condition Toeplitz matrices model operator Blaschke product Besov spaces

MSC classification

Primary: 15A60: Norms of matrices, numerical range, applications of functional analysis to matrix theory

Secondary: 15B05: Toeplitz, Cauchy, and related matrices 30J10: Blaschke products

Type: Article
Information: Canadian Mathematical Bulletin , Volume 66 , Issue 4 , December 2023 , pp. 1376 - 1390

DOI: https://doi.org/10.4153/S0008439523000437 [Opens in a new window]
Copyright: © The Author(s), 2023. Published by Cambridge University Press on behalf of The Canadian Mathematical Society

1 Introduction

Let T be an invertible bounded operator on a complex Banach space X. In numerical analysis, it is a matter of interest to estimate the quantity $CN(T):=\Vert T\Vert \Vert T^{-1}\Vert $ , called the condition number of T. Roughly speaking, the condition number of T measures the greatest loss of precision that the linear system $AT=x$ can exhibit over all inputs and their potential errors. The condition numbers of matrices are also intimately related to many problems from matrix analysis, such as the study of the distribution of the eigenvalues of classical random matrices appearing in mathematical statistics. We refer the reader to, e.g., [Reference Quarteroni, Sacco and Saleri10, Chapter 3] or [Reference Eidelman3, Reference Smale15], and the references therein.

Let us denote by $\mathcal {L}(X)$ the algebra of all bounded operators on X, and for $n\in {\mathbb {N}}$ , let the notation $\mathcal {B}(n)$ (resp. $\mathcal {H}(n)$ ) stand for the set of all n-dimensional complex Banach spaces (resp. Hilbert spaces). When X belongs to $\mathcal {H}(n)$ , one can easily deduce from the polar decomposition that

(1.1)

$$ \begin{align} CN(T)\leq\prod_{i}|\lambda_{i}(T)|^{-1}\Vert T\Vert^{n},\quad T\in\mathcal{L}(X), \end{align} $$

where $\lambda _{i}(T)$ , $i=1,\ldots ,n$ , are the eigenvalues of T counting multiplicity. This inequality is clearly sharp since equality occurs if T is the $n\times n$ identity matrix. When turning to the Banach setting, the analogous problem becomes more involved and was first considered in the 70s by B. L. Van der Waerden, W. A. Coppel, and J. J. Schäffer. In view of (1.1), it is natural to seek for estimates of the quantity

(1.2)

$$ \begin{align} C(n):=\inf\left\{C>0:\,CN(T)\leq C\prod_{i}|\lambda_{i}(T)|^{-1}\Vert T\Vert^{n},\,T\in\mathcal{L}(X)\text{ invertible},\,X\in\mathcal{B}(n)\right\}. \end{align} $$

A normalization reduces this problem to that of estimating the norm of the inverse of contractive invertible operators (that is, those with norm less than or equal to $1$ ). Namely, denoting by $\mathcal {C}(X)\subset \mathcal {L}(X)$ the class of contractive matrices on X, $C(n)$ coincides with the quantity

(1.3)

$$ \begin{align} &\inf\left\{ C>0:\,\Vert T^{-1}\Vert\leq C\prod_{i}|\lambda_{i}(T)|^{-1},\,T\in\mathcal{C}(X),\,T\:\mathrm{invertible},\,X\in\mathcal{B}(n)\right\} \nonumber\\ &\quad=\sup\left\{ \prod_{i}|\lambda_{i}(T)|\Vert T^{-1}\Vert:\,T\in\mathcal{C}(X),\,T\:\mathrm{invertible},\,X\in\mathcal{B}(n)\right\}, \end{align} $$

which turns out to be easier to handle. In 1970, Schäffer [Reference Schäffer14] proved the general estimate $C(n)\leq \sqrt {en}$ , and showed that the inequality

(1.4)

$$ \begin{align} \prod_{i}|\lambda_{i}(T)|\Vert T^{-1}\Vert\leq2 \end{align} $$

holds for invertible contractions T acting on ${\mathbb {C}}^{n}$ endowed with the $\ell ^{1}$ -norm or with the $l^{\infty }$ -norm, (and that (1.4) is sharp in both cases). This led him to make the conjecture—nowadays known as Schäffer’s conjecture—that $C(n)=2$ , $n\in {\mathbb {N}}$ . The latter was disproved first by Gluskin, Meyer, and Pajor [Reference Gluskin, Meyer and Pajor4], J. Bourgain [Reference Gluskin, Meyer and Pajor4],Footnote ¹ and later by Queffélec [Reference Queffélec11], who obtained that $C(n)\geq c\sqrt {n}$ for some absolute constant $c>0$ , proving that the initial upper estimate given by Schäffer is optimal (see below for more precise statements with respect to our motivations).

In order to introduce our motivations and for the sake of general interest, we shall briefly survey recent extensions and refinements of the solutions to Schäffer’s conjecture that have been developed at the light of new theoretical approaches, and that are natural to the study of the condition number of matrices. It will imply the introduction of a number of notations that, hopefully, will be still kept limited.

First of all, it is a natural problem to wonder how does the constant $C(n)$ behave, if in its equivalent definition (1.3) one replaces the class of all invertible contractions by some other classical sets of invertible operators. Generally speaking, let $\mathcal {P}$ be a property, and for $T\in \mathcal {L}(X)$ , let the notation $T\in \mathcal {P}$ mean that the operator T satisfies the property $\mathcal {P}$ . We set

(1.5)

$$ \begin{align} C(n,\mathcal{P}):=\sup\left\{ \prod_{i}|\lambda_{i}(T)|\Vert T^{-1}\Vert:\,T\in\mathcal{P},\,T\in\mathcal{L}(X)\text{ invertible},\,X\in\mathcal{B}(n)\right\}. \end{align} $$

Thus, for example, $C(n)=C(n,\mathcal {C})$ if $T\in \mathcal {C}$ means “T is a contraction.” For properties $\mathcal {P}$ that define too large classes of operators, it may happen that $C(n,\mathcal {P})=\infty $ . It is obviously the case if $\mathcal {P}$ is satisfied by any invertible operator acting on any Banach space (indeed, consider all the matrices $\lambda I_{n}$ , $|\lambda |>1$ , where $I_{n}$ is the identity matrix of size n). In this situation, studying $C(n,\mathcal {P})$ is not relevant for estimating condition numbers. On the contrary, if $\mathcal {P}$ is a property for which $C(n,\mathcal {P})<\infty $ , then by definition it holds

$$\begin{align*}CN(T)\leq C(n,\mathcal{P})\prod_{i}|\lambda_{i}(T)|^{-1}\Vert T\Vert \end{align*}$$

for any $T\in \mathcal {P}$ . In this case, estimating $C(n,\mathcal {P})$ becomes especially relevant for properties $\mathcal {P}$ that are satisfied by operators that are not necessarily contractions since, for such operators, it may give upper estimates of condition numbers that are better than that given by Schäffer’s inequality (the latter is valid for any invertible operator).

In 2005, Nikolski [Reference Nikolski8] developed a new approach to achieve estimates of $C(n,\mathcal {P})$ by using functional calculus on spaces of functions holomorphic on the unit disk ${\mathbb {D}}:=\{z\in {\mathbb {C}}:\,|z|<1\}$ . The idea is that for certain properties $\mathcal {P}$ and any Banach space X, an invertible operator $T\in \mathcal {L}(X)$ satisfies $\mathcal {P}$ if and only if $\Vert f(T)\Vert \leq K\Vert f\Vert _{A_{\mathcal {P}}}$ for any f in some algebra $A_{\mathcal {P}}$ of holomorphic functions on $\mathbb {D}$ and for some absolute constant $K>0$ . To make it shorter, we will say in this case that property $\mathcal {P}$ obeys an $A_{\mathcal {P}}$ -functional calculus. Then, for such properties $\mathcal {P}$ , estimating $\Vert T^{-1}\Vert $ from above for $T\in \mathcal {P}$ reduces to estimating $\Vert f\Vert _{A_{\mathcal {P}}}$ or $\Vert f\Vert _{A_{\mathcal {P}}/m_{T}A_{\mathcal {P}}}$ from above for any $f\in A_{\mathcal {P}}$ satisfying the Bézout identity $zf+m_{T}h=1$ , where $m_{T}$ is the minimal polynomial of T and h is any function in $A_{\mathcal {P}}$ . For example, it is well known and easily seen that the property $\mathcal {C}$ of being a contraction obeys a W-functional calculus (with constant 1) where W refers to the Wiener space consisting of all functions f analytic in ${\mathbb {D}}$ , such that $\Vert f\Vert _{W}:=\sum _{k\geq 0}|\hat {f}(k)|<\infty $ , and where $\hat {f}(k)$ denotes the kth Taylor coefficient of f. It is also readily checked that given $K\geq 1$ , the property $\mathcal {PB}_{K}$ satisfied by all Banach space power bounded operators T such that $\sup _{n}\Vert T^{n}\Vert \leq K$ obeys a W-functional calculus (with constant K), which leads to the same estimates for $C(n,\mathcal {PB}_{K})$ and $C(n,\mathcal {C})$ , up to an absolute constant. We also refer the reader to [Reference Szehr18, Paragraph 2.3] for more explanations of the above Nikolski’s strategy. Another standard class of operators obeying a functional calculus over a function algebra is that of Kreiss operators. It is defined, for $K\geq 1$ , by the property $\mathcal {K}_{K}$ given for any Banach operator T by $T\in \mathcal {K}_{K}$ if and only if the following resolvent estimate holds:

(1.6)

$$ \begin{align} \left\Vert (\zeta-T)^{-1}\right\Vert \leq K(|\zeta|-1)^{-1},\quad|\zeta|>1. \end{align} $$

For a given operator T, the infimum of all constants K satisfying (1.6) is called the Kreiss constant of T. This class of operators satisfying $\mathcal {K}_{K}$ is an important one in numerical analysis [Reference Nikolski9, Reference Spijker, Tracogna and Welfert17]. Note that every contraction is a Kreiss operator with Kreiss constant less than or equal to $1$ and that there exist Kreiss operators that are not contractive. In [Reference Nikolski8], Nikolski made use of the fact that $\mathcal {K}_{K}$ obeys a Besov functional calculus, according to a result by Vitse [Reference Vitse22], to obtain the following theorem.

Theorem 1.1 $\quad $

1. [Reference Nikolski8, Theorem 3.26] Let X be a complex Banach space, and let $T\in \mathcal {L}(X)$ be a Kreiss operator with Kreiss constant $K\geq 1$ . Then
(1.7) $$ \begin{align} \Vert T^{-1}\Vert\leq CK\frac{n}{\prod_{i=1}^{n}|\lambda_{i}|}, \end{align} $$
where $(\lambda _{1},\ldots ,\lambda _{n})\in {\mathbb {D}}^n$ denotes the eigenvalues of T and C an absolute constant.
2. (Particular case of [Reference Nikolski8, Theorem 3.31]) For any $(\lambda _{1},\ldots ,\lambda _{n})\in {\mathbb {D}}^{n}$ , there exists an invertible Kreiss operator T with eigenvalues $(\lambda _{1},\ldots ,\lambda _{n})$ such that
(1.8) $$ \begin{align} \left\Vert T^{-1}\right\Vert \geq c\frac{n}{\prod_{i=1}^{n}|\lambda_{i}|}\left(c'-\prod_{j=1}^{n}\left(1+\left|\lambda_{i}\right|\right)\right), \end{align} $$
where $c>0$ and $c'> 1$ are absolute constants.

The first part of this theorem implies

(1.9)

$$ \begin{align} C(n,\mathcal{K}_{K})\leq CKn\quad\text{and}\quad CN(T)\leq CKn\prod_{i}|\lambda_{i}(T)|^{-1}\Vert T\Vert \end{align} $$

for any $T\in \mathcal {K}_{K}$ . In passing, we point out that for any given $K\geq 1$ , there exists an absolute constant $C'$ such that $\Vert T\Vert \leq C'K$ for any $T\in \mathcal {K}_{K}$ (to see it, one can use the Riesz–Dunford functional calculus). Moreover, letting the $\lambda _{i}$ ’s tend to $0$ sufficiently fast in (1.8), one can deduce that for fixed K, the inequality $C(n,\mathcal {K}_{K})\leq CKn$ is made sharp with respect to $n\rightarrow \infty $ for sequences of operators with spectrum shrinking to $0$ . It follows from the approach used by Nikolski that these extremal operators are model operators acting on model spaces (with adequate norms). Then three questions naturally arise.

Question 1 Is it possible to find operators with fixed degenerate spectrum that are extremal for $C(n,\mathcal {K}_{K})\leq CKn$ , $n\to \infty $ ?

Question 2 Can one choose these operators among classes of structured matrices (e.g., Toeplitz matrices)?

Question 3 Is the inequality $C(n,\mathcal {K}_{K})\leq CKn$ sharp simultaneously when K and n go to $\infty $ ?

The aim of this note is to propose a solution to these three questions. For contractions, the same questions have already been investigated for a long time. A brief exposition of these investigations and a statement of our results requires the introduction of the following natural quantities:

• $C_{n}^{\mathcal {H}}(r,\mathcal {P}):=\sup \left \{ \left \Vert T^{-1}\right \Vert :\:T\in \mathcal {P},\,T\in \mathcal {L}(X)\text { invertible},\,X\in \mathcal {H}(n),\:r_{\min }(T)\geq r\right \} $ ;
• $C_{n}^{\mathcal {B}}(r,\mathcal {P}):=\sup \left \{ \left \Vert T^{-1}\right \Vert :\:T\in \mathcal {P},\,T\in \mathcal {L}(X)\text { invertible},\,X\in \mathcal {B}(n),\:r_{\min }(T)\geq r\right \} $ ;
• $\tau _{n}^{\mathcal {H}}(r,\mathcal {P}):=\sup \left \{ \left \Vert T^{-1}\right \Vert :\:T\in \mathcal {P}{\kern-1.2pt}\cap{\kern-1.2pt} \mathcal {T},\,T{\kern-1.2pt}\in{\kern-1.2pt} \mathcal {L}(X)\text { invertible},\,X{\kern-1.2pt}\in{\kern-1.2pt} \mathcal {H}(n),\:r_{\min }(T){\kern-1.2pt}\geq{\kern-1.2pt} r\right \} $ ;
• $\tau _{n}^{\mathcal {B}}(r,\mathcal {P}):=\sup \left \{ \left \Vert T^{-1}\right \Vert :\:T\in \mathcal {P}\cap \mathcal {T},\,T\in \mathcal {L}(X)\text { invertible},\,X{\kern-1.2pt}\in{\kern-1.2pt} \mathcal {B}(n),\:r_{\min }(T){\kern-1.2pt}\geq{\kern-1.2pt} r\right \} $ ,

where $r_{\text {min}}(T)$ stands for the smallest eigenvalues of T, where $\mathcal {T}$ denotes the property of being a Toeplitz operator, and where the notation $\mathcal {P}\cap \mathcal {P}'$ means that $\mathcal {P}$ and $\mathcal {P}'$ are simultaneously satisfied.

If $\mathcal {P}=\mathcal {C}$ , and in the Hilbert setting, it is known that for any $r\in (0,1), C_{n}^{\mathcal {H}}(r,\mathcal {C})=r^{-n}$ . This result probably dates back to the 19th century and is sometimes attributed to L. Kronecker. A description of matrices achieving the supremum was obtained by Nikolski [Reference Nikolski8, Theorem 2.1], using the abovementioned functional calculus approach. In the Banach setting, none of the results given in [Reference Gluskin, Meyer and Pajor4, Reference Queffélec12] lead to sharp estimates of $C_{n}^{\mathcal {B}}(r,\mathcal {C})$ . Indeed, while Schäffer’s inequality [Reference Schäffer14] gives the upper estimates

(1.10)

$$ \begin{align} C_{n}^{\mathcal{B}}(r,\mathcal{C})\leq\sqrt{en}r^{-n}, \end{align} $$

the lower estimates obtained in [Reference Gluskin, Meyer and Pajor4, Reference Queffélec11] are not enough to prove that (1.10) is sharp with respect to $r,n$ : Gluskin–Meyer–Pajor [Reference Gluskin, Meyer and Pajor4], Bourgain (see [Reference Gluskin, Meyer and Pajor4]), and Queffélec [Reference Queffélec11], respectively, stated

$$\begin{align*}C_{n}^{\mathcal{B}}\left(1-\frac{1}{n},\mathcal{C}\right)\geq c\sqrt{\frac{n}{\log n}}\frac{1}{\log\log n},\quad C_{n}^{\mathcal{B}}\left(1-\frac{1}{n},\mathcal{C}\right)\geq c\sqrt{\frac{n}{\log n}},\quad C_{n}^{\mathcal{B}}\left(1-\frac{1}{n},\mathcal{C}\right)\geq c\sqrt{n}, \end{align*}$$

where c is an absolute constant. In all these estimates, it appears that r and n are correlated (in particular, the spectrum of the extremal operators is not fixed).

When $\mathcal {P}=\mathcal {K}_{K}$ , we recall that the first point of Theorem 1.1 gives that for any $r\in (0,1)$ ,

(1.11)

$$ \begin{align} C_{n}^{\mathcal{B}}(r,\mathcal{K}_{K})\leq CK\frac{n}{r^{n}}, \end{align} $$

where C is an absolute constant, whereas the second one yields the asymptotic sharpness of (1.11) as $n\rightarrow \infty $ , only for $r\ll 1/n$ .

A somewhat natural approach to attack third the question of exhibiting operators with fixed spectrum that are asymptotically extremal for (1.10) and (1.11) may consist in looking for such operators in classes of structured matrices such as Jordan blocks, which are special cases of Toeplitz matrices. More generally, Toeplitz or Hankel matrices, which play a crucial role in matrix analysis and operator theory, are natural candidates. Yet, in the Banach setting, the proofs given in [Reference Gluskin, Meyer and Pajor4, Reference Queffélec11] are far from providing with explicit examples achieving $C(n,\mathcal {C})$ —and a fortiori with extremal Toeplitz matrices with degenerate spectrum. In the Hilbert setting, we recall that Nikolski characterized matrices that are extremal for $C_{n}^{\mathcal {H}}(r,\mathcal {C})$ for any r in [Reference Nikolski8] and asked for the existence of Toeplitz matrices satisfying this characterization. Szehr and the last author obtained in [Reference Szehr and Zarouf19] the equalities

$$\begin{align*}\tau_{n}^{\mathcal{H}}(r,\mathcal{C})=c_{n}^{\mathcal{H}}(r,\mathcal{C})=\frac{1}{r^{n}}, \end{align*}$$

for any $r\in (0,1)$ (see also [Reference Zarouf23] where the weaker estimate $\tau _{n}^{\mathcal {H}}(r,\mathcal {C})\geq 2^{-1}r^{-n}$ is obtained) and, in 2021, they proved that for some absolute constant $c>0$ ,

$$\begin{align*}C_{n}^{\mathcal{B}}(r,\mathcal{C})\geq\tau_{n}^{\mathcal{B}}(r,\mathcal{C})\geq c\frac{\sqrt{n}}{r^{n}} \end{align*}$$

for any $r\in (0,1)$ and any n large enough, so that (1.10) is sharp. Moreover, they provided with explicit examples of extremal Banach space Toeplitz operators with arbitrary degenerate spectrum [Reference Szehr and Zarouf20]. Worth insisting, these extremal Toeplitz operators may be chosen with spectrum equal to $\left \{ \lambda \right \} ,$ $\lambda $ arbitrary in $\mathbb {D}\setminus \left \{ 0\right \}$ .

Our main contribution in this note is to follow their strategy and to obtain an analogous result for Kreiss operators, giving a solution to Questions 1 and 2. More precisely, we will prove the following result.

Theorem 1.2 (For a more precise statement, see Theorem 2.1)

For a fixed $K\geq 1$ , there exists an absolute constant $c>0$ such that

$$\begin{align*}C_{n}^{\mathcal{B}}(r,\mathcal{K}_{K})\geq\tau_{n}^{\mathcal{B}}(r,\mathcal{K}_{K})\geq c\frac{n}{r^{n}} \end{align*}$$

for any $r\in (0,1)$ and n large enough.

Moreover, the extremal Toeplitz matrices in the second inequality can be chosen with a degenerate spectrum arbitrary in $\mathbb {D}\setminus \left \{ 0\right \} $ .

This implies the sharpness of (1.11) as $n\to \infty $ for any $r\in (0,1)$ . With respect to numerical analysis, this theorem says that degenerate Toeplitz matrices may be ill-conditioned in high dimensions. The outline of the proof will be similar to that proposed by Szehr and Zarouf in [Reference Szehr and Zarouf20] for contractions, but the techniques will differ. We mention that Theorem 2.1 was announced in [Reference Charpentier, Fouchet, Szehr and Zarouf2], without a proof.

Our second result concerns the asymptotic sharpness of Nikolski’s upper bound (5.1) when K is permitted to grow unboundedly as $n\rightarrow \infty $ . It is indeed a natural question to ask whether the dependency on K in (5.1) can be improved or not. We will show that (5.1) is also sharp in the following sense: there exists a sequence of Jordan blocks $T_{n}\in \mathcal {L}(X),$ where $X=\left (\mathbb {C}^{n},\,\left |\!\left |\cdot \right |\!\right |{}_{l^{1}}\right )$ , such that $K\left (T_{n}\right )\rightarrow \infty $ as $n\to \infty $ and

$$\begin{align*}\frac{\left|\text{det}\left(T_{n}\right)\right|\left\Vert T_{n}^{-1}\right\Vert }{K\left(T_{n}\right)}\geq cn, \end{align*}$$

for some absolute constant $c>0$ (see Proposition 4.1). We mention that the use of Jordan blocks as extremal matrices in the context of Kreiss matrices is rather classical (see, for example, [Reference van Dorsselaer, Kraaijevanger and Spijker21]).

The organization of the paper is as follows. The next section contains the prerequisites and the statement of Theorem 2.1. The latter is proved in the third section. Section 4 contains the proof of the asymptotic sharpness of (5.1) with respect to K and n (Proposition 4.1). In the last section, we present—for the interested reader—a short and simple proof of the upper estimate (5.1) in Theorem 1.1.

Notation Throughout the paper, we will use the notation $f\lesssim g$ meaning that $f\leq cg$ , where c is some absolute constant. The notation $f\simeq g$ will mean that $f\lesssim g$ and $g\lesssim f$ .

2 Background and statement of the main result

Let us denote by $H(\mathbb {D})$ the space of analytic functions in $\mathbb {D}$ and by $H^{\infty }$ the Banach algebra of bounded analytic functions in $\mathbb {D}$ , endowed with the supremum norm on $\mathbb {D}$ . The standard Hardy space $H^{2}$ is defined as the subspace of $H(\mathbb {D})$ consisting of those functions f for which

$$\begin{align*}\Vert f\Vert_{H^{2}}^{2}:=\sup_{0\leq r<1}\int_{\mathbb{T}}\left|f(rz)\right|{}^{2}\mathrm{d}m(z)<\infty, \end{align*}$$

where m is the normalized Lebesgue measure on the unit circle $\mathbb {T}:=\left \{ z\in \mathbb {C}:\,|z|=1\right \} $ . Endowed with $\Vert \cdot \Vert _{H^{2}}$ , $H^{2}$ is a Hilbert space with inner product given by

(2.1)

$$ \begin{align} \left\langle f,g\right\rangle =\sum_{k\geq0}\hat{f}(k)\overline{\hat{g}(k)}. \end{align} $$

By Fatou’s theorem, $H^{2}$ can be identified with the (Hilbert) subspace of $L^{2}(\partial \mathbb {D})$ so that $\left \langle f,g\right \rangle =\frac {1}{2\pi }\int _{-\pi }^{\pi }f(e^{it})\overline {g(e^{it})}dt$ , where f and g in the right-hand side of the last equality denote, without possible confusion, the almost everywhere radial limits of f and g, respectively.

Given $\lambda \in {\mathbb {D}}$ , we denote by $b_{\lambda }$ the Blaschke factor associated with $\lambda $ , namely

$$\begin{align*}b_{\lambda}:=\frac{z-\lambda}{1-\overline{\lambda}z}. \end{align*}$$

Let $\sigma =\{\lambda _{1},\ldots ,\lambda _{n}\}\subset {\mathbb {D}}$ be a finite sequence. We shall consider the model space $K_{B_{\sigma }}$ given by

$$\begin{align*}K_{B_{\sigma}}:=\left(B_{\sigma}H^{2}\right)^{\perp}=H^{2}\ominus B_{\sigma}H^{2}, \end{align*}$$

where

$$\begin{align*}B_{\sigma}:=\prod_{k=1}^{n}b_{\lambda_{k}} \end{align*}$$

is the finite Blaschke product associated with $\sigma =\{\lambda _{1},\ldots ,\lambda _{n}\}$ . Any such space $K_{B_{\sigma }}$ is an n-dimensional subspace of $H^{2}$ . Now, for $1\leq k\leq n$ , let $f_{k}:=\frac {1}{1-\overline {\lambda _{k}}z}$ and observe that $\Vert f_{k}\Vert _{H^{2}}=\left (1-\vert \lambda _{k}\vert ^{2}\right )^{-1/2}$ . Then set

$$\begin{align*}e_{1}=\frac{f_{1}}{\Vert f_{1}\Vert_{H^{2}}}\,\quad\mbox{and}\quad e_{k}={\displaystyle \prod_{j=1}^{k-1}}b_{\lambda_{j}}\frac{f_{k}}{\Vert f_{k}\Vert_{H^{2}}},\quad k=2,\ldots,n. \end{align*}$$

It is known that $\left (e_{k}\right )_{1\leq k\leq n}$ defines an orthonormal basis of $K_{B_{\sigma }}$ , called the Malmquist–Walsh basis [Reference Nikolski6, p. 117].

The central object of Nikolski’s approach in [Reference Nikolski8] is the model operator $M_{B_{\sigma }}$ defined by

$$\begin{align*}M_{B_{\sigma}}:\left\{ \begin{array}{ccc} K_{B_{\sigma}} & \rightarrow & K_{B_{\sigma}},\\ f & \mapsto & P_{B_{\sigma}}(zf), \end{array} \right. \end{align*}$$

where $P_{B_{\sigma }}$ denotes the orthogonal projection on $K_{B_{\sigma }}$ . The matrix representation $\widehat {M_{B_{\sigma }}}$ of $M_{B_{\sigma }}$ with respect to the Malmquist–Walsh basis $\left (e_{k}\right )_{1\leq k\leq n}$ is as follows (see [Reference Szehr18, Proposition III.4]):

(2.2)

$$ \begin{align} \left(\widehat{M_{B_{\sigma}}}\right)_{ij}=\left\{ \begin{array}{ll} 0, & \text{if }i<j,\\ \lambda_{i}, & \text{if }i=j,\\ (1-|\lambda_{i}|^{2})^{1/2}(1-|\lambda_{j}|^{2})^{1/2}\prod_{\mu=j+1}^{i-1}\left(-\bar{\lambda}_{\mu}\right), & \text{if }i>j, \end{array} \right. \end{align} $$

where $\left (\widehat {M_{B_{\sigma }}}\right )_{ij}$ stands for the $i,j$ entry of $\widehat {M_{B_{\sigma }}}$ . The reader may consult [Reference Nikolski6, Reference Nikolski7] for a complete description of model spaces and model operators.

For the proof of Theorem 2.1, we shall focus on the case where $\sigma =\left \{ \lambda ,\ldots ,\lambda \right \} $ (namely, $\lambda _{1}=\cdots =\lambda _{n}=\lambda $ with the previous notations). In this case,

$$\begin{align*}K_{B_{\sigma}}=\text{span}\left\{\frac{z^{k-1}}{(1-\bar{\lambda}z)^n}:\, k=1,\dots,n \right\} \end{align*}$$

and the Malmquist–Walsh basis $\beta _{\lambda }:=\left (e_{k}\right )_{1\leq k\leq n}$ is given by

$$\begin{align*}e_{k}(z):=\frac{(1-|\lambda|^{2})^{1/2}}{1-\bar{\lambda}z}\left(\frac{z-\lambda}{1-\bar{\lambda}z}\right)^{k-1},\quad k=1,\dots,n. \end{align*}$$

Moreover, one can also show that $K_{B_{\sigma }}$ coincides as a set with the n-dimensional Banach space consisting of all rational functions of degree at most n with poles located at $1/\overline {\lambda }$ . We may and shall equip $K_{B_{\sigma }}$ with the Banach norm

$$\begin{align*}\Vert f\Vert_{\mathcal{B}_{\infty}}:=\sup_{z\in\mathbb{D}}\left(1-\left|z\right|{}^{2}\right)\left|f(z)\right|<\infty,\quad f\in K_{B_{\sigma}}. \end{align*}$$

Then $\left (K_{B_{\sigma }},\left |\!\left |\cdot \right |\!\right |{}_{\mathcal {B}_{\infty }}\right )$ is a (Banach) subspace of the Besov space $\mathcal {B}_{\infty }$ defined as

$$\begin{align*}\mathcal{B}_{\infty}=\left\{ f\in H(\mathbb{D}):\:\left|\!\left|f\right|\!\right|{}_{\mathcal{B}_{\infty}}=\sup_{z\in\mathbb{D}}\left(1-\left|z\right|{}^{2}\right)\left|f(z)\right|<\infty\right\}. \end{align*}$$

Our main theorem states as follows.

Theorem 2.1 Let $\lambda \in \mathbb {D}\setminus \{0\}$ be fixed, and let $T_{\lambda }$ denote the operator acting on $\left (K_{B_{\sigma }},\left |\!\left |\cdot \right |\!\right |{}_{\mathcal {B}_{\infty }}\right )$ whose matrix with respect to $\beta _{\lambda }$ is given by

$$\begin{align*}M_{\lambda}:=\left(\begin{array}{ccccc} \lambda & 1-\left|\lambda\right|{}^{2} & -\bar{\lambda}(1-\left|\lambda\right|{}^{2}) & \ldots & (-\bar{\lambda})^{n-2}(1-\left|\lambda\right|{}^{2})\\ 0 & \lambda & 1-\left|\lambda\right|{}^{2} & \ddots & \vdots\\ 0 & \ddots & \lambda & \ddots & -\bar{\lambda}(1-\left|\lambda\right|{}^{2})\\ \vdots & \ddots & \ddots & \ddots & 1-\left|\lambda\right|{}^{2}\\ 0 & \ldots & 0 & 0 & \lambda \end{array}\right). \end{align*}$$

Then $T_{\lambda }$ satisfies the Kreiss condition and the inequality

(2.3)

$$ \begin{align} \left|\!\left|T_{\lambda}^{-1}\right|\!\right|{}_{*}\geq c(\lambda)\frac{n}{|\lambda|^{n}}, \end{align} $$

where $c(\lambda )>0$ and where $|\!|\cdot |\!|_{*}$ is the operator norm induced by $\left |\!\left |\cdot \right |\!\right |{}_{\mathcal {B}_{\infty }}$ . In particular, for any $r\in (0,1)$ ,

$$\begin{align*}c_{n}^{\mathcal{B}}(r,\mathcal{K_{K}})\geq\tau_{n}^{\mathcal{B}}(r,\mathcal{K_{K}})\gtrsim\frac{n}{r^{n}}, \end{align*}$$

as $n\to \infty $ .

The proof of Theorem 2.1—displayed in the next section—is based on a duality argument already used in [Reference Szehr and Zarouf20]. Considering Kreiss operators, we will here make use of the Besov functional calculus developed in [Reference Vitse22] and the duality between the Besov spaces $\mathcal {B}_{\infty }$ and $\mathcal {B}_{1}$ , where

(2.4)

$$ \begin{align} \mathcal{B}_{1}=\left\{ f\in H(\mathbb{D}):\:\left|\!\left|f\right|\!\right|{}_{\mathcal{B}_{1}}:=\int_{\mathbb{D}}|(z^{2}f(z))"| \mathrm{d}A(z) <\infty\right\}, \end{align} $$

where A stands for the normalized Lebesgue measure on $\mathbb {D}$ . This duality is given by the relation (with equivalence of the norms, see [Reference Vitse22, p. 1815] for details)

(2.5)

$$ \begin{align} \mathcal{B}_{1}^{\star}=\mathcal{B}_{\infty} \end{align} $$

for the Cauchy duality (2.1).

3 Proof of Theorem 2.1

Proof of Theorem 2.1

In the whole section, $\sigma =\left \{ \lambda ,\ldots ,\lambda \right \} \subset \mathbb {D}\setminus \left \{ 0\right \} $ is fixed. First, using the orthonormality of the Malmquist–Walsh basis, we can see that the adjoint of $M_{B_{\sigma }}$ acting on $\left (K_{B_{\sigma }},\left |\!\left |\cdot \right |\!\right |{}_{\mathcal {B}_{\infty }}\right )$ coincides with $T_{\bar {\lambda }}$ . So, in order to prove that $T_{\lambda }$ is a Kreiss operator, it is enough to show that $\widehat {M_{B_{\sigma }}}^{*}$ —the adjoint of $\widehat {M_{B_{\sigma }}}$ —satisfies the Kreiss condition, namely that there exists some constant $K>0$ such that for any $|\zeta |>1$ ,

(3.1)

$$ \begin{align} \left\Vert \left(\zeta-\widehat{M_{B_{\sigma}}}^{*}\right)^{-1}\right\Vert _{*}\leq K\left(\left|\zeta\right|-1\right)^{-1}. \end{align} $$

Let us fix $\zeta $ , $|\zeta |>1$ . By definition, the adjoint of $M_{B_{\sigma }}$ coincides with the backward shift operator

$$\begin{align*}S^{*}:f\mapsto\frac{f-f(0)}{z} \end{align*}$$

acting on $K_{B_{\sigma }}$ .

Thus, using the Cauchy duality (2.5), we further have $\left \Vert \left (\zeta -\widehat {M_{B_{\sigma }}}^{*}\right )^{-1}\right \Vert _{*}:= \left \Vert \left (\zeta -S^{*}\right )^{-1}\right \Vert _{*}$ . Now, considering the shift operator $(Sf)(z)=zf(z)$ acting on the whole space $\mathcal {B}_{1}$ (and recalling that S is the adjoint of the operator $S^{*}$ acting on $\mathcal {B}_{\infty }$ ), we have $\left \Vert \left (\zeta -S^{*}\right )^{-1}\right \Vert _{\mathcal {B}_{\infty }\rightarrow \mathcal {B}_{\infty }}=\left \Vert \left (\zeta -S\right )^{-1}\right \Vert _{\mathcal {B}_{1}\rightarrow \mathcal {B}_{1}}$ . The $\mathcal {B}_{1}$ -functional calculus [Reference Vitse22] then tells us that for any $f\in \mathcal {B}_{1}$ ,

$$\begin{align*}\left\Vert \left(\zeta-S\right)^{-1}f\right\Vert _{\mathcal{B}_{1}} \leq K_{1}\left\Vert f\right\Vert _{\mathcal{B}_{1}}\left\Vert \frac{1}{\zeta-z}\right\Vert _{\mathcal{B}_{1}} \end{align*}$$

for some constant $K_{1}>0$ (see also [Reference Nikolski8, p. 143] and the references therein for more details on the last inequality). It remains to observe that

$$\begin{align*}\left\Vert \frac{1}{\zeta-z}\right\Vert _{\mathcal{B}_{1}}\leq K_{2}\left(\left|\zeta\right|-1\right)^{-1}, \end{align*}$$

whence there exists $K_{3}>0$ such that

$$\begin{align*}\left\Vert \left(\zeta-S^{*}\right)^{-1}\right\Vert _{\mathcal{B}_{\infty}\rightarrow\mathcal{B}_{\infty}}=\left\Vert \left(\zeta-S\right)^{-1}\right\Vert _{\mathcal{B}_{1}\rightarrow\mathcal{B}_{1}}\leq K_{3}\left(\left|\zeta\right|-1\right)^{-1}. \end{align*}$$

Since $\left \Vert \left (\zeta -S^{*}\right )^{-1}\right \Vert _{*}\leq \left \Vert \left (\zeta -S^{*}\right )^{-1}\right \Vert _{\mathcal {B}_{\infty }\rightarrow \mathcal {B}_{\infty }}$ , this yields (3.1) and the fact that $T_{\lambda }$ is a Kreiss operator.

In order to derive (2.3), it is enough to show that $\left \Vert \left (\widehat {M_{B_{\sigma }}}^{*}\right )^{-1}\right \Vert _{*}\geq c(\lambda )\frac {n}{|\lambda |^{n}}$ for some constant $c(\lambda )>0$ . To do so, we apply $\left (\widehat {M_{B_{\sigma }}}^{*}\right )^{-1}$ to the test vector $X_{0}=(0,\dots ,0,1)$ , i.e., to the rational function $R(z)=e_{n}(z)$ . We set

$$\begin{align*}g:=\left(\widehat{M_{B_{\sigma}}}^{*}\right)^{-1}X_{0}^{\top}=(S^{*})^{-1}e_{n}, \end{align*}$$

where $X_{0}^{\top }$ is the transpose of $X_{0}$ . We have

$$\begin{align*}S^{*}g=\frac{g-g(0)}{z}=e_{n}, \end{align*}$$

which means that

$$ \begin{align*} g & =ze_{n}+g(0)\\ & =\frac{(1-|\lambda|^{2})^{1/2}}{1-\overline{\lambda}z}zb_{\lambda}^{n-1}+g(0)\\ & =(1-|\lambda|^{2})^{1/2}\frac{z(z-\lambda)^{n-1}}{(1-\overline{\lambda}z)^{n}}+g(0). \end{align*} $$

The condition $g\in K_{B_{\sigma }}$ imposes that g is a rational function with $\lim _{|z|\to \infty }g(z)=0$ , and hence

$$ \begin{align*} g(0) & =-(1-|\lambda|^{2})^{1/2}\lim_{z\rightarrow+\infty}\frac{z(z-\lambda)^{n-1}}{(1-\overline{\lambda}z)^{n}}\\ & =(-1)^{n+1}\frac{(1-|\lambda|^{2})^{1/2}}{\overline{\lambda}^{n}}. \end{align*} $$

It follows that

$$ \begin{align*} \Vert g\Vert_{\mathcal{B}_{\infty}} & =\sup_{z\in\mathbb{D}}\left(1-\left|z\right|{}^{2}\right)\left|g(z)\right|\\ & \geq|g(0)|\\ & =\frac{(1-|\lambda|^{2})^{1/2}}{|\lambda|^{n}}. \end{align*} $$

Let us now estimate $\left |\!\left |e_{n}\right |\!\right |{}_{\mathcal {B}_{\infty }}$ from the above. Defining $\widetilde {b_{\lambda }}(z)=-b_{\lambda }(z)=\frac {\lambda -z}{1-\overline {\lambda }z}$ , we have $\widetilde {b_{\lambda }}\circ \widetilde {b_{\lambda }}=\text {id}$ - where $\text {id}$ denotes the identity function on ${\mathbb {D}}$ —and the $H^{\infty }$ norm is invariant under the composition by $\widetilde {b_{\lambda }}$ . Therefore,

$$ \begin{align*} \left|\!\left|e_{n}\right|\!\right|{}_{\mathcal{B}_{\infty}} & =\sup_{z\in\mathbb{D}}\left(1-\left|z\right|{}^{2}\right)\left|\frac{(1-|\lambda|^{2})^{1/2}}{1-\overline{\lambda}z}(b_{\lambda}(z))^{n-1}\right|\\ & =\sup_{z\in\mathbb{D}}\left(1-\left|z\right|{}^{2}\right)\left|\frac{(1-|\lambda|^{2})^{1/2}}{1-\overline{\lambda}z}\widetilde{b_{\lambda}}(z)^{n-1}\right|\\ & =(1-|\lambda|^{2})^{1/2}\sup_{z\in\mathbb{D}}\left(1-\left|\widetilde{b_{\lambda}}(z)\right|{}^{2}\right)\left|\frac{1}{1-\overline{\lambda}\widetilde{b_{\lambda}}(z)}z^{n-1}\right|. \end{align*} $$

Since

$$ \begin{align*} 1-\left|\widetilde{b_{\lambda}}(z)\right|{}^{2} & =\frac{(1-|\lambda|^{2})(1-|z|^{2})}{(1-\overline{\lambda}z)(1-\lambda\overline{z})}\\ & \leq\frac{1+|\lambda|}{1-|\lambda|}(1-|z|^{2}) \end{align*} $$

and $|1-\overline {\lambda }\widetilde {b_{\lambda }}(z)|\geq (1-|\lambda |)$ , we get

$$ \begin{align*} \left|\!\left|e_{n}\right|\!\right|{}_{\mathcal{B}_{\infty}} & \leq\left(\frac{1+|\lambda|}{1-|\lambda|}\right)^{3/2}\sup_{z\in\mathbb{D}}\left(1-\left|z\right|{}^{2}\right)\left|z\right|{}^{n-1}\\ & \leq2\left(\frac{1+|\lambda|}{1-|\lambda|}\right)^{3/2}\sup_{z\in\mathbb{D}}\left(1-\left|z\right|\right)\left|z\right|{}^{n-1}. \end{align*} $$

Studying the function $r\mapsto (1-r)r^{n-1}$ for $r\in (0,1)$ , we can check that the supremum in the last inequality is attained at $|z|=1-1/n$ . We conclude that

$$ \begin{align*} \left|\!\left|e_{n}\right|\!\right|{}_{\mathcal{B}_{\infty}} & \leq\frac{2}{n-1}\left(\frac{1+|\lambda|}{1-|\lambda|}\right)^{3/2}\left(1-\frac{1}{n}\right)^{n}\\ & \leq\frac{2}{e(n-1)}\left(\frac{1+|\lambda|}{1-|\lambda|}\right)^{3/2}. \end{align*} $$

In particular,

$$ \begin{align*} \frac{|\!|\left(\widehat{M_{B_{\sigma}}}^{*}\right)^{-1} X_{0}^{\top}|\!|_{\mathcal{B}_{\infty}}}{|\!|X_{0}^{\top}|\!|_{\mathcal{B}_{\infty}}} & =\frac{|\!|g|\!|_{\mathcal{B}_{\infty}}}{|\!|e_{n}|\!|_{\mathcal{B}_{\infty}}}\\ & \geq\frac{e(1-|\lambda|)^{2}}{2^{5/2}}\frac{(n-1)}{|\lambda|^{n}}, \end{align*} $$

which completes the proof.

4 On the sharpness of (5.1) with respect to K and n

In the previous sections, we were interested in the sharpness of the inequality $\det (T)\Vert T^{-1}\Vert ~\leq ~CK(T)n$ (see (5.1)) for Kreiss matrices T with Kreiss constant $K(T)$ less than some constant K. In this section, we will consider this inequality for all Kreiss matrices and prove that (5.1) is sharp as n and K tend to $\infty $ . Relaxing the bound on the Kreiss constants will allow us to exhibit sequences of Jordan blocks that are extremal for (5.1).

For $a>1$ and $\lambda \in {\mathbb {D}}$ fixed, let $J_{\lambda }$ stand for the Jordan block of size n

(4.1)

$$ \begin{align} J_{\lambda}:=\left(\begin{array}{ccccc} \lambda & a & 0 & \ldots & 0\\ 0 & \lambda & a & \ddots & \vdots\\ 0 & \ddots & \lambda & \ddots & 0\\ \vdots & \ddots & \ddots & \ddots & a\\ 0 & \ldots & 0 & 0 & \lambda \end{array}\right). \end{align} $$

From now on, we denote by $\Vert \cdot \Vert $ the operator norm induced by the $\ell ^{1}$ - or the $\ell ^{\infty }$ -norm of $\mathbb {C}^n$ . Observe that for any $z\in \mathbb {C}$ , $z\neq \lambda $ , the matrix $\left (zI_{n}-J_{\lambda }\right )^{-1}$ is well defined and is the Toeplitz matrix given by

(4.2)

$$ \begin{align} \left(zI_{n}-J_{\lambda}\right)^{-1}=\left(\begin{array}{ccccc} \frac{1}{z-\lambda} & \frac{a}{\left(z-\lambda\right)^{2}} & \frac{a^{2}}{\left(z-\lambda\right)^{3}} & \ldots & \frac{a^{n-1}}{\left(z-\lambda\right)^{n}}\\ 0 & \frac{1}{z-\lambda} & \frac{a}{\left(z-\lambda\right)^{2}} & \ddots & \vdots\\ 0 & \ddots & \frac{1}{z-\lambda} & \ddots & \frac{a^{2}}{\left(z-\lambda\right)^{3}}\\ \vdots & \ddots & \ddots & \ddots & \frac{a}{\left(z-\lambda\right)^{2}}\\ 0 & \ldots & 0 & 0 & \frac{1}{z-\lambda} \end{array}\right). \end{align} $$

Then

(4.3)

$$ \begin{align} \left\Vert \left(zI_{n}-J_{\lambda}\right)^{-1}\right\Vert =\frac{1}{\left|z-\lambda\right|{}^{n}}\frac{\left|z-\lambda\right|{}^{n}-\left|a\right|{}^{n}}{\left|z-\lambda\right|-\left|a\right|}. \end{align} $$

In the following proposition, we exhibit sequences of Jordan blocks that are extremal for (5.1), with respect to the dimension n and the Kreiss constant K.

Proposition 4.1 There exists $\left (\lambda _{n}\right )_{n}$ , $\lambda _{n}>1$ , such that $\lim _{n}K\left (J_{\lambda _{n}}\right )=\infty $ and

(4.4)

$$ \begin{align} \frac{\left|\text{det}\left(J_{\lambda_{n}}\right)\right|\left\Vert J_{\lambda_{n}}^{-1}\right\Vert }{K\left(J_{\lambda_{n}}\right)}\simeq n\quad\text{as }n\to\infty. \end{align} $$

Proof By (4.3) with $z=0$ and since $\text {det}\left (J_{\lambda }\right )=\lambda ^{n}$ , we have $\left |\text {det}\left (J_{\lambda }\right )\right |\left \Vert J_{\lambda }^{-1}\right \Vert =\frac {\left |\lambda \right |{}^{n}-\left |a\right |{}^{n}}{\left |\lambda \right |-\left |a\right |}$ . We need to check that

$$\begin{align*}K\left(J_{\lambda_{n}}\right)=\sup_{|z|>1}\frac{\left|z\right|-1}{\left|z-\lambda_{n}\right|{}^{n}}\frac{\left|z-\lambda_{n}\right|{}^{n}-\left|a\right|{}^{n}}{\left|z-\lambda_{n}\right|-\left|a\right|}\simeq\frac{1}{n}\frac{\left|\lambda_{n}\right|{}^{n}-\left|a\right|{}^{n}}{\left|\lambda_{n}\right|-\left|a\right|} \end{align*}$$

for some $\left (\lambda _{n}\right )_{n}$ . From now on, we assume that $\lambda _{n}=1/n$ . Since the function $x\mapsto \frac {x^{-n}a^{n}-1}{a-x}$ is decreasing on $[0,+\infty [$ and since $\left |z-\lambda _n\right |\geq \left |z\right |-\lambda _n$ for any $\left |z\right |>1$ , we have

$$\begin{align*}K\left(J_{\lambda_{n}}\right)=\sup_{t>1}\frac{t-1}{\left(t-1/n\right)^{n}}\frac{\left(t-1/n\right)^{n}-\left|a\right|{}^{n}}{\left(t-1/n\right)-\left|a\right|}. \end{align*}$$

Setting $x:=a/(t-1/n)$ and

$$\begin{align*}g(x):=\frac{\left(a+(1/n-1)x\right)\left(x^{n}-1\right)}{a(x-1)} \end{align*}$$

gives $K\left (J_{\lambda _{n}}\right )=\sup _{0<x<\frac {a}{1-1/n}}g(x)$ . Studying the derivative and the second derivative of g easily leads to observe that $g'$ vanish only once in the interval $[0,\frac {a}{1-1/n}]$ . Moreover, a computation shows that $g'\left (a=\frac {a}{1+1/n-1/n}\right )>0$ , while $g'\left (\frac {na}{n+1}=\frac {a}{1+2/n-1/n}\right )<0$ for n large enough. Thus, there exists $\delta _{n}\in (1/n,2/n)$ such that g admits a maximum at $x_{n}:=\frac {a}{1+\delta _{n}-1/n}$ . Now,

$$ \begin{align*} K\left(J_{\lambda_{n}}\right)=g\left(x_{n}\right) & =\frac{\delta_{n}\left(\left(\frac{a}{1+\delta_{n}-1/n}\right)^{n}-1\right)}{a-(1+\delta_{n}-1/n)}\\ & \simeq\frac{a^{n}}{n}\\ & \simeq\frac{1}{n}\frac{\left|\lambda_{n}\right|{}^{n}-\left|a\right|{}^{n}}{\left|\lambda_{n}\right|-\left|a\right|}, \end{align*} $$

as $n\to \infty $ , as desired.

Remark 4.2 In contrast with the estimate $C(n,\mathcal {K}_{K})\simeq Kn$ (as $n\to \infty $ ) considered in the previous sections, it follows from the previous proposition that

$$\begin{align*}C(n,\mathcal{K}):=\sup\left\{ \prod_{i}|\lambda_{i}(T)|\Vert T^{-1}\Vert:\,T\in\mathcal{K},\,T\in\mathcal{L}(X)\text{ invertible},\,X\in\mathcal{B}(n)\right\} =\infty. \end{align*}$$

Now, Proposition 4 shows that some sequence of Jordan blocks asymptotically achieves—up to numerical factor independent of K and n—the supremum

$$\begin{align*}M(n,\mathcal{K}):=\sup\left\{ \frac{\prod_{i}|\lambda_{i}(T)|\Vert T^{-1}\Vert}{K(T)}:\,T\in\mathcal{K},\,T\in\mathcal{L}(X)\text{ invertible},\,X\in\mathcal{B}(n)\right\}, \end{align*}$$

and provides with an elementary proof of the sharpness of the estimate $M(n,\mathcal {K})\lesssim n$ as $n\to \infty $ (which is a consequence of the first part of Theorem 1.1). Note that Theorem 2.1 obviously also leads to the latter assertion, but not for Jordan blocks and with more sophisticated arguments.

We shall also notice that, by the Kreiss Matrix Theorem [Reference Kreiss5, Reference Richtmyer and Morton13, Reference Sod16], properties $\mathcal {PB}$ and $\mathcal {K}$ are equivalent, so $C(n,\mathcal {PB})=C(n,\mathcal {K})=\infty $ . One can wonder whether an estimate similar to (4.4) can be obtained with $\sup _k\Vert T_n^k \Vert $ instead of $K(T_n)$ , for some (simple) sequence $(T_n)_n$ instead of $(J_{\lambda _n})_n$ .

5 A short and simple proof of Inequality (5.1)

In fact, the statement of Theorem 3.26 in [Reference Nikolski8] is slightly stronger than the first part of Theorem 1.1. More precisely, Nikolski proves the following.

Theorem 5.1 Let X be a complex Banach space, and let $T\in \mathcal {L}(X)$ be a Kreiss operator with Kreiss constant $K\geq 1$ . Let us denote by $m_{T}=\prod _{i=1}^{d}(z-\lambda _{i})$ its minimal polynomial, and assume that $(\lambda _{1},\ldots ,\lambda _{d})\in {\mathbb {D}}^{d}$ . Then

(5.1)

$$ \begin{align} \Vert T^{-1}\Vert\leq CK\frac{d}{\prod_{i=1}^{d}|\lambda_{i}|}, \end{align} $$

where C is an absolute constant.

We will display below a short proof of this more precise result that follows Nikolski’s approach and is obtained as a combination of Vitse’s [Reference Vitse22] functional calculus for Kreiss operators and the Bonsall–Walsh inequality [Reference Bonsall and Walsh1] for rational functions. In comparison with the proof of [Reference Nikolski8, Theorem 3.26], its simplicity lies in the choice of a very simple test function.

Proof A short proof of Inequality (5.1)

Let T, $(\lambda _{1},\ldots ,\lambda _{d})$ , $m_{T}$ , and K be as in the statement of Theorem 1.1. We denote by B the Blaschke product associated with the sequence $(\lambda _{1},\ldots ,\lambda _{d})$ and introduce the test function f given by

$$\begin{align*}f(z)=\frac{B(0)-B(z)}{zB(0)}. \end{align*}$$

Observe that f is a rational function (analytic in $\overline {\mathbb {D}}$ ) that interpolates the function $1/z$ on the set $(\lambda _{1},\ldots ,\lambda _{d})$ . More precisely,

$$\begin{align*}zf(z)-1=h(z)m_{T}(z),\quad z\in{\mathbb{D}}, \end{align*}$$

where $h(z)=\frac {1}{B(0)}\prod _{i=1}^{d}(\overline {\lambda _{i}}z-1)^{-1}$ , and therefore $Tf(T)$ is the identity matrix. Now, by [Reference Vitse22, Theorem 2.4(3)],

(5.2)

$$ \begin{align} \Vert T^{-1}\Vert =\left\Vert f(T)\right\Vert & \leq\frac{16Kd}{\pi}\left|\!\left|\frac{B(0)-B(z)}{zB(0)}\right|\!\right|{}_{H^{\infty}}\\ & \leq\frac{16Kd}{\pi\prod_{i=1}^{d}\left|\lambda_{i}\right|}\max_{z\in{\mathbb{T}}}|B(0)-B(z)|\nonumber \\ & \leq\frac{32K}{\pi}\frac{d}{\prod_{i=1}^{d}\left|\lambda_{i}\right|}.\nonumber\\[-41pt] \nonumber \end{align} $$

Note that the above proof gives the explicit constant $\frac {32}{\pi }$ in (5.1). Yet we expect that it is not optimal.

Remark 5.1 For completeness, let us give an insight into the first inequality of (5.2). It is obtained as a combination of Vitse’s functional calculus and the Bonsall–Walsh inequality: applying [Reference Vitse22, Theorem 2.4(1)] to the function f, we get

$$\begin{align*}\Vert T^{-1}\Vert=\left\Vert f(T)\right\Vert \leq2K\left|\!\left|f\right|\!\right|{}_{\mathcal{B}_{1}}, \end{align*}$$

where $\mathcal {B}_{1}$ is the analytic Besov algebra defined in Section 2, and it remains to apply the Bonsall–Walsh inequality [Reference Bonsall and Walsh1] to the rational function f:

$$\begin{align*}\left|\!\left|f\right|\!\right|{}_{\mathcal{B}_{1}}\leq\frac{8}{\pi}\deg f\left|\!\left|f\right|\!\right|{}_{H^{\infty}}, \end{align*}$$

where $\deg f$ stands for the degree of f.

Footnotes

Charpentier was partly supported by the grant ANR-17-CE40-0021 of the Agence Nationale pour la Recherche ANR. Zarouf acknowledges financial support by the Agence Nationale pour la Recherche grant ANR-18-CE40-0035.

1 The same article [Reference Gluskin, Meyer and Pajor4] contains an appendix with a stronger estimate due to Bourgain.

References

Bonsall, F. and Walsh, D., Symbols for trace class Hankel operators with good estimates for norms . Glasg. Math. J. 28(1986), 47–54.CrossRef Google Scholar

Charpentier, S., Fouchet, K., Szehr, O., and Zarouf, R., Condition numbers of matrices with given spectrum. Anal. Math. Phys. 9(2019), no. 3, 971–990.CrossRef Google Scholar

Eidelman, A., Eigenvalues and condition numbers of random matrices . SIAM J. Matrix Anal. Appl. 9(1988), no. 4, 543–560.CrossRef Google Scholar

Gluskin, E., Meyer, M., and Pajor, A., Zeros of analytic functions and norms of inverse matrices . Israel J. Math. 87(1994), 225–242.CrossRef Google Scholar

Kreiss, H.-O., Über die Stabilitätsdefinition für Differenzengleichungen die partielle Differentialgleichungen approximieren . BIT 2(1962), 153–181.CrossRef Google Scholar

Nikolski, N., Treatise on the shift operator, Springer, Berlin, 1986 (Translated from Russian, Lekzii ob operatore sdviga, “Nauka”, Moskva, 1980).CrossRef Google Scholar

Nikolski, N., Operators, function, and systems: an easy reading. Vol. 1, Monographs and Surveys, American Mathematical Society, Providence, RI, 2002.Google Scholar

Nikolski, N., Condition numbers of large matrices and analytic capacities . St. Petersburg Math. J. 17(2006), 641–682.CrossRef Google Scholar

Nikolski, N., Sublinear dimension growth in the Kreiss matrix theorem . Algebra i Analiz. 25(2013), no. 3, 3–51.Google Scholar

Quarteroni, A., Sacco, R., and Saleri, F., Numerical mathematics, Springer, Berlin, 2000.Google Scholar

Queffélec, H., Sur un théorème de Gluskin–Meyer–Pajor . C. R. Acad. Sci. Paris Sér. 1 Math. 317(1993), 155–158.Google Scholar

Queffélec, H., Norm of the inverse of a matrix; solution to a problem of Schäffer . In: Harmonic analysis from the Pichorides viewpoint, Publications Mathématiques d’Orsay, University of Paris XI, Orsay, 1995, pp. 68–87.Google Scholar

Richtmyer, R. D. and Morton, K. W., Difference methods for initial-value problems, 2nd ed., Wiley, New York–London–Sydney, 1967.Google Scholar

Schäffer, J. J., Norms and determinants of linear mappings . Math. Z. 118(1970), 331–339.CrossRef Google Scholar

Smale, S., On the efficiency of algorithms of analysis . Bull. Amer. Math. Soc. (N.S.) 13(1985), 87–121.CrossRef Google Scholar

Sod, G. A., Numerical methods in fluid dynamics, Cambridge University Press, Cambridge, 1985.CrossRef Google Scholar

Spijker, M. N., Tracogna, S., and Welfert, B., About the sharpness of the stability estimates in the Kreiss matrix theorem . Math. of Comp. 72(2003), 697–713.CrossRef Google Scholar

Szehr, O., Eigenvalue estimates for the resolvent of a non-normal matrix. J. Spectr. Theory 4(2014), no. 4, 783–813.CrossRef Google Scholar

Szehr, O. and Zarouf, R., Maximum of the resolvent over matrices with given spectrum . J. Funct. Anal. 272(2017), no. 2, 819–847.CrossRef Google Scholar

Szehr, O. and Zarouf, R., Explicit counterexamples to Schäffer’s conjecture . J. Math. Pures Appl. 146(2021), no. 9, 1–30.CrossRef Google Scholar

van Dorsselaer, J. L. M., Kraaijevanger, J. F. B. M., and Spijker, M. N., Linear stability analysis in the numerical solution of initial value problems . Acta Numer. (1993), 199–237.CrossRef Google Scholar

Vitse, P., Functional calculus under Kreiss type conditions . Math. Nachr. 278(2005), no. 15, 1811–1822.CrossRef Google Scholar

Zarouf, R., Toeplitz condition numbers as an

${\mathrm{H}}^{\infty }$ interpolation problem . J. Math. Sci. 156(2009), no. 5, 819–823.CrossRef Google Scholar

Article contents

On the condition number of a Kreiss matrix

Abstract

Keywords

MSC classification

1 Introduction

Theorem 1.2 (For a more precise statement, see Theorem 2.1)

2 Background and statement of the main result

3 Proof of Theorem 2.1

Proof of Theorem 2.1

4 On the sharpness of (5.1) with respect to K and n

5 A short and simple proof of Inequality (5.1)

Proof A short proof of Inequality (5.1)

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests