Skip to main content

On solution and perturbation estimates for the nonlinear matrix equation  \(X-A^{*}e^{X}A=I\)

Abstract

This work incorporates an efficient inversion free iterative scheme into Newton’s method to solve Newton’s step regardless of the singularity of the Fr\({\acute{\text {e}}}\)chet derivative. The proposed iterative scheme is constructed by extending the idea of the foundational form of the conjugate gradient method. Moreover, the resulting scheme is refined and employed to obtain a symmetric solution of the nonlinear matrix equation  \(X-A^{*}e^{X}A=I.\) Furthermore, explicit expressions for the perturbation and residual bound estimates of the approximate positive definite solution are derived. Finally, five numerical case studies provided confirm both the preciseness of theoretical results and the effectiveness of the propounded iterative method.

Introducton

We consider the nonlinear matrix equation

$$\begin{aligned} X-A^{*}e^{X}A=I, \end{aligned}$$
(1)

where A and X are real or complex square  matrices of the same size and I is an identity matrix. The nonlinear matrix equation has important applications in structural dynamics, numerical analysis theory, stability and robust stability analysis of control theory ([1,2,3,4,5,6]).

In the literature, various iterative methods and solutions to the matrix equations of the form \(X\pm A^{*}\mathfrak {F} (X)A =Q\) have been extensively investigated (see [11,12,13,14,15]). In [28], Hajarian developed the matrix form of the biconjugate residual (BCR) algorithm for finding the generalized reflexive solution and the generalized anti-reflexive solution of the generalized Sylvester matrix equation. It was further proven that the suggested BCR algorithm scheme converges within a finite number of iterations in the absence of round-off errors.

Zhang et al. [20] derived the necessary and sufficient conditions for the existence of Hermitian positive definite solution of the nonlinear matrix equation \(X-A^{*}X^{q}A=Q(q>1)\) and proposed two fixed point iterative methods for obtaining the solution. Peng et al. [21] applied Newton’s method to solve the nonlinear matrix equation \(X+A^{*}X^{-n}A =Q\) and provided sufficient conditions for its convergence. For \(\mathfrak {F}(X)=- X^{n},\) where \(n\ge 2,\) authors in [22] proved that under mild conditions the iterations converged monotonically to the elementwise minimal nonnegative solutions. Chacha and Naqvi [23] derived the explicit expressions for mixed and componentwise condition numbers for the nonlinear matrix equation \(X^{p}-A^{*}e^{X}A=I,\) where p is a positive integer.

This work is inspired by the work by Gao [16] who explored the solution of (1) and proposed a fixed point method to obtain the Hermitian positive definite solution.  However, to the best of our knowledge, no study has been conducted to explore symmetric solution and perturbation estimates of Eq.(1). This motivates us to study new solution and iterative method for Eq. (1).

This paper makes the following contributions. First, an inversion free iterative method that can be incorporated into Newton’s method to find symmetric solution of Eq. (1) is presented and necessary conditions for the existence of symmetric solution of (1) based on the proposed Algorithm 2 are derived. Newton’s step is computed by Algorithm 2 even if the Fr\(\acute{\text {e}}\)chet derivative is singular and it ensures the existence of symmetric solution of (1). Algorithm 2 is developed by extending the variant of the conjugate gradient method presented by Hajarian and Deghan in [27]. Second, fixed point method proposed in [16] is utilized to obtain the solution and the explicit expressions of the perturbation and error bound estimates for the approximate positive definite solution of Eq. (1) are derived. The motivation for studying symmetric solution of Eq. (1) is due to its vast practical applications and it has attracted the attention of many researchers (see [17, 19, 24] and the references therein).

This paper is organized as follows. In “Methods” section, we first introduce some notations, definitions and lemmas that will be applied in our proofs. Furthermore, we provide Newton’s method and propose an inversion free iterative method to solve the Newton’s step. Also, necessary conditions for the existence of symmetric solution and perturbation and error estimates for the symmetric positive definite solution of Eq. (1) are derived. In “Results and discussion” section, the proposed method is examine experimentally to illustrate the accurateness of the established theoretical results. Finally, a brief conclusion is presented in “Conclusion” section.

Methods

In this section we derive Newton’s method and propose an inversion free method to solve Eq. (1 ).

Preliminaries

In this subsection provide some important notations, definitions and lemmas that will be exploited in our proofs.

The notation \(\rho (\bullet )\) stand for spectral radius; \(A^{T}\) and \(A^{*}\) denotes the transpose and conjugate transpose of matrix A, respectively; \(\Vert A\Vert _{F}=\sqrt{\mathrm {trace}(A^{T}A)}\) denotes the Frobenius norm of matrix A induced by the inner product; for \(A=[a_{ij}]\in \mathbb {C}^{m\times n}\) and \(B\in \mathbb {C}^{p\times q},\) then \(A\otimes B=[a_{ij}B]\in \mathbb {C}^{mp\times nq}\) denotes the Kronecker product of matrices A and B; \(\mathrm {vec}(A)=[a_{1},a_{2},\cdots , a_{n}]^{T}\) stands for the vec operator on matrix A,  where \(a_{i}\) is the ith column of the matrix A.

Definition 1

[7, 8] Let \(f:\mathbb {C}^{n\times n}\mapsto \mathbb {C}^{n \times n}\) be a matrix function. The Fr\(\acute{\text {e}}\)chet derivative of matrix function f at A in the direction E is the unique linear operator \(L_{f}\) that maps E to \(L_ {f} (A, E)\) such that

$$\begin{aligned} f(A+E)-f(A)- L_{f}(A,E)=O(\Vert E\Vert ^{2}),~~ \text {for all}~~ A,E\in \mathbb {C}^{n\times n}. \end{aligned}$$

Definition 2

[9, 10] Fr\(\acute{\text {e}}\)chet derivative of a matrix function   \(e^{X}\) at \(X_{0}\) in the direction Z is

$$\begin{aligned} L_{f}(X_{0},Z)=\int _{0}^{1} e^{tX_{0}}Ze^{(1-t)X_{0}} dt\approx e^{X_{0}/2}Z e^{X_{0}/2}. \end{aligned}$$
(2)

Definition 3

Let a matrix A be \(m\times m\) square matrix. A is a Z- matrix if all its off-diagonal elements are non-positive.

Definition 4

A matrix \(A\in \mathbb {R}^{n\times n}\) is an M-matrix if \(A=sI-B\) for some nonnegative B and s with \(s>\rho (B).\)

Lemma 1

[2] For a Z-matrix A the following are equivalent:

  1. (i)

    A is a nonsingular M-matrix.

  2. (ii)

    \(A^{-1}\) is nonnegative.

  3. (iii)

    \(Av>0~(\ge 0)\) for some vector \(v>0~(\ge 0).\)

  4. (iv)

    All eigenvalue of A have positive real parts.

Lemma 2

[17] For any symmetric matrix X it holds that

$$\begin{aligned} \mathrm {trace}\left[ \frac{1}{2}\left( Y+Y^{T}\right) ^{T}X\right] =\mathrm {trace}(Y^{T}X), \end{aligned}$$
(3)

where Y is any arbitrary \(n\times n\) real matrix.

Lemma 3

[18] Let \(A,B\in \mathbb {C}^{n\times n},\) then \(\left\| e^{A}-e^{B}\right\| \le \Vert A-B\Vert e^{\mathrm {max}(\Vert A\Vert ,~\Vert B\Vert )}.\)

Newton’s method for Eq. (1)

In this subsection, we derive Newton’s method for Eq. (1). Let define a map

$$\begin{aligned} F(X)=X-A^{*}e^{X}A-I=0. \end{aligned}$$
(4)

Before applying Newton’s method, we need to evaluate the Fr\(\acute{\text {e}}\)chet derivative of F(X). From (2) and (4), we have

$$\begin{aligned} \begin{aligned} F(X+Z)&=X+Z-\left[ A^{*}\left( e^{X+Z}-e^{X}\right) A+A^{*}e^{X}A\right] -I\\&=X+ A^{*}e^{X}A -I+ Z-\left[ A^{*}\left( e^{X+Z}-e^{X}\right) A\right] \\&=F(X)+ Z-A^{*}e^{X/2}Ze^{X/2}A+O(\Vert Z\Vert ^{2}). \end{aligned} \end{aligned}$$
(5)

We see that the Fr\(\acute{\text {e}}\)chet derivative is a linear operator, \(F_{X}^{'}(Z):\mathbb {C}^{n\times n}\rightarrow \mathbb {C}^{n\times n},\) defined by

$$\begin{aligned} F_{X}^{'}(Z)=Z-A^{*}e^{X/2}Ze^{X/2}A. \end{aligned}$$
(6)

Applying the vec operator in (6) we have

$$\begin{aligned} \mathrm {vec}(F_{X}^{'}(Z))=\mathcal {D}_{X}\mathrm {vec}(Z), \end{aligned}$$
(7)

where    \(\mathcal {D}_{X}=I_{n^{2}}-\left( e^{X/2}A\right) ^{T}\otimes \left( A^{*}e^{X/2}\right)\) is the Kronecker Fr\(\acute{\text {e}}\)chet derivative of F(X).

Lemma 4

Suppose that \(0\le \left( e^{X/2}A\right) ^{T}\otimes \left( A^{*}e^{X/2}\right) <I_{n^{2}}.\) Then,

$$\begin{aligned} I_{n^{2}}-\left( e^{X/2}A\right) ^{T}\otimes \left( A^{*}e^{X/2}\right) \quad \text {is a nonsingular} ~ M\text {-matrix.} \end{aligned}$$

Proof

The proof is straight forward from Definitions 3, 4 and Lemma 1. Thus it is omitted here. \(\square\)

Since \(I_{n^{2}}-\left( e^{X/2}A\right) ^{T}\otimes \left( A^{*}e^{X/2}\right)\) is invertible under assumptions made in Lemma 4. Then, Newton’s step is computed in the iteration

$$\begin{aligned} Z-A^{*}e^{X/2}Ze^{X/2}A=-F(X) \end{aligned}$$
(8)

and the solution of (1) is given by the Newton’s iteration

$$\begin{aligned} X_{i+1}=X_{i}-\left[ F'_{X_{i}}\right] ^{-1}F(X_{i})\qquad \text {for all}\quad i=0,1,2\cdots . \end{aligned}$$
(9)

The analysis lead to Algorithm 1.

figure a

Remark 1

Newton’s method for (1) is not applicable if the Kronecker Fr\(\acute{\text {e}}\)chet derivative \(F'_{X}\) in step 3 of Algorithm 1 is singular. Also, Algorithm 1 does not ensure the existence of the symmetric solution. Moreover, when the size of the coefficient matrix A in Eq. (1) is large, Algorithm 1 consume more computer time and memory. To overcome these complications and drawbacks, we extend the idea of conjugate gradient method to Algorithm 2 which works even if the Kronecker Fr\(\acute{\text {e}}\)chet derivative \(F'_{X}\) is singular and ensures the existence of the symmetric solution of (1).

Consider the linear algebraic system

$$\begin{aligned} { Ax=b,} \end{aligned}$$
(10)

where A isa real square matrix, b is a vector of scalar real numbers and x is a unknown vector. For solving system (10), we have the following conjugate gradient method.

Conjugate gradient algorithm [27]

  1. (i)

    Choose \(x_i\) from a set of real numbers and set \(r_0=b-Ax_0,\alpha _0=\Vert r_0\Vert ^2, d_0=r_0\);

  2. (ii)

    for \(i=0, 1, \cdots\) until convergence do:

  3. (iii)

    \(s_i=Ad_i\);

  4. (iv)

    \(t_i=\alpha _i/(d_i^T s_i);~ x_{i+1}=x_i +t_id_i;~ r_{i+1}=r_i-t_is_i;~\beta _{i+1}=\Vert r_{i+1}\Vert ^2/\Vert r_i\Vert ^2; ~d_{i+1}=r_{i+1}+\beta _{i+1}d_i\);

  5. (v)

    end for.

Generally, the conjugate gradient method is not desirable for solving the non-square system \(Bx=c\), where matrix B is non-square. This motivates us to explore new iterative methods like the conjugate gradient algorithm which can be represented as

$$\begin{aligned} x_{i+1}=x_i+t_id_i, \end{aligned}$$
(11)

where parameter \(t_i\) and vector \(d_i\) are to be obtained. It is clear that (11) cannot be implemented directly to solve Newton’s step Z in its present form. Thus, the conjugate gradient method is refined and extended to solve symmetric Newton’s step Z. The details of algorithm are presented as follows.

figure b

Remark 2

In Algorithm 2, the sequence of matrices \(\mathcal {Q}_{k}\) and \(Z_{pk}\) are symmetric for all \(k=0,1,\cdots .\)

We have the following results from Algorithm 2.

Lemma 5

Let \(Z_{p}\) be a symmetric solution of pth Newton’s iteration (8), and the sequences \(\left\{ \mathcal {M}_{k}\right\} ,\)   \(\left\{ R_{k}\right\} ,\)   \(\left\{ Z_{pk}\right\}\) be generated by Algorithm 2. Then,

$$\begin{aligned} \mathrm {trace}\left[ \mathcal {M}_{k}^{T}\left( Z_{p}-Z_{pk}\right) \right] =\left\| R_{k}\right\| ^{2}, \quad \text {for all}\quad k=0,1,\cdots . \end{aligned}$$

Proof

From Algorithm 2, we have

$$\begin{aligned} \begin{aligned} \mathrm {trace}\left[ \mathcal {M}_{k}^{T}\left( Z_{p}-Z_{pk}\right) \right]&=\mathrm {trace}\left\{ \left[ R_{k}-\left( A^{*}e^{X_{p}/2}\right) ^{T}R_{k}\left( e^{X_{p}/2}A\right) ^{T}\right] ^{T} \left( Z_{p}-Z_{pk}\right) \right\} \\&=\mathrm {trace}\left\{ R_{k}^{T}\left[ Z_{p}-Z_{pk}-\left( A^{*}e^{X_{p}/2}\right) \left( Z_{p}-Z_{pk}\right) \left( e^{X_{p}/2}A\right) \right] \right\} \\&=\mathrm {trace}\left\{ R_{k}^{T}\left[ -F(X)-\left[ Z_{pk}-\left( A^{*}e^{X_{p}/2}\right) Z_{pk}\left( e^{X_{p}/2}A\right) \right] \right] \right\} \\&=\mathrm {trace}\left\{ R_{k}^{T}R_{k}\right\} =\left\| R_{k}\right\| ^{2}. \end{aligned} \end{aligned}$$
(12)

Hence the proof is completed. \(\square\)

Lemma 6

Suppose that \(Z_{p}\) is a symmetric solution of pth Newton’s iteration (8) and the sequences \(R_{k},~\mathcal {Q}_{k}\) are generated by Algorithm 2. Then, it holds that  \(\mathrm {trace}\left[ \mathcal {Q}_{k}^{T}\left( Z_{p}-Z_{pk}\right) \right] =\left\| R_{k}\right\| ^{2},~ \text {for all}\quad k=0,1,\cdots ;~ \mathrm {trace}(R_{k}^{T}R_{j}) =0~ \text {and}~~ \mathrm {trace}(\mathcal {Q}_{k}^{T}\mathcal {Q}_{j}) =0,~~ \text {for} ~~ k>j=0,1,\cdots ,l,\quad l\ge 1.\)

Proof

We prove via mathematical induction. For \(k=0,\) it follows from Algorithm 2, Lemma 2 and Lemma 5 that

$$\begin{aligned} \begin{aligned} \mathrm {trace}\left[ \mathcal {Q}_{0}^{T}\left( Z_{p}-Z_{p0}\right) \right]&=\mathrm {trace}\left[ \frac{1}{2}\left( \mathcal {M}_{0}+\mathcal {M}_{0}^{T}\right) ^{T} \left( Z_{p}-Z_{p0}\right) \right] \\&=\mathrm {trace}\left[ \mathcal {M}_{0}^{T} \left( Z_{p}-Z_{p0}\right) \right] \\&=\left\| R_{0}\right\| ^{2}.\\ \end{aligned} \end{aligned}$$
(13)

Now assume that \(\mathrm {trace}\left[ \mathcal {Q}_{k}^{T}\left( Z_{p}-Z_{pk}\right) \right] =\left\| R_{k}\right\| ^{2}, \quad \text {for all}\quad k=0,1,\cdots\) hold true for \(k=h\in \mathbb {N},\) we need to show that the statement it also holds for \(k=h+1\in \mathbb {N}\). From Algorithm 2, Lemma 2 and Lemma 5, we have

$$\begin{aligned} \begin{aligned} \mathrm {trace}\left[ \mathcal {Q}_{h+1}^{T}\left( Z_{p}-Z_{ph+1}\right) \right]&=\mathrm {trace}\left\{ \left[ \frac{1}{2}\left( \mathcal {M}_{h+1}+\mathcal {M}_{h+1}^{T}\right) ^{T} +\beta _{h}\mathcal {Q}_{h} \right] ^{T}\left( Z_{p}-Z_{ph+1}\right) \right\} \\&=\mathrm {trace}\left[ \mathcal {M}_{h+1}^{T}\left( Z_{p}-Z_{ph+1}\right) \right] +\beta _{h}\mathrm {trace}\left[ \mathcal {Q}_{h}^{T}\left( Z_{p}-Z_{ph+1}\right) \right] \\&=\left\| R_{h+1}\right\| ^{2}+\beta _{h}\mathrm {trace}\left[ \mathcal {Q}_{h}^{T}\left( Z_{p}-Z_{ph}-\alpha _{h}\mathcal {Q}_{h}\right) \right] \\&=\left\| R_{h+1}\right\| ^{2}+\beta _{h}\mathrm {trace}\left[ \mathcal {Q}_{h}^{T}\left( Z_{p}-Z_{ph}\right) \right] -\beta _{h}\alpha _{h}\left\| \mathcal {Q}_{h}\right\| ^{2}\\&=\left\| R_{h+1}\right\| ^{2}+\beta _{h}\left\| R_{h}\right\| ^{2}-\beta _{h}\left\| R_{h}\right\| ^{2}\\&=\left\| R_{h+1}\right\| ^{2}+\left\| R_{h+1}\right\| ^{2}-\left\| R_{h+1}\right\| ^{2}=\left\| R_{h+1}\right\| ^{2}.\\ \end{aligned} \end{aligned}$$
(14)

As requred, the lemma is proved.

Similarly, we prove that \(\mathrm {trace}(R_{k}^{T}R_{j}) =0~~ \text {and}~~ \mathrm {trace}(\mathcal {Q}_{k}^{T}\mathcal {Q}_{j}) =0,~~ \text {for} ~~ k>j=0,1,\cdots ,l,\quad l\ge 1\) via mathematical induction.

Step 1: For \(l=1,\) it follows that

$$\begin{aligned} \begin{aligned} \mathrm {trace}\left[ R_{1}^{T}R_{0} \right]&=\mathrm {trace}\left\{ \left[ -F(X_{p})-\left[ Z_{p1}-A^{*}e^{X_{p}/2}Z_{p1}e^{X_{p}/2}A\right] \right] ^{T}R_{0}\right\} \\&=\mathrm {trace}\left\{ \left[ -F(X_{p})-\left[ Z_{0}-A^{*}e^{X_{p}/2}Z_{0}e^{X_{p}/2}A\right. \right. \right. \\&+ \left. \left. \left. \alpha _{0}(\mathcal {Q}_{0}-A^{*}e^{X_{p}/2}\mathcal {Q}_{0}e^{X_{p}/2}A)\right] \right] ^{T}R_{0}\right\} \\&=\mathrm {trace}\left\{ \left[ R_{0}- \alpha _{0}\left( \mathcal {Q}_{0}-A^{*}e^{X_{p}/2}\mathcal {Q}_{0}e^{X_{p}/2}A\right) \right] ^{T}R_{0}\right\} \\&=\left\| R_{0}\right\| ^{2}-\mathrm {trace}\left\{ \alpha _{0}\left( \mathcal {Q}_{0}^{T}\left[ R_{0}-\left( A^{*}e^{X_{p}/2}\right) ^{T}R_{0}\left( e^{X_{p}/2}A\right) ^{T}\right] \right) \right\} \\&\quad =\left\| R_{0}\right\| ^{2}-\alpha _{0}\mathrm {trace}\left[ \mathcal {Q}_{0}^{T}\mathcal {M}_{0}\right] \\&=\left\| R_{0}\right\| ^{2}-\alpha _{0}\mathrm {trace}\left[ \mathcal {Q}_{0}^{T}\frac{1}{2} \left( \mathcal {M}_{0}+\mathcal {M}_{0}^{T}\right) \right] \\&=\left\| R_{0}\right\| ^{2}-\alpha _{0}\mathrm {trace}\left[ \mathcal {Q}_{0}^{T}\mathcal {Q}_{0}\right] =0,\\ \end{aligned} \end{aligned}$$
(15)

and

$$\begin{aligned} \begin{aligned} \mathrm {trace}\left[ \mathcal {Q}_{1}^{T}\mathcal {Q}_{0} \right]&=\mathrm {trace}\left[ \left[ \frac{1}{2}\left( \mathcal {M}_{1}+\mathcal {M}_{1}^{T}\right) +\beta _{0}\mathcal {Q}_{0}\right] ^{T} \mathcal {Q}_{0}\right] \\&=\mathrm {trace}\left( \mathcal {M}_{1}^{T}\mathcal {Q}_{0}\right) +\beta _{0}\mathrm {trace}\left( \mathcal {Q}_{0}^{T}\mathcal {Q}_{0}\right) \\&=\mathrm {trace}\left[ \left[ R_{1}-\left( A^{*}e^{X_{p}/2}\right) ^{T}R_{1}\left( e^{X_{p}/2}A\right) ^{T}\right] ^{T}\mathcal {Q}_{0}\right] +\beta _{0}\left\| \mathcal {Q}_{0}\right\| ^{2}\\&=\mathrm {trace}\left[ R_{1}^{T}\left[ \mathcal {Q}_{0}-\left( A^{*}e^{X_{p}/2}\right) \mathcal {Q}_{0}\left( e^{X_{p}/2}A\right) \right] \right] +\frac{\Vert R_{1}\Vert ^{2}}{\Vert R_{0}\Vert ^{2}}\left\| \mathcal {Q}_{0}\right\| ^{2}\\&=\mathrm {trace}\left[ R_{1}^{T}\left[ \frac{1}{\alpha _{0}}(Z_{p1}-Z_{p0})-\frac{1}{\alpha _{0}}\left( A^{*}e^{X_{p}/2}\right) (Z_{p1}-Z_{p0})\left( e^{X_{p}/2}A\right) \right] \right] \\&+\frac{\Vert R_{1}\Vert ^{2}}{\Vert R_{0}\Vert ^{2}}\left\| \mathcal {Q}_{0}\right\| ^{2}\\&=\frac{1}{\alpha _{0}}\mathrm {trace}\left[ R_{1}^{T}\left[ (Z_{p1}-Z_{p0})-\left( A^{*}e^{X_{p}/2}\right) (Z_{p1}-Z_{p0})\left( e^{X_{p}/2}A\right) \right] \right] \\&+\frac{\Vert R_{1}\Vert ^{2}}{\Vert R_{0}\Vert ^{2}}\left\| \mathcal {Q}_{0}\right\| ^{2}\\&=\frac{1}{\alpha _{0}}\mathrm {trace}\left[ R_{1}^{T}(R_{0}-R_{1})\right] +\frac{\Vert R_{1}\Vert ^{2}}{\Vert R_{0}\Vert ^{2}}\left\| \mathcal {Q}_{0}\right\| ^{2}\\&=\frac{1}{\alpha _{0}}\left( \mathrm {trace}\left[ R_{1}^{T}R_{0} \right] -\mathrm {trace}\left[ R_{1}^{T}R_{1}\right] \right) +\frac{\Vert R_{1}\Vert ^{2}}{\Vert R_{0}\Vert ^{2}}\left\| \mathcal {Q}_{0}\right\| ^{2}\\&=-\frac{1}{\alpha _{0}}\mathrm {trace}\left[ R_{1}^{T}R_{1} \right] +\frac{\Vert R_{1}\Vert ^{2}}{\Vert R_{0}\Vert ^{2}}\left\| \mathcal {Q}_{0}\right\| ^{2}\\&=-\frac{\Vert R_{1}\Vert ^{2}}{\Vert R_{0}\Vert ^{2}}\left\| \mathcal {Q}_{0}\right\| ^{2}+\frac{\Vert R_{1}\Vert ^{2}}{\Vert R_{0}\Vert ^{2}}\left\| \mathcal {Q}_{0}\right\| ^{2}=0. \end{aligned} \end{aligned}$$
(16)

Now, assume that \(\mathrm {trace}(R_{k}^{T}R_{j}) =0~ \text {and}~~ \mathrm {trace}(\mathcal {Q}_{k}^{T}\mathcal {Q}_{j}) =0,~~ \text {for} ~~ k>j=0,1,\cdots ,l,\quad l\ge 1\) holds for \(l=s\in \mathbb {N}.\) We show that it holds for \(l=s+1\in \mathbb {N}.\) From Algorithm 2, we have

$$\begin{aligned}&\mathrm {trace}\left[ R_{s+1}^{T}R_{s} \right] \\&=\mathrm {trace}\left[ \left[ R_{s}-\alpha _{s}\left( \mathcal {Q}_{s}-A^{*}e^{X_{p}/2}\mathcal {Q}_{s}e^{X_{p}/2}A\right) \right] ^{T}R_{s} \right] \\&=\mathrm {trace}\left[ R_{s}^{T}R_{s} \right] -\alpha _{s}\mathrm {trace}\left[ \left[ \left( \mathcal {Q}_{s}-A^{*}e^{X_{p}/2}\mathcal {Q}_{s}e^{X_{p}/2}A\right) \right] ^{T}R_{s} \right] \\&=\Vert R_{s}\Vert ^{2}-\alpha _{s}\mathrm {trace}\left[ \mathcal {Q}_{s}^{T} \left( R_{s}-(A^{*}e^{X_{p}/2})^{T}R_{s}(e^{X_{p}/2}A)^{T}\right) \right] \\&=\Vert R_{s}\Vert ^{2}-\alpha _{s}\mathrm {trace}\left[ \mathcal {Q}_{s}^{T} \mathcal {M}_{s} \right] \\&=\Vert R_{s}\Vert ^{2}-\alpha _{s}\mathrm {trace}\left[ \mathcal {Q}_{s}^{T}\frac{1}{2}( \mathcal {M}_{s}+ \mathcal {M}_{s}^{T})\right] \\&=\Vert R_{s}\Vert ^{2}-\alpha _{s}\mathrm {trace}\left[ \mathcal {Q}_{s}^{T}( \mathcal {Q}_{s}- \beta _{s-1}\mathcal {Q}_{s-1})\right] \\&=\Vert R_{s}\Vert ^{2}-\alpha _{s}\Vert \mathcal {Q}_{s}\Vert ^{2}+\alpha _{s}\beta _{s-1}\mathrm {trace}\left[ \mathcal {Q}_{s}^{T}\mathcal {Q}_{s-1})\right] \\&=\Vert R_{s}\Vert ^{2}-\Vert R_{s}\Vert ^{2}+0=0.\\ \end{aligned}$$
(17)

Similarly, we have

$$\begin{aligned} \begin{aligned} \mathrm {trace}\left[ \mathcal {Q}_{s+1}^{T}\mathcal {Q}_{s} \right]&=\mathrm {trace}\left[ \left[ \frac{1}{2}\left( \mathcal {M}_{s+1}+\mathcal {M}_{s+1}^{T}\right) +\beta _{s}\mathcal {Q}_{s}\right] ^{T}\mathcal {Q}_{s} \right] \\&=\mathrm {trace}\left[ \mathcal {M}_{s+1}^{T}\mathcal {Q}_{s} \right] +\beta _{s}\Vert \mathcal {Q}_{s}\Vert ^{2}\\&=\mathrm {trace}\left[ \left[ R_{s+1}-\left( A^{*}e^{X_{p}/2}\right) ^{T}R_{s+1}\left( e^{X_{p}/2}A\right) ^{T}\right] ^{T}\mathcal {Q}_{s} \right] +\beta _{s}\Vert \mathcal {Q}_{s}\Vert ^{2}\\&=\mathrm {trace}\left[ R_{s+1}^{T} \left[ \mathcal {Q}_{s}-\left( A^{*}e^{X_{p}/2}\right) \mathcal {Q}_{s}\left( e^{X_{p}/2}A\right) \right] \right] +\beta _{s}\Vert \mathcal {Q}_{s}\Vert ^{2}\\&=\mathrm {trace}\left[ R_{s+1}^{T} \frac{1}{\alpha _{s}} (R_{s}-R_{s+1})\right] +\beta _{s}\Vert \mathcal {Q}_{s}\Vert ^{2}\\&= -\frac{1}{\alpha _{s}} \Vert R_{s+1}\Vert ^{2}+\beta _{s}\Vert \mathcal {Q}_{s}\Vert ^{2}\\&= -\frac{\Vert \mathcal {Q}_{s}\Vert ^{2}}{\Vert R_{s}\Vert ^{2}} \Vert R_{s+1}\Vert ^{2}+\frac{\Vert R_{s+1}\Vert ^{2}}{\Vert R_{s}\Vert ^{2}}\Vert \mathcal {Q}_{s}\Vert ^{2}=0.\\ \end{aligned} \end{aligned}$$
(18)

Thus, we have seen that \(\mathrm {trace}\left[ R_{k}^{T}R_{k-1} \right] =0\) and \(\mathrm {trace}\left[ \mathcal {Q}_{k}^{T}\mathcal {Q}_{k-1} \right] =0,\) for all \(k=0,1,\cdots , l.\)

Step2: We assume that \(\mathrm {trace}\left[ R_{s}^{T}R_{j} \right] =0\) and \(\mathrm {trace}\left[ \mathcal {Q}_{s}^{T}\mathcal {Q}_{j} \right] =0,\) for all \(j=0,1,\cdots , l-1.\) By Algorithm 2 and Lemma 2, together with the assumptions made, it follows that

$$\begin{aligned} \begin{aligned} \mathrm {trace}\left[ R_{s+1}^{T}R_{j} \right]&=\mathrm {trace}\left[ \left[ R_{s}-\alpha _{s}\left( \mathcal {Q}_{s}-A^{*}e^{X_{p}/2}\mathcal {Q}_{s}e^{X_{p}/2}A\right) \right] ^{T}R_{j} \right] \\&=\mathrm {trace}\left[ R_{s}^{T}R_{j} \right] -\alpha _{s}\mathrm {trace}\left[ \mathcal {Q}_{s}^{T} \left( R_{j}-(A^{*}e^{X_{p}/2})^{T}R_{j}(e^{X_{p}/2}A)^{T}\right) \right] \\&=\mathrm {trace}\left[ R_{s}^{T}R_{j} \right] -\alpha _{s}\mathrm {trace}\left[ \mathcal {Q}_{s}^{T} \mathcal {M}_{j} \right] \\&=0-\alpha _{s}\mathrm {trace}\left[ \mathcal {Q}_{s}^{T} \frac{1}{2}(\mathcal {M}_{j}+\mathcal {M}_{j}^{T}) \right] \\&=-\alpha _{s}\mathrm {trace}\left[ \mathcal {Q}_{s}^{T} (\mathcal {Q}_{j}-\beta _{j-1}\mathcal {Q}_{j-1}) \right] =0.\\ \end{aligned} \end{aligned}$$
(19)

Finally, we prove that   \(\mathrm {trace}\left[ \mathcal {Q}_{s+1}^{T}\mathcal {Q}_{j} \right] =0.\)

$$\begin{aligned} \begin{aligned} \mathrm {trace}\left[ \mathcal {Q}_{s+1}^{T}\mathcal {Q}_{j} \right]&=\mathrm {trace}\left[ \left[ \frac{1}{2}\left( \mathcal {M}_{s+1}+\mathcal {M}_{s+1}^{T}\right) +\beta _{s}\mathcal {Q}_{s}\right] ^{T}\mathcal {Q}_{j} \right] \\&=\mathrm {trace}\left[ \mathcal {M}_{s+1}^{T}\mathcal {Q}_{j} \right] \\&=\mathrm {trace}\left[ \left[ R_{s+1}-\left( A^{*}e^{X_{p}/2}\right) ^{T}R_{s+1}\left( e^{X_{p}/2}A\right) ^{T}\right] ^{T}\mathcal {Q}_{j} \right] \\&=\mathrm {trace}\left[ R_{s+1}^{T} \left[ \mathcal {Q}_{j}-\left( A^{*}e^{X_{p}/2}\right) \mathcal {Q}_{j}\left( e^{X_{p}/2}A\right) \right] \right] \\&=\mathrm {trace}\left[ R_{s+1}^{T} \frac{1}{\alpha _{j}} (R_{j}-R_{j+1})\right] \\&= \frac{1}{\alpha _{j}}\mathrm {trace}\left[ R_{s+1}^{T} R_{j}\right] - \frac{1}{\alpha _{j}}\mathrm {trace}\left[ R_{s+1}^{T} R_{j+1}\right] =0,\\ \end{aligned} \end{aligned}$$
(20)

for all \(j=0,1,\cdots ,s-1.\) The proof is completed. \(\square\)

From Lemma 6, we see that if \(k>0,\) and \(R_{i}\ne 0,\) for all \(i=0,1,\cdots ,k.\) Then, the sequences \(R_{i},~R_{j}\) generated by Algorithm 2 are orthogonal for all \(j\ne i.\) We give the following remark for for later use.

Remark 3

From Lemma 6, for the Newton’s iteration (8) to have a symmetric solution, then the sequences \(\left\{ R_{k}\right\}\) and \(\left\{ \mathcal {Q}_{k}\right\}\) generated by Algorithm 2 should be nonzero.

If there exist a positive number \(k>0\) such that \(R_{i} \ne 0\) for all \(i=0,1,\cdots ,k\) in Algorithm 2, then, the matrices \(R_{i}\) and \(R_{j}\) are orthogonal for all \(i\ne j.\)

Theorem 4

Assume that the pth Newton’s iteration (8) has a symmetric solution. Then, for any symmetric initial guess \(Z_{p0},\) its symmetric solution can be obtained with finite iterative steps.

Proof

From Lemma 6, suppose that \(R_{k}\ne 0\) for \(k=0,1,\cdots ,n^{2}-1.\) Since the pth Newton’s iteration (8) has a symmetric solution, then from Remark 3, it is certain that there exist a positive integer k such that \(\mathcal {Q}_{k}\ne 0.\) Thus, we can compute \(Z_{pn^{2}}\) and \(R _{n^{2}}\) by Algorithm 2. Also, from Lemma 6, we know that \(\mathrm {trace}(R_{n^{2}}^{T}R_{k})=0\) for all \(k=0,1,\cdots ,n^{2}-1\) and \(\mathrm {trace}(R_{i}^{T}R_{k})=0\) for all \(i,j,=0,1,\cdots ,n^{2}-1\) with \(i\ne j.\) We see that the set of matrices \(R_{0},R_{1},\cdots ,R_{n^{2}-1}\) forms an orthogonal basis of the matrix space \(\mathbb {R}^{n\times n}.\) But we know that \(\mathrm {trace}(R_{n^{2}}^{T}R_{k})=0\) holds true if \(R_{n^{2}}=0,\) this implies that \(Z_{p{n^{2}}}\) is the solution of the pth Newton’s iteration(8). \(\square\)

Now, we prove the convergence of Algorithm 1 to symmetric solution.

Theorem 5

Assume that (1) has a symmetric solution and each Newton’s iteration is consistent for symmetric initial guess \(X_{0}.\) The sequence \(\left\{ X_{k}\right\}\) is generated by Algorithm 1 with \(X_{0}\) such that \(\lim _{k\rightarrow \infty } X_{k}=X_{*},\) and the matrix \(X_{*}\) satisfies \(F(X_{*})=0,\) then, \(X_{*}\) is a symmetric solution of (1).

Proof

Since all Newton’s iteration have symmetric solution, from Theorem 4 and Newton’s method we can obtain the sequence \(\{X_{k}\}\) which is the set of symmetric matrices. Furthermore, the Newton’s sequence converges to a solution \(X_{*}\) which is a symmetric solution of (1). \(\square\)

Perturbation and error bound estimate for the approximate symmetric positive definite solution of Eq. (1)

In this subsection, we investigate a perturbation and error estimates for the approximate symmetric positive definite solution of the nonlinear matrix Eq. (1). We will use a fixed point method to find the approximate symmetric solution.

figure c

Lemma 7

Suppose A is a nonsingular matrix with \(\rho (A)\le 1/e\) and X is the symmetric positive definite solution of (1). Then, \(\Vert A\Vert ^{2}\Vert e^{X}\Vert \le 1.\)

Proof

Let define a map \(G(X)=I+A^{*}e^{X}A.\) G(X) has a fixed point in [I, 2I]( see [16]). Thus, from the assumption that \(\rho (A)\le 1/e\), \(X\le 2I\) and \(G(X)=I+A^{*}e^{X}A,\) it follows that

$$\begin{aligned} I\le I+A^{*}e^{X}A\le \left( 1+\Vert A\Vert ^{2}e^{\Vert X\Vert }\right) I=2I. \end{aligned}$$

\(\square\)

Theorem 6

Suppose that \(X^{\mathrm {sol.}}\) is the symmetric positive definite solution of (1) such that \(\displaystyle {\Vert A\Vert ^{2}\left\| e^{\widetilde{X^{\mathrm {sol.}}}}\right\| \le 1}\)    and   \(\displaystyle {\frac{1}{\left\| X^{\mathrm {sol.}}\right\| }\le 1}.\) Then,

$$\begin{aligned} \frac{\left\| \triangle X^{\mathrm {sol.}}\right\| }{\left\| X^{\mathrm {sol.}}\right\| }\le \frac{1}{\theta }\left( \frac{\Vert \triangle I\Vert }{\Vert I\Vert }+\frac{2\Vert \triangle A\Vert }{\Vert A\Vert }\right) , \end{aligned}$$
(21)

where

$$\begin{aligned} \theta = 1-\Vert A\Vert ^{2}e^{\mathrm {max}\left( \left\| X^{\mathrm {sol.}}\right\| ,~\left\| \widetilde{X^{\mathrm {sol.}}}\right\| \right) }>0. \end{aligned}$$

Proof

Consider the equations

$$\begin{aligned} X^{\mathrm {sol.}}-A^{*} e^{X^{\mathrm {sol.}}}A=I\ \end{aligned}$$
(22)

and

$$\begin{aligned} \widetilde{ X^{\mathrm {sol.}}}-\widetilde{A^{*}}\widetilde{ e^{X^{\mathrm {sol.}}}}{\widetilde{A}}={\widetilde{I}}. \end{aligned}$$
(23)

Let    \(\triangle A={\widetilde{A}}-A,\)   \(\triangle X^{\mathrm {sol.}}=\widetilde{ X^{\mathrm {sol.}}}- X^{\mathrm {sol.}},\) and    \(\triangle I={\widetilde{I}}-I.\) Then, we have

$$\begin{aligned} \begin{aligned} \triangle I&={\widetilde{I}}-I\\&=\widetilde{ X^{\mathrm {sol.}}}-\widetilde{A^{*}}\widetilde{ e^{X^{\mathrm {sol.}}}}{\widetilde{A}}-\left( X^{\mathrm {sol.}}-A^{*} e^{X^{\mathrm {sol.}}}A\right) \\&=\triangle X^{\mathrm {sol.}}-\widetilde{A^{*}}\widetilde{ e^{X_{*}}}{\widetilde{A}}+A^{*} e^{X^{\mathrm {sol.}}}A\\&=\triangle X^{\mathrm {sol.}}-(A+\triangle A)^{*}\widetilde{ e^{X^{\mathrm {sol.}}}}(A+\triangle A)+A^{*} e^{X^{\mathrm {sol.}}}A\\&=\triangle X^{\mathrm {sol.}}-A^{*}\widetilde{ e^{X^{\mathrm {sol.}}}}A-A^{*}\widetilde{ e^{X^{\mathrm {sol.}}}}\triangle A-\triangle A^{*}\widetilde{ e^{X^{\mathrm {sol.}}}} A\\&-\triangle A^{*}\widetilde{ e^{X^{\mathrm {sol.}}}}\triangle A+A^{*} e^{X^{\mathrm {sol.}}}A\\&=\triangle X^{\mathrm {sol.}}-A^{*}\left( \widetilde{ e^{X^{\mathrm {sol.}}}}- e^{X^{\mathrm {sol.}}}\right) A-A^{*}\widetilde{ e^{X^{\mathrm {sol.}}}}\triangle A-\triangle A^{*}\widetilde{ e^{X^{\mathrm {sol.}}}} A.\\ \end{aligned} \end{aligned}$$
(24)

Since both    \(\triangle A^{*} \rightarrow 0\) and \(\triangle A\rightarrow 0\) in (24), then the term \(\triangle A^{*}\widetilde{ e^{X^{\mathrm {sol.}}}}\triangle A\) is neglected.

For convenience, let \(N=A^{*}\left( \widetilde{ e^{X^{\mathrm {sol.}}}}- e^{X^{\mathrm {sol.}}}\right) A\) and \(H=A^{*}\widetilde{ e^{X^{\mathrm {sol.}}}}\triangle A-\triangle A^{*}\widetilde{ e^{X^{\mathrm {sol.}}}} A,\) we have,

$$\begin{aligned} \Vert \triangle I\Vert \ge \Vert \triangle X^{\mathrm {sol.}}\Vert -\Vert N\Vert -\Vert H\Vert . \end{aligned}$$
(25)

It follows that

$$\begin{aligned} \begin{aligned} \Vert N\Vert&=\left\| A^{*}\left( \widetilde{ e^{X^{\mathrm {sol.}}}}- e^{X^{\mathrm {sol.}}}\right) A\right\| \\&\le \Vert A\Vert ^{2}e^{\mathrm {max}\left( \left\| X^{\mathrm {sol.}}\right\| ,~\left\| \widetilde{X^{\mathrm {sol.}}}\right\| \right) }\left\| \triangle X^{\mathrm {sol.}}\right\| \\ \end{aligned} \end{aligned}$$
(26)

and

$$\begin{aligned} \begin{aligned} \Vert H\Vert&\le \Vert A^{*}\Vert \left\| \widetilde{ e^{X^{\mathrm {sol.}}}}\right\| \Vert \triangle A\Vert +\Vert \triangle A^{*}\Vert \left\| \widetilde{ e^{X^{\mathrm {sol.}}}} \right\| \Vert A\Vert \\&= \Vert A\Vert \left( \left\| \widetilde{ e^{X^{\mathrm {sol.}}}}\right\| +\left\| \widetilde{ e^{X^{\mathrm {sol.}}}} \right\| \right) \Vert \triangle A\Vert \\&= 2\Vert A\Vert \Vert \triangle A\Vert \left\| \widetilde{ e^{X^{\mathrm {sol.}}}}\right\| .\\ \end{aligned} \end{aligned}$$
(27)

Now, from (25) we have,

$$\begin{aligned}&\Vert \triangle I\Vert \ge \left\| \triangle X^{\mathrm {sol.}}\right\| -\Vert A\Vert ^{2}e^{\mathrm {max}\left( \left\| X^{\mathrm {sol.}}\right\| ,~\left\| \widetilde{X^{\mathrm {sol.}}}\right\| \right) }\Vert \triangle X^{\mathrm {sol.}}\Vert -2\Vert A\Vert \Vert \triangle A\Vert \left\| \widetilde{ e^{X^{\mathrm {sol.}}}}\right\| \end{aligned}$$
(28)
$$\begin{aligned}&=\Vert \triangle X^{\mathrm {sol.}}\Vert \left( 1-\Vert A\Vert ^{2}e^{\mathrm {max}\left( \Vert X^{\mathrm {sol.}}\Vert ,~\Vert \widetilde{X^{\mathrm {sol.}}}\Vert \right) }\right) -2\Vert A\Vert \Vert \triangle A\Vert \Vert \widetilde{ e^{X^{\mathrm {sol.}}}}\Vert \end{aligned}$$
(29)
$$\begin{aligned}&\Vert \triangle X^{\mathrm {sol.}}\Vert \le \frac{1}{1-\Vert A\Vert ^{2}e^{\mathrm {max}(\Vert X^{\mathrm {sol.}}\Vert ,~\Vert \widetilde{X^{\mathrm {sol.}}}\Vert )}}(\Vert \triangle I\Vert +2\Vert A\Vert \Vert \triangle A\Vert \Vert \widetilde{ e^{X^{\mathrm {sol.}}}}\Vert ) \end{aligned}$$
(30)
$$\begin{aligned}&\frac{\left\| \triangle X^{\mathrm {sol.}}\right\| }{\left\| X^{\mathrm {sol.}}\right\| }\nonumber \\&\quad \le \frac{1}{1-\Vert A\Vert ^{2}e^{\mathrm {max}\left( \left\| X^{\mathrm {sol.}}\right\| ,~\left\| \widetilde{X^{\mathrm {sol.}}}\right\| \right) }}\left( \frac{\Vert \triangle I\Vert }{\Vert I\Vert }\frac{\Vert I\Vert }{\Vert X^{\mathrm {sol.}}\Vert }+\frac{2\Vert \triangle A\Vert \left\| \widetilde{ e^{X^{\mathrm {sol.}}}}\right\| }{\Vert A\Vert }\frac{\Vert A\Vert ^{2}}{\left\| X^{\mathrm {sol.}}\right\| }\right) \end{aligned}$$
(31)

It follows from \(\Vert A\Vert ^{2}\le \frac{\Vert I\Vert }{\left\| \widetilde{e^{X^{\mathrm {sol.}}}}\right\| }\) and \(\frac{1}{\left\| X^{\mathrm {sol.}}\right\| }\le 1\) that

$$\begin{aligned} \frac{\Vert \triangle X_{*}\Vert }{\Vert X^{\mathrm {sol.}}\Vert }\le \frac{1}{\theta }\left( \frac{\Vert \triangle I\Vert }{\Vert I\Vert }+\frac{2\Vert \triangle A\Vert }{\Vert A\Vert }\right) , \end{aligned}$$
(32)

where

$$\begin{aligned} \theta = 1-\Vert A\Vert ^{2}e^{\mathrm {max}\left( \left\| X^{\mathrm {sol.}}\right\| ,~\left\| \widetilde{X^{\mathrm {sol.}}}\right\| \right) }>0. \end{aligned}$$

Which completes the proof. \(\square\)

In Theorem 7, we derive the error estimate for \(\widetilde{X^{\mathrm {sol.}}}.\)

Theorem 7

Let \(\widetilde{X^{\mathrm {sol.}}}\) approximate the symmetric positive definite solution of (1) such that the residual \(R\left( \widetilde{X^{\mathrm {sol.}}}\right) =\widetilde{ X^{\mathrm {sol.}}}-A^{*}\widetilde{ e^{X^{\mathrm {sol.}}}}A-I.\) Then,

$$\begin{aligned} \left\| R\left( {\widetilde{X^{\mathrm {sol.}}}}\right) \right\| \le \theta _{1}\left\| \widetilde{ X^{\mathrm {sol.}}}-X^{\mathrm {sol.}}\right\| , \quad \text { where} \quad \theta _{1}= 1+\Vert A\Vert ^{2}e^{\mathrm {max}\left( \left\| X^{\mathrm {sol.}}\right\| ,~\left\| \widetilde{X^{\mathrm {sol.}}}\right\| \right) }. \end{aligned}$$

Proof

Suppose that \(\widetilde{X^{\mathrm {sol.}}}\) approximate the symmetric positive definite solution of (1), it follows that

$$\begin{aligned} \begin{aligned} R\left( \widetilde{X^{\mathrm {sol.}}}\right)&=\widetilde{ X^{\mathrm {sol.}}}-A^{*}\widetilde{ e^{X^{\mathrm {sol.}}}}A-I\\&=\widetilde{ X^{\mathrm {sol.}}}-X^{\mathrm {sol.}} -A^{*}\widetilde{ e^{X^{\mathrm {sol.}}}}A+A^{*} e^{X^{\mathrm {sol.}}}A\\&=\left( \widetilde{ X^{\mathrm {sol.}}}-X^{\mathrm {sol.}}\right) -A^{*}\left( \widetilde{ e^{X^{\mathrm {sol.}}}}-e^{X^{\mathrm {sol.}}}\right) A\\&=\left( \widetilde{ X^{\mathrm {sol.}}}-X^{\mathrm {sol.}}\right) -A^{*}\left( \int _{0}^{1}e^{(1-s)X^{\mathrm {sol.}}}\left( \widetilde{ X^{\mathrm {sol.}}}-X^{\mathrm {sol.}}\right) e^{\widetilde{sX^{\mathrm {sol.}}}}ds\right) A,\\ \end{aligned} \end{aligned}$$
(33)

by Lemma 3. From (33) we see that

$$\begin{aligned} \left\| R\left( {\widetilde{X^{\mathrm {sol.}}}}\right) \right\| \le \left\| \left( \widetilde{ X^{\mathrm {sol.}}}-X^{\mathrm {sol.}}\right) \right\| \left( 1+\Vert A\Vert ^{2}e^{\mathrm {max}\left( \left\| X^{\mathrm {sol.}}\right\| ,~\left\| \widetilde{X^{\mathrm {sol.}}}\right\| \right) }\right) . \end{aligned}$$

Then, we have    \(\left\| R\left( {\widetilde{X^{\mathrm {sol.}}}}\right) \right\| \le \theta _{1}\left\| \widetilde{ X^{\mathrm {sol.}}}-X^{\mathrm {sol.}}\right\| ,\)   where    

$$\begin{aligned} \theta _{1}= 1+\Vert A\Vert ^{2}e^{\mathrm {max}\left( \left\| X^{\mathrm {sol.}}\right\| ,~\left\| \widetilde{X^{\mathrm {sol.}}}\right\| \right) }. \end{aligned}$$

Hence, the proof is completed. \(\square\)

Results and discussion

In this section, we will give some numerical examples to illustrate our results. All the tests are performed by MATLAB R2015a. Because of the influence of round off error, we regard the matrix A as zero matrix if \(\Vert A\Vert _{F} < 10^{-07}.\)

Example 1

We consider (1) with

$$\begin{aligned} A = {\left\{ \begin{array}{ll} \frac{1}{400(Ni-1)}, &{} \text {if}\quad i=j\\ \frac{1}{400(i+j+1)}, &{} \text {if}\quad i\ne j,\quad i,j=1,2,\cdots ,N,\\ \end{array}\right. } \end{aligned}$$

where N is the size of matrix A. Then using Algorithms 1 and 2 with \(N=4\),\(~X_{0}=I\)  and \(Z_{0}=0,~\) and iterating one step, we have the approximate symmetric solution of (1)

$$\begin{aligned} X=\begin{pmatrix} 1.0000065807096&{} 0.0000049771825&{} 0.0000040315387 &{} 0.0000036592124\\ 0.0000049771825 &{} 1.0000038264297 &{} 0.0000030655566&{} 0.0000027986304\\ 0.0000040315387&{} 0.0000030655566&{} 1.0000025349499 &{} 0.0000022569069\\ 0.0000036592124&{} 0.0000027986304&{} 0.0000022569069 &{} 1.0000020783084\\ \end{pmatrix} \end{aligned}$$

with a corresponding residual \(7.34\times 10^{-10}.\)

Example 2

We consider (1) with \(A=10^{-02} \begin{pmatrix} 0.191&{} 0.0785 &{} 0.1975\\ 0.0785 &{} 0 &{} 0.239\\ 0.1975 &{} 0.239 &{} 0.5325 \end{pmatrix}.\) Using Algorithms 1 and 2 with \(X_{0}=I\) and \(Z_{0}=0 ~\)and iterating one step we obtain a symmetric solution of (1)

$$\begin{aligned} X_{1}= \begin{pmatrix} 1.000035856379445 &{} 0.000025526838279 &{} 0.000073063791221\\ 0.000025526838279 &{} 1.000020966091056 &{} 0.000055325826041\\ 0.000073063791221 &{} 0.000055325826041 &{} 1.000159215242941\\ \end{pmatrix} \end{aligned}$$

with a corresponding residual \(\Vert X_{1}-A^{*}e^{X_{1}}A-I\Vert _{F}= 8.32\times 10^{-08}.\)

Example 3

We consider equation (1) with

$$\begin{aligned} A=10^{-03}\begin{pmatrix} 0.039184486647583&{} 0.752572770157521&{} 0.640759461948906\\ 0.752572770157521&{} 0.183842944465775&{} 0.746095912831499\\ 0.640759461948906 &{} 0.746095912831499&{} 0.854851683090675 \end{pmatrix}. \end{aligned}$$

Then, using Algorithms 1 and 2 with \(X_{0}=\begin{pmatrix} 1.0000001 &{} 0&{} 0\\ 0 &{} 1.0000008 &{} 0\\ 0 &{} 0 &{} 1.000005 \end{pmatrix}\)  and \(~Z_{0}=0\quad \text {and}\) iterating one step, we get symmetric solution of (1)

$$\begin{aligned} X =\begin{pmatrix} 1.000003733574993&{} 0.000003520228949&{} 0.000005160654545\\ 0.000003520228949&{} 1.000004818752929&{} 0.000005932157008\\ 0.000005160654545 &{} 0.000005932157008&{} 1.000007943209756\\ \end{pmatrix} \end{aligned}$$

with a corresponding residual \(5.72\times 10^{-10}.\)

Example 4

We now consider a matrix used in a model for the population of the bilby for the quasi-stationary behaviour of quasi-birth-death processes. The bilby is an endangered Australian marsupial ( [25, 26]). Define the \(5\times 5\) matrix \(B=\beta A_{2}^{T},\) where \(\beta =0.5,\)

$$\begin{aligned} A_{2}=Q(g,d)=\begin{pmatrix} gd_{1}&{}(1-g)d_{1}&{}0&{}0&{}0\\ gd_{2}&{}0&{}(1-g)d_{2}&{}0&{}0\\ gd_{3}&{}0&{}0&{}(1-g)d_{3}&{}0\\ gd_{4}&{}0&{}0&{}0&{}(1-g)d_{4}\\ gd_{5}&{}0&{}0&{}0&{} (1-g)d_{5}\\ \end{pmatrix}, \end{aligned}$$

\(d=[0,0.5,0.55,0.8,1]\) is the vector of probability that the population moves down a level given phase j and \(g=0.2.\) We now consider equation (1) with a symmetric matrix given by

$$\begin{aligned} A=\delta \left( \frac{B^{T}+B}{2}\right) =\delta \begin{pmatrix} 0 &{} 0.0250 &{} 0.0275 &{} 0.0400 &{} 0.0050\\ 0.0250 &{} 0 &{} 0.1000 &{} 0 &{} 0\\ 0.0275 &{} 0.1000 &{} 0 &{} 0.1100 &{} 0\\ 0.0400 &{} 0 &{} 0.1100 &{} 0 &{} 0.1600\\ 0.0050 &{} 0 &{} 0 &{} 0.1600 &{} 0.4000 \end{pmatrix}. \end{aligned}$$

Employing Algorithms 1 and 2, with \(\delta =0.001\), \(X_{0}=I\) and \(Z_{0}=0\), the solution of equation (1)

$$\begin{aligned} X=\begin{pmatrix} 1.0000000146 &{} 0.0000000169&{} 0.0000000350 &{} 0.0000000367 &{} 0.0000000695\\ 0.0000000169 &{} 1.0000000338 &{} 0.0000000308 &{} 0.0000000593 &{}0.0000000708\\ 0.0000000350 &{} 0.0000000308&{} 1.0000000956&{} 0.0000000755&{} 0.0000001646\\ 0.0000000367 &{} 0.0000000593&{} 0.0000000755 &{} 1.0000001636 &{} 0.0000002854\\ 0.0000000695 &{} 0.0000000708 &{} 0.0000001646 &{} 0.0000002854 &{} 1.0000006381\\ \end{pmatrix} \end{aligned}$$

is obtained by one iterative step with a residual 2.03 \(\times 10^{-12}.\)

The influence of \(\delta\) on the convergence of the proposed algorithm is summarized in Table 1.

Table 1 Summary of Results for Example 4 for different \(\delta\) with \(X_{0}=I\) and \(Z_{0}=0\)

From Table 1, the result reveals that when the spectral radius of the coefficient matrix A is reduced the convergence of the proposed algorithm improves significantly.

Example 5

In this example, we consider (1) in which symmetric matrix \(A=\left( \begin{array}{ccc} 0.0382 &{} 0.0157 &{} 0.0395\\ 0.0157 &{} 0&{} 0.0478\\ 0.0395 &{} 0.0478&{} 0.1065\\ \end{array}\right) .\) Then, we suppose that the perturbations in the matrices A and I are \(\triangle A= 10^{-h}\times \left( \begin{array}{ccc} -0.2 &{}-0.3&{} 0.1\\ 0.1&{} -0.1&{} 0.1\\ -0.1&{} 0.1&{} 0.2\\ \end{array}\right) ,\)   \(\triangle I = 10^{-h}\times \left( \begin{array}{ccc} -0.3 &{}0.2&{} 0.1\\ 0.1 &{}-0.2&{} 0.3\\ 0.1&{} 0.1&{} -0.3\\ \end{array}\right) ,\) respectively, where h is a positive integer. Let \({\widetilde{A}}=A+\triangle A\) and \({\widetilde{I}}=I+\triangle I\) and \(\widetilde{X^{\mathrm {sol.}}}=X^{\mathrm {sol.}}+\triangle X^{\mathrm {sol.}},\) where \(X^{\mathrm {sol.}}\) and \(\widetilde{X^{\mathrm {sol.}}}\) are the positive definite solutions of (23) and (24) computed by Algorithm 3 with initial solution \(X_{0}=I.\) A summary of results for Theorems 6 and 7 are recorded in Table 2. We denote    \(\theta = 1-\Vert A\Vert ^{2}e^{\mathrm {max}\left( \left\| X^{\mathrm {sol.}}\right\| ,~\left\| \widetilde{X^{\mathrm {sol.}}}\right\| \right) },\)    \(\theta _{1}= 1+\Vert A\Vert ^{2}e^{\mathrm {max}\left( \left\| X^{\mathrm {sol.}}\right\| ,~\left\| \widetilde{X^{\mathrm {sol.}}}\right\| \right) },\)   \(RE=\left\| R\left( \widetilde{X^{\mathrm {sol.}}}\right) \right\|\)   \(C1=\theta _{1}\left\| \widetilde{X^{\mathrm {sol.}}}-X^{\mathrm {sol.}}\right\|\)   \(C2=\dfrac{\left\| \widetilde{X^{\mathrm {sol.}}}-X^{\mathrm {sol.}}\right\| }{\left\| X^{\mathrm {sol.}}\right\| }\)  and  \(C3=\dfrac{1}{\theta }\left( \dfrac{\Vert \triangle I\Vert }{\Vert I\Vert }+\dfrac{2\Vert \triangle A\Vert }{\Vert A\Vert } \right) .\)

Table 2 Summary Results for Example 5 on Theorems 6 and 7

Remark 8

Table 2 shows the numerical results for the computed parameters. The computed values demonstrate the accurateness of our theoretical proofs. The estimates are relatively sharp. The bounds are reduced as the perturbations become very small.

Conclusion

In this paper, an efficient inversion free iterative method is developed by extending the conjugate gradient method and incorporated into Newton’s method, then after some refinements, it is employed to compute symmetric solution of Eq. (1). Moreover, the necessary conditions for the existence of symmetric solution for the proposed iterative method are derived. The fixed point method proposed in [16] is applied to find symmetric positive definite solution of Eq.(1). Finally, explicit expressions of perturbation and error bound estimates for the obtained solution are derived. Numerical experiments provided, demonstrate the plausibility of the derived theoretical results.

Availability of data and materials

Not Applicable in this paper since all details are provided within the paper.

References

  1. Huang, N., Ma, C.-F.: Two structure-preserving-doubling like algorithms for obtaining the positive definite solution to a class of nonlinear matrix equation. J. Comput. Math. Appl. 69, 494–502 (2015)

    MathSciNet  Article  Google Scholar 

  2. Guo, C.-H., Higham, N.J.: Iterative solution of a nonsymmetric algebraic Riccati equation. SIAM J. Matrix Anal. Appl. 29, 396–412 (2007)

    MathSciNet  Article  Google Scholar 

  3. Peng, Z.-H., Hu, X.Y., Zhang, L.: An iteration method for the symmetric solutions and the optimal approximation solution of the matrix equation \(AXB=C.\). Appl. Math. Comput. 160, 763–777 (2005)

    MathSciNet  MATH  Google Scholar 

  4. Ramadan, M.A., El-Shazly, N.M.: On the maximal positive definite solution of the nonlinear matrix equation\(X-\sum _{j=1}^{n}B_{j=1}^{*}X^{-1}B_j-\sum _{i=1}^{m}A_{i=1}^{*}X^{-1}A_i=I\). Appl. Math. Inf. Sci. 14(2), 349–354 (2020)

    MathSciNet  Google Scholar 

  5. Ramadan, M.A.: Necessary and sufficient conditions for the existence of positive definite solutions of the matrix equation. Int. J. Comput. Math. 82(7), 865–870 (2005). https://doi.org/10.1080/00207160412331336107

    MathSciNet  Article  MATH  Google Scholar 

  6. Liu, P., Zhang, S., Li, Q.: On the positive definite solutions of a nonlinear matrix equation.2013, 1–6 (2013). https://doi.org/10.1155/2013/676978

  7. S, R., NJ, H.: Higher Order Fréchet Derivative of Matrix Functions and Their Applications. University of Manchester, Manchester (2013)

    Google Scholar 

  8. Al-Mohy, A. H.: Algorithms for the Matrix Exponential and its Fréchet Derivative. PhD thesis, University of Manchester, UK (2010)

  9. Higham, N., Al-Mohy, A.: Computing the Fréchet derivative of \({e}^{A}\) with an application to condition number estimation. SIAM J. Matrix Anal. Appl. 30, 1639–1657 (2009)

    Article  Google Scholar 

  10. Mathias, R.: Evaluating the Fréchet derivative of the matrix exponential. Numer. Math. 62, 213–226 (1992)

    MathSciNet  Article  Google Scholar 

  11. Gao, D. J.: Existence and uniqueness of the positive definite solution for the matrix equation \(X=Q+A^{*}({\hat{X}}-C)A.\). J. Abstr. Appl. Anal. 1–4 (2013)

  12. He, Y.-M., Long, J.-H.: On the Hermitian positive definite solution of the nonlinear matrix equation. J. Appl. Math. Comput. 216(12), 3480–3485 (2010)

    MathSciNet  Article  Google Scholar 

  13. El-Sayed, S.M., Ramadan, M.A.: On the existence of a positive definite solution of the matrix equation. Int. J. Comput. Math. 76, 331–338 (2001)

    MathSciNet  Article  Google Scholar 

  14. Ramadan, M.A., El-Shazly, N.M.: On the matrix equation. Appl. Math. Comput. 173, 992–1013 (2006)

    MathSciNet  MATH  Google Scholar 

  15. Hasanov, V.I., Ivanov, I.G.: Solutions and perturbation estimates for the matrix equations\({X\pm A^{*}X^{-n}A =Q}\). J. Appl. Math. Comput. 156, 513–525 (2004)

    Article  Google Scholar 

  16. Gao, D.: On Hermitian positive definite solution of the nonlinear matrix equation \(X-A^{*}e^{X}A=I\). J. Appl. Math. Comput. 50, 109–116 (2016)

    MathSciNet  Article  Google Scholar 

  17. Huan H. Y.: Finding Special Solvents to Some Nonlinear Matrix Equations. Ph.D. thesis. Pusan National University (2011)

  18. Higham, N.J.: Functions of matrices: theory and computation. SIAM, Philadelphia, PA 19104-2688, USA( 2008) a linear matrix equation \(AXB+CYD=E\). J. Comput. Appl. Math. 223, 3030–3040 (2010)

    Google Scholar 

  19. Dehghan, M., Hajarian, M.: The \((R, S)\)-Symmetric and \((R, S)\)-Skew Symmetric Solutions of the pair of matrix equations \(A_{1}XB_{1} = C_{1}\) and \(A_{2}XB_{2} = C_{2}\). Bull. Iranian Math. Soc. 37(3), 269–279 (2011)

    MathSciNet  MATH  Google Scholar 

  20. Zhang, G.-F., Xie, W.-W., Zhao, J.-Y.: Positive definite solution of the nonlinear matrix equation \(X-A^{*}X^{q}A=Q(q>1)\). Appl. Math. Comput. 217, 9182–9188 (2011)

    MathSciNet  MATH  Google Scholar 

  21. Peng, J., Liao, A., Peng, Z.: An iterative method to solve a nonlinear matrix equation. Electron. J. Linear Algebra 31, 620–632 (2016)

    MathSciNet  Article  Google Scholar 

  22. Chacha, C.S., Kim, H.-M.: Elementwise minimal nonnegative solutions for a class of nonlinear matrix equations. East Asian J. Appl. Math 9, 665–682 (2019). https://doi.org/10.4208/eajam.300518.120119

    MathSciNet  Article  MATH  Google Scholar 

  23. Chacha, C.S., Naqvi, S.M.R.S.: Condition numbers of the nonlinear matrix equation \(X^{p}-Ae^{X}A^{*}=I\). J. Funct. Spaces 2018, 1–8 (2018). https://doi.org/10.1155/2018/3291867

    Article  Google Scholar 

  24. Chacha, C.S., Kim, H.M.: An efficient iterative algorithm for finding a nontrivial symmetric solution of the Yang-Baxter-like matrix equation. J. Nonlinear Sci. Appl. 12, 21–29 (2019)

    MathSciNet  Google Scholar 

  25. Chacha, C.S.: On iterative algorithm and perturbation analysis for the nonlinear matrix equation. Commun. Appl. Math. Comput. 4, 1158–1174 (2022). https://doi.org/10.1007/s42967-021-00152-3

    MathSciNet  Article  MATH  Google Scholar 

  26. Bean, N.G., Bright, L., Latouche, G., Pearce, P.K.P., Taylor, P.G.: The quasi-stationary behaviour of quasi-birth-death processes. Annal. Appl. Prob. 7(1), 134–155 (1997)

    MATH  Google Scholar 

  27. Dehghan, M., Hajarian, M.: The general coupled matrix equations over generalized bisymmetric matrices. Linear Algebra Appl. 432(6), 1531–1552 (2010). https://doi.org/10.1016/j.laa.2009.11.014

    MathSciNet  Article  MATH  Google Scholar 

  28. Hajarian, M.: Convergence properties of BCR method for generalized Sylvester matrix equation over generalized reflexive and anti-reflexive matrices. Linear Multilinear Algebra 66(10), 1975–1990 (2018). https://doi.org/10.1080/03081087.2017.1382441

    MathSciNet  Article  MATH  Google Scholar 

Download references

Acknowledgements

The author would like to thank the anonymous reviewers for providing very useful comments and suggestions, which greatly improved the original manuscript of this paper. The author is also very much indebted to Professor Eid H. Doha (Editor-in-Chief) for his valuable suggestions, generous encouragement and concern during the review process of this paper.

Funding

This work is funded by the author.

Author information

Authors and Affiliations

Authors

Contributions

All authors read and approved the final manuscript.

Corresponding author

Correspondence to Chacha S. Chacha.

Ethics declarations

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Competing interests

The author declares that he has no competing interests.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chacha, C.S. On solution and perturbation estimates for the nonlinear matrix equation  \(X-A^{*}e^{X}A=I\). J Egypt Math Soc 30, 18 (2022). https://doi.org/10.1186/s42787-022-00152-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s42787-022-00152-z

Keywords

  • Newton’s method
  • Iterative method
  • Perturbation estimate
  • Symmetric solution
  • Nonlinear matrix equation