In mathematics , more specifically in numerical linear algebra , the biconjugate gradient method is an algorithm to solve systems of linear equations
A
x
=
b
.
{\displaystyle Ax=b.\,}
Unlike the conjugate gradient method , this algorithm does not require the matrix
A
{\displaystyle A}
to be self-adjoint , but instead one needs to perform multiplications by the conjugate transpose A * .
Choose initial guess
x
0
{\displaystyle x_{0}\,}
, two other vectors
x
0
∗
{\displaystyle x_{0}^{*}}
and
b
∗
{\displaystyle b^{*}\,}
and a preconditioner
M
{\displaystyle M\,}
r
0
←
b
−
A
x
0
{\displaystyle r_{0}\leftarrow b-A\,x_{0}\,}
r
0
∗
←
b
∗
−
x
0
∗
A
∗
{\displaystyle r_{0}^{*}\leftarrow b^{*}-x_{0}^{*}\,A^{*}}
p
0
←
M
−
1
r
0
{\displaystyle p_{0}\leftarrow M^{-1}r_{0}\,}
p
0
∗
←
r
0
∗
M
−
1
{\displaystyle p_{0}^{*}\leftarrow r_{0}^{*}M^{-1}\,}
for
k
=
0
,
1
,
…
{\displaystyle k=0,1,\ldots }
do
α
k
←
r
k
∗
M
−
1
r
k
p
k
∗
A
p
k
{\displaystyle \alpha _{k}\leftarrow {r_{k}^{*}M^{-1}r_{k} \over p_{k}^{*}Ap_{k}}\,}
x
k
+
1
←
x
k
+
α
k
⋅
p
k
{\displaystyle x_{k+1}\leftarrow x_{k}+\alpha _{k}\cdot p_{k}\,}
x
k
+
1
∗
←
x
k
∗
+
α
k
¯
⋅
p
k
∗
{\displaystyle x_{k+1}^{*}\leftarrow x_{k}^{*}+{\overline {\alpha _{k}}}\cdot p_{k}^{*}\,}
r
k
+
1
←
r
k
−
α
k
⋅
A
p
k
{\displaystyle r_{k+1}\leftarrow r_{k}-\alpha _{k}\cdot Ap_{k}\,}
r
k
+
1
∗
←
r
k
∗
−
α
k
¯
⋅
p
k
∗
A
∗
{\displaystyle r_{k+1}^{*}\leftarrow r_{k}^{*}-{\overline {\alpha _{k}}}\cdot p_{k}^{*}\,A^{*}}
β
k
←
r
k
+
1
∗
M
−
1
r
k
+
1
r
k
∗
M
−
1
r
k
{\displaystyle \beta _{k}\leftarrow {r_{k+1}^{*}M^{-1}r_{k+1} \over r_{k}^{*}M^{-1}r_{k}}\,}
p
k
+
1
←
M
−
1
r
k
+
1
+
β
k
⋅
p
k
{\displaystyle p_{k+1}\leftarrow M^{-1}r_{k+1}+\beta _{k}\cdot p_{k}\,}
p
k
+
1
∗
←
r
k
+
1
∗
M
−
1
+
β
k
¯
⋅
p
k
∗
{\displaystyle p_{k+1}^{*}\leftarrow r_{k+1}^{*}M^{-1}+{\overline {\beta _{k}}}\cdot p_{k}^{*}\,}
In the above formulation, the computed
r
k
{\displaystyle r_{k}\,}
and
r
k
∗
{\displaystyle r_{k}^{*}}
satisfy
r
k
=
b
−
A
x
k
,
{\displaystyle r_{k}=b-Ax_{k},\,}
r
k
∗
=
b
∗
−
x
k
∗
A
∗
{\displaystyle r_{k}^{*}=b^{*}-x_{k}^{*}\,A^{*}}
and thus are the respective residuals corresponding to
x
k
{\displaystyle x_{k}\,}
and
x
k
∗
{\displaystyle x_{k}^{*}}
, as approximate solutions to the systems
A
x
=
b
,
{\displaystyle Ax=b,\,}
x
∗
A
∗
=
b
∗
;
{\displaystyle x^{*}\,A^{*}=b^{*}\,;}
x
∗
{\displaystyle x^{*}}
is the adjoint , and
α
¯
{\displaystyle {\overline {\alpha }}}
is the complex conjugate .
Unpreconditioned version of the algorithm [ edit ]
Choose initial guess
x
0
{\displaystyle x_{0}\,}
,
r
0
←
b
−
A
x
0
{\displaystyle r_{0}\leftarrow b-A\,x_{0}\,}
r
^
0
←
b
^
−
x
^
0
A
∗
{\displaystyle {\hat {r}}_{0}\leftarrow {\hat {b}}-{\hat {x}}_{0}A^{*}}
p
0
←
r
0
{\displaystyle p_{0}\leftarrow r_{0}\,}
p
^
0
←
r
^
0
{\displaystyle {\hat {p}}_{0}\leftarrow {\hat {r}}_{0}\,}
for
k
=
0
,
1
,
…
{\displaystyle k=0,1,\ldots }
do
α
k
←
r
^
k
r
k
p
^
k
A
p
k
{\displaystyle \alpha _{k}\leftarrow {{\hat {r}}_{k}r_{k} \over {\hat {p}}_{k}Ap_{k}}\,}
x
k
+
1
←
x
k
+
α
k
⋅
p
k
{\displaystyle x_{k+1}\leftarrow x_{k}+\alpha _{k}\cdot p_{k}\,}
x
^
k
+
1
←
x
^
k
+
α
k
⋅
p
^
k
{\displaystyle {\hat {x}}_{k+1}\leftarrow {\hat {x}}_{k}+\alpha _{k}\cdot {\hat {p}}_{k}\,}
r
k
+
1
←
r
k
−
α
k
⋅
A
p
k
{\displaystyle r_{k+1}\leftarrow r_{k}-\alpha _{k}\cdot Ap_{k}\,}
r
^
k
+
1
←
r
^
k
−
α
k
⋅
p
^
k
A
∗
{\displaystyle {\hat {r}}_{k+1}\leftarrow {\hat {r}}_{k}-\alpha _{k}\cdot {\hat {p}}_{k}A^{*}}
β
k
←
r
^
k
+
1
r
k
+
1
r
^
k
r
k
{\displaystyle \beta _{k}\leftarrow {{\hat {r}}_{k+1}r_{k+1} \over {\hat {r}}_{k}r_{k}}\,}
p
k
+
1
←
r
k
+
1
+
β
k
⋅
p
k
{\displaystyle p_{k+1}\leftarrow r_{k+1}+\beta _{k}\cdot p_{k}\,}
p
^
k
+
1
←
r
^
k
+
1
+
β
k
⋅
p
^
k
{\displaystyle {\hat {p}}_{k+1}\leftarrow {\hat {r}}_{k+1}+\beta _{k}\cdot {\hat {p}}_{k}\,}
The biconjugate gradient method is numerically unstable [citation needed ] (compare to the biconjugate gradient stabilized method ), but very important from a theoretical point of view. Define the iteration steps by
x
k
:=
x
j
+
P
k
A
−
1
(
b
−
A
x
j
)
,
{\displaystyle x_{k}:=x_{j}+P_{k}A^{-1}\left(b-Ax_{j}\right),}
x
k
∗
:=
x
j
∗
+
(
b
∗
−
x
j
∗
A
)
P
k
A
−
1
,
{\displaystyle x_{k}^{*}:=x_{j}^{*}+\left(b^{*}-x_{j}^{*}A\right)P_{k}A^{-1},}
where
j
<
k
{\displaystyle j<k}
using the related projection
P
k
:=
u
k
(
v
k
∗
A
u
k
)
−
1
v
k
∗
A
,
{\displaystyle P_{k}:=\mathbf {u} _{k}\left(\mathbf {v} _{k}^{*}A\mathbf {u} _{k}\right)^{-1}\mathbf {v} _{k}^{*}A,}
with
u
k
=
[
u
0
,
u
1
,
…
,
u
k
−
1
]
,
{\displaystyle \mathbf {u} _{k}=\left[u_{0},u_{1},\dots ,u_{k-1}\right],}
v
k
=
[
v
0
,
v
1
,
…
,
v
k
−
1
]
.
{\displaystyle \mathbf {v} _{k}=\left[v_{0},v_{1},\dots ,v_{k-1}\right].}
These related projections may be iterated themselves as
P
k
+
1
=
P
k
+
(
1
−
P
k
)
u
k
⊗
v
k
∗
A
(
1
−
P
k
)
v
k
∗
A
(
1
−
P
k
)
u
k
.
{\displaystyle P_{k+1}=P_{k}+\left(1-P_{k}\right)u_{k}\otimes {v_{k}^{*}A\left(1-P_{k}\right) \over v_{k}^{*}A\left(1-P_{k}\right)u_{k}}.}
A relation to Quasi-Newton methods is given by
P
k
=
A
k
−
1
A
{\displaystyle P_{k}=A_{k}^{-1}A}
and
x
k
+
1
=
x
k
−
A
k
+
1
−
1
(
A
x
k
−
b
)
{\displaystyle x_{k+1}=x_{k}-A_{k+1}^{-1}\left(Ax_{k}-b\right)}
, where
A
k
+
1
−
1
=
A
k
−
1
+
(
1
−
A
k
−
1
A
)
u
k
⊗
v
k
∗
(
1
−
A
A
k
−
1
)
v
k
∗
A
(
1
−
A
k
−
1
A
)
u
k
.
{\displaystyle A_{k+1}^{-1}=A_{k}^{-1}+\left(1-A_{k}^{-1}A\right)u_{k}\otimes {v_{k}^{*}\left(1-AA_{k}^{-1}\right) \over v_{k}^{*}A\left(1-A_{k}^{-1}A\right)u_{k}}.}
The new directions
p
k
=
(
1
−
P
k
)
u
k
,
{\displaystyle p_{k}=\left(1-P_{k}\right)u_{k},}
p
k
∗
=
v
k
∗
A
(
1
−
P
k
)
A
−
1
{\displaystyle p_{k}^{*}=v_{k}^{*}A\left(1-P_{k}\right)A^{-1}}
are then orthogonal to the residuals:
v
i
∗
r
k
=
p
i
∗
r
k
=
0
,
{\displaystyle v_{i}^{*}r_{k}=p_{i}^{*}r_{k}=0,}
r
k
∗
u
j
=
r
k
∗
p
j
=
0
,
{\displaystyle r_{k}^{*}u_{j}=r_{k}^{*}p_{j}=0,}
which themselves satisfy
r
k
=
A
(
1
−
P
k
)
A
−
1
r
j
,
{\displaystyle r_{k}=A\left(1-P_{k}\right)A^{-1}r_{j},}
r
k
∗
=
r
j
∗
(
1
−
P
k
)
{\displaystyle r_{k}^{*}=r_{j}^{*}\left(1-P_{k}\right)}
where
i
,
j
<
k
{\displaystyle i,j<k}
.
The biconjugate gradient method now makes a special choice and uses the setting
u
k
=
M
−
1
r
k
,
{\displaystyle u_{k}=M^{-1}r_{k},\,}
v
k
∗
=
r
k
∗
M
−
1
.
{\displaystyle v_{k}^{*}=r_{k}^{*}\,M^{-1}.\,}
With this particular choice, explicit evaluations of
P
k
{\displaystyle P_{k}}
and A −1 are avoided, and the algorithm takes the form stated above.
If
A
=
A
∗
{\displaystyle A=A^{*}\,}
is self-adjoint ,
x
0
∗
=
x
0
{\displaystyle x_{0}^{*}=x_{0}}
and
b
∗
=
b
{\displaystyle b^{*}=b}
, then
r
k
=
r
k
∗
{\displaystyle r_{k}=r_{k}^{*}}
,
p
k
=
p
k
∗
{\displaystyle p_{k}=p_{k}^{*}}
, and the conjugate gradient method produces the same sequence
x
k
=
x
k
∗
{\displaystyle x_{k}=x_{k}^{*}}
at half the computational cost.
The sequences produced by the algorithm are biorthogonal , i.e.,
p
i
∗
A
p
j
=
r
i
∗
M
−
1
r
j
=
0
{\displaystyle p_{i}^{*}Ap_{j}=r_{i}^{*}M^{-1}r_{j}=0}
for
i
≠
j
{\displaystyle i\neq j}
.
if
P
j
′
{\displaystyle P_{j'}\,}
is a polynomial with
deg
(
P
j
′
)
+
j
<
k
{\displaystyle \deg \left(P_{j'}\right)+j<k}
, then
r
k
∗
P
j
′
(
M
−
1
A
)
u
j
=
0
{\displaystyle r_{k}^{*}P_{j'}\left(M^{-1}A\right)u_{j}=0}
. The algorithm thus produces projections onto the Krylov subspace .
if
P
i
′
{\displaystyle P_{i'}\,}
is a polynomial with
i
+
deg
(
P
i
′
)
<
k
{\displaystyle i+\deg \left(P_{i'}\right)<k}
, then
v
i
∗
P
i
′
(
A
M
−
1
)
r
k
=
0
{\displaystyle v_{i}^{*}P_{i'}\left(AM^{-1}\right)r_{k}=0}
.
Key concepts Problems Hardware Software