OpenRadioss 2025.1.11
OpenRadioss project
Loading...
Searching...
No Matches
sgesvj.f
Go to the documentation of this file.
1*> \brief \b SGESVJ
2*
3* =========== DOCUMENTATION ===========
4*
5* Online html documentation available at
6* http://www.netlib.org/lapack/explore-html/
7*
8*> \htmlonly
9*> Download SGESVJ + dependencies
10*> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/sgesvj.f">
11*> [TGZ]</a>
12*> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/sgesvj.f">
13*> [ZIP]</a>
14*> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/sgesvj.f">
15*> [TXT]</a>
16*> \endhtmlonly
17*
18* Definition:
19* ===========
20*
21* SUBROUTINE SGESVJ( JOBA, JOBU, JOBV, M, N, A, LDA, SVA, MV, V,
22* LDV, WORK, LWORK, INFO )
23*
24* .. Scalar Arguments ..
25* INTEGER INFO, LDA, LDV, LWORK, M, MV, N
26* CHARACTER*1 JOBA, JOBU, JOBV
27* ..
28* .. Array Arguments ..
29* REAL A( LDA, * ), SVA( N ), V( LDV, * ),
30* $ WORK( LWORK )
31* ..
32*
33*
34*> \par Purpose:
35* =============
36*>
37*> \verbatim
38*>
39*> SGESVJ computes the singular value decomposition (SVD) of a real
40*> M-by-N matrix A, where M >= N. The SVD of A is written as
41*> [++] [xx] [x0] [xx]
42*> A = U * SIGMA * V^t, [++] = [xx] * [ox] * [xx]
43*> [++] [xx]
44*> where SIGMA is an N-by-N diagonal matrix, U is an M-by-N orthonormal
45*> matrix, and V is an N-by-N orthogonal matrix. The diagonal elements
46*> of SIGMA are the singular values of A. The columns of U and V are the
47*> left and the right singular vectors of A, respectively.
48*> SGESVJ can sometimes compute tiny singular values and their singular vectors much
49*> more accurately than other SVD routines, see below under Further Details.
50*> \endverbatim
51*
52* Arguments:
53* ==========
54*
55*> \param[in] JOBA
56*> \verbatim
57*> JOBA is CHARACTER*1
58*> Specifies the structure of A.
59*> = 'L': The input matrix A is lower triangular;
60*> = 'U': The input matrix A is upper triangular;
61*> = 'G': The input matrix A is general M-by-N matrix, M >= N.
62*> \endverbatim
63*>
64*> \param[in] JOBU
65*> \verbatim
66*> JOBU is CHARACTER*1
67*> Specifies whether to compute the left singular vectors
68*> (columns of U):
69*> = 'U': The left singular vectors corresponding to the nonzero
70*> singular values are computed and returned in the leading
71*> columns of A. See more details in the description of A.
72*> The default numerical orthogonality threshold is set to
73*> approximately TOL=CTOL*EPS, CTOL=SQRT(M), EPS=SLAMCH('E').
74*> = 'C': Analogous to JOBU='U', except that user can control the
75*> level of numerical orthogonality of the computed left
76*> singular vectors. TOL can be set to TOL = CTOL*EPS, where
77*> CTOL is given on input in the array WORK.
78*> No CTOL smaller than ONE is allowed. CTOL greater
79*> than 1 / EPS is meaningless. The option 'C'
80*> can be used if M*EPS is satisfactory orthogonality
81*> of the computed left singular vectors, so CTOL=M could
82*> save few sweeps of Jacobi rotations.
83*> See the descriptions of A and WORK(1).
84*> = 'N': The matrix U is not computed. However, see the
85*> description of A.
86*> \endverbatim
87*>
88*> \param[in] JOBV
89*> \verbatim
90*> JOBV is CHARACTER*1
91*> Specifies whether to compute the right singular vectors, that
92*> is, the matrix V:
93*> = 'V': the matrix V is computed and returned in the array V
94*> = 'A': the Jacobi rotations are applied to the MV-by-N
95*> array V. In other words, the right singular vector
96*> matrix V is not computed explicitly; instead it is
97*> applied to an MV-by-N matrix initially stored in the
98*> first MV rows of V.
99*> = 'N': the matrix V is not computed and the array V is not
100*> referenced
101*> \endverbatim
102*>
103*> \param[in] M
104*> \verbatim
105*> M is INTEGER
106*> The number of rows of the input matrix A. 1/SLAMCH('E') > M >= 0.
107*> \endverbatim
108*>
109*> \param[in] N
110*> \verbatim
111*> N is INTEGER
112*> The number of columns of the input matrix A.
113*> M >= N >= 0.
114*> \endverbatim
115*>
116*> \param[in,out] A
117*> \verbatim
118*> A is REAL array, dimension (LDA,N)
119*> On entry, the M-by-N matrix A.
120*> On exit,
121*> If JOBU = 'U' .OR. JOBU = 'C':
122*> If INFO = 0:
123*> RANKA orthonormal columns of U are returned in the
124*> leading RANKA columns of the array A. Here RANKA <= N
125*> is the number of computed singular values of A that are
126*> above the underflow threshold SLAMCH('S'). The singular
127*> vectors corresponding to underflowed or zero singular
128*> values are not computed. The value of RANKA is returned
129*> in the array WORK as RANKA=NINT(WORK(2)). Also see the
130*> descriptions of SVA and WORK. The computed columns of U
131*> are mutually numerically orthogonal up to approximately
132*> TOL=SQRT(M)*EPS (default); or TOL=CTOL*EPS (JOBU = 'C'),
133*> see the description of JOBU.
134*> If INFO > 0,
135*> the procedure SGESVJ did not converge in the given number
136*> of iterations (sweeps). In that case, the computed
137*> columns of U may not be orthogonal up to TOL. The output
138*> U (stored in A), SIGMA (given by the computed singular
139*> values in SVA(1:N)) and V is still a decomposition of the
140*> input matrix A in the sense that the residual
141*> ||A-SCALE*U*SIGMA*V^T||_2 / ||A||_2 is small.
142*> If JOBU = 'N':
143*> If INFO = 0:
144*> Note that the left singular vectors are 'for free' in the
145*> one-sided Jacobi SVD algorithm. However, if only the
146*> singular values are needed, the level of numerical
147*> orthogonality of U is not an issue and iterations are
148*> stopped when the columns of the iterated matrix are
149*> numerically orthogonal up to approximately M*EPS. Thus,
150*> on exit, A contains the columns of U scaled with the
151*> corresponding singular values.
152*> If INFO > 0:
153*> the procedure SGESVJ did not converge in the given number
154*> of iterations (sweeps).
155*> \endverbatim
156*>
157*> \param[in] LDA
158*> \verbatim
159*> LDA is INTEGER
160*> The leading dimension of the array A. LDA >= max(1,M).
161*> \endverbatim
162*>
163*> \param[out] SVA
164*> \verbatim
165*> SVA is REAL array, dimension (N)
166*> On exit,
167*> If INFO = 0 :
168*> depending on the value SCALE = WORK(1), we have:
169*> If SCALE = ONE:
170*> SVA(1:N) contains the computed singular values of A.
171*> During the computation SVA contains the Euclidean column
172*> norms of the iterated matrices in the array A.
173*> If SCALE .NE. ONE:
174*> The singular values of A are SCALE*SVA(1:N), and this
175*> factored representation is due to the fact that some of the
176*> singular values of A might underflow or overflow.
177*>
178*> If INFO > 0 :
179*> the procedure SGESVJ did not converge in the given number of
180*> iterations (sweeps) and SCALE*SVA(1:N) may not be accurate.
181*> \endverbatim
182*>
183*> \param[in] MV
184*> \verbatim
185*> MV is INTEGER
186*> If JOBV = 'A', then the product of Jacobi rotations in SGESVJ
187*> is applied to the first MV rows of V. See the description of JOBV.
188*> \endverbatim
189*>
190*> \param[in,out] V
191*> \verbatim
192*> V is REAL array, dimension (LDV,N)
193*> If JOBV = 'V', then V contains on exit the N-by-N matrix of
194*> the right singular vectors;
195*> If JOBV = 'A', then V contains the product of the computed right
196*> singular vector matrix and the initial matrix in
197*> the array V.
198*> If JOBV = 'N', then V is not referenced.
199*> \endverbatim
200*>
201*> \param[in] LDV
202*> \verbatim
203*> LDV is INTEGER
204*> The leading dimension of the array V, LDV >= 1.
205*> If JOBV = 'V', then LDV >= max(1,N).
206*> If JOBV = 'A', then LDV >= max(1,MV) .
207*> \endverbatim
208*>
209*> \param[in,out] WORK
210*> \verbatim
211*> WORK is REAL array, dimension (LWORK)
212*> On entry,
213*> If JOBU = 'C' :
214*> WORK(1) = CTOL, where CTOL defines the threshold for convergence.
215*> The process stops if all columns of A are mutually
216*> orthogonal up to CTOL*EPS, EPS=SLAMCH('E').
217*> It is required that CTOL >= ONE, i.e. it is not
218*> allowed to force the routine to obtain orthogonality
219*> below EPSILON.
220*> On exit,
221*> WORK(1) = SCALE is the scaling factor such that SCALE*SVA(1:N)
222*> are the computed singular vcalues of A.
223*> (See description of SVA().)
224*> WORK(2) = NINT(WORK(2)) is the number of the computed nonzero
225*> singular values.
226*> WORK(3) = NINT(WORK(3)) is the number of the computed singular
227*> values that are larger than the underflow threshold.
228*> WORK(4) = NINT(WORK(4)) is the number of sweeps of Jacobi
229*> rotations needed for numerical convergence.
230*> WORK(5) = max_{i.NE.j} |COS(A(:,i),A(:,j))| in the last sweep.
231*> This is useful information in cases when SGESVJ did
232*> not converge, as it can be used to estimate whether
233*> the output is still useful and for post festum analysis.
234*> WORK(6) = the largest absolute value over all sines of the
235*> Jacobi rotation angles in the last sweep. It can be
236*> useful for a post festum analysis.
237*> \endverbatim
238*>
239*> \param[in] LWORK
240*> \verbatim
241*> LWORK is INTEGER
242*> length of WORK, WORK >= MAX(6,M+N)
243*> \endverbatim
244*>
245*> \param[out] INFO
246*> \verbatim
247*> INFO is INTEGER
248*> = 0: successful exit.
249*> < 0: if INFO = -i, then the i-th argument had an illegal value
250*> > 0: SGESVJ did not converge in the maximal allowed number (30)
251*> of sweeps. The output may still be useful. See the
252*> description of WORK.
253*> \endverbatim
254*
255* Authors:
256* ========
257*
258*> \author Univ. of Tennessee
259*> \author Univ. of California Berkeley
260*> \author Univ. of Colorado Denver
261*> \author NAG Ltd.
262*
263*> \ingroup realGEcomputational
264*
265*> \par Further Details:
266* =====================
267*>
268*> The orthogonal N-by-N matrix V is obtained as a product of Jacobi plane
269*> rotations. The rotations are implemented as fast scaled rotations of
270*> Anda and Park [1]. In the case of underflow of the Jacobi angle, a
271*> modified Jacobi transformation of Drmac [4] is used. Pivot strategy uses
272*> column interchanges of de Rijk [2]. The relative accuracy of the computed
273*> singular values and the accuracy of the computed singular vectors (in
274*> angle metric) is as guaranteed by the theory of Demmel and Veselic [3].
275*> The condition number that determines the accuracy in the full rank case
276*> is essentially min_{D=diag} kappa(A*D), where kappa(.) is the
277*> spectral condition number. The best performance of this Jacobi SVD
278*> procedure is achieved if used in an accelerated version of Drmac and
279*> Veselic [5,6], and it is the kernel routine in the SIGMA library [7].
280*> Some tuning parameters (marked with [TP]) are available for the
281*> implementer. \n
282*> The computational range for the nonzero singular values is the machine
283*> number interval ( UNDERFLOW , OVERFLOW ). In extreme cases, even
284*> denormalized singular values can be computed with the corresponding
285*> gradual loss of accurate digits.
286*>
287*> \par Contributors:
288* ==================
289*>
290*> Zlatko Drmac (Zagreb, Croatia) and Kresimir Veselic (Hagen, Germany)
291*>
292*> \par References:
293* ================
294*>
295*> [1] A. A. Anda and H. Park: Fast plane rotations with dynamic scaling. \n
296*> SIAM J. matrix Anal. Appl., Vol. 15 (1994), pp. 162-174. \n\n
297*> [2] P. P. M. De Rijk: A one-sided Jacobi algorithm for computing the
298*> singular value decomposition on a vector computer. \n
299*> SIAM J. Sci. Stat. Comp., Vol. 10 (1998), pp. 359-371. \n\n
300*> [3] J. Demmel and K. Veselic: Jacobi method is more accurate than QR. \n
301*> [4] Z. Drmac: Implementation of Jacobi rotations for accurate singular
302*> value computation in floating point arithmetic. \n
303*> SIAM J. Sci. Comp., Vol. 18 (1997), pp. 1200-1222. \n\n
304*> [5] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I. \n
305*> SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342. \n
306*> LAPACK Working note 169. \n\n
307*> [6] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II. \n
308*> SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362. \n
309*> LAPACK Working note 170. \n\n
310*> [7] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV,
311*> QSVD, (H,K)-SVD computations.\n
312*> Department of Mathematics, University of Zagreb, 2008.
313*>
314*> \par Bugs, Examples and Comments:
315* =================================
316*>
317*> Please report all bugs and send interesting test examples and comments to
318*> drmac@math.hr. Thank you.
319*
320* =====================================================================
321 SUBROUTINE sgesvj( JOBA, JOBU, JOBV, M, N, A, LDA, SVA, MV, V,
322 $ LDV, WORK, LWORK, INFO )
323*
324* -- LAPACK computational routine --
325* -- LAPACK is a software package provided by Univ. of Tennessee, --
326* -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
327*
328* .. Scalar Arguments ..
329 INTEGER INFO, LDA, LDV, LWORK, M, MV, N
330 CHARACTER*1 JOBA, JOBU, JOBV
331* ..
332* .. Array Arguments ..
333 REAL A( LDA, * ), SVA( N ), V( LDV, * ),
334 $ work( lwork )
335* ..
336*
337* =====================================================================
338*
339* .. Local Parameters ..
340 REAL ZERO, HALF, ONE
341 parameter( zero = 0.0e0, half = 0.5e0, one = 1.0e0)
342 INTEGER NSWEEP
343 parameter( nsweep = 30 )
344* ..
345* .. Local Scalars ..
346 REAL AAPP, AAPP0, AAPQ, AAQQ, APOAQ, AQOAP, BIG,
347 $ bigtheta, cs, ctol, epsln, large, mxaapq,
348 $ mxsinj, rootbig, rooteps, rootsfmin, roottol,
349 $ skl, sfmin, small, sn, t, temp1, theta,
350 $ thsign, tol
351 INTEGER BLSKIP, EMPTSW, i, ibr, IERR, igl, IJBLSK, ir1,
352 $ iswrot, jbc, jgl, kbl, lkahead, mvl, n2, n34,
353 $ n4, nbl, notrot, p, pskipped, q, rowskip,
354 $ swband
355 LOGICAL APPLV, GOSCALE, LOWER, LSVEC, NOSCALE, ROTOK,
356 $ rsvec, uctol, upper
357* ..
358* .. Local Arrays ..
359 REAL FASTR( 5 )
360* ..
361* .. Intrinsic Functions ..
362 INTRINSIC abs, max, min, float, sign, sqrt
363* ..
364* .. External Functions ..
365* ..
366* from BLAS
367 REAL SDOT, SNRM2
368 EXTERNAL sdot, snrm2
369 INTEGER ISAMAX
370 EXTERNAL isamax
371* from LAPACK
372 REAL SLAMCH
373 EXTERNAL slamch
374 LOGICAL LSAME
375 EXTERNAL lsame
376* ..
377* .. External Subroutines ..
378* ..
379* from BLAS
380 EXTERNAL saxpy, scopy, srotm, sscal, sswap
381* from LAPACK
382 EXTERNAL slascl, slaset, slassq, xerbla
383*
384 EXTERNAL sgsvj0, sgsvj1
385* ..
386* .. Executable Statements ..
387*
388* Test the input arguments
389*
390 lsvec = lsame( jobu, 'U' )
391 uctol = lsame( jobu, 'C' )
392 rsvec = lsame( jobv, 'V' )
393 applv = lsame( jobv, 'A' )
394 upper = lsame( joba, 'U' )
395 lower = lsame( joba, 'L' )
396*
397 IF( .NOT.( upper .OR. lower .OR. lsame( joba, 'G' ) ) ) THEN
398 info = -1
399 ELSE IF( .NOT.( lsvec .OR. uctol .OR. lsame( jobu, 'N' ) ) ) THEN
400 info = -2
401 ELSE IF( .NOT.( rsvec .OR. applv .OR. lsame( jobv, 'n' ) ) ) THEN
402 INFO = -3
403.LT. ELSE IF( M0 ) THEN
404 INFO = -4
405.LT..OR..GT. ELSE IF( ( N0 ) ( NM ) ) THEN
406 INFO = -5
407.LT. ELSE IF( LDAM ) THEN
408 INFO = -7
409.LT. ELSE IF( MV0 ) THEN
410 INFO = -9
411.AND..LT..OR. ELSE IF( ( RSVEC ( LDVN ) )
412.AND..LT. $ ( APPLV ( LDVMV ) ) ) THEN
413 INFO = -11
414.AND..LE. ELSE IF( UCTOL ( WORK( 1 )ONE ) ) THEN
415 INFO = -12
416.LT. ELSE IF( LWORKMAX( M+N, 6 ) ) THEN
417 INFO = -13
418 ELSE
419 INFO = 0
420 END IF
421*
422* #:(
423.NE. IF( INFO0 ) THEN
424 CALL XERBLA( 'sgesvj', -INFO )
425 RETURN
426 END IF
427*
428* #:) Quick return for void matrix
429*
430.EQ..OR..EQ. IF( ( M0 ) ( N0 ) )RETURN
431*
432* Set numerical parameters
433* The stopping criterion for Jacobi rotations is
434*
435* max_{i<>j}|A(:,i)^T * A(:,j)|/(||A(:,i)||*||A(:,j)||) < CTOL*EPS
436*
437* where EPS is the round-off and CTOL is defined as follows:
438*
439 IF( UCTOL ) THEN
440* ... user controlled
441 CTOL = WORK( 1 )
442 ELSE
443* ... default
444.OR..OR. IF( LSVEC RSVEC APPLV ) THEN
445 CTOL = SQRT( FLOAT( M ) )
446 ELSE
447 CTOL = FLOAT( M )
448 END IF
449 END IF
450* ... and the machine dependent parameters are
451*[!] (Make sure that SLAMCH() works properly on the target machine.)
452*
453 EPSLN = SLAMCH( 'epsilon' )
454 ROOTEPS = SQRT( EPSLN )
455 SFMIN = SLAMCH( 'safeminimum' )
456 ROOTSFMIN = SQRT( SFMIN )
457 SMALL = SFMIN / EPSLN
458 BIG = SLAMCH( 'overflow' )
459* BIG = ONE / SFMIN
460 ROOTBIG = ONE / ROOTSFMIN
461 LARGE = BIG / SQRT( FLOAT( M*N ) )
462 BIGTHETA = ONE / ROOTEPS
463*
464 TOL = CTOL*EPSLN
465 ROOTTOL = SQRT( TOL )
466*
467.GE. IF( FLOAT( M )*EPSLNONE ) THEN
468 INFO = -4
469 CALL XERBLA( 'sgesvj', -INFO )
470 RETURN
471 END IF
472*
473* Initialize the right singular vector matrix.
474*
475 IF( RSVEC ) THEN
476 MVL = N
477 CALL SLASET( 'a', MVL, N, ZERO, ONE, V, LDV )
478 ELSE IF( APPLV ) THEN
479 MVL = MV
480 END IF
481.OR. RSVEC = RSVEC APPLV
482*
483* Initialize SVA( 1:N ) = ( ||A e_i||_2, i = 1:N )
484*(!) If necessary, scale A to protect the largest singular value
485* from overflow. It is possible that saving the largest singular
486* value destroys the information about the small ones.
487* This initial scaling is almost minimal in the sense that the
488* goal is to make sure that no column norm overflows, and that
489* SQRT(N)*max_i SVA(i) does not overflow. If INFinite entries
490* in A are detected, the procedure returns with INFO=-6.
491*
492 SKL = ONE / SQRT( FLOAT( M )*FLOAT( N ) )
493 NOSCALE = .TRUE.
494 GOSCALE = .TRUE.
495*
496 IF( LOWER ) THEN
497* the input matrix is M-by-N lower triangular (trapezoidal)
498 DO 1874 p = 1, N
499 AAPP = ZERO
500 AAQQ = ONE
501 CALL SLASSQ( M-p+1, A( p, p ), 1, AAPP, AAQQ )
502.GT. IF( AAPPBIG ) THEN
503 INFO = -6
504 CALL XERBLA( 'sgesvj', -INFO )
505 RETURN
506 END IF
507 AAQQ = SQRT( AAQQ )
508.LT..AND. IF( ( AAPP( BIG / AAQQ ) ) NOSCALE ) THEN
509 SVA( p ) = AAPP*AAQQ
510 ELSE
511 NOSCALE = .FALSE.
512 SVA( p ) = AAPP*( AAQQ*SKL )
513 IF( GOSCALE ) THEN
514 GOSCALE = .FALSE.
515 DO 1873 q = 1, p - 1
516 SVA( q ) = SVA( q )*SKL
517 1873 CONTINUE
518 END IF
519 END IF
520 1874 CONTINUE
521 ELSE IF( UPPER ) THEN
522* the input matrix is M-by-N upper triangular (trapezoidal)
523 DO 2874 p = 1, N
524 AAPP = ZERO
525 AAQQ = ONE
526 CALL SLASSQ( p, A( 1, p ), 1, AAPP, AAQQ )
527.GT. IF( AAPPBIG ) THEN
528 INFO = -6
529 CALL XERBLA( 'sgesvj', -INFO )
530 RETURN
531 END IF
532 AAQQ = SQRT( AAQQ )
533.LT..AND. IF( ( AAPP( BIG / AAQQ ) ) NOSCALE ) THEN
534 SVA( p ) = AAPP*AAQQ
535 ELSE
536 NOSCALE = .FALSE.
537 SVA( p ) = AAPP*( AAQQ*SKL )
538 IF( GOSCALE ) THEN
539 GOSCALE = .FALSE.
540 DO 2873 q = 1, p - 1
541 SVA( q ) = SVA( q )*SKL
542 2873 CONTINUE
543 END IF
544 END IF
545 2874 CONTINUE
546 ELSE
547* the input matrix is M-by-N general dense
548 DO 3874 p = 1, N
549 AAPP = ZERO
550 AAQQ = ONE
551 CALL SLASSQ( M, A( 1, p ), 1, AAPP, AAQQ )
552.GT. IF( AAPPBIG ) THEN
553 INFO = -6
554 CALL XERBLA( 'sgesvj', -INFO )
555 RETURN
556 END IF
557 AAQQ = SQRT( AAQQ )
558.LT..AND. IF( ( AAPP( BIG / AAQQ ) ) NOSCALE ) THEN
559 SVA( p ) = AAPP*AAQQ
560 ELSE
561 NOSCALE = .FALSE.
562 SVA( p ) = AAPP*( AAQQ*SKL )
563 IF( GOSCALE ) THEN
564 GOSCALE = .FALSE.
565 DO 3873 q = 1, p - 1
566 SVA( q ) = SVA( q )*SKL
567 3873 CONTINUE
568 END IF
569 END IF
570 3874 CONTINUE
571 END IF
572*
573 IF( NOSCALE )SKL = ONE
574*
575* Move the smaller part of the spectrum from the underflow threshold
576*(!) Start by determining the position of the nonzero entries of the
577* array SVA() relative to ( SFMIN, BIG ).
578*
579 AAPP = ZERO
580 AAQQ = BIG
581 DO 4781 p = 1, N
582.NE. IF( SVA( p )ZERO )AAQQ = MIN( AAQQ, SVA( p ) )
583 AAPP = MAX( AAPP, SVA( p ) )
584 4781 CONTINUE
585*
586* #:) Quick return for zero matrix
587*
588.EQ. IF( AAPPZERO ) THEN
589 IF( LSVEC )CALL SLASET( 'g', M, N, ZERO, ONE, A, LDA )
590 WORK( 1 ) = ONE
591 WORK( 2 ) = ZERO
592 WORK( 3 ) = ZERO
593 WORK( 4 ) = ZERO
594 WORK( 5 ) = ZERO
595 WORK( 6 ) = ZERO
596 RETURN
597 END IF
598*
599* #:) Quick return for one-column matrix
600*
601.EQ. IF( N1 ) THEN
602 IF( LSVEC )CALL SLASCL( 'g', 0, 0, SVA( 1 ), SKL, M, 1,
603 $ A( 1, 1 ), LDA, IERR )
604 WORK( 1 ) = ONE / SKL
605.GE. IF( SVA( 1 )SFMIN ) THEN
606 WORK( 2 ) = ONE
607 ELSE
608 WORK( 2 ) = ZERO
609 END IF
610 WORK( 3 ) = ZERO
611 WORK( 4 ) = ZERO
612 WORK( 5 ) = ZERO
613 WORK( 6 ) = ZERO
614 RETURN
615 END IF
616*
617* Protect small singular values from underflow, and try to
618* avoid underflows/overflows in computing Jacobi rotations.
619*
620 SN = SQRT( SFMIN / EPSLN )
621 TEMP1 = SQRT( BIG / FLOAT( N ) )
622.LE..OR..GE..OR. IF( ( AAPPSN ) ( AAQQTEMP1 )
623.LE..AND..LE. $ ( ( SNAAQQ ) ( AAPPTEMP1 ) ) ) THEN
624 TEMP1 = MIN( BIG, TEMP1 / AAPP )
625* AAQQ = AAQQ*TEMP1
626* AAPP = AAPP*TEMP1
627.LE..AND..LE. ELSE IF( ( AAQQSN ) ( AAPPTEMP1 ) ) THEN
628 TEMP1 = MIN( SN / AAQQ, BIG / ( AAPP*SQRT( FLOAT( N ) ) ) )
629* AAQQ = AAQQ*TEMP1
630* AAPP = AAPP*TEMP1
631.GE..AND..GE. ELSE IF( ( AAQQSN ) ( AAPPTEMP1 ) ) THEN
632 TEMP1 = MAX( SN / AAQQ, TEMP1 / AAPP )
633* AAQQ = AAQQ*TEMP1
634* AAPP = AAPP*TEMP1
635.LE..AND..GE. ELSE IF( ( AAQQSN ) ( AAPPTEMP1 ) ) THEN
636 TEMP1 = MIN( SN / AAQQ, BIG / ( SQRT( FLOAT( N ) )*AAPP ) )
637* AAQQ = AAQQ*TEMP1
638* AAPP = AAPP*TEMP1
639 ELSE
640 TEMP1 = ONE
641 END IF
642*
643* Scale, if necessary
644*
645.NE. IF( TEMP1ONE ) THEN
646 CALL SLASCL( 'g', 0, 0, ONE, TEMP1, N, 1, SVA, N, IERR )
647 END IF
648 SKL = TEMP1*SKL
649.NE. IF( SKLONE ) THEN
650 CALL SLASCL( JOBA, 0, 0, ONE, SKL, M, N, A, LDA, IERR )
651 SKL = ONE / SKL
652 END IF
653*
654* Row-cyclic Jacobi SVD algorithm with column pivoting
655*
656 EMPTSW = ( N*( N-1 ) ) / 2
657 NOTROT = 0
658 FASTR( 1 ) = ZERO
659*
660* A is represented in factored form A = A * diag(WORK), where diag(WORK)
661* is initialized to identity. WORK is updated during fast scaled
662* rotations.
663*
664 DO 1868 q = 1, N
665 WORK( q ) = ONE
666 1868 CONTINUE
667*
668*
669 SWBAND = 3
670*[TP] SWBAND is a tuning parameter [TP]. It is meaningful and effective
671* if SGESVJ is used as a computational routine in the preconditioned
672* Jacobi SVD algorithm SGESVJ. For sweeps i=1:SWBAND the procedure
673* works on pivots inside a band-like region around the diagonal.
674* The boundaries are determined dynamically, based on the number of
675* pivots above a threshold.
676*
677 KBL = MIN( 8, N )
678*[TP] KBL is a tuning parameter that defines the tile size in the
679* tiling of the p-q loops of pivot pairs. In general, an optimal
680* value of KBL depends on the matrix dimensions and on the
681* parameters of the computer's memory.
682*
683 NBL = N / KBL
684.NE. IF( ( NBL*KBL )N )NBL = NBL + 1
685*
686 BLSKIP = KBL**2
687*[TP] BLKSKIP is a tuning parameter that depends on SWBAND and KBL.
688*
689 ROWSKIP = MIN( 5, KBL )
690*[TP] ROWSKIP is a tuning parameter.
691*
692 LKAHEAD = 1
693*[TP] LKAHEAD is a tuning parameter.
694*
695* Quasi block transformations, using the lower (upper) triangular
696* structure of the input matrix. The quasi-block-cycling usually
697* invokes cubic convergence. Big part of this cycle is done inside
698* canonical subspaces of dimensions less than M.
699*
700.OR..AND..GT. IF( ( LOWER UPPER ) ( NMAX( 64, 4*KBL ) ) ) THEN
701*[TP] The number of partition levels and the actual partition are
702* tuning parameters.
703 N4 = N / 4
704 N2 = N / 2
705 N34 = 3*N4
706 IF( APPLV ) THEN
707 q = 0
708 ELSE
709 q = 1
710 END IF
711*
712 IF( LOWER ) THEN
713*
714* This works very well on lower triangular matrices, in particular
715* in the framework of the preconditioned Jacobi SVD (xGEJSV).
716* The idea is simple:
717* [+ 0 0 0] Note that Jacobi transformations of [0 0]
718* [+ + 0 0] [0 0]
719* [+ + x 0] actually work on [x 0] [x 0]
720* [+ + x x] [x x]. [x x]
721*
722 CALL SGSVJ0( JOBV, M-N34, N-N34, A( N34+1, N34+1 ), LDA,
723 $ WORK( N34+1 ), SVA( N34+1 ), MVL,
724 $ V( N34*q+1, N34+1 ), LDV, EPSLN, SFMIN, TOL,
725 $ 2, WORK( N+1 ), LWORK-N, IERR )
726*
727 CALL SGSVJ0( JOBV, M-N2, N34-N2, A( N2+1, N2+1 ), LDA,
728 $ WORK( N2+1 ), SVA( N2+1 ), MVL,
729 $ V( N2*q+1, N2+1 ), LDV, EPSLN, SFMIN, TOL, 2,
730 $ WORK( N+1 ), LWORK-N, IERR )
731*
732 CALL SGSVJ1( JOBV, M-N2, N-N2, N4, A( N2+1, N2+1 ), LDA,
733 $ WORK( N2+1 ), SVA( N2+1 ), MVL,
734 $ V( N2*q+1, N2+1 ), LDV, EPSLN, SFMIN, TOL, 1,
735 $ WORK( N+1 ), LWORK-N, IERR )
736*
737 CALL SGSVJ0( JOBV, M-N4, N2-N4, A( N4+1, N4+1 ), LDA,
738 $ WORK( N4+1 ), SVA( N4+1 ), MVL,
739 $ V( N4*q+1, N4+1 ), LDV, EPSLN, SFMIN, TOL, 1,
740 $ WORK( N+1 ), LWORK-N, IERR )
741*
742 CALL SGSVJ0( JOBV, M, N4, A, LDA, WORK, SVA, MVL, V, LDV,
743 $ EPSLN, SFMIN, TOL, 1, WORK( N+1 ), LWORK-N,
744 $ IERR )
745*
746 CALL SGSVJ1( JOBV, M, N2, N4, A, LDA, WORK, SVA, MVL, V,
747 $ LDV, EPSLN, SFMIN, TOL, 1, WORK( N+1 ),
748 $ LWORK-N, IERR )
749*
750*
751 ELSE IF( UPPER ) THEN
752*
753*
754 CALL SGSVJ0( JOBV, N4, N4, A, LDA, WORK, SVA, MVL, V, LDV,
755 $ EPSLN, SFMIN, TOL, 2, WORK( N+1 ), LWORK-N,
756 $ IERR )
757*
758 CALL SGSVJ0( JOBV, N2, N4, A( 1, N4+1 ), LDA, WORK( N4+1 ),
759 $ SVA( N4+1 ), MVL, V( N4*q+1, N4+1 ), LDV,
760 $ EPSLN, SFMIN, TOL, 1, WORK( N+1 ), LWORK-N,
761 $ IERR )
762*
763 CALL SGSVJ1( JOBV, N2, N2, N4, A, LDA, WORK, SVA, MVL, V,
764 $ LDV, EPSLN, SFMIN, TOL, 1, WORK( N+1 ),
765 $ LWORK-N, IERR )
766*
767 CALL SGSVJ0( JOBV, N2+N4, N4, A( 1, N2+1 ), LDA,
768 $ WORK( N2+1 ), SVA( N2+1 ), MVL,
769 $ V( N2*q+1, N2+1 ), LDV, EPSLN, SFMIN, TOL, 1,
770 $ WORK( N+1 ), LWORK-N, IERR )
771
772 END IF
773*
774 END IF
775*
776* .. Row-cyclic pivot strategy with de Rijk's pivoting ..
777*
778 DO 1993 i = 1, NSWEEP
779*
780* .. go go go ...
781*
782 MXAAPQ = ZERO
783 MXSINJ = ZERO
784 ISWROT = 0
785*
786 NOTROT = 0
787 PSKIPPED = 0
788*
789* Each sweep is unrolled using KBL-by-KBL tiles over the pivot pairs
790* 1 <= p < q <= N. This is the first step toward a blocked implementation
791* of the rotations. New implementation, based on block transformations,
792* is under development.
793*
794 DO 2000 ibr = 1, NBL
795*
796 igl = ( ibr-1 )*KBL + 1
797*
798 DO 1002 ir1 = 0, MIN( LKAHEAD, NBL-ibr )
799*
800 igl = igl + ir1*KBL
801*
802 DO 2001 p = igl, MIN( igl+KBL-1, N-1 )
803*
804* .. de Rijk's pivoting
805*
806 q = ISAMAX( N-p+1, SVA( p ), 1 ) + p - 1
807.NE. IF( pq ) THEN
808 CALL SSWAP( M, A( 1, p ), 1, A( 1, q ), 1 )
809 IF( RSVEC )CALL SSWAP( MVL, V( 1, p ), 1,
810 $ V( 1, q ), 1 )
811 TEMP1 = SVA( p )
812 SVA( p ) = SVA( q )
813 SVA( q ) = TEMP1
814 TEMP1 = WORK( p )
815 WORK( p ) = WORK( q )
816 WORK( q ) = TEMP1
817 END IF
818*
819.EQ. IF( ir10 ) THEN
820*
821* Column norms are periodically updated by explicit
822* norm computation.
823* Caveat:
824* Unfortunately, some BLAS implementations compute SNRM2(M,A(1,p),1)
825* as SQRT(SDOT(M,A(1,p),1,A(1,p),1)), which may cause the result to
826* overflow for ||A(:,p)||_2 > SQRT(overflow_threshold), and to
827* underflow for ||A(:,p)||_2 < SQRT(underflow_threshold).
828* Hence, SNRM2 cannot be trusted, not even in the case when
829* the true norm is far from the under(over)flow boundaries.
830* If properly implemented SNRM2 is available, the IF-THEN-ELSE
831* below should read "AAPP = SNRM2( M, A(1,p), 1 ) * WORK(p)".
832*
833.LT..AND. IF( ( SVA( p )ROOTBIG )
834.GT. $ ( SVA( p )ROOTSFMIN ) ) THEN
835 SVA( p ) = SNRM2( M, A( 1, p ), 1 )*WORK( p )
836 ELSE
837 TEMP1 = ZERO
838 AAPP = ONE
839 CALL SLASSQ( M, A( 1, p ), 1, TEMP1, AAPP )
840 SVA( p ) = TEMP1*SQRT( AAPP )*WORK( p )
841 END IF
842 AAPP = SVA( p )
843 ELSE
844 AAPP = SVA( p )
845 END IF
846*
847.GT. IF( AAPPZERO ) THEN
848*
849 PSKIPPED = 0
850*
851 DO 2002 q = p + 1, MIN( igl+KBL-1, N )
852*
853 AAQQ = SVA( q )
854*
855.GT. IF( AAQQZERO ) THEN
856*
857 AAPP0 = AAPP
858.GE. IF( AAQQONE ) THEN
859.LE. ROTOK = ( SMALL*AAPP )AAQQ
860.LT. IF( AAPP( BIG / AAQQ ) ) THEN
861 AAPQ = ( SDOT( M, A( 1, p ), 1, A( 1,
862 $ q ), 1 )*WORK( p )*WORK( q ) /
863 $ AAQQ ) / AAPP
864 ELSE
865 CALL SCOPY( M, A( 1, p ), 1,
866 $ WORK( N+1 ), 1 )
867 CALL SLASCL( 'g', 0, 0, AAPP,
868 $ WORK( p ), M, 1,
869 $ WORK( N+1 ), LDA, IERR )
870 AAPQ = SDOT( M, WORK( N+1 ), 1,
871 $ A( 1, q ), 1 )*WORK( q ) / AAQQ
872 END IF
873 ELSE
874.LE. ROTOK = AAPP( AAQQ / SMALL )
875.GT. IF( AAPP( SMALL / AAQQ ) ) THEN
876 AAPQ = ( SDOT( M, A( 1, p ), 1, A( 1,
877 $ q ), 1 )*WORK( p )*WORK( q ) /
878 $ AAQQ ) / AAPP
879 ELSE
880 CALL SCOPY( M, A( 1, q ), 1,
881 $ WORK( N+1 ), 1 )
882 CALL SLASCL( 'g', 0, 0, AAQQ,
883 $ WORK( q ), M, 1,
884 $ WORK( N+1 ), LDA, IERR )
885 AAPQ = SDOT( M, WORK( N+1 ), 1,
886 $ A( 1, p ), 1 )*WORK( p ) / AAPP
887 END IF
888 END IF
889*
890 MXAAPQ = MAX( MXAAPQ, ABS( AAPQ ) )
891*
892* TO rotate or NOT to rotate, THAT is the question ...
893*
894.GT. IF( ABS( AAPQ )TOL ) THEN
895*
896* .. rotate
897*[RTD] ROTATED = ROTATED + ONE
898*
899.EQ. IF( ir10 ) THEN
900 NOTROT = 0
901 PSKIPPED = 0
902 ISWROT = ISWROT + 1
903 END IF
904*
905 IF( ROTOK ) THEN
906*
907 AQOAP = AAQQ / AAPP
908 APOAQ = AAPP / AAQQ
909 THETA = -HALF*ABS( AQOAP-APOAQ ) / AAPQ
910*
911.GT. IF( ABS( THETA )BIGTHETA ) THEN
912*
913 T = HALF / THETA
914 FASTR( 3 ) = T*WORK( p ) / WORK( q )
915 FASTR( 4 ) = -T*WORK( q ) /
916 $ WORK( p )
917 CALL SROTM( M, A( 1, p ), 1,
918 $ A( 1, q ), 1, FASTR )
919 IF( RSVEC )CALL SROTM( MVL,
920 $ V( 1, p ), 1,
921 $ V( 1, q ), 1,
922 $ FASTR )
923 SVA( q ) = AAQQ*SQRT( MAX( ZERO,
924 $ ONE+T*APOAQ*AAPQ ) )
925 AAPP = AAPP*SQRT( MAX( ZERO,
926 $ ONE-T*AQOAP*AAPQ ) )
927 MXSINJ = MAX( MXSINJ, ABS( T ) )
928*
929 ELSE
930*
931* .. choose correct signum for THETA and rotate
932*
933 THSIGN = -SIGN( ONE, AAPQ )
934 T = ONE / ( THETA+THSIGN*
935 $ SQRT( ONE+THETA*THETA ) )
936 CS = SQRT( ONE / ( ONE+T*T ) )
937 SN = T*CS
938*
939 MXSINJ = MAX( MXSINJ, ABS( SN ) )
940 SVA( q ) = AAQQ*SQRT( MAX( ZERO,
941 $ ONE+T*APOAQ*AAPQ ) )
942 AAPP = AAPP*SQRT( MAX( ZERO,
943 $ ONE-T*AQOAP*AAPQ ) )
944*
945 APOAQ = WORK( p ) / WORK( q )
946 AQOAP = WORK( q ) / WORK( p )
947.GE. IF( WORK( p )ONE ) THEN
948.GE. IF( WORK( q )ONE ) THEN
949 FASTR( 3 ) = T*APOAQ
950 FASTR( 4 ) = -T*AQOAP
951 WORK( p ) = WORK( p )*CS
952 WORK( q ) = WORK( q )*CS
953 CALL SROTM( M, A( 1, p ), 1,
954 $ A( 1, q ), 1,
955 $ FASTR )
956 IF( RSVEC )CALL SROTM( MVL,
957 $ V( 1, p ), 1, V( 1, q ),
958 $ 1, FASTR )
959 ELSE
960 CALL SAXPY( M, -T*AQOAP,
961 $ A( 1, q ), 1,
962 $ A( 1, p ), 1 )
963 CALL SAXPY( M, CS*SN*APOAQ,
964 $ A( 1, p ), 1,
965 $ A( 1, q ), 1 )
966 WORK( p ) = WORK( p )*CS
967 WORK( q ) = WORK( q ) / CS
968 IF( RSVEC ) THEN
969 CALL SAXPY( MVL, -T*AQOAP,
970 $ V( 1, q ), 1,
971 $ V( 1, p ), 1 )
972 CALL SAXPY( MVL,
973 $ CS*SN*APOAQ,
974 $ V( 1, p ), 1,
975 $ V( 1, q ), 1 )
976 END IF
977 END IF
978 ELSE
979.GE. IF( WORK( q )ONE ) THEN
980 CALL SAXPY( M, T*APOAQ,
981 $ A( 1, p ), 1,
982 $ A( 1, q ), 1 )
983 CALL SAXPY( M, -CS*SN*AQOAP,
984 $ A( 1, q ), 1,
985 $ A( 1, p ), 1 )
986 WORK( p ) = WORK( p ) / CS
987 WORK( q ) = WORK( q )*CS
988 IF( RSVEC ) THEN
989 CALL SAXPY( MVL, T*APOAQ,
990 $ V( 1, p ), 1,
991 $ V( 1, q ), 1 )
992 CALL SAXPY( MVL,
993 $ -CS*SN*AQOAP,
994 $ V( 1, q ), 1,
995 $ V( 1, p ), 1 )
996 END IF
997 ELSE
998.GE. IF( WORK( p )WORK( q ) )
999 $ THEN
1000 CALL SAXPY( M, -T*AQOAP,
1001 $ A( 1, q ), 1,
1002 $ A( 1, p ), 1 )
1003 CALL SAXPY( M, CS*SN*APOAQ,
1004 $ A( 1, p ), 1,
1005 $ A( 1, q ), 1 )
1006 WORK( p ) = WORK( p )*CS
1007 WORK( q ) = WORK( q ) / CS
1008 IF( RSVEC ) THEN
1009 CALL SAXPY( MVL,
1010 $ -T*AQOAP,
1011 $ V( 1, q ), 1,
1012 $ V( 1, p ), 1 )
1013 CALL SAXPY( MVL,
1014 $ CS*SN*APOAQ,
1015 $ V( 1, p ), 1,
1016 $ V( 1, q ), 1 )
1017 END IF
1018 ELSE
1019 CALL SAXPY( M, T*APOAQ,
1020 $ A( 1, p ), 1,
1021 $ A( 1, q ), 1 )
1022 CALL SAXPY( M,
1023 $ -CS*SN*AQOAP,
1024 $ A( 1, q ), 1,
1025 $ A( 1, p ), 1 )
1026 WORK( p ) = WORK( p ) / CS
1027 WORK( q ) = WORK( q )*CS
1028 IF( RSVEC ) THEN
1029 CALL SAXPY( MVL,
1030 $ T*APOAQ, V( 1, p ),
1031 $ 1, V( 1, q ), 1 )
1032 CALL SAXPY( MVL,
1033 $ -CS*SN*AQOAP,
1034 $ V( 1, q ), 1,
1035 $ V( 1, p ), 1 )
1036 END IF
1037 END IF
1038 END IF
1039 END IF
1040 END IF
1041*
1042 ELSE
1043* .. have to use modified Gram-Schmidt like transformation
1044 CALL SCOPY( M, A( 1, p ), 1,
1045 $ WORK( N+1 ), 1 )
1046 CALL SLASCL( 'g', 0, 0, AAPP, ONE, M,
1047 $ 1, WORK( N+1 ), LDA,
1048 $ IERR )
1049 CALL SLASCL( 'g', 0, 0, AAQQ, ONE, M,
1050 $ 1, A( 1, q ), LDA, IERR )
1051 TEMP1 = -AAPQ*WORK( p ) / WORK( q )
1052 CALL SAXPY( M, TEMP1, WORK( N+1 ), 1,
1053 $ A( 1, q ), 1 )
1054 CALL SLASCL( 'g', 0, 0, ONE, AAQQ, M,
1055 $ 1, A( 1, q ), LDA, IERR )
1056 SVA( q ) = AAQQ*SQRT( MAX( ZERO,
1057 $ ONE-AAPQ*AAPQ ) )
1058 MXSINJ = MAX( MXSINJ, SFMIN )
1059 END IF
1060* END IF ROTOK THEN ... ELSE
1061*
1062* In the case of cancellation in updating SVA(q), SVA(p)
1063* recompute SVA(q), SVA(p).
1064*
1065.LE. IF( ( SVA( q ) / AAQQ )**2ROOTEPS )
1066 $ THEN
1067.LT..AND. IF( ( AAQQROOTBIG )
1068.GT. $ ( AAQQROOTSFMIN ) ) THEN
1069 SVA( q ) = SNRM2( M, A( 1, q ), 1 )*
1070 $ WORK( q )
1071 ELSE
1072 T = ZERO
1073 AAQQ = ONE
1074 CALL SLASSQ( M, A( 1, q ), 1, T,
1075 $ AAQQ )
1076 SVA( q ) = T*SQRT( AAQQ )*WORK( q )
1077 END IF
1078 END IF
1079.LE. IF( ( AAPP / AAPP0 )ROOTEPS ) THEN
1080.LT..AND. IF( ( AAPPROOTBIG )
1081.GT. $ ( AAPPROOTSFMIN ) ) THEN
1082 AAPP = SNRM2( M, A( 1, p ), 1 )*
1083 $ WORK( p )
1084 ELSE
1085 T = ZERO
1086 AAPP = ONE
1087 CALL SLASSQ( M, A( 1, p ), 1, T,
1088 $ AAPP )
1089 AAPP = T*SQRT( AAPP )*WORK( p )
1090 END IF
1091 SVA( p ) = AAPP
1092 END IF
1093*
1094 ELSE
1095* A(:,p) and A(:,q) already numerically orthogonal
1096.EQ. IF( ir10 )NOTROT = NOTROT + 1
1097*[RTD] SKIPPED = SKIPPED + 1
1098 PSKIPPED = PSKIPPED + 1
1099 END IF
1100 ELSE
1101* A(:,q) is zero column
1102.EQ. IF( ir10 )NOTROT = NOTROT + 1
1103 PSKIPPED = PSKIPPED + 1
1104 END IF
1105*
1106.LE..AND. IF( ( iSWBAND )
1107.GT. $ ( PSKIPPEDROWSKIP ) ) THEN
1108.EQ. IF( ir10 )AAPP = -AAPP
1109 NOTROT = 0
1110 GO TO 2103
1111 END IF
1112*
1113 2002 CONTINUE
1114* END q-LOOP
1115*
1116 2103 CONTINUE
1117* bailed out of q-loop
1118*
1119 SVA( p ) = AAPP
1120*
1121 ELSE
1122 SVA( p ) = AAPP
1123.EQ..AND..EQ. IF( ( ir10 ) ( AAPPZERO ) )
1124 $ NOTROT = NOTROT + MIN( igl+KBL-1, N ) - p
1125 END IF
1126*
1127 2001 CONTINUE
1128* end of the p-loop
1129* end of doing the block ( ibr, ibr )
1130 1002 CONTINUE
1131* end of ir1-loop
1132*
1133* ... go to the off diagonal blocks
1134*
1135 igl = ( ibr-1 )*KBL + 1
1136*
1137 DO 2010 jbc = ibr + 1, NBL
1138*
1139 jgl = ( jbc-1 )*KBL + 1
1140*
1141* doing the block at ( ibr, jbc )
1142*
1143 IJBLSK = 0
1144 DO 2100 p = igl, MIN( igl+KBL-1, N )
1145*
1146 AAPP = SVA( p )
1147.GT. IF( AAPPZERO ) THEN
1148*
1149 PSKIPPED = 0
1150*
1151 DO 2200 q = jgl, MIN( jgl+KBL-1, N )
1152*
1153 AAQQ = SVA( q )
1154.GT. IF( AAQQZERO ) THEN
1155 AAPP0 = AAPP
1156*
1157* .. M x 2 Jacobi SVD ..
1158*
1159* Safe Gram matrix computation
1160*
1161.GE. IF( AAQQONE ) THEN
1162.GE. IF( AAPPAAQQ ) THEN
1163.LE. ROTOK = ( SMALL*AAPP )AAQQ
1164 ELSE
1165.LE. ROTOK = ( SMALL*AAQQ )AAPP
1166 END IF
1167.LT. IF( AAPP( BIG / AAQQ ) ) THEN
1168 AAPQ = ( SDOT( M, A( 1, p ), 1, A( 1,
1169 $ q ), 1 )*WORK( p )*WORK( q ) /
1170 $ AAQQ ) / AAPP
1171 ELSE
1172 CALL SCOPY( M, A( 1, p ), 1,
1173 $ WORK( N+1 ), 1 )
1174 CALL SLASCL( 'g', 0, 0, AAPP,
1175 $ WORK( p ), M, 1,
1176 $ WORK( N+1 ), LDA, IERR )
1177 AAPQ = SDOT( M, WORK( N+1 ), 1,
1178 $ A( 1, q ), 1 )*WORK( q ) / AAQQ
1179 END IF
1180 ELSE
1181.GE. IF( AAPPAAQQ ) THEN
1182.LE. ROTOK = AAPP( AAQQ / SMALL )
1183 ELSE
1184.LE. ROTOK = AAQQ( AAPP / SMALL )
1185 END IF
1186.GT. IF( AAPP( SMALL / AAQQ ) ) THEN
1187 AAPQ = ( SDOT( M, A( 1, p ), 1, A( 1,
1188 $ q ), 1 )*WORK( p )*WORK( q ) /
1189 $ AAQQ ) / AAPP
1190 ELSE
1191 CALL SCOPY( M, A( 1, q ), 1,
1192 $ WORK( N+1 ), 1 )
1193 CALL SLASCL( 'g', 0, 0, AAQQ,
1194 $ WORK( q ), M, 1,
1195 $ WORK( N+1 ), LDA, IERR )
1196 AAPQ = SDOT( M, WORK( N+1 ), 1,
1197 $ A( 1, p ), 1 )*WORK( p ) / AAPP
1198 END IF
1199 END IF
1200*
1201 MXAAPQ = MAX( MXAAPQ, ABS( AAPQ ) )
1202*
1203* TO rotate or NOT to rotate, THAT is the question ...
1204*
1205.GT. IF( ABS( AAPQ )TOL ) THEN
1206 NOTROT = 0
1207*[RTD] ROTATED = ROTATED + 1
1208 PSKIPPED = 0
1209 ISWROT = ISWROT + 1
1210*
1211 IF( ROTOK ) THEN
1212*
1213 AQOAP = AAQQ / AAPP
1214 APOAQ = AAPP / AAQQ
1215 THETA = -HALF*ABS( AQOAP-APOAQ ) / AAPQ
1216.GT. IF( AAQQAAPP0 )THETA = -THETA
1217*
1218.GT. IF( ABS( THETA )BIGTHETA ) THEN
1219 T = HALF / THETA
1220 FASTR( 3 ) = T*WORK( p ) / WORK( q )
1221 FASTR( 4 ) = -T*WORK( q ) /
1222 $ WORK( p )
1223 CALL SROTM( M, A( 1, p ), 1,
1224 $ A( 1, q ), 1, FASTR )
1225 IF( RSVEC )CALL SROTM( MVL,
1226 $ V( 1, p ), 1,
1227 $ V( 1, q ), 1,
1228 $ FASTR )
1229 SVA( q ) = AAQQ*SQRT( MAX( ZERO,
1230 $ ONE+T*APOAQ*AAPQ ) )
1231 AAPP = AAPP*SQRT( MAX( ZERO,
1232 $ ONE-T*AQOAP*AAPQ ) )
1233 MXSINJ = MAX( MXSINJ, ABS( T ) )
1234 ELSE
1235*
1236* .. choose correct signum for THETA and rotate
1237*
1238 THSIGN = -SIGN( ONE, AAPQ )
1239.GT. IF( AAQQAAPP0 )THSIGN = -THSIGN
1240 T = ONE / ( THETA+THSIGN*
1241 $ SQRT( ONE+THETA*THETA ) )
1242 CS = SQRT( ONE / ( ONE+T*T ) )
1243 SN = T*CS
1244 MXSINJ = MAX( MXSINJ, ABS( SN ) )
1245 SVA( q ) = AAQQ*SQRT( MAX( ZERO,
1246 $ ONE+T*APOAQ*AAPQ ) )
1247 AAPP = AAPP*SQRT( MAX( ZERO,
1248 $ ONE-T*AQOAP*AAPQ ) )
1249*
1250 APOAQ = WORK( p ) / WORK( q )
1251 AQOAP = WORK( q ) / WORK( p )
1252.GE. IF( WORK( p )ONE ) THEN
1253*
1254.GE. IF( WORK( q )ONE ) THEN
1255 FASTR( 3 ) = T*APOAQ
1256 FASTR( 4 ) = -T*AQOAP
1257 WORK( p ) = WORK( p )*CS
1258 WORK( q ) = WORK( q )*CS
1259 CALL SROTM( M, A( 1, p ), 1,
1260 $ A( 1, q ), 1,
1261 $ FASTR )
1262 IF( RSVEC )CALL SROTM( MVL,
1263 $ V( 1, p ), 1, V( 1, q ),
1264 $ 1, FASTR )
1265 ELSE
1266 CALL SAXPY( M, -T*AQOAP,
1267 $ A( 1, q ), 1,
1268 $ A( 1, p ), 1 )
1269 CALL SAXPY( M, CS*SN*APOAQ,
1270 $ A( 1, p ), 1,
1271 $ A( 1, q ), 1 )
1272 IF( RSVEC ) THEN
1273 CALL SAXPY( MVL, -T*AQOAP,
1274 $ V( 1, q ), 1,
1275 $ V( 1, p ), 1 )
1276 CALL SAXPY( MVL,
1277 $ CS*SN*APOAQ,
1278 $ V( 1, p ), 1,
1279 $ V( 1, q ), 1 )
1280 END IF
1281 WORK( p ) = WORK( p )*CS
1282 WORK( q ) = WORK( q ) / CS
1283 END IF
1284 ELSE
1285.GE. IF( WORK( q )ONE ) THEN
1286 CALL SAXPY( M, T*APOAQ,
1287 $ A( 1, p ), 1,
1288 $ A( 1, q ), 1 )
1289 CALL SAXPY( M, -CS*SN*AQOAP,
1290 $ A( 1, q ), 1,
1291 $ A( 1, p ), 1 )
1292 IF( RSVEC ) THEN
1293 CALL SAXPY( MVL, T*APOAQ,
1294 $ V( 1, p ), 1,
1295 $ V( 1, q ), 1 )
1296 CALL SAXPY( MVL,
1297 $ -CS*SN*AQOAP,
1298 $ V( 1, q ), 1,
1299 $ V( 1, p ), 1 )
1300 END IF
1301 WORK( p ) = WORK( p ) / CS
1302 WORK( q ) = WORK( q )*CS
1303 ELSE
1304.GE. IF( WORK( p )WORK( q ) )
1305 $ THEN
1306 CALL SAXPY( M, -T*AQOAP,
1307 $ A( 1, q ), 1,
1308 $ A( 1, p ), 1 )
1309 CALL SAXPY( M, CS*SN*APOAQ,
1310 $ A( 1, p ), 1,
1311 $ A( 1, q ), 1 )
1312 WORK( p ) = WORK( p )*CS
1313 WORK( q ) = WORK( q ) / CS
1314 IF( RSVEC ) THEN
1315 CALL SAXPY( MVL,
1316 $ -T*AQOAP,
1317 $ V( 1, q ), 1,
1318 $ V( 1, p ), 1 )
1319 CALL SAXPY( MVL,
1320 $ CS*SN*APOAQ,
1321 $ V( 1, p ), 1,
1322 $ V( 1, q ), 1 )
1323 END IF
1324 ELSE
1325 CALL SAXPY( M, T*APOAQ,
1326 $ A( 1, p ), 1,
1327 $ A( 1, q ), 1 )
1328 CALL SAXPY( M,
1329 $ -CS*SN*AQOAP,
1330 $ A( 1, q ), 1,
1331 $ A( 1, p ), 1 )
1332 WORK( p ) = WORK( p ) / CS
1333 WORK( q ) = WORK( q )*CS
1334 IF( RSVEC ) THEN
1335 CALL SAXPY( MVL,
1336 $ T*APOAQ, V( 1, p ),
1337 $ 1, V( 1, q ), 1 )
1338 CALL SAXPY( MVL,
1339 $ -CS*SN*AQOAP,
1340 $ V( 1, q ), 1,
1341 $ V( 1, p ), 1 )
1342 END IF
1343 END IF
1344 END IF
1345 END IF
1346 END IF
1347*
1348 ELSE
1349.GT. IF( AAPPAAQQ ) THEN
1350 CALL SCOPY( M, A( 1, p ), 1,
1351 $ WORK( N+1 ), 1 )
1352 CALL SLASCL( 'g', 0, 0, AAPP, ONE,
1353 $ M, 1, WORK( N+1 ), LDA,
1354 $ IERR )
1355 CALL SLASCL( 'g', 0, 0, AAQQ, ONE,
1356 $ M, 1, A( 1, q ), LDA,
1357 $ IERR )
1358 TEMP1 = -AAPQ*WORK( p ) / WORK( q )
1359 CALL SAXPY( M, TEMP1, WORK( N+1 ),
1360 $ 1, A( 1, q ), 1 )
1361 CALL SLASCL( 'g', 0, 0, ONE, AAQQ,
1362 $ M, 1, A( 1, q ), LDA,
1363 $ IERR )
1364 SVA( q ) = AAQQ*SQRT( MAX( ZERO,
1365 $ ONE-AAPQ*AAPQ ) )
1366 MXSINJ = MAX( MXSINJ, SFMIN )
1367 ELSE
1368 CALL SCOPY( M, A( 1, q ), 1,
1369 $ WORK( N+1 ), 1 )
1370 CALL SLASCL( 'g', 0, 0, AAQQ, ONE,
1371 $ M, 1, WORK( N+1 ), LDA,
1372 $ IERR )
1373 CALL SLASCL( 'g', 0, 0, AAPP, ONE,
1374 $ M, 1, A( 1, p ), LDA,
1375 $ IERR )
1376 TEMP1 = -AAPQ*WORK( q ) / WORK( p )
1377 CALL SAXPY( M, TEMP1, WORK( N+1 ),
1378 $ 1, A( 1, p ), 1 )
1379 CALL SLASCL( 'g', 0, 0, ONE, AAPP,
1380 $ M, 1, A( 1, p ), LDA,
1381 $ IERR )
1382 SVA( p ) = AAPP*SQRT( MAX( ZERO,
1383 $ ONE-AAPQ*AAPQ ) )
1384 MXSINJ = MAX( MXSINJ, SFMIN )
1385 END IF
1386 END IF
1387* END IF ROTOK THEN ... ELSE
1388*
1389* In the case of cancellation in updating SVA(q)
1390* .. recompute SVA(q)
1391.LE. IF( ( SVA( q ) / AAQQ )**2ROOTEPS )
1392 $ THEN
1393.LT..AND. IF( ( AAQQROOTBIG )
1394.GT. $ ( AAQQROOTSFMIN ) ) THEN
1395 SVA( q ) = SNRM2( M, A( 1, q ), 1 )*
1396 $ WORK( q )
1397 ELSE
1398 T = ZERO
1399 AAQQ = ONE
1400 CALL SLASSQ( M, A( 1, q ), 1, T,
1401 $ AAQQ )
1402 SVA( q ) = T*SQRT( AAQQ )*WORK( q )
1403 END IF
1404 END IF
1405.LE. IF( ( AAPP / AAPP0 )**2ROOTEPS ) THEN
1406.LT..AND. IF( ( AAPPROOTBIG )
1407.GT. $ ( AAPPROOTSFMIN ) ) THEN
1408 AAPP = SNRM2( M, A( 1, p ), 1 )*
1409 $ WORK( p )
1410 ELSE
1411 T = ZERO
1412 AAPP = ONE
1413 CALL SLASSQ( M, A( 1, p ), 1, T,
1414 $ AAPP )
1415 AAPP = T*SQRT( AAPP )*WORK( p )
1416 END IF
1417 SVA( p ) = AAPP
1418 END IF
1419* end of OK rotation
1420 ELSE
1421 NOTROT = NOTROT + 1
1422*[RTD] SKIPPED = SKIPPED + 1
1423 PSKIPPED = PSKIPPED + 1
1424 IJBLSK = IJBLSK + 1
1425 END IF
1426 ELSE
1427 NOTROT = NOTROT + 1
1428 PSKIPPED = PSKIPPED + 1
1429 IJBLSK = IJBLSK + 1
1430 END IF
1431*
1432.LE..AND..GE. IF( ( iSWBAND ) ( IJBLSKBLSKIP ) )
1433 $ THEN
1434 SVA( p ) = AAPP
1435 NOTROT = 0
1436 GO TO 2011
1437 END IF
1438.LE..AND. IF( ( iSWBAND )
1439.GT. $ ( PSKIPPEDROWSKIP ) ) THEN
1440 AAPP = -AAPP
1441 NOTROT = 0
1442 GO TO 2203
1443 END IF
1444*
1445 2200 CONTINUE
1446* end of the q-loop
1447 2203 CONTINUE
1448*
1449 SVA( p ) = AAPP
1450*
1451 ELSE
1452*
1453.EQ. IF( AAPPZERO )NOTROT = NOTROT +
1454 $ MIN( jgl+KBL-1, N ) - jgl + 1
1455.LT. IF( AAPPZERO )NOTROT = 0
1456*
1457 END IF
1458*
1459 2100 CONTINUE
1460* end of the p-loop
1461 2010 CONTINUE
1462* end of the jbc-loop
1463 2011 CONTINUE
1464*2011 bailed out of the jbc-loop
1465 DO 2012 p = igl, MIN( igl+KBL-1, N )
1466 SVA( p ) = ABS( SVA( p ) )
1467 2012 CONTINUE
1468***
1469 2000 CONTINUE
1470*2000 :: end of the ibr-loop
1471*
1472* .. update SVA(N)
1473.LT..AND..GT. IF( ( SVA( N )ROOTBIG ) ( SVA( N )ROOTSFMIN ) )
1474 $ THEN
1475 SVA( N ) = SNRM2( M, A( 1, N ), 1 )*WORK( N )
1476 ELSE
1477 T = ZERO
1478 AAPP = ONE
1479 CALL SLASSQ( M, A( 1, N ), 1, T, AAPP )
1480 SVA( N ) = T*SQRT( AAPP )*WORK( N )
1481 END IF
1482*
1483* Additional steering devices
1484*
1485.LT..AND..LE..OR. IF( ( iSWBAND ) ( ( MXAAPQROOTTOL )
1486.LE. $ ( ISWROTN ) ) )SWBAND = i
1487*
1488.GT..AND..LT. IF( ( iSWBAND+1 ) ( MXAAPQSQRT( FLOAT( N ) )*
1489.AND..LT. $ TOL ) ( FLOAT( N )*MXAAPQ*MXSINJTOL ) ) THEN
1490 GO TO 1994
1491 END IF
1492*
1493.GE. IF( NOTROTEMPTSW )GO TO 1994
1494*
1495 1993 CONTINUE
1496* end i=1:NSWEEP loop
1497*
1498* #:( Reaching this point means that the procedure has not converged.
1499 INFO = NSWEEP - 1
1500 GO TO 1995
1501*
1502 1994 CONTINUE
1503* #:) Reaching this point means numerical convergence after the i-th
1504* sweep.
1505*
1506 INFO = 0
1507* #:) INFO = 0 confirms successful iterations.
1508 1995 CONTINUE
1509*
1510* Sort the singular values and find how many are above
1511* the underflow threshold.
1512*
1513 N2 = 0
1514 N4 = 0
1515 DO 5991 p = 1, N - 1
1516 q = ISAMAX( N-p+1, SVA( p ), 1 ) + p - 1
1517.NE. IF( pq ) THEN
1518 TEMP1 = SVA( p )
1519 SVA( p ) = SVA( q )
1520 SVA( q ) = TEMP1
1521 TEMP1 = WORK( p )
1522 WORK( p ) = WORK( q )
1523 WORK( q ) = TEMP1
1524 CALL SSWAP( M, A( 1, p ), 1, A( 1, q ), 1 )
1525 IF( RSVEC )CALL SSWAP( MVL, V( 1, p ), 1, V( 1, q ), 1 )
1526 END IF
1527.NE. IF( SVA( p )ZERO ) THEN
1528 N4 = N4 + 1
1529.GT. IF( SVA( p )*SKLSFMIN )N2 = N2 + 1
1530 END IF
1531 5991 CONTINUE
1532.NE. IF( SVA( N )ZERO ) THEN
1533 N4 = N4 + 1
1534.GT. IF( SVA( N )*SKLSFMIN )N2 = N2 + 1
1535 END IF
1536*
1537* Normalize the left singular vectors.
1538*
1539.OR. IF( LSVEC UCTOL ) THEN
1540 DO 1998 p = 1, N2
1541 CALL SSCAL( M, WORK( p ) / SVA( p ), A( 1, p ), 1 )
1542 1998 CONTINUE
1543 END IF
1544*
1545* Scale the product of Jacobi rotations (assemble the fast rotations).
1546*
1547 IF( RSVEC ) THEN
1548 IF( APPLV ) THEN
1549 DO 2398 p = 1, N
1550 CALL SSCAL( MVL, WORK( p ), V( 1, p ), 1 )
1551 2398 CONTINUE
1552 ELSE
1553 DO 2399 p = 1, N
1554 TEMP1 = ONE / SNRM2( MVL, V( 1, p ), 1 )
1555 CALL SSCAL( MVL, TEMP1, V( 1, p ), 1 )
1556 2399 CONTINUE
1557 END IF
1558 END IF
1559*
1560* Undo scaling, if necessary (and possible).
1561.GT..AND..LT. IF( ( ( SKLONE ) ( SVA( 1 )( BIG / SKL ) ) )
1562.OR..LT..AND..GT. $ ( ( SKLONE ) ( SVA( MAX( N2, 1 ) )
1563 $ ( SFMIN / SKL ) ) ) ) THEN
1564 DO 2400 p = 1, N
1565 SVA( P ) = SKL*SVA( P )
1566 2400 CONTINUE
1567 SKL = ONE
1568 END IF
1569*
1570 WORK( 1 ) = SKL
1571* The singular values of A are SKL*SVA(1:N). If SKL.NE.ONE
1572* then some of the singular values may overflow or underflow and
1573* the spectrum is given in this factored representation.
1574*
1575 WORK( 2 ) = FLOAT( N4 )
1576* N4 is the number of computed nonzero singular values of A.
1577*
1578 WORK( 3 ) = FLOAT( N2 )
1579* N2 is the number of singular values of A greater than SFMIN.
1580* If N2<N, SVA(N2:N) contains ZEROS and/or denormalized numbers
1581* that may carry some information.
1582*
1583 WORK( 4 ) = FLOAT( i )
1584* i is the index of the last sweep before declaring convergence.
1585*
1586 WORK( 5 ) = MXAAPQ
1587* MXAAPQ is the largest absolute value of scaled pivots in the
1588* last sweep
1589*
1590 WORK( 6 ) = MXSINJ
1591* MXSINJ is the largest absolute value of the sines of Jacobi angles
1592* in the last sweep
1593*
1594 RETURN
1595* ..
1596* .. END OF SGESVJ
1597* ..
1598 END
subroutine slassq(n, x, incx, scl, sumsq)
SLASSQ updates a sum of squares represented in scaled form.
Definition slassq.f90:137
subroutine slaset(uplo, m, n, alpha, beta, a, lda)
SLASET initializes the off-diagonal elements and the diagonal elements of a matrix to given values.
Definition slaset.f:110
subroutine slascl(type, kl, ku, cfrom, cto, m, n, a, lda, info)
SLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition slascl.f:143
subroutine xerbla(srname, info)
XERBLA
Definition xerbla.f:60
subroutine sgesvj(joba, jobu, jobv, m, n, a, lda, sva, mv, v, ldv, work, lwork, info)
SGESVJ
Definition sgesvj.f:323
subroutine sgsvj0(jobv, m, n, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep, work, lwork, info)
SGSVJ0 pre-processor for the routine sgesvj.
Definition sgsvj0.f:218
subroutine sgsvj1(jobv, m, n, n1, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep, work, lwork, info)
SGSVJ1 pre-processor for the routine sgesvj, applies Jacobi rotations targeting only particular pivot...
Definition sgsvj1.f:236
subroutine sscal(n, sa, sx, incx)
SSCAL
Definition sscal.f:79
subroutine scopy(n, sx, incx, sy, incy)
SCOPY
Definition scopy.f:82
subroutine sswap(n, sx, incx, sy, incy)
SSWAP
Definition sswap.f:82
subroutine srotm(n, sx, incx, sy, incy, sparam)
SROTM
Definition srotm.f:97
subroutine saxpy(n, sa, sx, incx, sy, incy)
SAXPY
Definition saxpy.f:89
#define min(a, b)
Definition macros.h:20
#define max(a, b)
Definition macros.h:21