Matrix Tricks for Linear Statistical Models:
Our Personal Top Twenty
by Simo Puntanen, George P.H. Styan, and Jarkko Isotalo

Figures of TRIX

In addition to photos and stamps, TRIX includes dozens of figures. The figures for scatter plots were prepared using Survo software and the other figures using PSTricks.

On this page we show a few examples of the scatter plots and their work schemes that apply the PLOT operation (as well as other operations, like the matrix interpreter and statistical operations) of Survo with various specifications. The resulting graphs (as PDF files) are ready for publication as such, i.e., no further tuning whatsoever is needed.

Similar work schemes may be used in Muste, which is an open-source implementation of Survo in a form of an R package.


[picture]

**************** Figure 0.9 is created in Survo as follows: ********************
*
*/ACTIVATE +  /  Activating this activates all lines below marked with '+'
*
*Let us generate 1000 observations from two-dimensional
*normal distribution, where the correlation matrix is
*
*MATRIX R
*/// x    y
*x   1    0.7
*y   0.7  1
*
*while the means and standard deviations are as follows:
*
*MATRIX A
*///  mean stddev
*x      0     5
*y      0     4
*
+MAT SAVE R / save the matrices defined above
+MAT SAVE A
*
*Create a Survo data file TX1000 of 1000 observations:
*
+MNSIMUL R,A,TX1000,1000,0 /  RND=rand(2009)
*
*Some *GLOBAL* specifications (shared by the PLOTs below):
*
* *pen=[move(0,0)][Swiss(8)] PEN=*pen LINETYPE=*pen
* FRAME=6 HEADER= XDIV=0,1,0 YDIV=0,1,0 SIZE=630,630
* POINT=_ YLABEL= XLABEL= FRAME=0 SCALE=-18,17
*
*...................................................................
*
+PLOT TX1000,x,y / DEVICE=PS,k1.ps  POINT=3,4 FRAME=6
*
* Fine-tuning the scale notations, labels and locations:
* YSCALE=[move(10,000)],-18:?,-15(5)-5,0:_0,5:_5,10:_10,15:_15,17:?
* XSCALE=[move(-14,-8)],-18:?,-15(5)-5,0:_0,5:_5,10:_10,15:_15,17:?
* YLABEL=[move(-5,0)],y
* XLABEL=[move(+30,+83)],x
*
*...................................................................
+PLOT TX1000,x,y / DEVICE=PS,k2.ps
* CONTOUR=[BLUE][line_width(0.30)],0.8,0.95,0.99
*   TREND=[RED][line_width(2.000)],0
*...................................................................
+PLOT TX1000,x,y / DEVICE=PS,k3.ps
* CONTOUR=[GREEN][line_width(1.0)],0
*...................................................................
+PLOT TX1000,x,y / DEVICE=PS,k4.ps
*   TREND=[CYAN][line_width(0.30)],X
*...................................................................
*
*Combine the PostScript files and create the PDF file:
+EPS JOIN fig09.ps k1 k2 k3 k4
+/GS-PDF  fig09.ps / (opens automatically with the default reader)

[picture]
Note: Figure 8.1 (referred to in above text) is much like Figure 0.9, but with standardized variables.
**************** Figure 8.2 is created in Survo as follows: ********************
*
*Let us generate 1000 observations from two-dimensional
*normal distribution, where the correlation matrix is I
*and the expectation vector is 0:
*
*/ACTIVATE +
*
+FILE CREATE NORA1
*Sample of 1000 observations from a bivariate normal distribution
*with E(X)=E(Y)=0 S(X)=S(Y)=1 kor(X,Y)=0.7 (Mustonen 1992, p.265)
*FIELDS:
*1 N 4 X
*2 N 4 Y
*3 N 4 YHAT
*END
*
+FILE INIT NORA1,1000
+VAR X,Y TO NORA1  /   roo=0.7
*           U=probit(rand(1009))   V=probit(rand(1009))
*           X=U                    Y=roo*U+sqrt(1-roo*roo)*V
*
+VAR VAKIO=0 TO NORA1 / explicit constant (needed in plots below)
*
*In the following, we use so called 'fence lines', where the command
*(here: LINREG) always makes space for its results in the edit field:
*(by default it would overwrite anything starting from line CUR+2)
*
+#LINREG NORA1,CUR+2  /  VARS=X(X),Y(Y),YHAT(P)
*
*Means, std.devs and correlations of NORA1  N=1000
*Variable  Mean        Std.dev.
*X         0.005202    1.006446
*Y        -0.009663    0.982219
*Correlations:
*             X       Y
* X            1.0000  0.7103
* Y            0.7103  1.0000
*
*Linear regression analysis: Data NORA1, Regressand Y         N=1000
*Variable Regr.coeff.    Std.dev.    t     beta
*X         0.693215       0.021745  31.88  0.710
*constant -0.013269       0.021874 -0.607
*Variance of regressand Y=0.964754677 df=999
*Residual variance=0.478471749 df=998
*R=0.7103 R^2=0.5045
*###################################################################
*
*Some *GLOBAL* specifications (shared by the PLOTs below):
*
* *pen=[move(0,0)][Swiss(8)] PEN=*pen LINETYPE=*pen FRAME=6
* HEADER= SIZE=450,450 XDIV=0,1,0 YDIV=0,1,0 POINT=0,1
* XLABEL= YLABEL=
*
*...................................................................
*
+PLOT NORA1,X,Y / DEVICE=PS,k1.ps TEXT=T4 T4=[black],(a),-40,-40
*   LINE=6 LINE2=[RED][line_width(0.30)][line_type(0)],X,VAKIO
*
* Fine-tuning the scale notations, labels and locations:
* YSCALE=[move(10,000)],-4:?,-3(1)-1,0:_0,1:_1,2:_2,3:_3,4:?
* XSCALE=[move(-14,-8)],-4:?,-3(1)-1,0:_0,1:_1,2:_2,3:_3,4:?
* YLABEL=[move(-5,0)],y
* XLABEL=[move(+30,+83)],x
*
*...................................................................
+PLOT Y(X)=0 /    DEVICE=PS,k2.ps X=[line_width(0.72)],-4,4
* SCALE=-4,4 FRAME=0
*...................................................................
*
+PLOT NORA1,X,Y / DEVICE=PS,k3.ps TEXT=T5 T5=(),(b),-40,-40
* LINE=6 LINE2=[BLUE][line_width(0.3)][line_type(0)],X,YHAT
*
* YSCALE=[move(10,000)],-4:?,-3(1)-1,0:_0,1:_1,2:_2,3:_3,4:?
* XSCALE=[move(-14,-8)],-4:?,-3(1)-1,0:_0,1:_1,2:_2,3:_3,4:?
* YLABEL=[move(-5,0)],y
* XLABEL=[move(+30,+83)],x                        TICK=5,5
*
*  TREND=[BLACK][line_type(0)][line_width(1)],0
*CONTOUR=[BLACK][line_type(0)][line_width(1)],0.99
*
*...................................................................
+EPS JOIN fig82a.ps k1 k2
+EPS JOIN fig82b.ps k3 k3
*
+/GS-PDF fig82a.ps
+/GS-PDF fig82b.ps

Reference:
Mustonen, S. (1992). Survo - An Integrated Environment for Statistical Computing and Related Areas.
Survo Systems, Helsinki. (494 pp.)

[picture]

[picture]

**************** Figures 19.2 and 19.3 are created in Survo as follows: ********
*
*Here are the twelve observations we are working with:
*
*DATA TWELVE
*nro   x y
*1     1 2
*2     1 3
*3     1 4
*4     2 1
*5     2 3
*6     2 4
*7     3 1
*8     3 2
*9     3 4
*10    4 1
*11    4 2
*12    4 3
*
*We make certain computations and calculations with matrices and scalars
*in the edit field. After some preparations we plot the three figures.
*
*CORR TWELVE / MASK=-AA  creates two matrices: CORR.M and MSN.M
*
*MAT R!=CORR.M       / *R~R(TWELVE) S2*2
*MAT STDEV!=MSN.M(*,2)
*MAT S!=DV(STDEV)*R*DV(STDEV) / covariance matrix
*MAT LOAD S
*MATRIX S
*///             x        y
*x         1.36364 -0.45455
*y        -0.45455  1.36364
*
*MAT LOAD R
*MATRIX R
*///             x        y
*x         1.00000 -0.33333
*y        -0.33333  1.00000
*
*MAT SPECTRAL DECOMPOSITION OF S TO T,L
*MAT LOAD T
*MATRIX T
*S(S)
*///           ev1      ev2
*x         0.70711  0.70711
*y        -0.70711  0.70711
*
*MAT LOAD L
*MATRIX L
*L(S)
*///      eigenval
*ev1      1.818182
*ev2      0.909091
*
*................................
*
*FILE COPY TWELVE TO NEW TWELVEF / create a data file
*
*VAR XC=x-2.5 TO TWELVEF / center x & y
*VAR YC=y-2.5 TO TWELVEF / (subtract the means)
*
*FILE LOAD  TWELVEF
*DATA TWELVEF*,A,B,C
C nro x y          XC          YC
A   1 1 2      -1.500      -0.500
*   2 1 3      -1.500       0.500
*   3 1 4      -1.500       1.500
*   4 2 1      -0.500      -1.500
*   5 2 3      -0.500       0.500
*   6 2 4      -0.500       1.500
*   7 3 1       0.500      -1.500
*   8 3 2       0.500      -0.500
*   9 3 4       0.500       1.500
*  10 4 1       1.500      -1.500
*  11 4 2       1.500      -0.500
B  12 4 3       1.500       0.500
*
*........................................
*
*MAT SAVE DATA TWELVEF TO UC / MASK=---AA
*MAT SAVE DATA TWELVEF TO U  / MASK=-AA--
*
*MAT LOAD UC
*MATRIX UC
*///            XC       YC
*  1      -1.50000 -0.50000
*  2      -1.50000  0.50000
*  3      -1.50000  1.50000
*  4      -0.50000 -1.50000
*  5      -0.50000  0.50000
*  6      -0.50000  1.50000
*  7       0.50000 -1.50000
*  8       0.50000 -0.50000
*  9       0.50000  1.50000
* 10       1.50000 -1.50000
* 11       1.50000 -0.50000
* 12       1.50000  0.50000
*
*MAT LOAD U
*MATRIX U
*///             x        y
*  1             1        2
*  2             1        3
*  3             1        4
*  4             2        1
*  5             2        3
*  6             2        4
*  7             3        1
*  8             3        2
*  9             3        4
* 10             4        1
* 11             4        2
* 12             4        3
*
*.......................................................................
*
*LINREG TWELVE,CUR+2  /  VARS=x(X),y(Y)
*
*Means, std.devs and correlations of TWELVE  N=12
*Variable  Mean        Std.dev.
*x         2.500000    1.167748
*y         2.500000    1.167748
*Correlations:
*             x       y
* x            1.0000 -0.3333
* y           -0.3333  1.0000
*
*Linear regression analysis: Data TWELVE, Regressand y         N=12
*Variable Regr.coeff.    Std.dev.    t     beta
*x        -0.333333       0.298142 -1.118 -0.333
*constant  3.333333       0.816497  4.082
*Variance of regressand y=1.363636364 df=11
*Residual variance=1.333333333 df=10
*R=0.3333 R^2=0.1111
*.......................................................................
*   sx=15/11 =1.3636363636364
*
*/RANEW / (a special sucro [Survo macro] programmed by the first author)
*..............
*  Modifying regression output based on LINREG
*
*
*  REGRESSION ANALYSIS
*             Data: TWELVE   n=12   k=1
*       Y-variable: y
*      X-variables: x
*
* SS                  df       MS=SS/df               F=MSR/MSE
* SSR=1.666666674     dfr=1    MSR=1.666666674        F=1.2500000058125
* SSE=13.33333333     dfe=10   MSE=1.333333333
* SST=15.000000004    dft=11   MST=1.363636364
*   R^2=0.111111  R=0.333333                  P-value for F: Fp=0.28969
* 1-R^2=0.888888
*
*Variable  Regr.coeff.       Std.dev.    t-value  P-value  Stcoef
*const.    b0=3.333333   se0=0.816497   t0=4.082   0.0022
*x        b1=-0.333333   se1=0.298142  t1=-1.118   0.2897  -0.333
*
*.......................................................................
*
*MAT UROTA!=UC*T  /  rotate U by T (the eigenvectors) created above
*
*MAT LOAD UROTA
*MATRIX UROTA
*///           ev1      ev2
*  1      -0.70711 -1.41421
*  2      -1.41421 -0.70711
*  3      -2.12132  0.00000
*  4       0.70711 -1.41421
*  5      -0.70711  0.00000
*  6      -1.41421  0.70711
*  7       1.41421 -0.70711
*  8       0.70711 -0.00000
*  9      -0.70711  1.41421
* 10       2.12132 -0.00000
* 11       1.41421  0.70711
* 12       0.70711  1.41421
*
*MAT TXUU=ZER(12,4)  /  combine the matrices
*MAT TXUU(1,1)=UC
*MAT TXUU(1,3)=UROTA
*MAT LOAD TXUU
*MATRIX TXUU
*0&UC&UROTA
*///            XC       YC      ev1      ev2
*  1      -1.50000 -0.50000 -0.70711 -1.41421
*  2      -1.50000  0.50000 -1.41421 -0.70711
*  3      -1.50000  1.50000 -2.12132  0.00000
*  4      -0.50000 -1.50000  0.70711 -1.41421
*  5      -0.50000  0.50000 -0.70711  0.00000
*  6      -0.50000  1.50000 -1.41421  0.70711
*  7       0.50000 -1.50000  1.41421 -0.70711
*  8       0.50000 -0.50000  0.70711 -0.00000
*  9       0.50000  1.50000 -0.70711  1.41421
* 10       1.50000 -1.50000  2.12132 -0.00000
* 11       1.50000 -0.50000  1.41421  0.70711
* 12       1.50000  0.50000  0.70711  1.41421
*
*MAT TXUU(0,1)=x   /  change the labels
*MAT TXUU(0,2)=y
*MAT TXUU(0,3)=w
*MAT TXUU(0,4)=z
*
*MAT LOAD TXUU
*MATRIX TXUU
*0&UC&UROTA
*///             x        y        w        z
*  1      -1.50000 -0.50000 -0.70711 -1.41421
*  2      -1.50000  0.50000 -1.41421 -0.70711
*  3      -1.50000  1.50000 -2.12132  0.00000
*  4      -0.50000 -1.50000  0.70711 -1.41421
*  5      -0.50000  0.50000 -0.70711  0.00000
*  6      -0.50000  1.50000 -1.41421  0.70711
*  7       0.50000 -1.50000  1.41421 -0.70711
*  8       0.50000 -0.50000  0.70711 -0.00000
*  9       0.50000  1.50000 -0.70711  1.41421
* 10       1.50000 -1.50000  2.12132 -0.00000
* 11       1.50000 -0.50000  1.41421  0.70711
* 12       1.50000  0.50000  0.70711  1.41421
*
*MAT T1!=T(*,1)
*MAT PP!=TXUU(*,1:2)*T1*T1'
*MAT LOAD PP
*MATRIX PP
*///             x        y
*  1      -0.50000  0.50000
*  2      -1.00000  1.00000
*  3      -1.50000  1.50000
*  4       0.50000 -0.50000
*  5      -0.50000  0.50000
*  6      -1.00000  1.00000
*  7       1.00000 -1.00000
*  8       0.50000 -0.50000
*  9      -0.50000  0.50000
* 10       1.50000 -1.50000
* 11       1.00000 -1.00000
* 12       0.50000 -0.50000
*
*.......................................................................
*
*Add some extra points so that connecting the points with LINE=1
*will create exactly the lines that are needed in Figure 19.3(b):
*
*MATRIX TXAA
*///                  x        y
*  1           -1.50000 -0.50000
*  2           -0.50000  0.50000
*  3           -1.00000  1.00000
*  4           -1.50000  0.50000
*  5           -1.00000  1.00000
*  6           -1.50000  1.50000
*  7           -1.50000  1.50000
*  8           -1.50000  1.50000
*  9            0.50000 -0.50000
*  0           -0.50000 -1.50000
*  1            0.50000 -0.50000
*  2           -0.50000  0.50000
*  3           -0.50000  0.50000
*  4           -0.50000  0.50000
*  5           -1.00000  1.00000
*  6           -0.50000  1.50000
*  7           -1.00000  1.00000
*  8            1.00000 -1.00000
*  9            0.50000 -1.50000
*  0            1.00000 -1.00000
*  1            0.50000 -0.50000
*  2            0.50000 -0.50000
*  3            0.50000 -0.50000
*  4           -0.50000  0.50000
*  5            0.50000  1.50000
*  6           -0.50000  0.50000
*  7            1.50000 -1.50000
*  8            1.50000 -1.50000
*  9            1.50000 -1.50000
*  0            1.00000 -1.00000
*  0            1.50000 -0.50000
*  1            1.00000 -1.00000
*  2            0.50000 -0.50000
*  3            1.50000  0.50000
*
*MAT SAVE TXAA
*
*...................................
*
*SORT z,t,cur+1 / (just sorting the lines when building the above matrix)
*  11111111
z  -1.50000 -0.50000
*  -1.50000  0.50000
*  -1.50000  1.50000
*  -1.50000  1.50000
*  -1.00000  1.00000
*  -1.00000  1.00000
*  -0.50000  0.50000
*  -0.50000 -1.50000
*  -0.50000  0.50000
*  -0.50000  0.50000
*  -0.50000  1.50000
*  -0.50000  0.50000
*   0.50000 -0.50000
*   0.50000 -1.50000
*   0.50000 -0.50000
*   0.50000 -0.50000
*   0.50000  1.50000
*   0.50000 -0.50000
*   1.00000 -1.00000
*   1.00000 -1.00000
*   1.50000 -1.50000
*   1.50000 -1.50000
*   1.50000 -0.50000
t   1.50000  0.50000
*
*.......................................................................
*
*                                          Now, let's begin plotting!
*/ACTIVATE +
+ PLOT TXUU.MAT,x,y / DEVICE=PS,k1.ps
*  CONTOUR=*Contour
*    TREND=*Trend
*   YLABEL=*ylabel
*   XLABEL=*xlabel
*
a   YSCALE=[move(10,0000)],-3:?,-2,-1,0:_0,1:_1,2:_2,3:?
*   XSCALE=[move(-14,-08)],-3:?,-2,-1,0:_0,1:_1,2:_2,3:?
*  *ylabel=[move(-5,0000)],y *zlabel=[move(-5,0000)],z
*  *xlabel=[move(+30,+83)],x *wlabel=[move(+30,+83)],w
*    FRAME=6
*       Ta=[BLACK],(a),-50,-40
*       Tb=[BLACK],(b),-50,-40
*
b *Contour=[RED][line_width(0.30)],0.8
*  *PrAxes=[GREEN][line_width(0.30)],0
*   *Trend=[CYAN][line_width(2.00)],0
*     *pen=[move(0,0)][Swiss(8)]
*      PEN=*pen
* LINETYPE=*pen      The specifications on lines a-b-c-d
*   HEADER=          may be accessed from elsewhere by the
*    POINT=3,8       new SPECS specification (ver.3.26+)
*     SIZE=450,450   They have a high precedence (only the
*     XDIV=0,1,0     activated (CUR) line precedes them).
c     YDIV=0,1,0     (cf. "GLOBAL" technique used above)
*
*   YSCALE=-3,3
*   XSCALE=-3,3
*   YLABEL=
*   XLABEL=
d    FRAME=0
*..................
+ PLOT TXUU.MAT,x,y / DEVICE=PS,k2.ps SPECS=b,d
*  CONTOUR=*PrAxes
*
+ EPS JOIN fig192.ps k1 k2
+ /GS-PDF  fig192.ps
*.......................................................................
*/ACTIVATE #
# PLOT TXUU.MAT,w,z / DEVICE=PS,k1.ps SPECS=a,c
*  CONTOUR=*Contour
*    TREND=*Trend
*   YLABEL=*zlabel
*   XLABEL=*wlabel
*     TEXT=Ta
*..................
# PLOT TXUU.MAT,w,z / DEVICE=PS,k2.ps SPECS=b,d
*  CONTOUR=*PrAxes
*
# EPS JOIN fig193a.ps k1 k2
# /GS-PDF  fig193a.ps
*.......................................................................
*/ACTIVATE $
$ PLOT TXAA.MAT,x,y / DEVICE=PS,k1.ps SPECS=a,c POINT=3,1
*   YLABEL=*ylabel
*   XLABEL=*xlabel
*     LINE=1
*     TEXT=Tb
*..................
$ PLOT TXUU.MAT,x,y / DEVICE=PS,k2.ps SPECS=b,d POINT=_
*  CONTOUR=*Contour
*..................
$ PLOT TXUU.MAT,x,y / DEVICE=PS,k3.ps SPECS=b,d
*  CONTOUR=*PrAxes
*
$ EPS JOIN fig193b.ps k1 k2 k3
$ /GS-PDF  fig193b.ps
*.......................................................................

The original Survo edit fields were created by Simo. The work schemes were fine-tuned twice by Kimmo Vehkalahti: first for the publication, and then for these examples. The last example was build merely for demonstrating the possibilities of the new SPECS specification. It is quite different from the original edit fields, but shows how the user of Survo (or Muste) may set his/her work schemes very freely, and still obtain exactly the same results as before.



Matrix Tricks for Linear Statistical Models: Our Personal Top Twenty  |  Site updated: 2012-01-03