This document outlines the basic features and syntax of Mx. It is an abbreviated version of the Mx manual.
This HTML version was created by Wayne Hadady from a hardcopy edition. This HTML version is now and will likely remain more up-to-date than the hardcopy version.
An HTML-version of the main manual is in work.
Mx is public domain. You may download versions for most systems from griffin.vcu.edu which has executables for most platforms along with documentation and examples. The latter have been extended to include all the appendices from the Neale & Cardon book.
There is world-wide-web access to Mx:
http://griffin.vcu.edu/html/mx/mxhomepage.html is the Mx home page. With a suitable browser, you can obtain the program, documentation and examples, send comments, and see the latest developments. Email bug reports, requests for further information, and most important your comments and suggestions for improvements to neale@ruby.vcu.edu.
Mx development is supported by NIH grant RR08123. We are very grateful to all those who made comments and suggestions for improving the program. Please keep the comments coming!
refer to the Mx manual as:
Neale, M.C. (1994). Mx: Statistical Modeling. Box 710 MCV, Richmond, VA 23298: Department of Psychiatry. 2nd edition.
For technical support, please use one of the following methods of contact,
listed in order of preference:
Currently Mx is available for MSDOS/ WINDOWS 3.1 (or higher) for IBM compatible 386 PCs and above, several unix systems (e.g. Vax BSD4.3, Sun-OS, IBM (RS-6000) AIX, Dec Ultrix, OSF/1, HP), and VAX VMS.
We recommend that input files have the naming convention cutename.mx where cutename is a name of your choice. To run Mx on a PC, create an input script and type:
mx cutename.mx {cutename.mxo}
if you are running DOS. If you use WINDOWS, you can either associate files with the .mx extension with the mx.exe file and double click the input file or launch the input file on the Mx icon. Feedback of function evaluations is printed on the screen and an outputfile cutename.mxo is automatically created.
Mx may be used very simply in UNIX by typing:
mx <inputfile >outputfile &
where the parameter & is to run the job in batch. Other command-line interfaces may be available. Refer to the notes in the distribution for details.
Mx can be run either interactively or as a batch job under VMS with:
$ mx cutename
With this syntax, the output will be in a file called cutename.mxo
Mx reads three things: keywords, parameters and numbers. Keywords must start on a new line. Parameters pass information to commands, e.g. filenames or numbers. Comments can be put anywhere following !. Characters after column 1200 and blank lines are ignored. The syntax described for commands follows these conventions:
alternatives are represented by /
optional parameters or keywords are enclosed by { and }
items to be substituted according to the specific application are enclosed
by < and >
Mx has been written for multiple groups, since genetically informative data generally comprise information on different types of relatives which form distinct groups. Several Mx programs can be run in one file. Minimum specifications per group are capitalized:
In multivariate modeling it is quite common that the same matrix dimensions are used in many different parts of a script. #define statements can minimize the number of changes required.
After declaring matrices, the user is free to enter values and parameter specifications for the matrices prior to giving a matrix formula for covariance structure or means.
In many cases breaking up a complicated matrix algebra expression into smaller parts can improve readability or efficiency or both. The begin algebra... end algebra declares new matrices as matrix functions. Each matrix that appears on the left hand side of the = sign is newly defined in this group (it must not have been previously defined).
The title line is purely for the user's reference, it is printed when Mx prints the parameter specifications and the parameter estimates for a group. The title is recognized by its location rather than a keyword at the start of a line.
The group-type line specifies type of group: Data for data group, Calculation for calculation group; Constraint for constraint group. The first group must specify number of groups NG. Other parameters are number of input variables NI and number of observations NO.
Several different commands are used to read summary statistics from either the input script or an external file, if File= appears. Covariance matrices (CM) and correlation matrices (KM or PM for matrices of polyserial or polychoric correlations) are by default symmetric. Full indicates that a full matrix is supplied.
To use asymptotic or diagonally weighted least squares, it is necessary to read a weight matrix. Mx expects a PRELIS output type matrix of asymptotic covariances (AC) or asymptotic variances (AV). The inverse of the asymptotic matrix (AI) could be used also.
Two types of raw data can be read: a rectangular matrix of balanced raw data (RE) using . (dots) as missing value indicators. Unbalanced data with many missing values may be read with a variable length file (VL). A variable length file contains, per vector, the number of input variables k (on a separate line) and identification codes and observed data for k variables.
This feature allows new types of analysis with VL or rectangular data files. Essentially, some of the variables may be assigned as definition variables which may subsequently be used as case-specific parameters of the model. Mx uses the Definition keyword to identify variables that are to be used in the model; they are extracted from the data set so modeling is restricted to the other variables. Definition variables are referenced as -1, -2 etc. in Specification statements. A new covariance matrix is computed for each case, and then the usual raw data log-likelihood function is computed.
Frequency data are read in a two-way contingency table (CT) which the requires number of rows and columns on same line. Mx will automatically handle incomplete ascertainment (when flagged by a negative number for cells that are not ascertained).
In the data-reading section (before Matrices command) the Means keyword reads a vector of observed means.
Labels may be read for the input variables, and either labels or variable numbers may be used to select variables. A / must end the select command. After matrices has been declared, either within the Matrices section or in an Algebra section, labels may be provided for the rows or columns of defined matrices.
The ICodes command may be used with the RM fit function to specify a non-standard structure of the expected covariance matrix.
Matrices are defined by their name, type and dimensions (rows & columns) for use in current or subsequent groups. Possible types, structure, shape and number of usable elements are given.
_________________________________________________ Type Structure Shape # of Usable Elements ------------------------------------------------- Zero Null Any 0 Unit Unit Any 0 Iden Identity Square 0 IZero Identity|Zero Any 0 ZIden Zero|Identity Any 0 Diag Diagonal Square r SDiag Subdiagonal Square r(r-1)/2 Stand Standardized Square r(r-1)/2 Symm Symmetric Square r(r+1)/2 Lower Lower triangular Square r(r+1)/2 Full Full Any rxc _________________________________________________
r is the number of rows and c the number of columns of the matrix.
Number of free elements indicates how many elements should be supplied. All usable elements of matrices are initialized at zero and are fixed parameters, unless the Free keyword is used, in which case each usable element is specified as a different free parameter.
________________________________________________ Symbol Matrix Quantity Dimensions ------------------------------------------------ %On Observed covariance/data NIn x NIn %En Expected covariance matrix NIn x NIn %Mn Expected mean vector 1 x NIn %Pn Expected proportions NRn x NCn %Fn Function value 1 x 1 ________________________________________________
NIn: number of input variables in group n following any selection
NR and NC: number of rows and columns in a contingency table
Special codes exist for constraining a matrix to equal one previously computed or defined in group n. Note that none of the equalities may refer to groups that appear after the current group.
When matrices are declared with the Matrices command, a special type, computed, may be used to equate to a matrix which was defined within the Algebra section of a previous group. Row and column dimensions are set to those of the previously calculated matrix, and may be omitted when declaring a matrix as computed.
This command equates all matrices in a group to those of a previous group.
_____________________________________________________________ Keyword Function Restric- Result tions Size ------------------------------------------------------------- \tr() Trace r=c 1x1 \det() Determinant r=c 1x1 \sum() Sum None 1x1 \prod() Product None 1x1 \max() Maximum None 1x1 \min() Minimum None 1x1 \abs() Absolute value None rxc \cos() Cosine None rxc \cosh() Hyperbolic cosine None rxc \sin() Sin None rxc \sinh() Hyperbolic sin None rxc \tan() Tan None rxc \tanh() Hyperbolic tan None rxc \exp() Exponent (e**A) None rxc \ln() Natural logarithm None rxc \sqrt() Square root None rxc \d2v() Diagonal to Vector None min(r,c)x1 \v2d() Vector to Diagonal r=1 or max(r,c)x c=1 max(r,c) \m2v() Matrix to Vector None rcx1 \vec() Matrix to Vector* None rcx1 \vech() Lower triangle to Vector None rcx1 \stnd() Standardize matrix r=c rxc \eval() Real eigenvalues r=c rxc \evec() Real eigenvectors r=c rxr \ival() Imaginary eigenvalues r=c rx1 \ivec() Imaginary eigenvectors r=c rxr \mean() Mean of columns None 1xc \cov() Covariance of columns None cxc \mnor() Multivariate normal integral r=c+3 1x1 \momnor() Moments of multiv normal r=c+3 1x1 \aorder() Ascending sort order rx1 rx1 \dorder() Descending sort order rx1 rx1 \sortr() Row sort None rxmax (1,c-1) \sortc() Column sort None max (1,r-1)xc \part() Extract part of matrix** None Variable _____________________________________________________________
*vec: vectorizes by columns, whereas m2v vectorizes by rows.
**part: part(A,B) takes two arguments. Elements of the 1x4 matrix B define a rectangle in A to be extracted.
Matrix functions are defined by a keyword starting with a backslash and followed by an argument enclosed in parentheses. This argument may be a single matrix name or a complex matrix formula. The expression within parentheses will be evaluated prior to the function evaluation, e.g., \tr(A*B)
_____________________________________________ Symbol Function Example Priority --------------------------------------------- ~ Inversion A~ 1 ' Transposition A' 1 ^ Element powering A^B 2 * Multiplication A*B 3 . Dot product A.B 3 @ Kronecker product A@B 3 & Quadratic product A&B 3 % Element division A%B 3 + Addition A+B 4 - Subtraction A-B 4 | Horizontal adhesion A|B 4 _ Vertical adhesion A_B 4 _____________________________________________
Unary and binary matrix operators are used to perform operations on or between matrices, declared by the Matrices command. Operations with lower priority are evaluated first, equal priority operations are carried out from left to right. Parentheses may be used to change order of evaluation.
The Covariance command may be formed with any syntactically correct combination of matrices (specified by Matrices command), operators and functions. The command may extend over several lines and must end in a /. Compute is the recommended keyword for calculation groups, to make reading scripts easier for humans.
A matrix formula for the means may be supplied after the Matrices command. Means will be used in covariance analysis if both a means model and observed means are supplied. Raw data analysis requires a model for means.
Thresholds can be used only when fitting to contingency table data. Special restrictions apply to the dimensions of the matrix calculated in the threshold command. The resulting matrix must have 2 rows and at least d columns where d=max ((r-1),(c-1)) and r and c are the rows and columns of the contingency table. The elements are the predicted row and column thresholds.
The list of numbers must be equal to the number of usable elements of that matrix.
The TO keyword operates differently for start and value, otherwise the commands are synonymous.
When a matrix is declared in the Matrices section, the Free keyword sets all usable elements free.
Specify is a convenient method of defining constraints between parameters. If two elements are given the same value, then the same free parameter is assigned to both elements. A zero indicates that the element does not have a free parameter.
The pattern command requests a different free parameter for every element with a 1. A zero fixes the corresponding matrix element.
Matrix elements are referred to by group, row and column. Fix makes a parameter fixed (if it was free before) and Free makes an element a free parameter to be estimated. Equate passes the value and the parameter specification of the first matrix element in the list to the remaining elements in the list. Elements of matrices in the current group may be specified with 2 subscripts; elements in previous groups must be specified with 3 subscripts. For large models with many constraints it may be more convenient to use the Specification command. See also the Pattern command.
Free parameters in the list will be bounded to lie between low and high. Negative parameter numbers will bound non-linear constraints.
The Options command takes a wide variety of keywords to control the use of non-default fit functions, the amount of statistical output, optimization parameters, filenames for result matrices, etc.
The default fit function for a group is set according to the type of data that are read. Note that the method may change between groups.
___________________________________________________ Input data Default fit function --------------------------------------------------- CMatrix, Kmatrix or PMatrix ML CMatrix, Kmatrix or Pmatrix with Acov AWLS CMatrix, Kmatrix or Pmatrix with Avar DWLS Rawdata, Vlength or Rectangular RM Ctable MLn ___________________________________________________
ML - maximum likelihood
AWLS - asymptotic weighted least squares
DWLS -
diagonal weighted least squares, RM raw maximum likelihood
MLn - maximum
likelihood assuming bivariate normal liability.
The fit function for a group may be user defined. For this, the User-defined keyword must appear on the Options line, and the matrix expression given as the model (Constraint or Covariance command) must evaluate to a scaler. There are no other rules. Use matrix functions and operators to suit. Any of the pre-defined fit functions LS, ML, AWLS, etc., could be specified as user-defined functions, but it is generally less efficient to do so. User-defined functions are recommended only when the default functions are not suitable.
!Example: !User-defined fit function !Fit to a correlation matrix by least squares Data NInput_vars=3 NGroups=1 CMatrix Symm 1 .2 1 .3 .4 1 Matrices A Symm 3 3 = %01 B Stan 3 3 Free Covariances \tr((A-B)*(A-B))/ Option User RSiduals End
When VL or Rectangular files are read, Mx calculates twice the negative log-likelihood of the data for each case. When there are missing data , the appropriate mean vector and covariance matrix is automatically created by Mx for each observation.
Mx estimates thresholds and polychoric correlations from 2-way contingency tables. The covariance structure is necessarily limited to two variables. For an r by c contingency table, there are r-1 row thresholds and c-1 column thresholds that separate the observed categories of individuals. The fit function is based on twice the log-likelihood of the observed frequency data.
By default, Mx will print most numbers with three decimal places, or use exponential format. With NDecimals=n, Mx will print most numbers with n decimal places. By default, Mx prints up to 80 columns of output. With Width=m, this may be changed to m columns of output. At this time, NDecimals and Width may not be used together.
Before describing ways in which Mx output can be increased, we note the valuable keyword NO_Output which prevents printing of all output for a group.
The RSiduals keyword requests that the observed matrix, the expected matrix, and the residuals (O-E) be printed.
If a correlation matrix is read instead of a covariance matrix, the number of statistics provided is usually less than when variances are also given. This option can be used to correct such problems.
Power calculations are useful in experimental design and getting grants. If the function values computed by Mx may be considered as a Chi-squared, then the Power command will compute the power of the study to reject the hypothesis at significance level alpha for the given number of degrees of freedom (df). In addition, the program very kindly works out the total sample size that would be required, given the current proportion of subjects in each group, to reject the hypothesis at various power levels.
The difference in fit between two competing models may be assessed by examining the confidence limits on their Chi-squared statistics. These are computed using the inverse non-central Chi-squared distribution.
The numerical estimates of the hessian matrix of the parameters to provide approximate standard errors on the parameters.
If the parameter THard is set using TH=n where n is a positive integer, Mx will generate random starting values for all parameters and attempt to fit the model again n times. THard can be very useful when exploring the identification of structural equation models. THard with a negative integer requests repeat fits from the solution, resetting the covariance matrix to an identity matrix.
Sometimes used with the randomizing option described above, parameter start values can be jiggled. This option can be useful to nudge Mx away from a saddle point.
Users may supply the results of fitting a null model (usually a simple diagonal model of variances) which will extend the output with other fit indices.
By default, Mx does not test identification of models via examination of the rank of the hessian matrix of parameter estimates. Option check does this, but it can give either false positives or false negatives. Mx computes the eigenvalues and eigenvectors of the hessian matrix, and uses this information to assess potential areas of underidentification.
Mx uses NPSOL to perform numerical optimization in the presence of general linear, non-linear and boundary constraints, obtained from Walter Murray and Philip Gill. The default optimization parameters are suitable for most problems. Examples:
NAG=n creates output file (NAGDUMP.OUT) if n>0
Iterations=n alters maximum numbers of iterations
The multiple keyword may be given in the last group of an Mx script. Following optimization, the program is in multiple fit mode, and will accept commands to alter the model specification. Parameters of the model not changed will restart estimation at the previous solution. The only commands that may be used in multiple fit mode to modify matrices are SP, PA, MA, VA, ST, EQ, FI, FR. An End line is necessary to end the group. A number identifying the group in which a matrix is to be found must be placed directly after the keywords SP, PA and MA, before the letter indicating the matrix.
It is possible to change matrix formulae and other characteristics of a group. Options or matrix formulae supplied after this command would apply to that group.
Drop fixes all occurrences of a parameter with that number to zero or to a specified value.
When the multiple fit option is implemented, the entire input job specification, date, and estimates may be stored in binary format for rapid retrieval and estimation in subsequent fitting of submodels.
Mx will write matrices to files, including %E, expected covariance matrices, %M, expected mean matrices, %P, a series of columns of information about the likelihood of individual data vectors, and %V, a variable length file.
If you specify a RAM model with matrices A, S and F, RAMpath graphics commands may be written to a file for later input for RAMpath to draw a path diagram.
The End command signifies the end of a group.
When Mx runs into something it doesn't understand, it tries to tell you as soon as possible. Sometimes this will be after the error itself, so check the earlier input for warnings & mistakes if the cause is not immediately obvious. The Just WHAT is this command... message usually comes when Mx has been given too many numbers or labels for a matrix. You may get a "non-numeric characters" warning if you supply too few numbers, or Mx may run into the end of the file in the vain search for enough numbers. By and large, the error messages are supposed to inform and amuse a little to make them less user-hostile.
TITLE: Factor Model DATA NGroups=1 NInput_vars=3 NObs=100 Cmatrix Symmetric 1.0 .5 .8 .4 .2 .7 MATRICES A Full 3 1 U Diag 3 3 Free COVARIANCE A*A' + U / Specification A 4 5 0 Start .5 A 1 1 - A 2 1 Options ML END
Double cholesky model !This group calculates, no fitting Calculation NGroups=4 Begin Matrices; H Lower 2 2 Free U Lower 2 2 Free End Matrices; Begin Algebra; A= H*H' / ! Additive Genetic E= U*U' / ! Specific Environment End Algebra; End ! Now get to actual data Group 2: Unmatched twins Data NInput_vars=2 NObservations=449 KMatrix Symmetric 1. -.22 1. ACov File=unm.asy Matrices= Group 1 Covariances (A + E) / End Group 3: MZ twins with cotwins DAta NInput_vars=6 NObservations=456 CMatrix File=mz.cov Select 1 2 4 5/ Matrices= Group 1 Covariances (A+E | A_ A | A+E) / End Group 4: DZ twins with cotwins Data NInput_vars=6 NObservations=357 CMatrix File=dz.cov Labels Ex1 Ne1 De1 Ex2 Ne2 De2 Select ex1 ne1 ex2 ne2 / Matrices= Group 1 H Diag 1 1 Covariances (A+E | H@A _ H@A | A+E) / Matrix Half .5 Start .5 All Boundary -1 1 All Options RSiduals Iterations=200 End
PACE MODEL: MZ twins Data Ninput_vars=2 NGroups=3 CTable 2 2 30 20 19 60 Matrices A Full 2 6 B Full 2 2 I Iden 2 2 P Symm 6 6 T Full 2 1 Thresholds T / Covariances (I-B)~*A*P*A'*(I-B)~/ Specify A 1 2 3 0 0 0 0 0 0 1 2 3 Labels Row A At1 Ct1 Et1 A2 C2 E2 Labels Col A Pt1 Pt2 Start .6 A 1 1 - A 2 6 Specification T 4 5 Pattern B 0 1 1 0 Equate B 2 1 B 1 2 Boundary -.99 .99 6 Matrix P 1 0 1 0 0 1 1 0 0 1 0 1 0 0 1 0 0 0 0 0 1 Options RSidual End DZ twins in PACE model Data Ninput_vars=2 CTable 2 2 30 30 29 50 Matrices A Full 2 6 =A1 B Full 2 2 =B1 I Iden 2 2 P Stand 6 6 T Full 2 1 =T1 Thresholds T / Covariances (I-B)~* A*P*A'* (I-B)~ / Value .5 P 4 1 Value 1 P 5 2 Option RSiduals End Constraint group: a*a+c*c+e*e=1 Data Constraint Ninput=1 Matrices S Full 1 3 I Identity 1 1 Constraint I - S*S' / Specification S 1 2 3 Options Multiple Iterations=300 End Save pace.mxs ! No common environment Drop 2 End ! No interaction Get pace.mxs Value 0 B 1 1 1 - B 1 2 2 End
Estimate means and cholesky Data NInput_vars=3 NGroups=1 VLength File=[neale]unbalanced.raw Matrices M Full 1 3 Free S Lower 3 3 Free Means M / ! Means model Covariances S*S' / Start .7 s 1 1 - s 3 3 Options RM MX%E=unbal.cov THard=-1 End First few lines of unbalanced.raw 2 1 2 0.5550 -1.1114 3 1 2 3 1.6442 -0.1728 3.69 2 1 3 -0.2145 5.01