Fundamentals of Programming in SAS. James Blum
2.7.3: Using a Format to Control Categories for a Variable in the TABLE Statement
First mortgage monthly payment | ||||
MortgagePayment | Frequency | Percent | CumulativeFrequency | CumulativePercent |
None | 603691 | 52.08 | 603691 | 52.08 |
$350 and Below | 59856 | 5.16 | 663547 | 57.25 |
$351 to $1000 | 283111 | 24.43 | 946658 | 81.67 |
$1001 to $1600 | 128801 | 11.11 | 1075459 | 92.79 |
Over $1600 | 83603 | 7.21 | 1159062 | 100.00 |
The FREQ procedure is not limited to one-way frequencies—special operators between variables in the TABLE statement allow for construction of multi-way tables.
2.7.2 Multi-Way Tables in PROC FREQ
The * operator constructs cross-tabular summaries for two categorical variables, which includes the following statistics:
cross-tabular and marginal frequencies
cross-tabular and marginal percentages
conditional percentages within each row and column
Program 2.7.4 summarizes all combinations of Metro and MortgagePayment, with Metro formatted to add detail and MortgagePayment formatted into the bins used in the previous example.
Program 2.7.4: Using the * Operator to Create a Cross-Tabular Summary with PROC FREQ
proc format;
value METRO
0 = “Not Identifiable”
1 = “Not in Metro Area”
2 = “Metro, Inside City”
3 = “Metro, Outside City”
4 = “Metro, City Status Unknown”
;
value Mort
0=’None’
1-350=”$350 and Below”
351-1000=”$351 to $1000”
1001-1600=”$1001 to $1600”
1601-high=”Over $1600”
;
run;
proc freq data=BookData.IPUMS2005Basic;
table Metro*MortgagePayment;
format Metro Metro. MortgagePayment Mort.;
run;
The first variable listed in any request of the form A*B is placed on the rows in the table. Requesting MortgagePayment*Metro transposes the table and the included summary statistics.
The format applied to the Metro variable is merely a change in display and has no effect on the structure of the table—it is five rows with or without the format. The format on MortgagePayment is essential to the column structure—allowing each unique value of MortgagePayment to form a column does not produce a useful summary table.
Output 2.7.4: Using the * Operator to Create a Cross-Tabular Summary with PROC FREQ
Table of METRO by MortgagePayment | ||||||
METRO(Metropolitan status) | MortgagePayment(First mortgage monthly payment) | |||||
FrequencyPercentRow PctCol Pct | None | $350 and Below | $351 to $1000 | $1001 to $1600 | Over $1600 | Total |
Not Identifiable | 493794.2653.668.18 | 69790.607.5811.66 | 254882.2027.709.00 | 73070.637.945.67 | 28750.253.123.44 | 920287.94 |
Not in Metro Area | 13431411.5958.2022.25 | 216981.879.4036.25 | 609485.2626.4121.53 | 104640.904.538.12 | 33510.291.454.01 | 23077519.91 |
Metro, Inside City | 964878.3262.5015.98 | 44100.382.867.37 | 288662.4918.7010.20 | 140491.219.1010.91 | 105560.916.8412.63 | 15436813.32 |
Metro, Outside City | 14996112.9443.9824.84 | 121481.053.5620.30 | 793886.8523.2828.04 | 563304.8616.5243.73 | 431553.7212.6651.62 | 34098229.42 |
Metro, City Status Unknown | 17355014.9750.9128.75 | 146211.264.2924.43 | 884217.6325.9431.23 | 406513.5111.9231.56 | 236662.046.9428.31 | 34090929.41 |
Total | 60369152.08 | 598565.16 | 28311124.43 | 12880111.11 | 836037.21 | 1159062100.00 |
Various options are available to control the displayed statistics. Program 2.7.5 illustrates some of these with the result shown in Output 2.7.5.
Program 2.7.5: Using Options in the TABLE Statement.
proc freq data=BookData.IPUMS2005Basic;
table Metro*MortgagePayment / nocol nopercent format=comma10.;
format Metro Metro. MortgagePayment Mort.;
run;
NOCOL and NOPERCENT suppress the column and overall percentages, respectively, with NOPERCENT also applying to the marginal totals. NOROW and NOFREQ are also available, with NOFREQ also applying to the marginal totals.
A format can be applied to the frequency statistic; however, this only applies to cross-tabular frequency tables and has no effect in one-way tables.
Output 2.7.5: Using Options in the TABLE Statement
Table of METRO by MortgagePayment | ||||||
METRO(Metropolitan status) | MortgagePayment(First mortgage monthly payment) | |||||
FrequencyRow Pct | None | $350 and Below | $351 to $1000 | $1001 to $1600 | Over $1600 | Total |
Not Identifiable | 49,37953.66 | 6,9797.58 | 25,48827.70 | 7,3077.94 | 2,8753.12 | 92,028 |
Not in Metro Area | 134,31458.20 | 21,6989.40 | 60,94826.41 | 10,4644.53 | 3,3511.45 | 230,775 |
Metro, Inside City | 96,48762.50 | 4,4102.86 | 28,86618.70 | 14,0499.10 | 10,5566.84 | 154,368 |
Metro, Outside City | 149,96143.98 | 12,1483.56 | 79,38823.28 | 56,33016.52 | 43,15512.66 | 340,982 |
Metro, City Status Unknown | 173,55050.91 | 14,6214.29 | 88,42125.94 | 40,65111.92 | 23,6666.94 | 340,909 |
Total | 603,691 | 59,856 | 283,111 | 128,801 | 83,603 | 1,159,062 |
Higher dimensional requests can be made; however, they are constructed as a series of two-dimensional tables. Therefore, a request of A*B*C in the TABLE statement creates the B*C table for each level of A, while a request of A*B*C*D makes the C*D table for each combination of A and B, and so forth. Program 2.7.6 generates a three-way table, where a cross-tabulation of Metro and HomeValue is built for each level of Mortgage Status as shown in Output 2.7.6. The VALUE statement that defines the character format $MortStatus takes advantage of the fact that value ranges are legal for character variables. Be sure to understand the difference between uppercase and lowercase letters when ordering the values of a character variable.
Program 2.7.6: A Three-Way Table in PROC FREQ
proc format;
value MetroB
0 = “Not Identifiable”
1 = “Not in Metro Area”
other = “In a Metro Area”
;
value $MortStatus
‘No’-’Nz’=’No’
‘Yes’-’Yz’=’Yes’
;
value Hvalue
0-65000=’$65,000 and Below’
65000<-110000=’$65,001 to $110,000’
110000<-225000=’$110,001 to $225,000’
225000<-500000=’$225,001 to $500,000’
500000-high=’Above $500,000’