SAS ODS to R markdown/Sweave/knitr

Using standard SAS ODS output to create beautiful reports in R

I left my R programmer position in California in lieu of a position at University of Wisconsin, Madison. I must be crazy, because not only is Madison cold as hell in the winter, but I have to use SAS instead of R in my day to day programming. Ugh.

But good news! SAS’s standard output (“ODS”, whatever the hell that stands for, I’m too lazy to Google it at the moment) is in HTML. So, using R’s wonderful XML package, all the output from your lousy SAS program can be imported straight into R for use in a beautiful markdown/Sweave/knitr report. Impress your coworkers and more importantly, your boss, with a beautiful PDF report of your SAS analysis, complete with commentary and other nonsense you choose to add in. You can even include graphics from the output as well. And the best part is, much like how your reports in R update with changes in the data or analysis, as long as you keep your SAS program in the same format, if the data changes, you need only to re-export the SAS output and re-compile your report.

I started off experimenting with the mtcars dataset. I ran PROC PRINT, PROC FREQ, PROC MEANS, and PROC REG and exported the output to a local folder. It turns out that SAS structures the output from PROC MEANS such that there is only one row in the table for the data and the values are separated by linebreaks rather than broken into individual cells. I tried doing some reformatting in R and in the raw HTML but gave up fairly quickly. This is only a proof of principle anyway. So PROC MEANS was excluded from this report.

It was all fairly easy to get the tables and figures into the report:

  1. Export your SAS output to a directory (default is in a temp directory, so you’ll need to explicity tell SAS where to put it)
  2. Read the HTML tables in via XML::readHTMLTable
  3. Pull in the figures from the directory
  4. Done. Have a beer.

The code looks something like this:

Now you can format your tables:

Ob mk prc mpg rep78 wght lngth fgn
1 AMC 4099 22 3 2930 186 0
2 AMC 4749 17 3 3350 173 0
3 AMC 3799 22 3 2640 168 0
4 Audi 9690 17 5 2830 189 1
5 Audi 6295 23 3 2070 174 1
6 BMW 9735 25 4 2650 177 1
7 Buick 4816 20 3 3250 196 0
8 Buick 7827 15 4 4080 222 0
9 Buick 5788 18 3 3670 218 0
10 Buick 4453 26 3 2230 170 0

Table 1: PROC PRINT

rep78 Freq Pct CumFreq CumPct
2 3 11.54 3 11.54
3 15 57.69 18 69.23
4 6 23.08 24 92.31
5 2 7.69 26 100.00

Table 2: PROC FREQ

Variable DF est SE t-val p-val
Intercept 1 44.96858 9.32268 4.82 <.0001
weight 1 -0.00501 0.00219 -2.29 0.0320
length 1 -0.04306 0.07693 -0.56 0.5813
foreign 1 -1.26921 1.63213 -0.78 0.4451

Table 3: PROC REG

And pull in your plots:

DiagnosticsPanel

ResidualPlot

Leave a Reply