Sign In/My Account | View Cart  


AddThis Social Bookmark Button


Analyzing Statistics with GNU R
Pages: 1, 2, 3, 4, 5

When you pass only one vector of data to the R plot() function, it makes an x/y plot using the point index as the x value and the specified data vector points as the y value. Because the data file lists the data in historical order, the plot shows the value of the S&P 500 index over time.

That's not bad for three lines of code! However, you can make a prettier, better annotated plot easily by using the plot() function's optional arguments. For example, to add a title, a subtitle, and axis labels, enter:

> plot(sp500value, type="l", main="S&P 500", 
      sub="[3-Jan-1995 thru 29-Jul-2005]", 
      xlab="Day", ylab="Value")

This produces the plot shown in Figure 4.

Annotated S&P 500 close price plot
Figure 4. An annotated S&P 500 close price plot

Proceeding further, you could use R's date and axis classes to produce an x-axis that uses the dates stored in column 1 of the data frame for labels. See the R documentation for details.

After generating a plot, R provides options for adding new data. The 90-day moving average is plotted on stock index graphs published in the Wall Street Journal. A moving average is the average value of the preceding n data items. How about displaying the moving average of the S&P 500 over the preceding 90 days?

In R's nomenclature, a "moving average" is a "filter" (an equation) applied to a "time series" (the S&P 500 index values). R's filter() function is complex, providing many different data processing options. Fortunately, the actual commands for creating a 90-day moving average data set are tiny compared with what standard programming languages might require:

> coef90 = 1/90
> ma90 = filter(sp500value, rep(coef90, 90), sides=1)

The first line defines a weighting factor for the data in the filter: each day's S&P 500 value will represent 1/90 of the moving average. The second line creates the moving average data set. The rep() function "repeats" the 1/90 coefficient 90 times (including 90 days of S&P 500 data in the moving average). The sides=1 parameter specifies to include only the trailing data points in the moving average (which is how financial moving averages are always calculated, because we cannot foresee the future).

Add the moving average data (variable ma90) to the existing plot as green line using the R lines() function:

> lines(ma90, col="green")

Figure 5 shows the result.

S&P 500 close price and 90-day moving average plot
Figure 5. S&P 500 close price and 90-day moving average

Pages: 1, 2, 3, 4, 5

Next Pagearrow