XLConnect:一个用R处理Excel文件的高效平台

XLConnect:一个用R处理Excel文件的高效平台

read.table(),read.csv(),read.delim()等函数可以直接读取EXCEl文件,但或多或少总会遇到一些问题。XLConnect函数包,是一个可以用R处理Excel文件的高效平台。利用它可以读取或创建一个XLSX文件,并对文件进行数据处理,对文本内数据进行标记,以及可视化。

创建读取xlsl文件

require("XLConnect")
## Loading required package: XLConnect
## Loading required package: XLConnectJars
## XLConnect 0.2-11 by Mirai Solutions GmbH [aut],
##   Martin Studer [cre],
##   The Apache Software Foundation [ctb, cph] (Apache POI, Apache Commons
##     Codec),
##   Stephen Colebourne [ctb, cph] (Joda-Time Java library)
## http://www.mirai-solutions.com ,
## http://miraisolutions.wordpress.com
# 读取或创建一个XLSX文件,此步相当于建立一个连接
xls <- loadWorkbook('C:/Users/ShangFR/Desktop/test.xlsx',create=TRUE) 

# 创建工作表
createSheet(xls,name='namesheet')

# 写入数据
writeWorksheet(xls,iris,'namesheet',
               startRow=5,startCol=5, # 数据出现的左上角位置
               header=TRUE)

# 存入硬盘,直到此步方才有文档生成
saveWorkbook(xls)

# 上面四个步骤是新建文档、新建工作表、写入数据、最后存盘。如果要写入数据的同时创建好区域名称,则在第三步有所不同。

# 创建区域名
createName(xls,name='nameregion',
           formula='namesheet!$C$5', #区域的左上角单元格位置
           overwrite=TRUE)

# 写入数据
writeNamedRegion(xls,iris,name='nameregion')

# 读取文档则简单的多
data <- readWorksheet(xls, 'namesheet',
              startRow=1, startCol=1,
              endRow=0,endCol=0, #取0表示自动判断
              header=TRUE)

文件内数据标记、处理和可视化

一、创建汇率excel

#一、创建汇率excel

require(XLConnect)
require(zoo)
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## 
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
require(ggplot2) # >= 0.9.3
## Loading required package: ggplot2
curr = XLConnect::swissfranc
curr = curr[order(curr$Date),]
wbFilename = "swiss_franc.xlsx"
wb = loadWorkbook(wbFilename, create = TRUE)
 # Create a new sheet named 'Swiss_Franc'
 sheet = "Swiss_Franc"
createSheet(wb, name = sheet)
# Create a new Excel name referring to the top left corner
 # of the sheet 'Swiss_Franc' - this name is going to hold
 # our currency data
dataName = "currency"
nameLocation = paste(sheet, "$A$1", sep = "!")
 createName(wb, name = dataName, formula = nameLocation)
 # Instruct XLConnect to only apply a data format for a cell
   # but not to apply any other cell styling
 setStyleAction(wb, XLC$"STYLE_ACTION.DATA_FORMAT_ONLY")
 # Set the default format for numeric data to display
   # four digits after the decimal point
setDataFormatForType(wb, type = XLC$"DATA_TYPE.NUMERIC", format = "0.0000")
 # Write the currency data to the named region created above
   # Note: the named region will be automatically redefined to encompass all
   # written data
 writeNamedRegion(wb, data = curr, name = dataName, header = TRUE)
 # Save the workbook (this actually writes the file to disk)
saveWorkbook(wb)

#二、颜色标记-特殊值 

 # Load the workbook created above
 wb = loadWorkbook(wbFilename)
  # Create a cell style for the header row
 csHeader = createCellStyle(wb, name = "header")
setFillPattern(csHeader, fill = XLC$FILL.SOLID_FOREGROUND)
setFillForegroundColor(csHeader, color = XLC$COLOR.GREY_25_PERCENT)
  # Create a date cell style with a custom format for the Date column
csDate = createCellStyle(wb, name = "date")
setDataFormat(csDate, format = "yyyy-mm-dd")
  # Create a highlighting cell style
csHlight = createCellStyle(wb, name = "highlight")
setFillPattern(csHlight, fill = XLC$FILL.SOLID_FOREGROUND)
 setFillForegroundColor(csHlight, color = XLC$COLOR.CORNFLOWER_BLUE)
  # Apply header cell style to the header row
setCellStyle(wb, sheet = sheet, row = 1,
                   col = seq(length.out = ncol(curr)),
                   cellstyle = csHeader)
  # Index for all rows except header row
allRows = seq(length = nrow(curr)) + 1
  # Apply date cell style to the Date column
setCellStyle(wb, sheet = sheet, row = allRows, col = 1,cellstyle = csDate)
  # Set column width such that the full date column is visible
setColumnWidth(wb, sheet = sheet, column = 1, width = 2800)
 # Check if there was a change of more than 2% compared
    # to the previous day (per currency)
idx = rollapply(curr[, -1], width = 2,
                     FUN = function(x) abs(x[2] / x[1] - 1),
                     by.column = TRUE) > 0.02
idx = rbind(rep(FALSE, ncol(idx)), idx)
widx = lapply(as.data.frame(idx), which)
  # Apply highlighting cell style
for(i in seq(along = widx)) {
     if(length(widx[[i]]) > 0) {
        setCellStyle(wb, sheet = sheet, row = widx[[i]] + 1, col = i + 1,cellstyle = csHlight)
     }
        }
 saveWorkbook(wb)
 
#三、添加汇率趋势图
 
 wb = loadWorkbook(wbFilename)
  # Stack currencies into a currency variable (for use with ggplot2 below)
    currencies = names(curr)[-1]
  gcurr = reshape(curr, varying = currencies, direction = "long",
                    v.names = "Value", times = currencies, timevar = "Currency")
  # Create a png graph showing the currencies in the context
    # of the Swiss Franc
    png(filename = "swiss_franc.png", width = 800, height = 600)
  p = ggplot(gcurr, aes(Date, Value, colour = Currency)) +
    geom_line() + stat_smooth(method = "loess") +
    scale_y_continuous("Exchange Rate CHF/CUR") +
    labs(title = paste0("CHF vs ", paste(currencies, collapse = ", ")),
           x = "") +
    theme(axis.title.y = element_text(size = 10, angle = 90, vjust = 0.3))
  print(p)
  dev.off()
## png 
##   2
  p

  # Define where the image should be placed via a named region;
    # let's put the image two columns left to the data starting
    # in the 5th row
    createName(wb, name = "graph",
                 formula = paste(sheet, idx2cref(c(5, ncol(curr) + 2)), sep = "!"))
  # Put the image created above at the corresponding location
    addImage(wb, filename = "swiss_franc.png", name = "graph",
               originalSize = TRUE)
  saveWorkbook(wb)

XLConnect的帮助文档内有详细介绍,感兴趣的可直接参考。

反馈与建议

你可能感兴趣的:(XLConnect:一个用R处理Excel文件的高效平台)