SequenceFile的不足

SequenceFile 的Reader 用于读取sequencefile文件。
private Reader(FileSystem fs, Path file, int bufferSize, long start,
long length, Configuration conf, boolean tempReader)
throws IOException {
this.file = file;
this.in = openFile(fs, file, bufferSize, length);
this.conf = conf;
seek(start);
this.end = in.getPos() + length;
init(tempReader);
}


private void init(boolean tempReader) throws IOException {
byte[] versionBlock = new byte[VERSION.length];
in.readFully(versionBlock);

if ((versionBlock[0] != VERSION[0]) ||
(versionBlock[1] != VERSION[1]) ||
(versionBlock[2] != VERSION[2]))
throw new IOException(file + " not a SequenceFile");

当读取其他格式的时候,在Reader的构造函数中this.in = openFile(fs, file, bufferSize, length);会先创建到datanode的tcp连接,然后读取文件开头的部分,当读取的不是sequencefile格式时,构造函数最后面的init会抛出异常,所以Reader构造失败,但是Reader的in没有close,这块代码需要修改一下比较合适。


Hive Server Leaking File Descriptors
http://search-hadoop.com/m/epe4Y1920w32/Hive+Server+Leaking+File+Descriptors&subj=Hive+Server+Leaking+File+Descriptors+


log4j.logger.org.apache.hadoop.hdfs.StateChange=DEBUG,nnstate
log4j.additivity.org.apache.hadoop.hdfs.StateChange=false
log4j.appender.nnstate=org.apache.log4j.RollingFileAppender
log4j.appender.nnstate.MaxFileSize=500MB
log4j.appender.nnstate.MaxBackupIndex=30
log4j.appender.nnstate.BufferedIO=true
log4j.appender.nnstate.BufferSize=16384
log4j.appender.nnstate.File=${hadoop.log.dir}/stateChange.log
log4j.appender.nnstate.layout=org.apache.log4j.PatternLayout
log4j.appender.nnstate.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
在NameNode的$HADOOP_CONF_DIR/log4j.properties 加这一项吧,以后好查一下原因

你可能感兴趣的:(hadoop,hive)