1. DiskVerticesInfo类的使用
生成数据: hama jar hama-examples-0.6.4.jar gen symmetric 100 20 /hamaInput 20
使用DiskVerticesInfo类,0.6.3报错:
测试:hama jar hama-examples-0.6.4.jar pagerank /hamaInput /hamaOutputPK6
结果:
14/06/26 21:32:37 INFO ipc.Server: Stopping IPC Server listener on 52257
14/06/26 21:32:37 DEBUG ipc.Server: Checking for old call responses.
14/06/26 21:32:37 INFO ipc.Server: Stopping IPC Server Responder
14/06/26 21:32:37 ERROR bsp.BSPTask: Shutting down ping service.
14/06/26 21:32:37 FATAL bsp.GroomServer: Error running child
java.lang.IllegalArgumentException: Messages must never be behind the vertex in ID! Current Message ID: 100016 vs. 20
at org.apache.hama.graph.GraphJobRunner.iterate(GraphJobRunner.java:306)
at org.apache.hama.graph.GraphJobRunner.doSuperstep(GraphJobRunner.java:254)
at org.apache.hama.graph.GraphJobRunner.bsp(GraphJobRunner.java:145)
at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:191)
at org.apache.hama.bsp.BSPTask.run(BSPTask.java:146)
at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1249)
java.lang.IllegalArgumentException: Messages must never be behind the vertex in ID! Current Message ID: 100016 vs. 20
at org.apache.hama.graph.GraphJobRunner.iterate(GraphJobRunner.java:306)
at org.apache.hama.graph.GraphJobRunner.doSuperstep(GraphJobRunner.java:254)
at org.apache.hama.graph.GraphJobRunner.bsp(GraphJobRunner.java:145)
at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:191)
at org.apache.hama.bsp.BSPTask.run(BSPTask.java:146)
at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1249)
改成0.6.4版本后就好了
2. PartitioningRunner导致的多文件资源读写不足
tmp/hama-parts/job_201406301636_0001/part-19/file-19 could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
查到datanode的日志里,显示:
java.io.FileNotFoundException: /hadoop-1.2.0/data/blocksBeingWritten/blk_10093574377987397_21088.meta (Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
at org.apache.hadoop.hdfs.server.datanode.FSDataset.createBlockWriteStreams(FSDataset.java:1156)
原因: 比如Pagerank作业,有20个任务,由于有PartitioningRunner的分区,导致每个任务都要对获取的20个分区分别建立文件进行归并排序,也就是说HDFS同时要容纳20x20个文件写入(考虑到对应的块,读写压力会更大),我这里测试的时候就因为这个原因导致HDFS报错了
即使将linux的ulimit调到10240也不行
最后没招,只能将文件块大小改大了,最大任务数限制在10,这样才好.
3. 数据问题?
14/07/02 10:38:41 ERROR bsp.BSPTask: Error running bsp setup and bsp function.
java.lang.IllegalArgumentException: A message has recieved with a destination ID: 100003880356367481247 that does not exist! (Vertex iterator is at100003935752877898316 ID) 100001890836869132179
at org.apache.hama.graph.GraphJobRunner.iterate(GraphJobRunner.java:298)
at org.apache.hama.graph.GraphJobRunner.doSuperstep(GraphJobRunner.java:250)
at org.apache.hama.graph.GraphJobRunner.bsp(GraphJobRunner.java:145)
at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:171)
at org.apache.hama.bsp.BSPTask.run(BSPTask.java:144)
at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1243)
14/07/02 10:38:41 DEBUG bsp.Counters: Adding TASK_OUTPUT_RECORDS
原因: 前期排序时由于最终调用Text的比较类时,原数据类型不匹配,导致比较的结果和实际不符。
原理:
1. PartionRunner首先调用SequenceFile自带的Sorter对各个文件内部排序,
SequenceFile调用MergeSort对单个文件进行快排(?),直交换待比较数组的ID坐标,实际比较时为:
class SeqFileComparator implements Comparator<IntWritable> {
public int compare(IntWritable I, IntWritable J) {
return comparator.compare(rawBuffer, keyOffsets[I.get()],
keyLengths[I.get()], rawBuffer,
keyOffsets[J.get()], keyLengths[J.get()]);
}
}
// rawBuffer是rawKey产生的字节数组,keyOffsets是key的初始值, I即为顶点在数组的ID, 非常聪明的方法!
//即 DataOutputBuffer rawKey; XXX ; byte[] rawBuffer = rawKeys.getData();
实际调用Text类,即测试用例的顶点类的比较方法,我修改了hadoop1.2-core的Text中比较方法以适应google的较长的边数据。同时,为了注意字节流和字符串的转换,有下面的方法,可理解rawData中的内容:
byte[] key1 = new byte[40] ;
System.arraycopy(b1,s1, key1 , 0, l1);
String k1 = new String(key1).trim();
BigDecimal bd1 = new BigDecimal(k1);
OK,这样就能看到该字节数组的内容了!
调试hama子进程时,类似hadoop. 不过hadoop中childjvm经常由于时间较短而来不及远程断点调试,hama则比较顺利。
<property>
<name>bsp.child.java.opts</name>
<value>-Xmx1024m -Xdebug -Xrunjdwp:transport=dt_socket,address=8792,server=y,suspend=y</value>//注意! 这里参数添加时不能有多余的空格(如-Xmx1024m -Xdebug之间只能有一个空格!),否则会报错!!!
<description>Java opts for the groom server child processes.
The following symbol, if present, will be interpolated: @taskid@ is replaced
by current TaskID. Any other occurrences of '@' will go unchanged.
For example, to enable verbose gc logging to a file named for the taskid in
/tmp and to set the heap maximum to be a gigabyte, pass a 'value' of:
-Xmx1024m -verbose:gc -Xloggc:/tmp/@[email protected]
The configuration variable bsp.child.ulimit can be used to control the
maximum virtual memory of the child processes.
</description>
</property>
注:这里测试最后发现,即使是对于Long型无法表示的类型数值变量,如需要BIgInteger的表示的,Text类型也可以正确比较大小,前面的基本上没用。。。。。。 教训了
从文件中删除固定行:
sed -i '/Love/d' 1.txt