Hama测试问题记录

1. DiskVerticesInfo类的使用

生成数据:  hama jar hama-examples-0.6.4.jar gen symmetric 100 20 /hamaInput  20

使用DiskVerticesInfo类,0.6.3报错:


测试:hama  jar hama-examples-0.6.4.jar pagerank  /hamaInput /hamaOutputPK6


结果:

14/06/26 21:32:37 INFO ipc.Server: Stopping IPC Server listener on 52257
14/06/26 21:32:37 DEBUG ipc.Server: Checking for old call responses.
14/06/26 21:32:37 INFO ipc.Server: Stopping IPC Server Responder
14/06/26 21:32:37 ERROR bsp.BSPTask: Shutting down ping service.
14/06/26 21:32:37 FATAL bsp.GroomServer: Error running child
java.lang.IllegalArgumentException: Messages must never be behind the vertex in ID! Current Message ID: 100016 vs. 20
at org.apache.hama.graph.GraphJobRunner.iterate(GraphJobRunner.java:306)
at org.apache.hama.graph.GraphJobRunner.doSuperstep(GraphJobRunner.java:254)
at org.apache.hama.graph.GraphJobRunner.bsp(GraphJobRunner.java:145)
at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:191)
at org.apache.hama.bsp.BSPTask.run(BSPTask.java:146)
at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1249)
java.lang.IllegalArgumentException: Messages must never be behind the vertex in ID! Current Message ID: 100016 vs. 20
at org.apache.hama.graph.GraphJobRunner.iterate(GraphJobRunner.java:306)
at org.apache.hama.graph.GraphJobRunner.doSuperstep(GraphJobRunner.java:254)
at org.apache.hama.graph.GraphJobRunner.bsp(GraphJobRunner.java:145)
at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:191)
at org.apache.hama.bsp.BSPTask.run(BSPTask.java:146)
at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1249)

改成0.6.4版本后就好了


2.  PartitioningRunner导致的多文件资源读写不足

tmp/hama-parts/job_201406301636_0001/part-19/file-19 could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
查到datanode的日志里,显示:
java.io.FileNotFoundException: /hadoop-1.2.0/data/blocksBeingWritten/blk_10093574377987397_21088.meta (Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
at org.apache.hadoop.hdfs.server.datanode.FSDataset.createBlockWriteStreams(FSDataset.java:1156)


原因: 比如Pagerank作业,有20个任务,由于有PartitioningRunner的分区,导致每个任务都要对获取的20个分区分别建立文件进行归并排序,也就是说HDFS同时要容纳20x20个文件写入(考虑到对应的块,读写压力会更大),我这里测试的时候就因为这个原因导致HDFS报错了

即使将linux的ulimit调到10240也不行

最后没招,只能将文件块大小改大了,最大任务数限制在10,这样才好.


3. 数据问题?
14/07/02 10:38:41 ERROR bsp.BSPTask: Error running bsp setup and bsp function.
java.lang.IllegalArgumentException: A message has recieved with a destination ID: 100003880356367481247 that does not exist! (Vertex iterator is at100003935752877898316 ID)                                                       100001890836869132179
at org.apache.hama.graph.GraphJobRunner.iterate(GraphJobRunner.java:298)
at org.apache.hama.graph.GraphJobRunner.doSuperstep(GraphJobRunner.java:250)
at org.apache.hama.graph.GraphJobRunner.bsp(GraphJobRunner.java:145)
at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:171)
at org.apache.hama.bsp.BSPTask.run(BSPTask.java:144)
at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1243)
14/07/02 10:38:41 DEBUG bsp.Counters: Adding TASK_OUTPUT_RECORDS


原因: 前期排序时由于最终调用Text的比较类时,原数据类型不匹配,导致比较的结果和实际不符。

原理:

 1. PartionRunner首先调用SequenceFile自带的Sorter对各个文件内部排序,

  SequenceFile调用MergeSort对单个文件进行快排(?),直交换待比较数组的ID坐标,实际比较时为:

      class SeqFileComparator implements Comparator<IntWritable> {
        public int compare(IntWritable I, IntWritable J) {
          
          return comparator.compare(rawBuffer, keyOffsets[I.get()], 
                                    keyLengths[I.get()], rawBuffer, 
                                    keyOffsets[J.get()], keyLengths[J.get()]);
        }
      }

   // rawBuffer是rawKey产生的字节数组,keyOffsets是key的初始值, I即为顶点在数组的ID, 非常聪明的方法!

 //即  DataOutputBuffer rawKey;  XXX ;  byte[]  rawBuffer = rawKeys.getData();

   实际调用Text类,即测试用例的顶点类的比较方法,我修改了hadoop1.2-core的Text中比较方法以适应google的较长的边数据。同时,为了注意字节流和字符串的转换,有下面的方法,可理解rawData中的内容:

        byte[] key1 = new byte[40] ;

        System.arraycopy(b1,s1, key1 , 0, l1);

        String k1 = new String(key1).trim(); 

        BigDecimal bd1 = new BigDecimal(k1);

        OK,这样就能看到该字节数组的内容了!

调试hama子进程时,类似hadoop. 不过hadoop中childjvm经常由于时间较短而来不及远程断点调试,hama则比较顺利。

 <property>
    <name>bsp.child.java.opts</name>
    <value>-Xmx1024m -Xdebug -Xrunjdwp:transport=dt_socket,address=8792,server=y,suspend=y</value>//注意! 这里参数添加时不能有多余的空格(如-Xmx1024m -Xdebug之间只能有一个空格!),否则会报错!!!
    <description>Java opts for the groom server child processes.
    The following symbol, if present, will be interpolated: @taskid@ is replaced
    by current TaskID. Any other occurrences of '@' will go unchanged.
    For example, to enable verbose gc logging to a file named for the taskid in
    /tmp and to set the heap maximum to be a gigabyte, pass a 'value' of:
          -Xmx1024m -verbose:gc -Xloggc:/tmp/@[email protected]
    The configuration variable bsp.child.ulimit can be used to control the
    maximum virtual memory of the child processes.
    </description>
  </property>


注:这里测试最后发现,即使是对于Long型无法表示的类型数值变量,如需要BIgInteger的表示的,Text类型也可以正确比较大小,前面的基本上没用。。。。。。 教训了

 从文件中删除固定行:

 sed -i '/Love/d' 1.txt

你可能感兴趣的:(Hama测试问题记录)