多线程构建lucene索引遇到的并发锁问题

最近把一个多线程构建lucene索引的程序 从lucene2.9.1版本迁移到3.4版本,索引的时间增加了1/2

 

一开始怀疑是代码问题,从头到尾,检查了一遍代码,没发现会导致性能下降的地方。

 

接着查了下机器负载,发现cpu负载比原来2.9.1版本要低一些,然后增加了线程数,发现load还是上不去。

 

怀疑多线程间有资源竞争了,用jstack 看了一下,果然,发现很多锁等待,如下图:

  $ /opt/taobao/java/bin/jstack 15916|grep -A5 BLOCK

   java.lang.Thread.State: BLOCKED (on object monitor)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.getClassForInterface(AttributeSource.java:77)

        - waiting to lock <0x0000000280000100> (a java.util.WeakHashMap)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.createAttributeInstance(AttributeSource.java:68)

        at org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:280)

        at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:139)

--

   java.lang.Thread.State: BLOCKED (on object monitor)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.getClassForInterface(AttributeSource.java:77)

        - waiting to lock <0x0000000280000100> (a java.util.WeakHashMap)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.createAttributeInstance(AttributeSource.java:68)

        at org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:280)

        at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:139)

--

   java.lang.Thread.State: BLOCKED (on object monitor)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.getClassForInterface(AttributeSource.java:77)

        - waiting to lock <0x0000000280000100> (a java.util.WeakHashMap)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.createAttributeInstance(AttributeSource.java:68)

        at org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:280)

        at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:139)

--

   java.lang.Thread.State: BLOCKED (on object monitor)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.getClassForInterface(AttributeSource.java:77)

        - waiting to lock <0x0000000280000100> (a java.util.WeakHashMap)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.createAttributeInstance(AttributeSource.java:68)

        at org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:280)

        at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:139)

--

   java.lang.Thread.State: BLOCKED (on object monitor)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.getClassForInterface(AttributeSource.java:77)

        - waiting to lock <0x0000000280000100> (a java.util.WeakHashMap)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.createAttributeInstance(AttributeSource.java:68)

        at org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:280)

        at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:140)

--

   java.lang.Thread.State: BLOCKED (on object monitor)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.getClassForInterface(AttributeSource.java:77)

        - waiting to lock <0x0000000280000100> (a java.util.WeakHashMap)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.createAttributeInstance(AttributeSource.java:68)

        at org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:280)

        at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:139)

--

   java.lang.Thread.State: BLOCKED (on object monitor)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.getClassForInterface(AttributeSource.java:77)

        - locked <0x0000000280000100> (a java.util.WeakHashMap)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.createAttributeInstance(AttributeSource.java:68)

        at org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:280)

        at com.taobao.terminator.indexbuilder.doc.LuceneDocMaker$IntRangeTokenStream.<init>(LuceneDocMaker.java:551)

--

   java.lang.Thread.State: BLOCKED (on object monitor)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.getClassForInterface(AttributeSource.java:77)

        - waiting to lock <0x0000000280000100> (a java.util.WeakHashMap)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.createAttributeInstance(AttributeSource.java:68)

        at org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:280)

        at com.taobao.terminator.indexbuilder.doc.LuceneDocMaker$IntRangeTokenStream.<init>(LuceneDocMaker.java:554)

--

   java.lang.Thread.State: BLOCKED (on object monitor)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.getClassForInterface(AttributeSource.java:77)

        - waiting to lock <0x0000000280000100> (a java.util.WeakHashMap)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.createAttributeInstance(AttributeSource.java:68)

        at org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:280)

        at com.taobao.terminator.indexbuilder.doc.LuceneDocMaker$SFTokenStream.<init>(LuceneDocMaker.java:516)

--

   java.lang.Thread.State: BLOCKED (on object monitor)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.getClassForInterface(AttributeSource.java:77)

        - waiting to lock <0x0000000280000100> (a java.util.WeakHashMap)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.createAttributeInstance(AttributeSource.java:68)

        at org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:280)

        at com.taobao.terminator.indexbuilder.doc.LuceneDocMaker$IntRangeTokenStream.<init>(LuceneDocMaker.java:554)

--

   java.lang.Thread.State: BLOCKED (on object monitor)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.getClassForInterface(AttributeSource.java:77)

        - waiting to lock <0x0000000280000100> (a java.util.WeakHashMap)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.createAttributeInstance(AttributeSource.java:68)

        at org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:280)

        at com.taobao.terminator.indexbuilder.doc.LuceneDocMaker$IntRangeTokenStream.<init>(LuceneDocMaker.java:551)

--

   java.lang.Thread.State: BLOCKED (on object monitor)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.getClassForInterface(AttributeSource.java:77)

        - waiting to lock <0x0000000280000100> (a java.util.WeakHashMap)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.createAttributeInstance(AttributeSource.java:68)

        at org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:280)

        at com.taobao.terminator.indexbuilder.doc.LuceneDocMaker$IntRangeTokenStream.<init>(LuceneDocMaker.java:551)

--

   java.lang.Thread.State: BLOCKED (on object monitor)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.getClassForInterface(AttributeSource.java:77)

        - locked <0x0000000280000100> (a java.util.WeakHashMap)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.createAttributeInstance(AttributeSource.java:68)

        at org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:280)

        at com.taobao.terminator.indexbuilder.doc.LuceneDocMaker$IntRangeTokenStream.<init>(LuceneDocMaker.java:551)

--

   java.lang.Thread.State: BLOCKED (on object monitor)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.getClassForInterface(AttributeSource.java:77)

        - locked <0x0000000280000100> (a java.util.WeakHashMap)

        at org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.createAttributeInstance(AttributeSource.java:68)

        at org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:280)

        at com.taobao.terminator.indexbuilder.doc.LuceneDocMaker$IntRangeTokenStream.<init>(LuceneDocMaker.java:551)

 

看了下lucene中的AttributeSource.java的代码. 该类为了加快Attribute的实现类的加载,把接口对应的实现类缓存了起来,如下:

 

privatestaticfinal WeakHashMap<Class<? extends Attribute>, WeakReference<Class<? extends AttributeImpl>>> attClassImplMap =

        new WeakHashMap<Class<? extends Attribute>, WeakReference<Class<? extends AttributeImpl>>>();

 

privatestatic Class<? extends AttributeImpl> getClassForInterface(Class<? extends Attribute> attClass) {

        synchronized(attClassImplMap) {

          final WeakReference<Class<? extends AttributeImpl>> ref = attClassImplMap.get(attClass);

          Class<? extends AttributeImpl> clazz = (ref == null) ? null : ref.get();

          if (clazz == null) {

            try {

              // TODO: Remove when TermAttribute is removed!

              // This is a "sophisticated backwards compatibility hack"

              // (enforce new impl for this deprecated att):

              if (TermAttribute.class.equals(attClass)) {

                clazz = CharTermAttributeImpl.class;

              } else {

                clazz = Class.forName(attClass.getName() + "Impl", true, attClass.getClassLoader())

                  .asSubclass(AttributeImpl.class);

              }

              attClassImplMap.put(attClass,

                new WeakReference<Class<? extends AttributeImpl>>(clazz)

              );

            } catch (ClassNotFoundException e) {

              thrownew IllegalArgumentException("Could not find implementing class for " + attClass.getName());

            }

          }

          return clazz;

        }

      }

 

这样虽然加快了查找实现类的速度,但由于每次获取实现类,都要对静态的attClassImplMap进行加锁,导致多线程的情况下,性能反而更差。

 

为了印证这个想法,看了下2.9.1版本的代码,发现2.9.1版本也一样会对静态变量加锁,跑了下2.9.1的全量程序,用jstack抓了下,发现确实也存在锁,但锁的数量比较少。2.9.1版本相应代码如下:

private static final IdentityHashMap/*<Class<? extends Attribute>,Class<? extends AttributeImpl>>*/ attClassImplMap = new IdentityHashMap();

privatestatic Class getClassForInterface(Class attClass) {

        synchronized(attClassImplMap) {

          Class clazz = (Class) attClassImplMap.get(attClass);

          if (clazz == null) {

            try {

              attClassImplMap.put(attClass, clazz = Class.forName(attClass.getName() + "Impl"));

            } catch (ClassNotFoundException e) {

              thrownew IllegalArgumentException("Could not find implementing class for " + attClass.getName());

            }

          }

          return clazz;

        }

      }

 

比较下,3.4版本的map用的是WeakHashMapkeyvalue都是弱引用,弱引用在内存不够的时候,gc程序会回收占用的内存。而全量程序内存是经常不够的,gc比较频繁,该缓存会经常被回收,这样缓存没命中就常常需要重新加载类,锁的时间比较长,因此性能比2.9.1版本差很多。

 

 

解决办法:

其实3.4版本是高版本,改用WeakHashMap是一种改进,这样可以避免太多数量不确定的Attribute实现类加载到缓存中导致内存消耗过多。 但在我们的全量程序中, Attribute实现类数量是确定的,没有必要用WeakHashMap,因此改为了用IdentityHashMap来缓存Attribute实现类。另外,为了完全避免对缓存的map的锁,采用ThreadLocal,每个线程保存一个缓存map

 

lucene的代码库中下了一个3.4分支的代码,修改如下:

public static ThreadLocal< IdentityHashMap <Class<? extends Attribute>, Class<? extends AttributeImpl>>> tl = new ThreadLocal< IdentityHashMap <Class<? extends Attribute>, Class<? extends AttributeImpl>>>();

 

private  Class<? extends AttributeImpl> getClassForInterface(Class<? extends Attribute> attClass) {

        //synchronized(attClassImplMap) {

      IdentityHashMap<Class<? extends Attribute>, Class<? extends AttributeImpl>>

      tlAttClassImplMap = AttributeSource.tl.get();

      if(tlAttClassImplMap==null)

      {

         tlAttClassImplMap = attClassImplMap;

      }

      synchronized(tlAttClassImplMap) {

          Class<? extends AttributeImpl> clazz = tlAttClassImplMap.get(attClass);

          if (clazz == null) {

            try {

              // TODO: Remove when TermAttribute is removed!

              // This is a "sophisticated backwards compatibility hack"

              // (enforce new impl for this deprecated att):

              if (TermAttribute.class.equals(attClass)) {

                clazz = CharTermAttributeImpl.class;

              } else {

                clazz = Class.forName(attClass.getName() + "Impl", true, attClass.getClassLoader())

                  .asSubclass(AttributeImpl.class);

              }

              tlAttClassImplMap.put(attClass,clazz);

            } catch (ClassNotFoundException e) {

              thrownew IllegalArgumentException("Could not find implementing class for " + attClass.getName());

            }

          }

          return clazz;

        }

      }

 

 

在线程类LuceneDocMakerrun方法中设置ThreadLoal

privatevoid doRun() throws IOException, InterruptedException

    {

      

       AttributeSource.tl.set(new IdentityHashMap <Class<? extends Attribute>, Class<? extends AttributeImpl>>());

         。。。。。。

 

这样每个线程一个map来保存缓存的实现类,就避免了锁。

AttributeSource.java中还有一个地方也是同样的问题,对静态变量knownImplClasses也进行了加锁,解决办法和上面的做法一样。

 

另外在排查过程中还发现了另外两处我们写的代码中也存在锁,都是调用的jdk中同步了静态方法:

 

} elseif (fType.equals("D")) { //

                     Date date = new SimpleDateFormat("yyyyMMdd").parse(svalue);

                     Calendar calendar = Calendar.getInstance()

                     calendar.setTime(date);

                     calendar.set(Calendar.HOUR_OF_DAY, 0);

                     calendar.set(Calendar.MINUTE, 0);

                     calendar.set(Calendar.SECOND, 0);

                     calendar.set(Calendar.MILLISECOND, 0);

}

 

1

java.lang.Thread.State: BLOCKED on java.lang.Class@2fa3ff86 owned by: some_thread

 

2

 at java.util.TimeZone.getDefaultInAppContext(TimeZone.java:723)

 

3

 at java.util.TimeZone.getDefaultRef(TimeZone.java:619)

 

4

 at java.util.Calendar.getInstance(Calendar.java:968)

 

 

解决办法是:把CalendarSimpleDateFormat改为成员变量,不用每条记录都new一个

 

 

 

  

代码修改完后,跑了下,效果很明显,性能比2.9.1的还快一些(2.9.1版本也有锁的,只是锁的时间不长)。

 

 

你可能感兴趣的:(Lucene)