jdk1.5 并发_JDK 5.0中的并发

关于本教程

本教程是关于什么的?

JDK 5.0是用Java语言创建高度可扩展的并发应用程序的重要一步。 JVM进行了改进,以允许类利用硬件级别的并发支持,并且提供了一组丰富的新的并发构建块,以使开发并发应用程序更加容易。

本教程介绍了JDK 5.0提供的新的并发实用程序类,并演示了与现有并发原语( synchronizedwait()notify() )相比,这些类如何提供改进的可伸缩性。

我应该学习本教程吗?

虽然本教程针对广泛的层次,但是假定读者对Java语言提供的线程,并发和并发原语有基本的了解,尤其是语义和正确使用同步。

初学者可能希望首先查阅“ Java线程简介”教程(请参阅参考资料 ),或阅读Java语言上的通用介绍性文本的并发性章节。

java.util.concurrent包中的许多类都使用泛型,因为java.util.concurrent对JDK 5.0 JVM具有其他强大的依赖关系。 不熟悉泛型的用户可能希望查阅JDK 5.0中新的泛型工具的资源。 (对于那些不熟悉泛型的人,您可能会发现在本教程的第一次学习中,忽略类和方法签名中尖括号内的内容会很有用。)

JDK 5.0中的并发性新功能

java.util.concurrent包包含大量线程安全的,经过良好测试的高性能并发构建块。 创建java.util.concurrent的目标非常不恰当,其目的是为了实现并发性,就像Collections框架对数据结构所做的那样。 通过提供一组可靠的高性能并发构建块,开发人员可以提高其并发类的线程安全性,可伸缩性,性能,可读性和可靠性。

如果一些类名看起来很熟悉,很可能是因为许多在概念java.util.concurrent由Doug Lea的派生util.concurrent库(请参阅相关的主题 )。

JDK 5.0中对并发性的改进可以分为三类:

  • JVM级别的更改。 大多数现代处理器都对并发提供某种硬件级别的支持,通常以比较和交换 (CAS)指令的形式。 CAS是一种低级,细粒度的技术,它允许多个线程更新单个内存位置,同时能够检测其他线程并从中恢复。 它是许多高性能并发算法的基础。 在JDK 5.0之前,Java语言中唯一用于协调线程之间访问的原语是同步,它更加繁重且粒度大。 公开CAS使开发高度可伸缩的并发Java类成为可能。 这些更改主要供JDK库类使用,而不供开发人员使用。
  • 低级实用程序类-锁定和原子变量。 使用CAS作为并发原语, ReentrantLock类提供与synchronized原语相同的锁定和内存语义,同时提供对锁定的更好控制(例如定时锁定等待,锁定轮询和可中断的锁定等待)和更好的可伸缩性(在竞争下具有更高的性能)。 )。 大多数开发人员不会直接使用ReentrantLock类,而是会使用在其之上构建的高级类。
  • 高级实用程序类。 这些类实现了每门计算机科学课本中描述的并发构建块-信号量,互斥量,锁存器,屏障,交换器,线程池和线程安全的收集类。 大多数开发人员将能够使用这些类来替换其应用程序中许多(即使不是全部)对sync, wait()notify()的使用,这可能会带来性能,可读性和正确性的好处。

路线图

本教程将主要关注java.util.concurrent包提供的更高级别的实用程序类-线程安全的集合,线程池和同步实用程序。 这些是新手和专家都可以“开箱即用”的类。

在第一部分中,我们将回顾并发的基础知识,尽管它不应替代对线程和线程安全性的理解。 根本不熟悉线程的读者可能应该首先查阅线程的介绍,例如“ Java线程简介”教程(请参阅参考资料 )。

接下来的几节将探讨java.util.concurrent的高级实用程序类-线程安全的集合,线程池,信号量和同步器。

最后几节介绍了java.util.concurrent的低级并发构建块,并提供了一些性能度量,这些度量表明了新的java.util.concurrent类的改进的可伸缩性。

环境要求

java.util.concurrent软件包与JDK 5.0紧密联系在一起。 没有以前的JVM版本的反向移植。 本教程中的代码示例将无法在5.0之前的JVM上编译或运行,并且许多代码示例都使用JDK 5.0中的泛型,增强功能或其他新语言功能。

并发基础

什么是线程?

所有非平凡的操作系统都支持进程的概念-相互独立运行的程序在某种程度上彼此隔离。

线程有时被称为轻量级进程 。 像进程一样,它们是通过程序执行的独立并发执行路径,并且每个线程都有自己的程序计数器,调用堆栈和局部变量。 但是,线程存在于一个进程中,并且它们与同一进程中的其他线程共享内存,文件句柄和每个进程的状态。

如今,几乎每个操作系统都支持线程,从而允许多个可独立调度的执行线程在单个进程中共存。 由于进程中的线程在同一地址空间中执行,因此多个线程可以同时访问相同的对象,并且它们从同一堆中分配对象。 尽管这使线程之间更容易共享信息,但是这也意味着您必须注意确保线程不会相互干扰。

正确使用线程后,线程可以带来多种好处,包括更好的资源利用,简化的开发,更高的吞吐量,响应速度更快的用户界面以及执行异步处理的能力。

Java语言包括用于协调线程行为的原语,以便可以安全地访问和修改共享变量,而不会违反设计不变性或破坏数据结构。

线程有什么用?

在Java程序中使用线程的原因很多,几乎每个Java应用程序都使用线程,无论开发人员是否知道。 许多J2SE和J2EE工具创建线程,例如RMI,Servlet,Enterprise JavaBeans组件和Swing GUI工具箱。

使用线程的原因包括:

  • 响应速度更快的用户界面。 事件驱动的GUI工具包(例如AWT或Swing)使用单独的事件线程来处理GUI事件。 从事件线程内调用向GUI对象注册的事件侦听器。 但是,如果事件侦听器要执行冗长的任务(例如,对文档进行拼写检查),则UI似乎会冻结,因为事件线程将无法处理其他事件,直到完成冗长的任务。 通过在单独的线程中执行冗长的操作,UI可以在执行冗长的后台任务时继续响应。
  • 利用多个处理器。 每年,多处理器(MP)系统变得越来越便宜和普及。 因为调度的基本单位通常是线程,所以单线程应用程序只能一次在单个处理器上运行,而不管有多少个处理器可用。 在设计良好的程序中,多个线程可以通过更好地利用可用的计算资源来提高吞吐量和性能。
  • 建模简单。 有效地使用线程可以使您的程序更易于编写和维护。 通过明智地使用线程,可以将各个类与调度,交错操作,异步IO和资源等待以及其他复杂性等细节隔离开来。 相反,他们可以专注于域需求,从而简化开发并提高可靠性。
  • 异步或后台处理。 服务器应用程序可以同时服务许多远程客户端。 如果应用程序要从套接字读取,并且没有可供读取的数据,则read()的调用将阻塞,直到有可用数据为止。 在单线程应用程序中,这意味着不仅将处理相应的请求停顿,而且在阻塞单个线程时将处理所有请求停顿。 但是,如果每个套接字都有自己的IO线程,则一个线程阻塞将不会影响其他并发请求的行为。

线程安全

确保类是线程安全的是困难的,但如果要在多线程环境中使用这些类,则必须这样做。 java.util.concurrent规范过程的目标之一是提供一组线程安全的高性能并发构建块,以便开发人员减轻编写线程安全类的负担。

显然,很难定义线程安全性,并且大多数定义似乎都是彻头彻尾的循环。 Google的快速搜索显示了以下示例,这些示例是线程安全代码的典型但无用的定义(或更确切地说,描述):

  • 可以从多个编程线程中调用,而不必在线程之间进行不必要的交互。
  • 一次可以由多个线程调用,而无需调用方执行任何其他操作。

有了这样的定义,难怪我们对线程安全如此困惑。 这些定义并不比说“如果可以从多个线程安全地调用一个类,则该类是线程安全的”更好。 当然,这意味着什么,但这并不能帮助我们从不安全的类中分辨出线程安全的类。 我们所说的“安全”是什么意思?

要使一个类具有线程安全性,它首先必须在单线程环境中正确运行。 如果正确地实现了一个类,即表示它符合其规范的另一种说法,则对该类的对象进行的任何操作序列(对公共字段的读取或写入以及对公共方法的调用)都不能放置该对象进入无效状态; 观察对象处于无效状态; 或违反该类的任何不变式,前提条件或后置条件。

此外,要使一个类具有线程安全性,则在从多个线程访问该类时,它必须继续正确运行(按上述意义),而与运行时环境对这些线程的执行进行调度或交织无关,且无任何其他要求调用代码方面的同步。 这样做的结果是,对线程安全对象的操作将以固定的全局一致顺序出现在所有线程上。

在线程之间没有某种明确的协调(例如锁定)的情况下,运行时可以自由地在合适的情况下交错执行多个线程中的操作。

在JDK 5.0之前,确保线程安全的主要机制是synchronized原语。 访问共享变量(可以由多个线程访问的变量)的线程必须使用同步来协调对共享变量的读写访问。 java.util.concurrent包提供了一些备用的并发原语,以及一组不需要额外同步的线程安全实用程序类。

并发,勉强地

即使您的程序从未明确创建线程,也可以通过各种工具或框架来代表您创建线程,要求从这些线程调用的类必须是线程安全的。 这可能会给开发人员带来巨大的设计和实现负担,因为开发线程安全类比开发非线程安全类需要更多的关注和分析。

AWT和秋千
这些GUI工具包创建了一个称为事件线程的后台线程,将从中调用在GUI组件中注册的侦听器。 因此,实现这些侦听器的类必须是线程安全的。

计时器任务
JDK 1.3中引入的TimerTask工具使您可以在以后执行任务或安排任务以定期执行。 TimerTask事件在Timer线程中执行,这意味着作为TimerTask执行的任务必须是线程安全的。

Servlet和JavaServer Pages技术
Servlet容器创建多个线程,并且可以针对多个线程中的多个请求同时调用给定的Servlet。 因此,Servlet类必须是线程安全的。

RMI
远程方法调用(RMI)工具允许您调用在其他JVM中运行的操作。 实现远程对象的最常见方法是扩展UnicastRemoteObject 实例化UnicastRemoteObject ,会将其注册到RMI调度程序,该调度程序可以创建一个或多个线程,在这些线程中将执行远程方法。 因此,远程类必须是线程安全的。

如您所见,即使您的应用程序从未显式创建线程,在许多情况下也可能会从其他线程调用类。 幸运的是, java.util.concurrent的类可以大大简化编写线程安全类的任务。

示例-非线程安全的servlet

下面的servlet看起来像一个无害的留言簿servlet,它保存了每个访问者的名字。 但是,此servlet不是线程安全的,并且servlet应该是线程安全的。 问题在于它使用HashSet来存储访问者的姓名,并且HashSet不是线程安全的类。

当我们说这个servlet不是线程安全的时,不利之处不仅限于丢失留言簿条目。 在最坏的情况下,我们的留言簿数据结构可能会无法恢复。

public class UnsafeGuestbookServlet extends HttpServlet {
    
    private Set visitorSet = new HashSet();

    protected void doGet(HttpServletRequest httpServletRequest, 
             HttpServletResponse httpServletResponse) throws ServletException, 
               IOException {
        String visitorName = httpServletRequest.getParameter("NAME");
        if (visitorName != null)
            visitorSet.add(visitorName);
    }
}

通过将visitorSet的定义更改为可以使该类成为线程安全的

private Set visitorSet = Collections.synchronizedSet(new HashSet());

像这样的示例说明了对线程的内置支持是一把双刃剑-尽管它使构建多线程应用程序变得更加容易,但它也要求开发人员更加意识到并发问题,即使在开发诸如留言簿Servlet。

线程安全的集合

介绍

JDK 1.2中引入的Collections框架是一个高度灵活的框架,用于使用基本接口ListSetMap表示对象的集合。 JDK提供了每种实现的几种实现( HashMapHashtableTreeMapWeakHashMapHashSetTreeSetVectorArrayListLinkedList等)。 其中一些已经是线程安全的( HashtableVector ),其余的可以由同步包装工厂( Collections.synchronizedMap()synchronizedList()synchronizedSet() )呈现为线程安全的。

java.util.concurrent包添加了几个新的线程安全的集合类( ConcurrentHashMapCopyOnWriteArrayListCopyOnWriteArraySet )。 这些类的目的是提供基本集合类型的高性能,高可伸缩性,线程安全的版本。

java.util的线程安全集合仍然有一些缺点。 例如,通常有必要在迭代时保持对集合的锁定,否则可能会引发ConcurrentModificationException (此特性有时称为条件线程安全性 ; 有关更多说明,请参见参考资料。)此外,如果从多个线程频繁访问集合,则这些类的性能通常很差。 java.util.concurrent的新集合类以更高的并发性为代价,但需要对语义进行一些小的更改。

JDK 5.0还提供了两个新的收集接口QueueBlockingQueue Queue接口类似于List ,但是只允许在尾部插入,并且只能从头部移出。 通过消除List的随机访问要求,可以创建比现有ArrayListLinkedList实现更好的性能的Queue实现。 由于List许多应用程序实际上不需要随机访问,因此Queue经常可以代替List ,从而获得更好的性能。

弱一致性迭代器

java.util包中的collections类均返回快速失败迭代器,这意味着它们假定在线程迭代其内容期间,collection不会更改其内容。 如果快速失败迭代器检测到在迭代过程中进行了修改,则会抛出ConcurrentModificationException ,这是未经检查的异常。

对于许多并发应用程序,在迭代期间不更改集合的要求通常很不方便。 相反,与java.util.concurrent集合类中的迭代器一样,允许并发修改并确保迭代器简单地做出合理的努力以提供集合的一致视图可能是更好的选择。

java.util.concurrent集合返回的迭代器称为弱一致性迭代器。 对于这些类,如果自迭代开始以来已删除了一个元素,但next()方法尚未返回该元素,则该元素不会返回给调用者。 如果自迭代开始以来已添加元素,则该元素可能会也可能不会返回给调用方。 而且,无论底层集合如何更改,任何元素都不会在一次迭代中返回两次。

CopyOnWriteArrayList和CopyOnWriteArraySet

您可以通过两种方式创建线程安全的,由数组支持的List Vector或使用Collections.synchronizedList()包装ArrayList java.util.concurrent包添加了笨拙的CopyOnWriteArrayList 为什么我们要一个新的线程安全List类? 为什么Vector不够用?

简单的答案与迭代和并发修改之间的交互有关。 使用Vector或同步的List包装器,返回的迭代器是快速失败的,这意味着,如果在迭代过程中任何其他线程修改了List,则迭代可能会失败。

Vector一个非常常见的应用程序是存储在组件中注册的侦听器列表。 当发生适当的事件时,组件将遍历侦听器列表,并调用每个侦听器。 为了防止ConcurrentModificationException ,迭代线程必须在整个迭代过程中复制列表或锁定列表-两者都具有巨大的性能成本。

CopyOnWriteArrayList类通过每次添加或删除元素时都创建一个新的后备数组副本来避免此问题,但是正在进行的迭代仍在处理创建迭代器时的当前副本。 尽管复制也要付出一些代价,但在许多情况下,迭代要远远超过修改,并且在这些情况下,写时复制比其他方法具有更好的性能和并发性。

如果您的应用程序需要Set而不是List的语义,那么还有一个Set版本CopyOnWriteArraySet

并发哈希图

正如已经存在线程安全的List实现一样,您可以通过多种方式创建基于线程安全的基于哈希的Map - Hashtable并使用Collections.synchronizedMap()包装HashMap JDK 5.0添加了ConcurrentHashMap实现,该实现提供相同的基本线程安全Map功能,但大大提高了并发性。

简单的方法,以通过两取同步HashtablesynchronizedMap -同步于每个方法Hashtable或同步Map包装对象-有两个主要缺陷。 这是可伸缩性的一个障碍,因为一次只能有一个线程访问哈希表。 同时,这不足以提供真正的线程安全性,因为许多常见的复合操作仍然需要额外的同步。 尽管诸如get()put()类的简单操作可以安全地完成而无需额外的同步,但是存在几种常见的操作序列,例如迭代或不存在put,它们仍需要外部同步以避免数据竞争。

HashtableCollections.synchronizedMap通过同步每个方法来实现线程安全。 这意味着,当一个线程正在执行其中一个Map方法时,无论其他线程想要对Map做什么,其他线程都必须等到第一个线程完成后才能执行。

相比之下, ConcurrentHashMap允许多次读取几乎总是同时执行,读取和写入通常可以同时执行,并且多个同时写入通常可以同时执行。 当多个线程需要访问同一个Map时,结果会带来更高的并发性。

在大多数情况下, ConcurrentHashMap可以代替HashtableCollections.synchronizedMap(new HashMap()) 但是,有一个显着差异-在ConcurrentHashMap实例上进行ConcurrentHashMap不会锁定地图以供独占使用。 实际上,没有方法可以将ConcurrentHashMap锁定为专用,因为它被设计为可同时访问。 为了弥补该集合不能被锁定以独占使用的事实,提供了用于常见复合操作的其他(原子)方法,例如,如果不存在,则被提供。 ConcurrentHashMap返回的迭代器是弱一致性的,这意味着它们将不会抛出ConcurrentModificationException并且会做出“合理的努力”以反映其他线程在迭代过程中对Map所做的修改。

队列

原始的集合框架包括三个接口ListMapSet List描述了元素的有序集合,支持完全随机访问-可以从任何位置添加,获取或删除元素。

LinkedList类通常用于存储工作元素(等待执行的任务)的列表或队列。 但是, List接口提供的灵活性远远超过此普通应用程序所需的灵活性,后者通常仅在尾部插入元素,而从头部删除元素。 但是,要支持完整的List接口,就意味着LinkedList不能像其他方式那样有效地完成此任务。 Queue接口比List简单得多-它仅包含put()take()方法,并且比LinkedList启用更有效的实现。

Queue接口还允许实现确定元素存储的顺序。 ConcurrentLinkedQueue类实现先进先出(FIFO)队列,而PriorityQueue类实现优先级队列(也称为堆),这对于构建必须按优先级或所需执行时间顺序执行任务的调度程序很有用。 。

interface Queue extends Collection {
    boolean offer(E x);
    E poll();
    E remove() throws NoSuchElementException;
    E peek();
    E element() throws NoSuchElementException;
}

实现Queue的类为:

  • 已对LinkedList进行改进以实现Queue
  • PriorityQueue一种非线程安全的优先级队列(堆)实现,根据自然顺序或比较器返回元素
  • ConcurrentLinkedQueue快速,线程安全,无阻塞的FIFO队列

阻塞队列

Queues可以是有界的或无界的。 当您尝试将元素添加到已满的队列中或尝试从空队列中删除元素时,尝试修改有界队列将失败。

有时,当队列操作原本会失败时,更希望使线程阻塞。 除了不需要调用类处理失败和重试之外,阻塞还具有流控制的好处-当使用者从队列中删除元素的速度比生产者将元素放入队列的速度慢时,迫使生产者执行块将限制生产者。 将其与无限制,无阻塞的队列进行比较-如果生产者和消费者之间的不平衡是长期存在的,则随着队列长度的增长而不受限制,系统可能会耗尽内存。 有界阻塞队列允许您以优美自然的方式限制给定队列使用的资源。

实现BlockingQueue的类为:

  • LinkedBlockingQueue有界或无界FIFO阻塞队列,实现方式类似于链表
  • PriorityBlockingQueue一个无界阻塞优先级队列
  • ArrayBlockingQueue由数组支持的有界FIFO阻塞队列
  • SynchronousQueue不是队列,但是可以促进协作线程之间的同步切换

任务管理

线程创建

线程最常见的应用程序之一是创建一个或多个线程,以执行特定类型的任务。 Timer类创建用于执行TimerTask对象的线程,Swing创建用于处理UI事件的线程。 在这两种情况下,应该认为在单独线程中执行的任务是短暂的-这些线程的存在是为服务大量潜在的短期任务。

在每种情况下,这些线程通常都具有非常简单的结构:

while (true) { 
  if (no tasks) 
    wait for a task; 
  execute the task;
}

通过实例化从Thread派生的对象并调用Thread.start()方法来创建Thread 您可以通过两种方式创建线程:通过扩展Thread并覆盖run()方法,或者通过实现Runnable接口并使用Thread(Runnable)构造函数:

class WorkerThread extends Thread { 
  public void run() { /* do work */ }
}
Thread t = new WorkerThread();
t.start();

要么:

Thread t = new Thread(new Runnable() { 
  public void run() { /* do work */ }
}
t.start();

重用线程

诸如Swing GUI框架之类的框架为事件任务创建单个线程,而不是出于多种原因而为每个任务生成新线程。 首先是创建线程有一些开销,因此创建线程来执行简单任务将浪费资源。 通过重新使用事件线程来处理多个事件,启动和拆卸成本(因平台而异)将在许多事件中摊销。

Swing对事件使用单个后台线程的另一个原因是,确保事件不会互相干扰,因为在前一个事件完成之前,下一个事件将不会开始处理。 这种方法简化了事件处理程序的编写。 对于多个线程,要确保一次只有一个线程在执行线程敏感代码,将需要花费更多的工作。

如何不管理任务

大多数服务器应用程序,例如Web服务器,POP服务器,数据库服务器或文件服务器,都是代表远程客户端处理请求的,远程客户端通常使用套接字连接到服务器。 对于每个请求,通常需要进行少量处理(获取此文件的该块并将其发送回套接字),但是请求服务的客户端数量可能很多(且无限制)。

构建服务器应用程序的简单模型是为每个请求生成一个新线程。 下面的代码片段实现了一个简单的Web服务器,该服务器在端口80上接受套接字连接,并产生一个新线程来处理请求。 不幸的是,此代码不是实现Web服务器的好方法,因为它将在高负载下失败,从而导致整个服务器瘫痪。

class UnreliableWebServer { 
  public static void main(String[] args) {
    ServerSocket socket = new ServerSocket(80);
      while (true) {
      final Socket connection = socket.accept();
      Runnable r = new Runnable() {
        public void run() {
          handleRequest(connection);
        }
      };
      // Don't do this!
      new Thread(r).start();
    }
  }
}

UnreliableWebServer类无法很好地处理服务器被请求淹没的情况。 每次收到请求时,都会创建一个新线程。 根据您的操作系统和可用内存,可以创建的线程数是有限的。 不幸的是,您并不总是知道该限制是什么-您只能找出应用程序因OutOfMemoryError崩溃的时间。

如果您以足够快的速度向该服务器抛出HTTP请求,则最终一个线程创建将失败,从而导致Error导致整个应用程序崩溃。 当您一次只能有效地服务数十个线程时,没有理由创建一千个线程-这样使用资源反正很可能会损害性能。 创建线程会占用大量内存-有两个堆栈(Java和C)以及每个线程的数据结构。 而且,如果创建的线程过多,则它们中的每一个都将获得很少的CPU时间,结果是您正在使用大量的内存来服务大量的线程,而每个线程的运行速度都很慢。 这不是对计算资源的很好利用。

抢救线程池

为任务创建一个新线程不一定很糟糕,但是如果任务创建的频率很高并且平均任务持续时间很短,我们可以看到每个任务产生一个新线程将如何产生性能(并且,如果负载是不可预测的(稳定性)问题。

如果不是要为每个任务创建一个新线程,则服务器应用程序必须具有某种方式来限制一次处理多少个请求。 这意味着它不能简单地调用

new Thread(runnable).start()

每次需要启动新任务时。

管理大量小任务的经典机制是将工作队列与线程池结合在一起。 工作队列只是要处理的任务队列,前面描述的Queue类正好适合这个要求。 线程池是线程的集合,每个线程都从公共工作队列中获取。 当工作线程之一完成任务的处理时,它将返回队列以查看是否还有更多任务要处理。 如果存在,它将使下一个任务出队并开始处理它。

线程池为线程生命周期开销问题和资源崩溃问题提供了解决方案。 通过将线程重用于多个任务,线程创建开销分散在许多任务上。 另外,由于在请求到达时线程已经存在,因此消除了线程创建带来的延迟。 因此,可以立即为请求提供服务,从而使应用程序更具响应性。 此外,通过适当地调整线程池中的线程数,您可以通过强制超过某个阈值的任何请求等待直到有一个线程可以处理该请求,从而防止资源浪费,因为在等待这些请求时,这些请求将比在一个线程中消耗更少的资源。额外的线程。

执行器框架

java.util.concurrent软件包包含一个灵活的线程池实现,但更有价值,它包含一个完整的框架,用于管理实现Runnable的任务的执行。 该框架称为执行器框架。

Executor界面非常简单。 它描述了要运行Runnable的对象:

public interface Executor { 
  void execute(Runnable command);
}

接口未指定任务在哪个线程中运行-取决于您使用的Executor实现。 它可以在Swing事件线程等后台线程中运行,或者在线程池中运行,或者在调用线程中运行,或者在新线程中运行,甚至可以在另一个JVM中运行! 通过标准化的Executor界面提交任务,任务提交与任务执行策略脱钩。 Executor接口仅与任务提交有关,而Executor实现的选择决定了执行策略 。 这使得在部署时调整执行策略(队列边界,池大小,优先级等)变得更加容易,而代码更改却很少。

java.util.concurrent大多数Executor实现都还实现了ExecutorService接口,这是Executor的扩展,还管理执行服务的生命周期。 This makes it easier for them to be managed and to provide services to an application whose lifetime may be longer than that of an individual Executor .

public interface ExecutorService extends Executor {
  void shutdown();
  List shutdownNow();
  boolean isShutdown();
  boolean isTerminated();
  boolean awaitTermination(long timeout,
                           TimeUnit unit);
  // other convenience methods for submitting tasks
}

Executors

The java.util.concurrent package contains several implementations of Executor , each of which implement different execution policies. What is an execution policy? An execution policy defines when and in what thread a task will run, what level of resources (threads, memory, and so on) the execution service may consume, and what to do if the executor is overloaded.

Rather than being instantiated through constructors, executors are generally instantiated through factory methods. The Executors class contains static factory methods for constructing a number of different kinds of Executor implementations:

  • Executors.newCachedThreadPool() Creates a thread pool that is not limited in size, but which will reuse previously created threads when they are available. If no existing thread is available, a new thread will be created and added to the pool. Threads that have not been used for 60 seconds are terminated and removed from the cache.
  • Executors.newFixedThreadPool(int n) Creates a thread pool that reuses a fixed set of threads operating off a shared unbounded queue. If any thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.
  • Executors.newSingleThreadExecutor() Creates an Executor that uses a single worker thread operating off an unbounded queue, much like the Swing event thread. Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time.

A more reliable Web server -- using Executor

The code in How not to manage tasks earlier showed how not to write a reliable server application. Fortunately, fixing this example is quite easy -- replace the Thread.start() call with submitting a task to an Executor :

class ReliableWebServer { 
  Executor pool =
    Executors.newFixedThreadPool(7);
    public static void main(String[] args) {
    ServerSocket socket = new ServerSocket(80);
      while (true) {
      final Socket connection = socket.accept();
      Runnable r = new Runnable() {
        public void run() {
          handleRequest(connection);
        }
      };
      pool.execute(r);
    }
  }
}

Note that the only difference between this example and the previous example is the creation of the Executor and how tasks are submitted for execution.

Customizing ThreadPoolExecutor

The Executor s returned by the newFixedThreadPool and newCachedThreadPool factory methods in Executors are instances of the class ThreadPoolExecutor , which is highly customizable.

The creation of a pool thread can be customized by using a version of the factory method or constructor that takes a ThreadFactory argument. A ThreadFactory is a factory object that constructs new threads to be used by an executor. Using a customized thread factory gives you the opportunity to create threads that have a useful thread name, are daemon threads, belong to a specific thread group, or have a specific priority.

The following is an example of a thread factory that creates daemon threads instead of user threads:

public class DaemonThreadFactory implements ThreadFactory {
    public Thread newThread(Runnable r) {
        Thread thread = new Thread(r);
        thread.setDaemon(true);
        return thread;
    }
}

Sometimes an Executor cannot execute a task, either because it has been shut down, or because the Executor uses a bounded queue for storing waiting tasks, and the queue is full. In that case, the executor's RejectedExecutionHandler is consulted to determine what to do with the task -- throw an exception (the default), discard the task, execute the task in the caller's thread, or discard the oldest task in the queue to make room for the new task. The rejected execution handler can be set by ThreadPoolExecutor.setRejectedExecutionHandler .

You can also extend ThreadPoolExecutor , and override the methods beforeExecute and afterExecute , to add instrumentation, add logging, add timing, reinitialize thread-local variables, or make other execution customizations.

Special considerations

Using the Executor framework decouples task submission from execution policy, which in the general case is more desirable as it allows us to flexibly tune the execution policy without having to change the code in hundreds of places. However, several situations exist when the submission code implicitly assumes a certain execution policy, in which case it is important that the selected Executor implement a consistent execution policy.

One such case is when tasks wait synchronously for other tasks to complete. In that case, if the thread pool does not contain enough threads, it is possible for the pool to deadlock, if all currently executing tasks are waiting for another task, and that task cannot execute because the pool is full.

A similar case is when a group of threads must work together as a cooperating group. In that case, you will want to ensure that the thread pool is large enough to accommodate all the threads.

If your application makes certain assumptions about a specific executor, these should be documented near the definition and initialization of the Executor so that well-intentioned changes do not subvert the correct functioning of your application.

Tuning thread pools

A common question asked when creating Executor s is "How big should the thread pool be?" The answer, of course, depends on your hardware (how many processors do you have?) and the type of tasks that are going to be executed (are they compute-bound or IO-bound?).

If thread pools are too small, the result may be incomplete resource utilization -- there may be idle processors while tasks are still on the work queue waiting to execute.

On the other hand, if the thread pool is too large, then there will be many active threads, and performance may suffer due to the memory utilization of the large number of threads and active tasks, or because there will be more context switches per task than with a smaller number of threads.

So what's the right size for a thread pool, assuming the goal is to keep the processors fully utilized? Amdahl's law gives us a good approximate formula, if we know how many processors our system has and the approximate ratio of compute time to wait time for the tasks.

Let WT represent the average wait time per task, and ST the average service time (computation time) per task. Then WT/ST is the percentage of time a task spends waiting. For an N processor system, we would want to have approximately N*(1+WT/ST) threads in the pool.

The good news is that you don't have to estimate WT/ST exactly. The range of "good" pool sizes is fairly large; you just want to avoid the extremes of "much too big" and "much too small."

The Future interface

The Future interface allows you to represent a task that may have completed, may be in the process of being executed, or may not yet have started execution. Through the Future interface, you can attempt to cancel a task that has not yet completed, inquire whether the task has completed or cancelled, and fetch (or wait for) the task's result value.

The FutureTask class implements Future , and has constructors that allow you to wrap a Runnable or Callable (a result-bearing Runnable ) with a Future interface. Because FutureTask also implements Runnable , you can then simply submit FutureTask to an Executor . Some submission methods (like ExecutorService.submit() ) will return a Future interface in addition to submitting the task.

The Future.get() method retrieves the result of the task computation (or throws ExecutionException if the task completed with an exception). If the task has not yet completed, Future.get() will block until the task completes; if it has already completed, the result will be returned immediately.

Building a cache with Future

This code example ties together several classes from java.util.concurrent , prominently showcasing the power of Future . It implements a cache, and uses Future to describe a cached value that may already be computed or that may be "under construction" in another thread.

It takes advantage of the atomic putIfAbsent() method in ConcurrentHashMap , ensuring that only one thread will try to compute the value for a given key. If another thread subsequently requests the value for that same key, it simply waits (with the help of Future.get() ) for the first thread to complete. As a result, two threads will not try to compute the same value.

public class Cache {
    ConcurrentMap> map = new ConcurrentHashMap();
    Executor executor = Executors.newFixedThreadPool(8);

    public V get(final K key) {
        FutureTask f = map.get(key);
        if (f == null) {
            Callable c = new Callable() {
                public V call() {
                    // return value associated with key
                }
            };
            f = new FutureTask(c);
            FutureTask old = map.putIfAbsent(key, f);
            if (old == null)
                executor.execute(f);
            else
                f = old;
        }
        return f.get();
    }
}

CompletionService

CompletionService combines an execution service with a Queue -like interface, so that the processing of task results can be decoupled from task execution. The CompletionService interface includes submit() methods for submitting tasks for execution, and take() / poll() methods for asking for the next completed task.

CompletionService allows the application to be structured using the Producer/Consumer pattern, where producers create tasks and submit them, and consumers request the results of a complete task and then do something with those results. The CompletionService interface is implemented by the ExecutorCompletionService class, which uses an Executor to process the tasks and exports the submit/poll/take methods from CompletionService .

The following example uses an Executor and a CompletionService to start a number of "solver" tasks, and uses the result of the first one that produces a non-null result, and cancels the rest:

void solve(Executor e, Collection> solvers) 
      throws InterruptedException {
        CompletionService ecs = 
          new ExecutorCompletionService(e);
        int n = solvers.size();
        List> futures = 
          new ArrayList>(n);
        Result result = null;
        try {
            for (Callable s : solvers)
                futures.add(ecs.submit(s));
            for (int i = 0; i < n; ++i) {
                try {
                    Result r = ecs.take().get();
                    if (r != null) {
                        result = r;
                        break;
                    }
                } catch(ExecutionException ignore) {}
            }
        }
        finally {
            for (Future f : futures)
                f.cancel(true);
        }

        if (result != null)
            use(result);
    }

Synchronizer classes

Synchronizers

Another useful category of classes in java.util.concurrent is the synchronizers. This set of classes coordinates and controls the flow of execution for one or more threads.

The Semaphore , CyclicBarrier , CountdownLatch , and Exchanger classes are all examples of synchronizers. Each of these has methods that threads can call that may or may not block based on the state and rules of the particular synchronizer being used.

Semaphores

The Semaphore class implements a classic Dijkstra counting semaphore. A counting semaphore can be thought of as having a certain number of permits, which can be acquired and released. If there are permits left, the acquire() method will succeed, otherwise it will block until one becomes available (by another thread releasing the permit). A thread can acquire more than one permit at a time.

Counting semaphores can be used to restrict the number of threads that have concurrent access to a resource. This approach is useful for implementing resource pools or limiting the number of outgoing socket connections in a Web crawler.

Note that the semaphore does not keep track of which threads own how many permits; it is up to the application to ensure that when a thread releases a permit, that it either owns the permit or it is releasing it on behalf of another thread, and that the other thread realizes that its permit has been released.

Mutex

A special case of counting semaphores is the mutex , or mutual-exclusion semaphore. A mutex is simply a counting semaphore with a single permit, meaning that only one thread can hold a permit at a given time (also called a binary semaphore ). A mutex can be used to manage exclusive access to a shared resource.

While mutexes have a lot in common with locks, mutexes have one additional feature that locks generally do not have, and that is the ability for the mutex to be released by a different thread than the one holding the permit. This may be useful in deadlock recovery situations.

CyclicBarrier

The CyclicBarrier class is a synchronization aid that allows a set of threads to wait for the entire set of threads to reach a common barrier point. CyclicBarrier is constructed with an integer argument, which determines the number of threads in the group. When one thread arrives at the barrier (by calling CyclicBarrier.await() ), it blocks until all threads have arrived at the barrier, at which point all the threads are then allowed to continue executing. This action is similar to what many families (try to) do at the mall -- family members go their separate ways, and everyone agrees to meet at the movie theater at 1:00. When you get to the movie theater and not everyone is there, you sit and wait for everyone else to arrive. Then everyone can leave together.

The barrier is called cyclic because it is reusable; once all the threads have met up at the barrier and been released, the barrier is reinitialized to its initial state.

You can also specify a timeout when waiting at the barrier; if by that time the rest of the threads have not arrived at the barrier, the barrier is considered broken and all threads that are waiting receive a BrokenBarrierException .

The code example below creates a CyclicBarrier and launches a set of threads that will each compute a portion of a problem, wait for all the other threads to finish, and then check to see if the solution has converged. If not, each worker thread will begin another iteration. This example uses a variant of CyclicBarrier that lets you register a Runnable that is executed whenever all the threads arrive at the barrier but before any of them are released.

class Solver { // Code sketch
  void solve(final Problem p, int nThreads) {
  final CyclicBarrier barrier = 
    new CyclicBarrier(nThreads,
      new Runnable() {
        public void run() { p.checkConvergence(); }}
    );
    for (int i = 0; i < nThreads; ++i) {
      final int id = i;
      Runnable worker = new Runnable() {
        final Segment segment = p.createSegment(id);
        public void run() {
          try {
            while (!p.converged()) {
              segment.update();
              barrier.await();
            }
          }
          catch(Exception e) { return; }
        }
      };
      new Thread(worker).start();
   }
}

CountdownLatch

The CountdownLatch class is similar to CyclicBarrier , in that its role is to coordinate a group of threads that have divided a problem among themselves. It is also constructed with an integer argument, indicating the initial value of the count, but, unlike CyclicBarrier , is not reusable.

Where CyclicBarrier acts as a gate to all the threads that reach the barrier, allowing them through only when all the threads have arrived at the barrier or the barrier is broken, CountdownLatch separates the arrival and waiting functionality. Any thread can decrement the current count by calling countDown() , which does not block, but merely decrements the count. The await() method behaves slightly differently than CyclicBarrier.await() -- any threads that call await() will block until the latch count gets down to zero, at which point all threads waiting will be released, and subsequent calls to await() will return immediately.

CountdownLatch is useful when a problem has been decomposed into a number of pieces, and each thread has been given a piece of the computation. When the worker threads finish, they decrement the count, and the coordination thread(s) can wait on the latch for the current batch of computations to finish before moving on to the next batch.

Conversely, a CountdownLatch class with a count of 1 can be used as a "starting gate" to start a group of threads at once; the worker threads wait on the latch, and the coordinating thread decrements the count, which releases all the worker threads at once. The following example uses two CountdownLatche s; ones as a starting gate, and one that releases when all the worker threads are finished:

class Driver { // ...
   void main() throws InterruptedException {
     CountDownLatch startSignal = new CountDownLatch(1);
     CountDownLatch doneSignal = new CountDownLatch(N);

     for (int i = 0; i < N; ++i) // create and start threads
       new Thread(new Worker(startSignal, doneSignal)).start();

     doSomethingElse();            // don't let them run yet
     startSignal.countDown();      // let all threads proceed
     doSomethingElse();
     doneSignal.await();           // wait for all to finish
   }
 }

 class Worker implements Runnable {
   private final CountDownLatch startSignal;
   private final CountDownLatch doneSignal;
   Worker(CountDownLatch startSignal, CountDownLatch doneSignal) {
      this.startSignal = startSignal;
      this.doneSignal = doneSignal;
   }
   public void run() {
      try {
        startSignal.await();
        doWork();
        doneSignal.countDown();
      } catch (InterruptedException ex) {} // return;
   }
 }

Exchanger

The Exchanger class facilitates a two-way exchange between two cooperating threads; in this way, it is like a CyclicBarrier with a count of two, with the added feature that the two threads can "trade" some state when they both reach the barrier. (The Exchanger pattern is also sometimes called a rendezvous.)

A typical use for Exchanger would be where one thread is filling a buffer (by reading from a socket) and the other thread is emptying the buffer (by processing the commands received from the socket). When the two threads meet at the barrier, they swap buffers. The following code demonstrates this technique:

class FillAndEmpty {
   Exchanger exchanger = new Exchanger();
   DataBuffer initialEmptyBuffer = new DataBuffer();
   DataBuffer initialFullBuffer = new DataBuffer();

   class FillingLoop implements Runnable {
     public void run() {
       DataBuffer currentBuffer = initialEmptyBuffer;
       try {
         while (currentBuffer != null) {
           addToBuffer(currentBuffer);
           if (currentBuffer.full())
             currentBuffer = exchanger.exchange(currentBuffer);
         }
       } catch (InterruptedException ex) { ... handle ... }
     }
   }

   class EmptyingLoop implements Runnable {
     public void run() {
       DataBuffer currentBuffer = initialFullBuffer;
       try {
         while (currentBuffer != null) {
           takeFromBuffer(currentBuffer);
           if (currentBuffer.empty())
             currentBuffer = exchanger.exchange(currentBuffer);
         }
       } catch (InterruptedException ex) { ... handle ...}
     }
   }

   void start() {
     new Thread(new FillingLoop()).start();
     new Thread(new EmptyingLoop()).start();
   }
 }

Low-level facilities -- Lock and Atomic

The Java language has a built-in locking facility -- the synchronized keyword. When a thread acquires a monitor (built-in lock), other threads will block when trying to acquire the same lock, until the first thread releases it. Synchronization also ensures that the values of any variables modified by a thread while it holds a lock are visible to a thread that subsequently acquires the same lock, ensuring that if classes properly synchronize access to shared state, threads will not see "stale" values of variables that are the result of caching or compiler optimization.

While there is nothing wrong with synchronization, it has some limitations that can prove inconvenient in some advanced applications. The Lock interface is a generalization of the locking behavior of built-in monitor locks, which allow for multiple lock implementations, while providing some features that are missing from built-in locks, such as timed waits, interruptible waits, lock polling, multiple condition-wait sets per lock, and non-block-structured locking.

interface Lock {
    void lock(); 
    void lockInterruptibly() throws IE;
    boolean tryLock();
    boolean tryLock(long time, 
                    TimeUnit unit) throws IE;           
    void unlock();
    Condition newCondition() throws
                    UnsupportedOperationException; 
  }

ReentrantLock

ReentrantLock is an implementation of Lock with the same basic behavior and semantics as the implicit monitor lock accessed using synchronized methods and statements, but with extended capabilities.

As a bonus, the implementation of ReentrantLock is far more scalable under contention than the current implementation of synchronized. (It is likely that there will be improvements to the contended performance of synchronized in a future version of the JVM.) This means that when many threads are all contending for the same lock, the total throughput is generally going to be better with ReentrantLock than with synchronized . In other words, when many threads are attempting to access a shared resource protected by a ReentrantLock , the JVM will spend less time scheduling threads and more time executing them.

While it has many advantages, the ReentrantLock class has one major disadvantage compared to synchronization -- it is possible to forget to release the lock. It is recommended that the following structure be used when acquiring and releasing a ReentrantLock :

Lock lock = new ReentrantLock();
...
lock.lock(); 
try { 
  // perform operations protected by lock
}
catch(Exception ex) {
 // restore invariants
}
finally { 
  lock.unlock(); 
}

Because the risk of fumbling a lock (forgetting to release it) is so severe, it is recommended that you continue to use synchronized for basic locking unless you really need the additional flexibility or scalability of ReentrantLock . ReentrantLock is an advanced tool for advanced applications -- sometimes you need it, but sometimes the trusty old hammer does just fine.

条件

Just as the Lock interface is a generalization of synchronization, the Condition interface is a generalization of the wait() and notify() methods in Object . One of the methods in Lock is newCondition() -- this asks the lock to return a new Condition object bound to this lock. The await() , signal() , and signalAll() methods are analogous to wait() , notify() , and notifyAll() , with the added flexibility that you can create more than one condition variable per Lock . This simplifies the implementation of some concurrent algorithms.

ReadWriteLock

The locking discipline implemented by ReentrantLock is quite simple -- one thread at a time holds the lock, and other threads must wait for it to be available. Sometimes, when data structures are more commonly read than modified, it may be desirable to use a more complicated lock structure, called a read-write lock, which allows multiple concurrent readers but also allows for exclusive locking by a writer. This approach offers greater concurrency in the common case (read only) while still offering the safety of exclusive access when necessary. The ReadWriteLock interface and the ReentrantReadWriteLock class provide this capability -- a multiple-reader, single-writer locking discipline that can be used to protect shared mutable resources.

Atomic variables

Even though they will rarely be used directly by most users, the most significant new concurrent classes may well be the atomic variable classes ( AtomicInteger , AtomicLong , AtomicReference , and so on). These classes expose the low-level improvements to the JVM that enable highly scalable atomic read-modify-write operations. Most modern CPUs have primitives for atomic read-modify-write, such as compare-and-swap (CAS) or load-linked/store-conditional (LL/SC). The atomic variable classes are implemented with whatever is the fastest concurrency construct provided by the hardware.

Many concurrent algorithms are defined in terms of compare-and-swap operations on counters or data structures. By exposing a high-performance, highly scalable CAS operation (in the form of atomic variables), it becomes practical to implement high performance, wait-free, lock-free concurrent algorithms in the Java language.

Nearly all of the classes in java.util.concurrent are built on top of ReentrantLock , which itself is built on top of the atomic variable classes. So while they may only be used by a few concurrency experts, it is the atomic variable classes that provide much of the scalability improvement of the java.util.concurrent classes.

The primary use for atomic variables is to provide an efficient, fine-grained means of atomically updating "hot" fields -- fields that are frequently accessed and updated by multiple threads. In addition, they are a natural mechanism for counters or generating sequence numbers.

Performance and scalability

Performance vs. scalability

While the overriding goal of the java.util.concurrent effort was to make it easier to write correct, thread-safe classes, a secondary goal was to improve scalability. Scalability is not exactly the same thing as performance -- in fact, sometimes scalability comes at the cost of performance.

Performance is a measure of "how fast can you execute this task." Scalability describes how an application's throughput behaves as its workload and available computing resources increase. A scalable program can handle a proportionally larger workload with more processors, memory, or I/O bandwidth. When we talk about scalability in the context of concurrency, we are asking how well a given class performs when many threads are accessing it simultaneously.

The low-level classes in java.util.concurrent -- ReentrantLock and the atomic variable classes -- are far more scalable than the built-in monitor (synchronization) locks. As a result, classes that use ReentrantLock or atomic variables for coordinating shared access to state will likely be more scalable as well.

Hashtable vs. ConcurrentHashMap

As an example of scalability, the ConcurrentHashMap implementation is designed to be far more scalable than its thread-safe uncle, Hashtable . Hashtable only allows a single thread to access the Map at a time; ConcurrentHashMap allows for multiple readers to execute concurrently, readers to execute concurrently with writers, and some writers to execute concurrently. As a result, if many threads are accessing a shared map frequently, overall throughput will be better with ConcurrentHashMap than with Hashtable .

The table below gives a rough idea of the scalability differences between Hashtable and ConcurrentHashMap . In each run, N threads concurrently executed a tight loop where they retrieved random key values from either a Hashtable or a ConcurrentHashMap , with 60 percent of the failed retrievals performing a put() operation and 2 percent of the successful retrievals performing a remove() operation. Tests were performed on a dual-processor Xeon system running Linux. The data shows run time for 10,000,000 iterations, normalized to the 1-thread case for ConcurrentHashMap . You can see that the performance of ConcurrentHashMap remains scalable up to many threads, whereas the performance of Hashtable degrades almost immediately in the presence of lock contention.

The number of threads in this test may look small compared to typical server applications. However, because each thread is doing nothing but repeatedly hitting on the table, this simulates the contention of a much larger number of threads using the table in the context of doing some amount of real work.

Threads ConcurrentHashMap Hashtable
1个 1.0 1.51
2 1.44 17.09
4 1.83 29.9
8 4.06 54.06
16 7.5 119.44
32 15.32 237.2

Lock vs. synchronized vs. Atomic

Another example of the scalability improvements possible with java.util.concurrent is evidenced by the following benchmark. This benchmark simulates rolling a die, using a linear congruence random number generator. Three implementations of the random number generator are available: one that uses synchronization to manage the state of the generator (a single variable), one that uses ReentrantLock , and one that uses AtomicLong . The graph below shows the relative throughput of the three versions with increasing numbers of threads, on an 8-way Ultrasparc3 system. (The graph probably understates the scalability of the atomic variable approach.)

Figure 1. Relative throughput using synchronization, Lock, and AtomicLong

jdk1.5 并发_JDK 5.0中的并发_第1张图片

Fair vs. unfair

One additional element of customization in many of the classes in java.util.concurrent is the question of "fairness." A fair lock, or fair semaphore, is one where threads are granted the lock or semaphore on a first-in, first-out (FIFO) basis. The constructors for ReentrantLock , Semaphore , and ReentrantReadWriteLock all can take arguments that determine whether the lock is fair, or whether it permits barging (threads to acquire the lock even if they have not been waiting the longest).

While the idea of barging locks may seem ridiculous and, well, unfair, barging locks are in fact quite common, and usually preferable. The built-in locks accessed with synchronization are not fair locks (and there is no way to make them fair). Instead, they provide weaker liveness guarantees that require that all threads will eventually acquire the lock.

The reason most applications choose (and should choose) barging locks over fair locks is performance. In most cases, exact fairness is not a requirement for program correctness, and the cost of fairness is quite high indeed. The table below adds a fourth dataset to the table from the previous section, where access to the PRNG state is managed by a fair lock. Note the large difference in throughput between barging locks and fair locks.

Figure 2. Relative throughput using synchronization, Lock, fair Lock, and AtomicLong

jdk1.5 并发_JDK 5.0中的并发_第2张图片

摘要

结论

The java.util.concurrent package contains a wealth of useful building blocks for improving the performance, scalability, thread-safety, and maintainability of concurrent classes. With them, you should be able to eliminate most uses of synchronization, wait/notify, and Thread.start() in your code, replacing them with higher-level, standardized, high-performance concurrency utilities.


翻译自: https://www.ibm.com/developerworks/java/tutorials/j-concur/j-concur.html

你可能感兴趣的:(jdk1.5 并发_JDK 5.0中的并发)