SEDA
Cassandra 的操作使用的并发模型。SEDA将应用程序分解为由事件队列分隔的各个阶段,并引入动态资源控制器的概念,允许应用程序动态调整,不断适应变化的负载。它是事件驱动的,收到请求后,先构造event,然后放到stage的请求队列中,stage从请求队列里拿到event进行处理,处理结束后,构造event_next并放入stage_next的请求队列。Stage之间通过队列来衔接,每个stage单独治理,各stage之间相互解耦。以异步响应的方式处理事件。
Threaded server design:Each incoming request is dispatched to a separate thread,which performs the entire processing for the request and return a result to the client. Edges re-preset control flow between components. Note that other I/O operations,such as disk access,are not shown here,but are incorporated within each threads' request processing
cassandra.concurrent
Cassandra中基于SEDA的并发模型实现目录。它是整个模型的基础。
StageManager
StageManager类中主要维护了一份stages列表。
EnumMap
stages = new EnumMap<>(Stage.class); static {
stages.put(Stage.TRACING, tracingExecutor());
......
}
根据不同的Stage枚举创建出不同的ExecutorService,在使用时获取对应的ExecuteService执行不同的任务。
Ex:
ListenableFutureTask task = ListenableFutureTask.create(runnable, null); StageManager.getStage(Stage.GOSSIP).execute(task);
StageManager.getStage(Stage.TRACING).execute(new WrappedRunnable(){});
LocalAwareExecutorService
Cassandra所有线程池的抽象接口,继承ExecutorService类,构建了基本的任务模型。添加了两个自己的方法:
// we need a way to inject a TraceState directly into the Executor context without going through
// the global Tracing sessions; see CASSANDRA-5668
public void execute(Runnable command, ExecutorLocals locals);
// permits executing in the context of the submitting thread
public void maybeExecuteImmediately(Runnable command);
ExecutorLocals是cassandra的trace跟踪类。调用execute(Runnable command, ExecutorLocals locals)实现了链路的跟踪。
常用的实现类只有两个:
- SEPExecutor:Executor中引用了SharedExecutorPool,SharedExecutorPool并非单例模式,但它在cassandra中是一个静态单例。
public static final SharedExecutorPool SHARED = new SharedExecutorPool("SharedPool");
通过newExecutor方法创建SEPExecutor,把自己方入创建的SEPExecutor中,并维护一个SEPExecutor列表。这意味着所有的共享一个SharedExecutorPool
public LocalAwareExecutorService newExecutor(int maxConcurrency, int maxQueuedTasks, String jmxPath, String name)
{
SEPExecutor executor = new SEPExecutor(this, maxConcurrency, maxQueuedTasks, jmxPath, name);
executors.add(executor);
return executor;
}
SEPExecutor调用execute时,SEPExecutor会把Thread封装成一个FutureTask放入Queue。这些Task最终是通过SEPWorker去执行的,SEPWorker在SEPExecutor的execute第一次被调用或停止后第一次被调用时通过SharedExecutorPool创建一个Work.SPINNING的SEPWorker,他是一个自循环worker,他有匿名内部类Worker可以用来治理状态和保存SEPExecutor,使得SEPWorker能找到SEPExecutor。由于execute放入Runnable是会使用Work.SPINNING的worker,所以任务并不会立刻被执行,SEPWorker会在自旋时分配SEPExecutor以及切换状态。而上面使用提到的maybeExecuteImmediately方法会立即执行任务。
// 4种Work:
static final Work STOP_SIGNALLED = new Work();
static final Work STOPPED = new Work();
static final Work SPINNING = new Work();
static final Work WORKING = new Work();
Work work = schedule(Work.SPINNING);
new SEPWorker(workerId.incrementAndGet(), work, this);
SEPWorker创建后启动Thread,Thread自旋执行ThreadPool
SEPWorker(Long workerId, Work initialState, SharedExecutorPool pool)
{
this.pool = pool;
this.workerId = workerId;
thread = new FastThreadLocalThread(this, pool.poolName + "-Worker-" + workerId);
thread.setDaemon(true);
set(initialState);
thread.start();
}
public void run()
{
SEPExecutor assigned = null;
Runnable task = null;
try
{
while (true)
{
if (isSpinning() && !selfAssign())
{
doWaitSpin();
continue;
}
...
}
}
}
SEPExecutor、SEPWorker和SharedExecutorPool组成了一套线程池的体系。与其他线程池不用的是,它所有worker共享一个pool,空闲的worker自由的寻找需要执行的executor,executor保存需要执行的runnable。
- JMXEnabledThreadPoolExecutor:
JMXEnabledThreadPoolExecutor相对简单一些,它继承了ThreadPoolExecutor类,大部分功能和父类相同,提供了部分扩展。比如DebuggableThreadPoolExecutor类中对Exception处理的扩展,以及ThreadPoolExecutor创建时维护了ThreadPoolMetrics和MBeanWrapper这两个类。
public static final RejectedExecutionHandler blockingExecutionHandler = new RejectedExecutionHandler()
{...... };
public DebuggableThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue workQueue, ThreadFactory threadFactory)
{
super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue, threadFactory);
allowCoreThreadTimeOut(true);
// block task submissions until queue has room.
// this is fighting TPE's design a bit because TPE rejects if queue.offer reports a full queue.
// we'll just override this with a handler that retries until it gets in. ugly, but effective.
// (there is an extensive analysis of the options here at
// http://today.java.net/pub/a/today/2008/10/23/creating-a-notifying-blocking-thread-pool-executor.html)
this.setRejectedExecutionHandler(blockingExecutionHandler);
}
public JMXEnabledThreadPoolExecutor(int corePoolSize,
int maxPoolSize,
long keepAliveTime,
TimeUnit unit,
BlockingQueue workQueue,
NamedThreadFactory threadFactory,
String jmxPath)
{
super(corePoolSize, maxPoolSize, keepAliveTime, unit, workQueue, threadFactory);
super.prestartAllCoreThreads();
metrics = new ThreadPoolMetrics(this, jmxPath, threadFactory.id);
mbeanName = "org.apache.cassandra." + jmxPath + ":type=" + threadFactory.id;
MBeanWrapper.instance.registerMBean(this, mbeanName);
}
ThreadPoolMetrics:基于metrics-core.jar包中的MetricRegistry实现对于该线程池指标的监控。
MBeanWrapper:MBeanWrapper有两个实现类NoOpMBeanWrapper、PlatformMBeanWrapper。NoOpMBeanWrapper不做任何事情。PlatformMBeanWrapper使用了JMX的MBeanServer类。
Cassandra启动
Cassandra通过CassandraDaemon类的main函数启动
public static void main(String[] args)
{
instance.activate();
}
启动时,Cassandra需要做以下几件事:
- 初始化配置文件
public void applyConfig()
{
DatabaseDescriptor.daemonInitialization();
}
DatabaseDescriptor是Cassandra的配置管理类,Cassandra节点的配置属性会保存在Config类中。
public static void daemonInitialization() throws ConfigurationException
{
daemonInitialization(DatabaseDescriptor::loadConfig); //通过loadConfig方法将Config加载入DatabaseDescriptor。
}
Cassandra读写操作
通过thrift包我们可以对Cassandra进行通信,CassandraServer是读写命令的入口。
1. Read
相关知识点:
- SEDA模型:cassandra核心思想
- ThreadPoolExecutor:多线程管理相关点
- Metric:线程调用监控
- MBeanServer:
- Netty:Cassandra基于CQL3协议的通讯基于Netty框架
参考资料:
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services
Understanding Cassandra Code Base