之前学习了floodlight链路发现模块:
http://blog.csdn.net/crystonesc/article/details/71157887
今天来着手学习floodlight拓扑管理模块,实际上拓扑管理模块会在网络发生变动(例如新添加了交换机、交换机端口改变)的时候自动计算网络拓扑,并生成相应的拓扑结构,同时floodlight拓扑管理模块计算拓扑数据的来源是链路发现模块通过LLDP和BDDP协议来获取的。
先给出本次实验的Mininet拓扑图吧:
拓扑结构说明:采用一个控制器,四台OF交换机和一台非OF交换机,配置如下:
S1/S2/S4/S5 dpId分别为:
00:00:00:00:00:00:00:01,00:00:00:00:00:00:00:02,00:00:00:00:00:00:00:04,00:00:00:00:00:00:00:05
那么我们直接看代码,其中遇到一些概念再解释,首先来看TopologyManager的startUp函数:
@Override
public void startUp(FloodlightModuleContext context) {
clearCurrentTopology();
// Initialize role to floodlight provider role.
this.role = floodlightProviderService.getRole();
ScheduledExecutorService ses = threadPoolService.getScheduledExecutor();
newInstanceTask = new SingletonTask(ses, new UpdateTopologyWorker());
if (role != HARole.STANDBY) {
newInstanceTask.reschedule(TOPOLOGY_COMPUTE_INTERVAL_MS, TimeUnit.MILLISECONDS);
}
linkDiscoveryService.addListener(this);
floodlightProviderService.addOFMessageListener(OFType.PACKET_IN, this);
floodlightProviderService.addHAListener(this.haListener);
addRestletRoutable();
}
可以看到启动模块后,模块会启动一个任务,并触发UpdateTopologyWorker(),继续跟进这个任务看看:
protected class UpdateTopologyWorker implements Runnable {
@Override
public void run() {
try {
if (ldUpdates.peek() != null) { /* must check here, otherwise will run every interval */
updateTopology("link-discovery-updates", false);
}
handleMiscellaneousPeriodicEvents();
}
catch (Exception e) {
log.error("Error in topology instance task thread", e);
} finally {
if (floodlightProviderService.getRole() != HARole.STANDBY) {
newInstanceTask.reschedule(TOPOLOGY_COMPUTE_INTERVAL_MS, TimeUnit.MILLISECONDS);
}
}
}
}
可以看到任务中会中ldUpdates消息队列中取出网络拓扑改变的事件,若存在事件,则进行拓扑的重新计算,调用updateTopology,继续跟进代码可以发现,floodlight会在拓扑改变的时候,生成一个TopologyInstance类的实例,并调用其中的compute方法来进行拓扑计算:
protected boolean createNewInstance(String reason, boolean forced) {
Set<NodePortTuple> blockedPorts = new HashSet<NodePortTuple>();
if (!linksUpdated && !forced) {
return false;
}
Map<NodePortTuple, Set<Link>> openflowLinks;
openflowLinks =
new HashMap<NodePortTuple, Set<Link>>();
Set<NodePortTuple> nptList = switchPortLinks.keySet();
if (nptList != null) {
for(NodePortTuple npt: nptList) {
Set<Link> linkSet = switchPortLinks.get(npt);
if (linkSet == null) continue;
openflowLinks.put(npt, new HashSet<Link>(linkSet));
}
}
// Identify all broadcast domain ports.
// Mark any port that has inconsistent set of links
// as broadcast domain ports as well.
Set<NodePortTuple> broadcastDomainPorts =
identifyBroadcastDomainPorts();
// Remove all links incident on broadcast domain ports.
for (NodePortTuple npt : broadcastDomainPorts) {
if (switchPortLinks.get(npt) == null) continue;
for (Link link : switchPortLinks.get(npt)) {
removeLinkFromStructure(openflowLinks, link);
}
}
// Remove all tunnel links.
for (NodePortTuple npt: tunnelPorts) {
if (switchPortLinks.get(npt) == null) continue;
for (Link link : switchPortLinks.get(npt)) {
removeLinkFromStructure(openflowLinks, link);
}
}
//switchPorts contains only ports that are part of links. Calculation of broadcast ports needs set of all ports.
Map<DatapathId, Set<OFPort>> allPorts = new HashMap<DatapathId, Set<OFPort>>();;
for (DatapathId sw : switchPorts.keySet()){
allPorts.put(sw, this.getPorts(sw));
}
TopologyInstance nt = new TopologyInstance(switchPorts,
blockedPorts,
openflowLinks,
broadcastDomainPorts,
tunnelPorts,
switchPortLinks,
allPorts,
interClusterLinks);
nt.compute();
currentInstance = nt;
return true;
}
我们可以看到代码中首先会identifyBroadcastDomainPorts(),这个方法的意思是排除那些非OF的端口链接,回到链路发现模块我们可以看到floodlight将非OF的端口链接都标识为广播域端口。后面代码接着从openflowLinks中移除了广播域端口和Tunnel端口,最后生成了一个TopologyInstance 的实例,并调用compute进行拓扑计算.接下来关键时刻到来,我们看看拓扑计算是怎么进行的:
protected void compute() {
/*
* Step 1: Compute clusters ignoring ports with > 2 links and
* blocked links.
*/
identifyClusters();
/*
* Step 2: Associate non-blocked links within clusters to the cluster
* in which they reside. The remaining links are inter-cluster links.
*/
identifyIntraClusterLinks();
/*
* Step 3: Compute the archipelagos. (Def: group of conneccted clusters)
* Each archipelago will have its own broadcast tree, chosen by running
* dijkstra's algorithm from the archipelago ID switch (lowest switch
* DPID). We need a broadcast tree per archipelago since each
* archipelago is by definition isolated from all other archipelagos.
*/
identifyArchipelagos();
/*
* Step 4: Use Yens algorithm to permute through each node combination
* within each archipelago and compute multiple paths. The shortest
* path located (i.e. first run of dijkstra's algorithm) will be used
* as the broadcast tree for the archipelago.
*/
computeOrderedPaths();
/*
* Step 5: Determine the broadcast ports for each archipelago. These are
* the ports that reside on the broadcast tree computed and saved when
* performing path-finding. These are saved into multiple data structures
* to aid in quick lookup per archipelago, per-switch, and topology-global.
*/
computeBroadcastPortsPerArchipelago();
/*
* Step 6: Optionally, print topology to log for added verbosity or when debugging.
*/
printTopology();
}
可以从代码中看出,拓扑计算分为6步:
第一步:计算cluster.
第二步:标识cluster之间的link
第三步:计算孤岛(Archipelagos)
第四步:计算每个节点之间的k个最短路径
第五步:计算每个孤岛的广播树
第六步:打印拓扑信息
以下将详细讲解其中重要的步骤:
1.第一步:计算cluster
cluster是floodlight中引入的概念,有点类似于我们平时接触到的集群,我理解是有一组互联的OF交换机。如上面拓扑图中,s1,s2为一个cluster; s4,s5为一个cluster.因为s2和s4之间隔着一个非OF的s3,所以s2,s4不是在一个cluster.
那么floodlight是如何计算出cluster的呢?floodlight 采用了一种名叫Tarjan的算法.看了下Tarjan算法是用于在有向图中找出强连通分量的算法,算法大致的思路是将OF交换机抽象为图中的一个节点,链路抽象为图中的连线,注意floodlight中链路都是有向的,也就是两个OF交换机的连接在floodlight中是有两个Link,两个Link是对称的.
下面说下Tarjan算法的大致思路:从一个节点开始进行图的深度遍历,如果遍历的过程中发现了之已经遍历过的节点,则图中势必存在一个强联通分量,这时候对这个分量进行标识,继续进行深度遍历.具体算法的简介请参考:
http://www.cnblogs.com/uncle-lu/p/5876729.html
2.第二步,第三步用于生成cluster之间的连接和标识孤岛,这里孤岛也可以理解为一组OF交换机强连接的集合.
3.第四步:计算每个节点之间的k个最短路径
这步骤是通过Yens算法计算节点之间的K个最短路径,从代码中可以得到,K默认为3.
我们都知道求最短路径的算法是dijkstra,Yens算法可以理解为在dijkstra算法基础上发展出来的,大致思路是先用dijkstra算法求出某个节点到另外一个节点的最短路径,求出最短路径以后,在最短路径上的基础上每次去掉一条路径并用另一条路径替换进而获得另一条次短的路径,算法的具体实现可以参考:
https://en.wikipedia.org/wiki/Yen%27s_algorithm
完成上面的步骤,拓扑管理模块就能生成网络拓扑的信息,可以用于界面展示、路由选路、数据包广播等等信息。