TRANSPORT LAYER
7.5
network dimensioning
The question of how much capacity to provide at network links in a given topology to achieve a given level of performance is often known as bandwidth provisioning. The even more complicated problem of how to design a network topology (where to place routers, how to interconnect routers with links, and what capacity to assign to links) to achieve a given level of end-to-end performance is a network design problem often referred to as network dimensioning.
Making the best of best-effort service.
The application-level mechanisms and infrastructure that can be successfully used in a well-dimensioned network where packet loss and excessive end-to-end delay rarely occur. When demand increases are forecasted, the ISPs deploy additional bandwidth and switching capacity to continue to ensure satisfactory delay and packet-loss performance multiple classes of service
Differentiated service.
one type of traffic might be given strict priority over another class of traffic when both types of traffic are queued at a router.
Motivation
Insight 1: Packet marking allows a router to distinguish among packets belonging to different classes of traffic.
Insight 2: It is desirable to provide a degree of traffic isolation among classes so that one class is not adversely affected
by another class of traffic that misbehaves.
Insight 3: While providing isolation among classes or flows, it is desirable to use resources (for example, link
bandwidth and buffers) as efficiently as possible.
scheduling
First-In-First-Out (FIFO)
Packets arriving at the link output queue wait for transmission if the link is currently busy transmitting another packet. If there is not sufficient buffering space to hold the arriving packet, the queue’s packet-discarding policy then determines whether the packet will be dropped (lost) or whether other packets will be removed from the queue to make space for the arriving packet.
Priority Queuing
packets arriving at the output link are classified into priority classes at the output queue. a packet’s priority class may depend on an explicit marking that it carries in its packet header, its source or destination IP address, its destination port number, or other criteria. Each priority class typically has its own queue. When choosing a packet to transmit, the priority queuing discipline will transmit a packet from the highest priority class that has a nonempty queue. The choice among packets in the same priority class is typically done in a FIFO manner.
Round Robin and Weighted Fair Queuing (WFQ)
Under the round robin queuing discipline, packets are sorted into classes as with priority queuing. However, rather than there being a strict priority of service among classes, a round robin scheduler alternates service among the classes. A so-called work-conserving queuing discipline will never allow the link to remain idle whenever there are packets (of any class) queued for transmission. A work-conserving round robin discipline that looks for a packet of a given class but finds none will immediately check the next class in the round robin sequence.
Policing: The Leaky Bucket
R3.c (8Points) Describe the leaky bucket mechanism. Why is this type of mechanism needed in priority queueing?
Leaky bucket mechanism: a traffic policing mechanism, to be implemented at the edge of the network to control the characteristics of the traffic injected. The leaky bucket enforces the traffic of a stream to stay within rate limits (controls injection rate (average rate, peak rate and burst size) into the network).
A leaky bucket can hold up to B tokens. Tokens are generated at rate R. If R>B then the extra token is ignored, the bucket remains full of B tokens.
Each packet enters the network must have one token, otherwise it has to wait for an available token. The token is removed from the bucket Because of the token-generation rate R>B, obviously the maximum burst size is B. This also makes the maximum long term average rate is R. Maximum number of packets that can enter the network for any interval T is RT + B
Priority queuing: packets must be classified according to explicit marking. Each priority class has its own queue. With this mechanism, we can choose packet to transmit first from the highest priority class. All other traffic can be handled when the highest priority queue is empty. Leaky bucket mechanism is needed in priority queueing to avoid abuse of the prioritized traffic over the nonprioritized buffer. Otherwise, prioritized traffic can eat out all the bandwidth and leave nothing to the others.
Per-connection QoS
A hard guarantee means the application will receive its requested quality of service (QoS) with certainty.
A soft guarantee means the application will receive its requested quality of service with high probability.
Insight 4: If sufficient resources will not always be available, and QoS is to be guaranteed, a call admission process is needed in which flows declare their QoS requirements and are then either admitted to the network (at the required QoS) or blocked from the network (if the required QoS cannot be provided by the network).
3.1 Transport layer 提供
A transport-layer protocol provides for logical communication between application processes running on different hosts.
3.1.1 interaction between transport and network layers
transport-layer protocol provides logical communication between application processes running on different hosts,
a network-layer protocol provides logical communication between hosts transport-layer protocols live in the end systems.
Within an end system, a transport protocol moves messages from application processes to the network edge (that is, the network layer) and vice versa, but it doesn’t have any say about how the messages are moved within the network core. the services that a transport protocol can provide are often constrained by the service model of the underlying network-layer protocol.
3.1.2 Overview of the Transport Layer in the Internet
UDP (User Datagram Protocol), which provides an unreliable, connectionless service to the invoking application
TCP (Transmission Control Protocol), which provides a reliable, connection-oriented service to the invoking application..
The IP service model is a best-effort delivery service. This means that IP makes its “ best effort ” to deliver segments between communicating hosts, but it makes no guarantees. For these reasons, IP is said to be an unreliable service.each host has an IP address
UDP and TCP Extending host-to-host delivery to process-to-process delivery is called transport-layer multiplexing and demultiplexing.
UDP and TCP also provide integrity checking by including errordetection fields in their segments’ headers.
TCP provides flow control, sequence numbers, acknowledgments, and timers, ensures that data is delivered from sending process to receiving process, correctly and in order. TCP thus converts IP’s unreliable service between end systems into a reliable data transport service between processes. TCP also provides . TCP congestion control prevents any one
TCP connection from swamping the links and routers between communicating hosts with an excessive amount of
traffic. TCP strives to give each connection traversing a congested link an equal share of the link bandwidth. This is
done by regulating the rate at which the sending sides of TCP connections can send traffic into the network.
3.2 (fall 13)
This job of delivering the data in a transport-layer segment to the correct socket is called demultiplexing.
The job of gathering data chunks at the source host from different sockets, encapsulating each data chunk with header information (that will later be used in demultiplexing) to create segments, and passing the segments to the network layer is called multiplexing.
multiplexing requires
(1) that sockets have unique identifiers,
(2) that each segment have special fields that indicate the socket to which the segment is to be delivered. These special fields are the source port number field and the destination port number field.
Connectionless Multiplexing and Demultiplexing --》UDP
a UDP socket is fully identified by a two-tuple consisting of a destination IP address and a destination port number.
Connection-Oriented Multiplexing and Demultiplexing
a TCP socket is identified by a four-tuple: (source IP address, source port number, destination IP address, destination port number). when arrives from the network to a host, the host uses all four values to direct (demultiplex) the segment to the appropriate socket.
3.3 13 mid
UDP services
the transport layer has to provide a multiplexing/demultiplexing service in order to pass data between the network layer and the correct application-level process. Aside from the multiplexing/demultiplexing function and some light error checking, it adds nothing to IP.
UDP takes messages from the application process, attaches source and destination port number fields for the multiplexing/demultiplexing service, adds two other small fields, and passes the resulting segment to the network layer. The network layer encapsulates the transport-layer segment into an IP datagram and then makes a best-effort attempt to deliver the segment to the receiving host. If the segment arrives at the receiving host, UDP uses the destination port number to deliver the segment’s data to the correct application process. Note that with UDP there is no handshaking between sending and receiving transport-layer entities before sending a segment. For this reason, UDP is said to be connectionless.
pro
快
No connection establishment. UDP does not introduce any delay to establish a connection.
No connection state. UDP does not maintain connection state and does not track any of these parameters.
Small packet header overhead. The TCP segment has 20 bytes of header overhead in every segment, whereas UDP has only 8 bytes of overhead.
cons
no congestion control.
the high loss rates :the lack of congestion control in UDP can result in high loss rates between a UDP sender and receiver, and the crowding out of TCP sessions
3.4
principles
Retransmission. A packet that is received in error at the receiver will be retransmitted by the sender.
Receiver feedback. Since the sender and receiver are typically executing on different end systems, possibly separated by thousands of miles, the only way for the sender to learn of the receiver’s view of the world is for the receiver to provide explicit feedback to the sender. The positive (ACK) and negative (NAK) acknowledgment replies in the message-dictation scenario are examples of such feedback
Error detection. First, a mechanism is needed to allow the receiver to detect when bit errors have occurred. UDP uses the Internet checksum field for exactly this purpose. these techniques allow the receiver to detect and possibly correct packet bit errors
the sender will not send a new piece of data until it is sure that the receiver has correctly received the current packet. Because of this behavior, protocols are known asstop to wait
add a new field to the data packet and have the sender number its data packets by putting a sequence number into this field. The receiver then need only check this sequence number to determine whether or not the received packet is a retransmission.
Go-Back-N (GBN)
the sender: transmit multiple packets without waiting for an acknowledgment, but is constrained to have no more than some maximum allowable number, N, of unacknowledged packets in the pipeline. the range of permissible sequence numbers for transmitted but not yet acknowledged packets can be viewed as a window of size N over the range of sequence numbers. As the protocol operates, this window slides forward over the sequence number space. For this reason, N is the window size
Selective Repeat (SR)
selective-repeat protocols avoid unnecessary retransmissions by having the sender retransmit only those packets that it suspects were received in error (that is, were lost or corrupted) at the receiver. This individual, as needed, retransmission will require that the receiver individually acknowledge correctly received packets. A window size of N will again be used to limit the number of outstanding, unacknowledged packets in the pipeline. The SR receiver will acknowledge a correctly received packet whether or not it is in order. Out-of-order packets are buffered until any missing packets (that is, packets with lower sequence numbers) are received, at which point a batch of packets can be delivered in order to the upper layer.
3.5
timeout and RTT
the timeout > round-trip time(RTT), that is, the time from when a segment is sent until it is acknowledged. Otherwise, unnecessary retransmissions would be sent.
SampleRTT, for a segment is the amount of time between when the segment is sent (that is, passed to IP) and when an acknowledgment for the segment is received.
timeout interval >= EstimatedRTT, or unnecessary retransmissions would be sent.
But the timeout interval should not be too much larger than EstimatedRTT; otherwise, when a segment is lost, TCP would not quickly retransmit the segment, leading to large data transfer delays.
Fast Retransmit: 3 duplicate ACK retransmitting
Flow control : slow down source rate to match destination rate.
TCP provide a input buffer, 满了就溢出,因此要控制发送端。若出现丢包,就减缓速度,防止sender overflowing
; this form of sender control is referred to as congestion control: (Routers in between source and destination may experience congestion,
loss. slow down rated of sources causing congestion)
flow control : the sender maintain a variable called the receive window. Informally, the receive window is used to give the sender an idea of how much free buffer space is available at the receiver. Because
TCP is full-duplex, the sender at each side of the connection maintains a distinct receive window.
3.6 congestion control principles and approaches
End-to-end congestion control
TCP segment loss (as indicated by a timeout or a triple duplicate ACKacknowledgment) is taken as an indication of network congestion and TCP decreases its window size accordingly.
Network-assisted congestion control.
network-layer components (that is, routers) provide explicit feedback to the sender regarding the congestion state in the network. This feedback may be as simple as a single bit indicating congestion at a link.
3.7 13final
TCP is to have each sender limit the rate at which it sends traffic into its connection as a function of perceived network congestion.
Loss event: a timeout or the receipt of three duplicate ACKs
A lost segment implies congestion, and hence, the TCP sender’s rate should be decreased when a segment is lost.
Slow Start in the slow-start state, the value of cwnd begins at 1 MSS and increases by 1 MSS every time a transmitted
segment is first acknowledged.
These segments are then acknowledged, with the sender increasing the congestion window by 1 MSS for each of
the acknowledged segments, giving a congestion window of 4 MSS, and so on. This process results in a doubling of the
sending rate every RTT. Thus, the TCP send rate starts slow but grows exponentially during the slow start phase.
Congestion Avoidance
On entry to the congestion-avoidance state, the value of cwnd is approximately half its value when congestion was
last encountered—congestion could be just around the corner! Thus, rather than doubling the value of cwnd every RTT,
TCP adopts a more conservative approach and increases the value of cwnd by just a single MSS every RTT
Fast Recovery
In fast recovery, the value of cwnd is increased by 1 MSS for every duplicate ACK received for the missing
segment that caused TCP to enter the fast-recovery state. Eventually, when an ACK arrives for the missing segment,
TCP enters the congestion-avoidance state after deflating cwnd. If a timeout event occurs, fast recovery transitions to
the slow-start state after performing the same actions as in slow start and congestion avoidance: The value of cwnd is
set to 1 MSS, and the value of ssthresh is set to half the value of cwnd when the loss event occurred.
TCP Tahoe, unconditionally cut its congestion window to 1 MSS and entered the slow-start phase after either a
timeout-indicated or triple-duplicate-ACK-indicated loss event.
The newer version of TCP, TCP Reno, incorporated fast recovery.
NETWORK LAYER
2.5
Services Provided by DNS
identify a host we need a directory service that translates hostnames to IP addresses. This is the main task of the
Internet’s domain name system (DNS). The DNS is (1) a distributed database implemented in a hierarchy of DNS
servers, and (2) an application-layer protocol that allows hosts to query the distributed database.
DNS is to translate user-supplied hostnames to IP addresses
Host aliasing. A host with a complicated hostname can have one or more alias names. Alias hostnames, when present,
are typically more mnemonic than canonical hostnames. DNS can be invoked by an application to obtain the canonical
hostname for a supplied alias hostname as well as the IP address of the host.
Mail server aliasing. it is highly desirable that e-mail addresses be mnemonic. DNS can be invoked by a mail
application to obtain the canonical hostname for a supplied alias hostname as well as the IP address of the host.
Load distribution. DNS is also used to perform load distribution among replicated servers. Busy sites are replicated
over multiple servers, with each server running on a different end system and each having a different IP address. For
replicated Web servers, a set of IP addresses is thus associated with one canonical hostname. The DNS database
contains this set of IP addresses. When clients make a DNS query for a name mapped to a set of addresses, the server
responds with the entire set of IP addresses, but rotates the ordering of the addresses within each reply.
Overview
Suppose that some application running in a user’s host needs to translate a hostname to an IP address. The
application will invoke the client side of DNS, specifying the hostname that needs to be translated. DNS in the user’s
host then takes over, sending a query message into the network. All DNS query and reply messages are sent within
UDP datagrams to port 53. After a delay, ranging from milliseconds to seconds, DNS in the user’s host receives a DNS
reply message that provides the desired mapping. This mapping is then passed to the invoking application.
Hierarchical Database
In order to deal with the issue of scale, the DNS uses a large number of servers, organized in a hierarchical fashion and
distributed around the world. No single DNS server has all of the mappings for all of the hosts in the Internet. Instead,
the mappings are distributed across the DNS servers. To a first approximation, there are three classes of DNS
servers—root DNS servers, top-level domain (TLD) DNS servers, and authoritative DNS servers.
5.4.1
MAC Addresses
it is not hosts and routers that have link-layer addresses but rather their adapters (that is, network interfaces) that
have link-layer addresses. A link-layer address is variously called a LAN address, a physical address, or a MAC
address. For most LANs, the MAC address is 6 bytes long.
4.2 transport layer vs network layer services
transport layer can offer applications connectionless service or connection-oriented service between two processes.
For example, the Internet’s transport layer provides each application a choice between two services: UDP, a
connectionless service; or TCP, a connection-oriented service.
a network layer can provide connectionless service or connection service between two hosts. Network-layer
connection and connectionless services in many ways parallel transport-layer connection-oriented and connectionless
services. For example, a network- layer connection service begins with handshaking between the source and destination
hosts; and a network-layer connectionless service does not have any handshaking preliminaries.
there are crucial differences:
• In the network layer, these services are host-to-host services provided by the network layer for the transport layer. In
the transport layer these services are process-to- process services provided by the transport layer for the application
layer.
• In all major computer network architectures to date, the network layer provides either a host-to-host connectionless
service or a host-to-host connection service, but not both. Computer networks that provide only a connection service at
the network layer are called virtual-circuit (VC) networks; computer networks that provide only a connectionless
service at the network layer are called datagram networks.
• The implementations of connection-oriented service in the transport layer and the connection service in the network
layer are fundamentally different. the transport-layer connection-oriented service is implemented at the edge of the
network in the end systems; the network-layer connection service is implemented in the routers in the network core as
well as in the end systems.
datagram network
In a datagram network, each time an end system wants to send a packet, it stamps the packet with the address of
the destination end system and then pops the packet into the network.
As a packet is transmitted from source to destination, it passes through a series of routers. Each of these routers
uses the packet’s destination address to forward the packet. Specifically, each router has a forwarding table that maps
destination addresses to link interfaces; when a packet arrives at the router, the router uses the packet’s destination
address to look up the appropriate output link interface in the forwarding table. The router then intentionally forwards
the packet to that output link interface.
With this style of forwarding table, the router matches a prefix of the packet’s destination address with the entries
in the table; if there’s a match, the router forwards the packet to a link associated with the match. When there are
multiple matches, the router uses the longest prefix matching rule; that is, it finds the longest matching entry in the
table and forwards the packet to the link interface associated with the longest prefix match.
4.3 13 final
4.4 ICMP(Internet Control Message Protocol)
ICMP is used by hosts and routers to communicate network- layer information to each other. The most typical use
of ICMP is for error reporting. ICMP is often considered part of IP but architecturally it lies just above IP, as ICMP
messages are carried inside IP datagrams. That is, ICMP messages are carried as IP payload. Similarly, when a host
receives an IP datagram with ICMP specified as the upper-layer protocol, it demultiplexes the datagram’s contents to
ICMP. ICMP messages have a type and a code field, and contain the header and the first 8 bytes of the IP datagram that
caused the ICMP message to be generated in the first place. Another interesting ICMP message is the source quench
message.
4.5
A graph is used to formulate routing problems. Recall that a graph G = (N,E) is a set N of nodes and a collection
E of edges, where each edge is a pair of nodes from N. In the context of network-layer routing, the nodes in the graph
represent routers—the points at which packet-forwarding decisions are made—and the edges connecting these nodes
represent the physical links between these routers. an edge’s cost may reflect the physical length of the corresponding
link. For any edge (x,y) in E, we denote c(x,y) as the cost of the edge between nodes x and y. If the pair (x,y) does not
belong to E, we set c(x,y) = ∞. we consider only undirected graphs
One or more of these paths is a least-cost path. The least-cost problem is therefore clear: Find a path between the
source and destination that has least cost. Note that if all edges in the graph have the same cost, the least-cost path is
also the shortest path (that is, the path with the smallest number of links between the source and the destination).
A global routing algorithm computes the least-cost path between a source and destination using complete, global
knowledge about the network. That is, the algorithm takes the connectivity between all nodes and all link costs as
inputs. This then requires that the algorithm somehow obtain this information before actually performing the
calculation. The calculation itself can be run at one site (a centralized global routing algorithm) or replicated at multiple
sites. The key distinguishing feature here, however, is that a global algorithm has complete information about
connectivity and link costs. In practice, algorithms with global state information are often referred to as link-state (LS)
algorithms, since the algorithm must be aware of the cost of each link in the network.
In a decentralized routing algorithm, the calculation of the least-cost path is carried out in an iterative, distributed
manner. No node has complete information about the costs of all network links. Instead, each node begins with only the
knowledge of the costs of its own directly attached links. Then, through an iterative process of calculation and
exchange of information with its neighboring nodes (that is, nodes that are at the other end of links to which it itself is
attached), a node gradually calculates the least-cost path to a destination or set of destinations. The decentralized
routing algorithm is called a distance-vector (DV) algorithm, because each node maintains a vector of estimates of the
costs (distances) to all other nodes in the network.
Link State protocol, Distance Vector, comparison LS vs DV 13 final
4.6
RIP Routing Information Protocol (RIP)
RIP is a distance-vector protocol that operates in a manner very close to the idealized DV protocol. The version of
RIP specified in RFC 1058 uses hop count as a cost metric; that is, each link has a cost of 1. In RIP (and also in OSPF),
costs are actually from source router to a destination subnet. RIP uses the term hop, which is the number of subnets
traversed along the shortest path from source router to destination subnet, including the destination subnet.
The maximum cost of a path is limited to 15, thus limiting the use of RIP to autonomous systems that are fewer
than 15 hops in diameter. Recall that in DV protocols, neighboring routers exchange distance vectors with each other.
The distance vector for any one router is the current estimate of the shortest path distances from that router to the
subnets in the AS. In RIP, routing updates are exchanged between neighbors approximately every 30 seconds using a
RIP response message. The response message sent by a router or host contains a list of up to 25 destination subnets
within the AS, as well as the sender’s distance to each of those subnets. Response messages are also known as RIP
advertisements.
Each router maintains a RIP table known as a routing table. A router’ s routing table includes both the router’ s
distance vector and the router’s forwarding table.
OSPF Open Shortest Path First (OSPF).
OSPF is a link-state protocol that uses flooding of link-state information and a Dijkstra least-cost path algorithm.
With OSPF, a router constructs a complete topological map (that is, a graph) of the entire autonomous system. The
router then locally runs Dijkstra’s shortest-path algorithm to determine a shortest-path tree to all subnets, with itself as
the root node. Individual
link costs are configured by the network administrator. The administrator might choose to set all link costs to 1, thus
achieving minimum-hop routing, or might choose to set the link weights to be inversely proportional to link capacity in
order to discourage traffic from using low-bandwidth links. OSPF does not mandate a policy for how link weights are
set (that is the job of the network administrator), but instead provides the mechanisms (protocol) for determining
least-cost path routing for the given set of link weights.
With OSPF, a router broadcasts routing information to all other routers in the autonomous system, not just to its
neighboring routers. A router broadcasts linkstate information whenever there is a change in a link’s state (for example,
a change in cost or a change in up/down status). It also broadcasts a link’s state periodically (at least once every 30
minutes), even if the link’s state has not changed
Some of the advances embodied in OSPF include the following:
Security. Exchanges between OSPF routers (for example, link-state updates) can be authenticated.
Multiple same-cost paths. When multiple paths to a destination have the same cost, OSPF allows multiple paths to be
used
Integrated support for unicast and multicast routing. Multicast OSPF (MOSPF) provides simple extensions to OSPF to
provide for multicast routing. MOSPF uses the existing OSPF link database and adds a new type of link-state
advertisement to the existing OSPF link-state broadcast mechanism.
Support for hierarchy within a single routing domain. Perhaps the most significant advance in OSPF is the ability to
structure an autonomous system hierarchically.
BGP Border Gateway Protocol
BGP provides each AS a means to
1. Obtain subnet reachability information from neighboring ASs.
2. Propagate the reachability information to all routers internal to the AS.
3. Determine “good” routes to subnets based on the reachability information and on AS policy.
BGP Basics
In BGP, pairs of routers exchange routing information over semipermanent TCP connections using port 179. There
are also semipermanent BGP TCP connections between routers within an AS. For each TCP connection, the two routers
at the end of the connection are called BGP peers, and the TCP connection along with all the BGP messages sent over
the connection is called a BGP session. Furthermore, a BGP session that spans two ASs is called an external BGP
(eBGP) session, and a BGP session between routers in the same AS is called an internal BGP (iBGP) session.
BGP allows each AS to learn which destinations are reachable via its neighboring ASs. In BGP, destinations are
not hosts but instead are CIDRized prefixes, with each prefix representing a subnet or a collection of subnets.
BGP Route Selection
BGP uses eBGP and iBGP to distribute routes to all the routers within ASs. From this distribution, a router may learn
about more than one route to any one prefix, in which case the router must select one of the possible routes. The input
into this route selection process is the set of all routes that have been learned and accepted by the router. If there are two
or more routes to the same prefix, then BGP sequentially invokes the following elimination rules until one route
remains:
• Routes are assigned a local preference value as one of their attributes. The local preference of a route could have
been set by the router or could have been learned by another router in the same AS. This is a policy decision that is left
up to the AS’s network administrator. The routes with the highest local preference values are selected.
• From the remaining routes (all with the same local preference value), the route with the shortest AS-PATH is selected.
If this rule were the only rule for route selection, then BGP would be using a DV algorithm for path determination,
where the distance metric uses the number of AS hops rather than the number of router hops.
• From the remaining routes (all with the same local preference value and the same AS-PATH length), the route with
the closest NEXT-HOP router is selected. Here, closest means the router for which the cost of the least-cost path,
determined by the intra-AS algorithm, is the smallest. this process is called hot-potato routing.
• If more than one route still remains, the router uses BGP identifiers to select the
route;
Routing Policy
All traffic entering a stub network must be destined for that network, and all traffic leaving a stub network must
have originated in that network.
There are currently no official standards that govern how backbone ISPs route among themselves. However, a rule
of thumb followed by commercial ISPs is that any traffic flowing across an ISP’s backbone network must have either a
source or a destination (or both) in a network that is a customer of that ISP; otherwise the traffic would be getting a free
ride on the ISP’s network. Individual peering agreements are typically negotiated between pairs of ISPs and
are often confidential;
4.7
Broadcast routing, the network layer provides a service of delivering a packet sent from a source node to all other nodes in the network; multicast routing enables a single source node to send a copy of a packet to a subset of the other network nodes.
Given N destination nodes, the source node simply makes N copies of the packet, addresses each copy to a different destination, and then transmits the N copies to the N destinations using unicast routing. This N-wayunicast
approach to broadcasting is simple — no new network-layer routing protocol, packet-duplication, or forwarding
functionality is needed. There are, however, several drawbacks to this approach. The first drawback is its inefficiency.
If the source node is connected to the rest of the network via a single link, then N separate copies of the (same) packet
will traverse this single link. It would clearly be more efficient to send only a single copy of a packet over this first hop
and then have the node at the other end of the first hop make and forward any additional needed copies. That is, it
would be more efficient for the network nodes themselves (rather than just the source node) to create duplicate copies
of a packet.
uncontrolled and controlled flooding, spanning tree 13 final
multicast routing
a multicast packet is addressed using address indirection. That is, a single identifier is used for the group of receivers, and a copy of the packet that is addressed to the group using this single identifier is
delivered to all of the multicast receivers associated with that group. In the Internet, the single identifier that represents a group of receivers is a class D multicast IP address. The group of receivers associated with a class D address is referred to as a multicast group.
The goal of multicast routing, then, is to find a tree of links that connects all of the routers that have attached hosts belonging to the multicast group. Multicast packets will then be routed along this tree from the sender to all of the hosts belonging to the multicast tree.
two approaches have been adopted for determining the multicast routing tree, both of which we have already studied in the context of broadcast routing, and so we will only mention them in passing here. The two
approaches differ according to whether a single group-shared tree is used to distribute the traffic for all senders in the group, or whether a source-specific routing tree is constructed for each individual sender.
Multicast routing using a group-shared tree. As in the case of spanning-tree broadcast, multicast routing over a group-shared tree is based on building a tree that includes all edge routers with attached hosts belonging to the multicast group.
In practice, a center-based approach is used to construct the multicast routing tree, with edge routers with attached hosts belonging to the multicast group sending (via unicast) join messages addressed to the center node.
As in the broadcast case, a join message is forwarded using unicast routing toward the center until it either arrives at a router that already belongs to the multicast tree or arrives at the center. All routers along the path that the join message follows will then forward received multicast packets to the edge router that initiated the multicast join.
Multicast routing using a source-based tree. While group-shared tree multicast routing constructs a single, shared
routing tree to route packets from all senders, the second approach constructs a multicast routing tree for each source in the multicast group. In practice, an RPF algorithm (with source node x) is used to construct a multicast forwarding tree for multicast datagrams originating at source x. The RPF broadcast algorithm we studied earlier requires a bit of tweaking for use in multicast. The solution to the problem of receiving unwanted multicast packets under RPF is known as pruning. A multicast router that receives multicast packets and has no attached hosts joined to that group will send a prune message to its upstream router. If a router receives prune messages from each of its downstream routers, then it can forward a prune message upstream.