Quality of Service (QoS) has been a part of the IP protocol since RFC 791 was released in 1981. However, it has not been extensively used until recently. The main reason for using QoS in an IP network is to protect sensitive traffic in congested links. In many cases, the best solution to the problem of congested links is simply to upgrade these links. All you can do with a QoS system is affect which packets are forwarded and which ones are dropped when congestion is encountered. This is only effective when the congestion is intermittent. If a link is just consistently over-utilized, QoS will at best offer a temporary stopgap measure until the link is upgraded or the network is redesigned.
There are several different traffic flow characteristics that you can set out to control with a QoS system. Some applications require a certain minimum bandwidth to operate; others require a minimum latency. Jitter, which is the difference in latency between consecutive packets, has to be carefully constrained for many real-time applications such as voice and video, in particular. Some applications do not tolerate dropped packets well. Others contain time-sensitive information that is better dropped than delayed.
There are essentially three steps to any traffic prioritization scheme. First, you have to know what your traffic patterns look like. This means you need to understand what traffic is mission critical, what can wait, and what traffic flows are sensitive to jitter, latency, or have minimum throughput requirements. Once you know this, the second step is to provide a way to identify the different types of traffic. Usually, in IP QoS you will use this information to tag the Type of Service (TOS) byte in the IP header. This byte contains a six-bit field called the Differentiated Services Control Point (DSCP) in newer literature, and is separated into a three-bit IP Precedence field and a TOS field (either three or four bits) in older literature. These fields are used for the same purpose, although there are differences in their precise meanings. We discuss these fields in more detail in Appendix B.
The third step is to configure the network devices to use this information to affect how the traffic is actually forwarded through the network. This is the step in which you actually have the most freedom, because you can decide precisely what you want to do with different traffic types. However, there are two main philosophies here: TOS-based routing and DSCP Per-Hop Behavior.
TOS-based routing basically means that the router selects different paths based on the contents of the TOS field in the IP header. However, the precise TOS behavior is left up to the network engineer, so the TOS values could affect other things such as queuing behavior. DSCP, on the other hand, generally looks at the same set of bits and uses them to decide how to handle the queuing when the links are congested. TOS-based routing is the older technique, and DSCP is newer.
You can easily implement TOS-based routing to select different network paths using Cisco's Policy Based Routing (PBR). For example, some networks use this technique of Frame Relay networks to funnel high-priority traffic into a different PVC than lower priority traffic. And many standard IP protocols, such as FTP and Telnet, have well-defined default TOS settings.
Most engineers prefer the DSCP approach because it is easier to implement and troubleshoot. If high-priority application packets take a different path than low-priority PING packets, as is possible in the TOS approach, it can be extremely confusing to manage the network. DSCP is also usually easier to implement and less demanding of the router's CPU and memory resources, as well as more consistent with the capabilities of modern routing protocols.
Note, however, that any time you stop a packet to examine it in more detail, you introduce latency and potentially increase the CPU load on the router. And the more fields you examine or change, the greater the impact. For this reason, we want to stress that the best network designs handle traffic prioritization by marking the packets as early as possible. Then other routers in the network only need to look at the DSCP field to handle the packet correctly. In general, you want to keep this marking function at the edges of the network where the traffic load is lowest, rather than in the core, where the routers are too busy forwarding packets to examine and classify packets.
We discuss the IP Precedence, TOS, and DSCP classification schemes in more detail in Appendix B.
Queuing Algorithms
The simplest type of queue transmits packets in the same order that it receives them. This is called a First In First Out (FIFO) queue. And although it sounds naively like it treats all traffic streams equally, it actually tends to favor resource-hungry, ill-behaved applications.
The problem is that if a single application sends a burst that fills a FIFO queue, the router will wind up transmitting most of the queued packets, but will have to drop incoming packets from other applications. If these other applications adapt to the decrease in available bandwidth by sending at a slower rate, then the ill-behaved application will greedily take up the slack and could gradually choke off all of the other applications.
Because FIFO queuing allows some data flows to take more than their share of the available bandwidth, it is called unfair. Fair Queuing (FQ) and Weighted Fair Queuing (WFQ) are two of the simpler algorithms that have been developed to deal with this problem. Both of these algorithms sort incoming packets into a series of flows.
We discuss Cisco's implementations of different Queuing algorithms in Appendix B.
When talking about queuing, it is easy to get wrapped up in relative priorities of data streams. However, it is just as important to think about how your packets should be dropped when there is congestion. Cisco routers allow you to even implement a congestion avoidance system called Random Early Detection (RED), which also has a weighted variant, Weighted Random Early Detection (WRED). These algorithms allow the router to start dropping packets before there is a serious congestion problem. This forces well-behaved TCP applications to back off and send their data more slowly, thereby avoiding congestion problems before they start. RED and WRED are also discussed in Appendix B.
Fast Switching and CEF
One of the most important performance limitations on a router depends on how the packets are processed internally. The worst case is where the router's CPU has to examine every packet to decide how to forward it. Packets that are handled in the CPU like this are said to use Process Switching. It is never possible to completely eliminate process switching in a router because the router has to react to some types of packets, particularly those containing network control information. And, as we will discuss in a moment, process switching is often used to bootstrap other more efficient methods.
For many years, Cisco has included more efficient methods for packet processing in routers. These often involve off-loading the routing decisions to special logic circuits, frequently associated with interface hardware. The actual details of how these circuits work is often not of much interest to the network engineer. The most important thing is to ensure that as many packets as possible use these more efficient methods.
Fast Switching is one of Cisco's earlier mechanisms for off-loading routing from the CPU. In Fast Switching, the router uses process switching to forward the first packet to a particular destination. The CPU looks up the appropriate forwarding information in the routing table, and then sends the packet accordingly. Then, when the router sees subsequent packets for the same destination, it is able to use the same forwarding information. Fast Switching records this forwarding information in an internal cache, and uses it to bypass the laborious route lookup process for all but the first packet in a flow. It works best when there is a relatively long stream of packets to the same destination. And, of course, it is necessary to periodically verify that the same forwarding information is still valid. So Fast Switching requires the router to process switch some packets just to check that the cached path is still the best path.
To allow for reliable load balancing, the Fast Switching cache includes only /32 addresses. This means that there is no network or subnet level summarization in this cache. Whenever the Fast Switching algorithm receives a packet for a destination that is not in its cache, or that it can't handle because of a special filtering feature that isn't supported by Fast Switching, it must punt. This means that the router passes the packet to a more general routing algorithm, usually process switching.
Fast switching only works with active traffic flows. A new flow will have a destination that is not in the fast-switching cache. Similarly, low-bandwidth applications that only send one packet at a time, with relatively long periods between packets, will not benefit from Fast Switching. In both of these cases, the router must punt and process-switch the packet. Another more serious example happens in busy Internet routers. These devices have to deal with so many flows that they are unable to cache them all.
Largely because of this last problem, Cisco developed a more sophisticated system called Cisco Express Forwarding (CEF) that improves on several of the shortcomings of Fast Switching. The main improvement is that instead of just caching active destinations, CEF caches the entire routing table. This increases the amount of memory required, but the routing information is stored in an efficient hashed structure.
The router keeps the cached table synchronized with the main routing table that is acquired through a dynamic routing protocol, such as OSPF or BGP. This means that CEF only needs to punt a packet when it requires features that don't work with CEF. For example, some Policy Based Routing rules do not work with CEF. So when you use them, CEF must still punt and process switch these packets.
In addition to caching the entire routing table, CEF maintains a table of information about all available next-hop devices. This allows the router to build the appropriate Layer 2 framing information for packets that need to be forwarded, without having to consult the system ARP table.
Because CEF rarely needs to punt a packet, even if it is the first packet of a new flow, it is able to operate much more efficiently than Fast Switching. And because it caches the entire routing table, it is even able to do packet-by-packet round-robin load sharing between equal cost paths. CEF shows its greatest advantage over Fast Switching in situations when there are many flows, each relatively short in duration. Another key advantage is that CEF has native support for QoS, while Fast Switching does not.
A Distributed CEF is available on routers that support Versatile Interface Processor (VIP) cards, such as the 7500 series. This allows each VIP card to run CEF individually to further improve scalability.