Graceful Restart

Graceful Restart
In steady-state operation, OSPF can react to changes in the routing domain and reconverge
quickly. This is one of OSPF’s strengths as an IGP. However, what happens when something goes
really wrong is just as important as how things work under relatively stable conditions.
One of those “really wrong” things that sometimes happens is that a router requires a restart to its
OSPF software process. To prevent various routing problems, including loops, that can take place
when an OSPF router suddenly goes away while its OSPF software is restarting is the graceful
restart process documented in RFC 3623. Cisco implemented its own version of graceful restart
in Cisco IOS prior to RFC 3623; as a result, Cisco IOS supports both versions.
Graceful restart is also known as nonstop forwarding (NSF) in RFC 3623 because of the way it
works. Graceful restart takes advantage of the fact that modern router architectures use separate
routing and forwarding planes. It is possible to continue forwarding without loops while the
routing process restarts, assuming the following conditions are true:
■ The router whose OSPF process is restarting must notify its neighbors that the restart is going
to take place by sending a “grace LSA.”
■ The LSA database remains stable during the restart.
■ All of the neighbors support, and are configured for, graceful restart.
■ The restart takes place within a specific “grace period.”
■ During restart, the neighboring fully adjacent routers must operate in “helper mode.”
In Cisco IOS, CEF handles forwarding during graceful restart while OSPF rebuilds the RIB tables,
provided that the preceding conditions are met. Both Cisco and IETF NSF support are enabled by
default in Cisco IOS, beginning with version 12.4(6)T. Disabling it requires a routing process
command for each NSF version, nsf [cisco | ietf] helper disable.
OSPF Path Choices That Do Not Use Cost
Under most circumstances, when an OSPF router runs the SPF algorithm and finds more than one
possible route to reach a particular subnet, the router chooses the route with the least cost.
However, OSPF does consider a few conditions other than cost when making this best-path
decision. This short section explains the remaining factors that impact which route, or path, is
considered best by the SPF algorithm.
Choosing the Best Type of Path
As mentioned earlier, some routes are considered to be intra-area routes, some are interarea routes,
and two are types of external routes (E1 and E2). It is possible for a router to find multiple routes
to reach a given subnet where the type of route (intra-area, interarea, E1, or E2) is different. In
these cases, RFC 2328 specifies that the router should ignore the costs and instead chooses the best
route based on the following order of preference:
1. Intra-area routes
2. Interarea routes
274 Chapter 9: OSPF
3. E1 routes
4. E2 routes
For example, if a router using OSPF finds one intra-area route for subnet 1 and one interarea route
to reach that same subnet, the router ignores the costs and simply chooses the intra-area route.
Similarly, if a router finds one interarea route, one E1 route, and one E2 route to reach the same
subnet, that router chooses the interarea route, again regardless of the cost for each route.
Best-Path Side Effects of ABR Loop Prevention
The other item that affects OSPF best-path selection relates to some OSPF loop-avoidance
features. Inside an area, OSPF uses Link State logic, but between areas OSPF acts much like a
Distance Vector (DV) protocol in some regard. For example, the advertisement of a Type 3 LSA
from one area to another hides the topology in the original area from the second area, just listing
a destination subnet, metric (cost), and the ABR through which the subnet can be reached—all DV
concepts.
OSPF does not use all the traditional DV loop avoidance features, but it does use some of the same
underlying concepts, including Split Horizon. In OSPF’s case, it applies Split Horizon for several
types of LSAs so that an LSA is not advertised into one nonbackbone area and then advertised
back into the backbone area. Figure 9-10 shows an example in which ABR1 and ABR2 both
advertise Type 3 LSAs into area 1, but then they both choose to not forward a Type 3 LSA back
into area 0.
Figure 9-10 Split Horizon per Area with OSPF
ABR3
ABR1 ABR2
R1 R2
Cost 1
Cost 1
Cost 1
Cost 1
Cost 100
Cost 1
Subnet 1
Area 2
Area 0
Area 1
Type 3 LSAs
OSPF Design and LSAs 275
The figure shows the propagation of some of the LSAs for subnet 1 in this figure but not all. ABR3
generates a T3 LSA for subnet 1 and floods that LSA within area 0. ABR1 floods a T3 LSA for
subnet 1 into area 1; however, when ABR2 gets that T3 LSA from ABR1, ABR2 does not flood a
T3 LSA back into area 0. (To reduce clutter, the figure does not include arrowed lines for the
opposite direction, in which ABR2 floods a T3 LSA into area 1, and then ABR1 chooses not to
flood a T3 LSA back into area 0.)
More generically speaking, an ABR can learn about summary LSAs from other ABRs, inside the
nonbackbone area, but the ABR will not then advertise another LSA back into area 0 for that
subnet.
Although interesting, none of these facts impacts OSPF path selection. The second part of ABR
loop prevention is the part that impacts path selection, as follows:
ABRs ignore LSAs created by other ABRs, when learned through a nonbackbone area,
when calculating least-cost paths. This prevents an ABR from choosing a path that goes
into one nonbackbone area and then back into area 0 through some other ABR.
For example, without this rule, in the internetwork of Figure 9-11, router ABR2 would calculate
a cost 3 path to subnet 1: from ABR2 to ABR1 inside area 1 and then from ABR1 to ABR3 in area
0. ABR2 would also calculate a cost 101 path to subnet 1, going from ABR2 through area 0 to
ABR3. Clearly, the first of these two paths, with cost 3, is the least-cost path. However, ABRs use
this additional loop-prevention rule, meaning that ABR2 ignores the T3 LSA advertised by ABR1
for subnet 1. This behavior prevents ABR2 from choosing the path through ABR2, so in actual
practice, ABR2 would find only one possible path to subnet 1: the path directly from ABR2 to ABR3.
Figure 9-11 Effect of ABR2 Ignoring Path to Subnet 1 Through Area 1
ABR3
ABR1 ABR2
R1 R2
Cost 1
Cost 1 Cost 100
Cost 1
Cost 1
Subnet 1
Area 2
Area 0
Area 1
Cost 3 path Cost 101 path
Cost 1
It is important to notice that the link between ABR1 and ABR2 is squarely inside nonbackbone
area 1. If this link were in area 0, ABR2 would pick the best route to reach ABR3 as being ABR2—
ABR1—ABR3, choosing the lower-cost route.
This loop-prevention rule has some even more interesting side effects for internal routers. Again
in Figure 9-10, consider the routes calculated by internal router R2 to reach subnet 1. R2 learns a
T3 LSA for subnet 1 from ABR1, with the cost listed as 2. To calculate the total cost for using
ABR1 to reach subnet 1, R2 adds its cost to reach ABR1 (cost 2), totaling cost 4. Likewise, R2
learns a T3 LSA for subnet 1 from ABR2, with cost 101. R1 calculates its cost to reach ABR2 (cost
1) and adds that to 101 to arrive at cost 102 for this alternative route. As a result, R1 picks the route
through ABR1 as the best route.
However, the story gets even more interesting with the topology in Figure 9-10. R2’s next-hop
router for the R2—ABR2—ABR1—ABR3 path is ABR2. So, R2 forwards packets destined to
subnet 1 to ABR2 next. However, as noted just a few paragraphs ago, ABR2’s route to reach subnet
1 points directly to ABR3. As a result, packets sent by R2, destined to subnet 1, actually take the
path from R2 —ABR2—ABR3. As you can see, these decisions can result in arguably suboptimal
routes, and even asymmetric routes, as would be the case in this particular example.