The investigation gave you three leads about the source of the problem. Now it’s a matter of
checking out each possibility and determining which one is most likely the source of the issue.
The majority of the possibilities point directly at the host machine, so start there. The first
two causes are host configuration issues. Now, assume that you’ve checked the TCP/IP configuration
on the host and everything is configured properly. You can eliminate the host machine
as the culprit.
You then move on to the remaining possible cause, which is an access list on the router. While
looking at the configuration on the router, you see that an access list is applied to the Ethernet
interface directly connected to the host segment. After reviewing the syntax of the access list, you
determine that it is the cause of the failure.
Great—you’ve found the problem. Now what? Once you find the problem, you must decide
what is needed to fix it. In this case, it is an access-list problem, so there are some special considerations
about how to restore functionality. You must be careful in your actions here, because that
access list may contain other entries that provide security or other network administrative functionality.
You can’t just remove the list—you could cause new problems as you fix the original one.
The best thing to do in this situation is to make a copy of the access list in a text editor,
and then make changes that are specific to your problem. When editing the access list, change
its number. After all of the changes are made in your text editor, ensure that you have a current
backup of the configuration on the router in case you need to restore the original configuration.
Then paste the modified access list back into the router. Finally, go to the interface
and apply the new access list. By following this procedure, the access list is never removed
from the interface.
Obviously, you have now changed the access-list number that is applied to the interface, so
any documentation that refers to the original number will need to be updated. If the access list
that was causing the problem was applied only to Ethernet 0, you can now safely remove
the old list, update this list with the corrections to address your problem, and put it back on the
router. Then reapply this list to Ethernet 0. As was the case before, the access list is never
removed from the interface.
When you are going through the troubleshooting methodology, it is important that you
don’t fix one problem and cause another. Before implementing any changes, think it through or
discuss it with coworkers to pick it apart, and make sure that your solution will fix the problem
without doing anything to create adverse side effects.
Another good practice when implementing changes is to change only one thing at a time, if
possible. If multiple changes must be made, it is best to make the changes in small sets. This way
it is easier to keep track of what was done, what worked, and what didn’t. Observing the effects
of a change becomes much more effective if only a single change is made at a time. There is nothing
worse than troubleshooting your self-induced errors in addition to the original difficulties!
To summarize, follow these practices and guidelines to making changes:
Make one change or a set of related changes at a time, and then observe the results.
Make non-impacting changes—this means trying not to cause other problems while implementing
the changes. The more transparent the change, the better.
Do not create security holes when changing access lists, TACACS+, RADIUS, or other
security-oriented configurations.
Most importantly, make sure you can revert to the original configuration if unforeseen problems
occur as a result of the change. Always have a backup or copy of the configuration.
In the preceding paragraphs, there were references to observing the results of the changes.
Observing results consists of using the exact same methods and commands that were used to
obtain information to gather symptoms—to see whether the changes you implemented had the
results you want. By making a change and then testing its effectiveness, you move toward the
correct solution.
It may take one or more changes to fix the problem, but you should observe each change separately
to monitor progress and to make sure that the alteration doesn’t create any adverse
effects. After the first change is made, you should be able to gather enough information to learn
whether or not the modification was effective, even if it doesn’t entirely solve the problem.
If the changes made have corrected the problem, move on and document the modifications
that were made to the network. If the changes did not work, you need to go back and either
gather more information or try another one of the potential issues that you identified while
isolating the problem.
Iterations—repetitions of certain steps within the troubleshooting model—are simply ways
of whittling away at a larger problem. By implementing changes and monitoring the results, you
can move toward solving the overall problem.
Iterations of the troubleshooting process allow you to focus with more and more detail on
the possible causes of the failure. The result of focusing on the problem is your ability to identify
more specific possibilities for the failure.
The iteration process has its own set of steps: While working through the process, you might
get more ideas of possible sources of the trouble. Write them down; if the current changes do
not work, you have notes about some other options. If you feel that you have exhausted all of
the possible causes, you should probably go back and gather more information. You will probably
find additional clues.
This is also the time to undo any changes that had adverse effects or that did not fix the problem.
Make sure to document what was done, so it will be easier to undo the any configuration
modifications.
Document the Changes
The network problem has been officially resolved after you’ve implemented a change, observed
that the symptoms have disappeared, and can successfully execute the tests that were used to aid
in gathering information about the problem. In this example, the way to verify that the problem
is solved is for Host A to try to ftp to Host Z. If this test is successful, then the problem is resolved.
In the previous sections, we have emphasized that documentation is an integral part of troubleshooting.
When you keep track of the alterations that were made, the routers, switches, or hosts that
were changed, and when the changes occurred, you have valuable information for future reference.
There is always the possibility that something you changed might have affected something else and
you didn’t notice it. If this happens, you will have documentation to refer to, so you can undo the
changes. Or if a similar problem occurs in the future, you can refer to these documents to resolve the
new problem, based on what was done the last time. Later chapters in this book will give you more
information about documentation and establishing baseline information.