TROUBLESHOOTING A NETWORK
Troubleshooting
Troubleshooting is perhaps the most difficult task that computer professionals face.
Added to the need to get to the bottom of a problem afflicting the network is the pressure
to do so as quickly as possible. Computers never seem to fail at a convenient time.
Failures occur in the middle of a job or when there are deadlines, and pressures to fix the
problem immediately are intense.
After a problem has been diagnosed, locating resources and following the procedures
required to correct the problem are straightforward. But before that diagnosis occurs, it
is essential to isolate the true cause of the problem from irrelevant factors.
Troubleshooting is more of an art form than an exact science. However, to be efficient
and effective as a troubleshooter, you must approach the problem in an organized and
methodical manner. Remember that you are looking for the cause, not its symptoms;
yet frequently, problems as originally reported are just symptoms and not the true cause.
As a troubleshooter you need to learn to quickly and confidently eliminate as many
alternative causes as possible. This will allow you to focus on the things that might be the
cause of the problem. To do this, you must take a systematic approach.
The process of troubleshooting a computer network problem can be divided into five
steps.
Step 1: Defining the Problem
The first phase is the most critical, yet most often ignored. Without a complete
understanding of the entire problem, you can spend a great deal of time working on the
symptoms, without getting to the cause. The only tools required for this phase are a pad
of paper, a pen (or pencil), and good listening skills.
Listening to the client or network user is your best source of information. Remember
that while you might know how the network functions and be able to find the technical
cause of the failure, those operating the network on a daily basis were there before and
after the problem started and probably recall the events that led up to the failure. By
drawing on their experience with the problem, you can get a head start on narrowing
down the possible causes. To help identify the problem, list the sequence of events, as
they occurred, before the failure. You might want to create a form with these
questions (and others specific to the situation) to help organize your notes.
Some general questions to ask might include:
When did you first notice the problem or error?
Has the computer recently been moved?
1
, Have there been any recent software or hardware changes?
Has anything happened to the workstation? Was it dropped or was something
dropped on it? Were coffee or soda spilled on the keyboard?
When exactly does the problem or error occur? During the startup process? After
lunch? Only on Monday mornings? After sending an e-mail message?
Can you reproduce the problem or error? If so, how do you reproduce the
problem?
What does the problem or error look like?
Describe any changes in the computer (such as noises, screen changes, disk
activity lights).
Users—even those with little or no technical background—can be helpful in collecting
information if they are questioned effectively. Ask users what the network is doing or not
doing that makes them think it's not functioning correctly. User observations that can be
clues to the underlying cause of a network problem include the following:
"The network is really slow."
"I cannot connect to the server."
"I was connected to the server but I lost the connection."
"One of my applications will not run."
"I cannot print."
As you continue to ask questions, you can begin to narrow your focus, as the following
list illustrates:
Are all users affected or only one?
If only one user has a problem, the user's workstation is probably the cause.
Are the symptoms constant or intermittent?
Intermittent symptoms are a sign of failing hardware.
Did the problem exist before an operating system upgrade?
Any change in operating system software can cause new problems.
Does the problem appear with all applications or only one?
If only one application causes problems, focus on the application.
Is this problem similar to a previous problem?
If a similar problem occurred in the past, there might be a documented solution.
Are there new users on the network?
2
, Increased traffic can cause logon and processing delays.
Is there new equipment on the network?
Check to verify that new network equipment has been correctly configured.
Was a new application installed before the problem occurred?
Installation and training issues can cause application problems.
Has any of the equipment been moved recently?
The moved equipment might not be connected to the network.
Which manufacturers' products are involved?
Some vendors offer telephone, online, or onsite support.
Is there a history of incompatibility among certain vendors and certain
components such as cards, hubs, disk drives, software, or network operating
software?
There might be a documented solution on the vendor's Web site.
Has anyone else attempted to solve this problem?
Check for documented repairs and ask coworkers about attempted repairs.
Step 2: Isolating the Cause
The next step is to isolate the problem. Begin by eliminating the most obvious problems
and work toward the more complex and obscure. Your purpose is to narrow your search
down to one or two general categories.
Be sure to observe the failure yourself. If possible, have someone demonstrate the failure
to you. If it is an operator-induced problem, it is important to observe how it is created, as
well as the results.
The most difficult problems to isolate are those which are intermittent and that never
seem to occur when you are present. The only way to resolve these is to re-create the set
of circumstances that cause the failure. Sometimes, eliminating causes that are not the
problem is the best you can do. This process takes time and patience. The user also needs
to keep detailed records of what is being done before and when the failure occurs. It can
help to tell the user to refrain from doing anything with the computer when the problem
recurs, except to call you. That way, the "evidence" won't be disturbed.
3
Troubleshooting
Troubleshooting is perhaps the most difficult task that computer professionals face.
Added to the need to get to the bottom of a problem afflicting the network is the pressure
to do so as quickly as possible. Computers never seem to fail at a convenient time.
Failures occur in the middle of a job or when there are deadlines, and pressures to fix the
problem immediately are intense.
After a problem has been diagnosed, locating resources and following the procedures
required to correct the problem are straightforward. But before that diagnosis occurs, it
is essential to isolate the true cause of the problem from irrelevant factors.
Troubleshooting is more of an art form than an exact science. However, to be efficient
and effective as a troubleshooter, you must approach the problem in an organized and
methodical manner. Remember that you are looking for the cause, not its symptoms;
yet frequently, problems as originally reported are just symptoms and not the true cause.
As a troubleshooter you need to learn to quickly and confidently eliminate as many
alternative causes as possible. This will allow you to focus on the things that might be the
cause of the problem. To do this, you must take a systematic approach.
The process of troubleshooting a computer network problem can be divided into five
steps.
Step 1: Defining the Problem
The first phase is the most critical, yet most often ignored. Without a complete
understanding of the entire problem, you can spend a great deal of time working on the
symptoms, without getting to the cause. The only tools required for this phase are a pad
of paper, a pen (or pencil), and good listening skills.
Listening to the client or network user is your best source of information. Remember
that while you might know how the network functions and be able to find the technical
cause of the failure, those operating the network on a daily basis were there before and
after the problem started and probably recall the events that led up to the failure. By
drawing on their experience with the problem, you can get a head start on narrowing
down the possible causes. To help identify the problem, list the sequence of events, as
they occurred, before the failure. You might want to create a form with these
questions (and others specific to the situation) to help organize your notes.
Some general questions to ask might include:
When did you first notice the problem or error?
Has the computer recently been moved?
1
, Have there been any recent software or hardware changes?
Has anything happened to the workstation? Was it dropped or was something
dropped on it? Were coffee or soda spilled on the keyboard?
When exactly does the problem or error occur? During the startup process? After
lunch? Only on Monday mornings? After sending an e-mail message?
Can you reproduce the problem or error? If so, how do you reproduce the
problem?
What does the problem or error look like?
Describe any changes in the computer (such as noises, screen changes, disk
activity lights).
Users—even those with little or no technical background—can be helpful in collecting
information if they are questioned effectively. Ask users what the network is doing or not
doing that makes them think it's not functioning correctly. User observations that can be
clues to the underlying cause of a network problem include the following:
"The network is really slow."
"I cannot connect to the server."
"I was connected to the server but I lost the connection."
"One of my applications will not run."
"I cannot print."
As you continue to ask questions, you can begin to narrow your focus, as the following
list illustrates:
Are all users affected or only one?
If only one user has a problem, the user's workstation is probably the cause.
Are the symptoms constant or intermittent?
Intermittent symptoms are a sign of failing hardware.
Did the problem exist before an operating system upgrade?
Any change in operating system software can cause new problems.
Does the problem appear with all applications or only one?
If only one application causes problems, focus on the application.
Is this problem similar to a previous problem?
If a similar problem occurred in the past, there might be a documented solution.
Are there new users on the network?
2
, Increased traffic can cause logon and processing delays.
Is there new equipment on the network?
Check to verify that new network equipment has been correctly configured.
Was a new application installed before the problem occurred?
Installation and training issues can cause application problems.
Has any of the equipment been moved recently?
The moved equipment might not be connected to the network.
Which manufacturers' products are involved?
Some vendors offer telephone, online, or onsite support.
Is there a history of incompatibility among certain vendors and certain
components such as cards, hubs, disk drives, software, or network operating
software?
There might be a documented solution on the vendor's Web site.
Has anyone else attempted to solve this problem?
Check for documented repairs and ask coworkers about attempted repairs.
Step 2: Isolating the Cause
The next step is to isolate the problem. Begin by eliminating the most obvious problems
and work toward the more complex and obscure. Your purpose is to narrow your search
down to one or two general categories.
Be sure to observe the failure yourself. If possible, have someone demonstrate the failure
to you. If it is an operator-induced problem, it is important to observe how it is created, as
well as the results.
The most difficult problems to isolate are those which are intermittent and that never
seem to occur when you are present. The only way to resolve these is to re-create the set
of circumstances that cause the failure. Sometimes, eliminating causes that are not the
problem is the best you can do. This process takes time and patience. The user also needs
to keep detailed records of what is being done before and when the failure occurs. It can
help to tell the user to refrain from doing anything with the computer when the problem
recurs, except to call you. That way, the "evidence" won't be disturbed.
3