|Contents||Bulletin||Scripting in shell and Perl||Network troubleshooting||History||Humor|
|Softpanorama Networking Links||Recommended Books||Recommended Links||TCP/Protocol layers||OSI Protocol Layers||Network Troubleshooting Tools||Troubleshooting Solaris Network Problems|
|NTP||Troubleshooting NTP on Solaris||Trobleshooting NTP on Red Hat Linux||Solaris Inetd Services||Solaris Multipathing||Linux Multipath||Troubleshooting TCP/IP Communication Issues||TCP Performance Tuning|
|Classic Net tools||Ftp||Telnet||ssh||HTTP Protocol||Apache troubleshooting||Network Security|
|Linux network configuration||Solaris network configuration||Ethernet||ARP||ICMP||Routing||NAT||Firewalls|
|DHCP||NIS||Troubleshooting NFS Problems||DNS Troubleshooting||Postfix Troubleshooting||Postfix Connection Refused Problem||Samba||LDAP|
|RPC||ICMP Tools||Nmap||ntop||ngrep||rsync||Network IDS||Intrusion Detection|
Network is now the most important parts of enterprise communication infrastructure -- the nerve system of modern enterprise. Network outages often lead to paralysis of organization as people are unable to find alternative ways of performing of their activities. As a network administrator, your primary concern is maintaining connectivity of all devices (a process often called fault management). It is also important continually evaluate your network's performance: serious networking problems sometimes begin as performance problems. Paying proper attention to performance can help you address issues before they become serious.
Network troubleshooting means recognizing and diagnosing correctly problems in production network working under considerable stress and pressure.
Note: the structure on this note was borrowed form old Solaris 8 sysadmin course but the content is different and original...
Like in any investigation you need to avoid jumping to conclusion and calmly collect all relevant facts. You can use famous "How to solve it " approach. Among more network specific issues
In general, there is no one correct way to determine the root cause of a networking problem. Like any troubleshooting of complex systems this is more art then science and the success depends both on your IQ and the level of experience with the environment.
|In general, there is no one correct way to determine the root cause of a networking problem. Like any troubleshooting of complex systems this is more art then science and the success depends both on your IQ and the level of experience with the environment.|
However, most networking problems are repetitive and there are a several heuristics that you can follow:
Start searching by quoting a part or the entire error message exactly but without server specific information and searching for it. Be sure to put the search string in quotes. If it's a common problem, there's a good chance you will get some hits. Anything that looks like a FAQ is a good start; mailing list archives can also been a good source of information. Just be sure to check the archive indexes for other messages in the discussion.
If you are using a commercial product a good idea is to search vendor knowledgebase using your
vendor account. Sometimes they have useful documents for assisting in particular case in their knowledgebase.
Following is a list of some common problems that occur:
When troubleshooting networks, some people prefer to think in layers, similar to the TCP/IP Model while others prefer to think in terms of functionality. Using the TCP/IP Model layered approach, you could start at either the Physical or Application layer. Start at either end of the model and test, draw conclusions, move to the next layer and so on.
A user complains that an application is not functioning. Assuming the application has everything that it needs, such as disk space, name servers, and the like, determine if the Application layer is functional by using another system.
Application layer programs often have diagnostic capabilities and may report that a remote system is not available. Use the tcpdump or Wireshark command to determine if the application program is receiving and sending the expected data.
These two layers can be bundled together for the purposes of troubleshooting. Determine if the systems
can communicate with each other. Look for ICMP messages that can provide clues as to where the problem
lies. Could this be a router or switching problem? Are the protocols (TFTP, BOOTP) being routed? Are
you attempting to use protocols that cannot be routed? Are the hostnames being translated to the correct
IP addresses? Are the correct netmask and broadcast addresses being used? Tests between the client and
server can include
using ping, traceroute, arp, and snoop.
Use snoop to determine if the network interface is actually functioning. Use the arp command to determine if the arp cache has the expected Ethernet or MAC address. Fourth generation hubs and some switches can be configured to block certain MAC addresses.
When troubleshooting connectivity problems here are some useful questions:
Check that the link status LED is lit. Test the connection with your laptop or with a known working cable. The link LED will be lit even if the transmit line is damaged. Verify that a mdi-x connection or crossover cable is being used if connecting hub to hub. See Faulty Cable for details
Here are pages devoted to troubleshooting sorted in reverse order for frequency of visits:
July 15, 2013 | InfoWorld
Born or made, network admins share certain defining characteristics. Here are but nine
A few years ago, I wrote a somewhat tongue-in-cheek piece detailing nine traits of the veteran Unix admin. It enjoyed quite a reception and sparked all kinds of debate across the Internet, with people discussing each trait point by point and sparking skirmishes between rival factions. Since then, I've thought about giving network admins the same treatment, but never got around to it. It seems that this is the week. Here are a few of the many traits of the veteran network admin.
Veteran network admin trait No. 1: We already know it's down
Few things are more annoying than having your phone blow up with automated alert messages from your monitoring systems, scrambling to dig into the issue, only to be continually bombarded with humans texting/talking/emailing/calling with the same "Is x down?" question, or even worse, "The network's down!" If the outage is significant, we already know about it, and we are trying to work on it as fast as we possibly can. Continued attempts to deliver elderly information will only impede that effort.
Veteran network admin trait No. 2: If we don't know it's down, it's probably not down
Conversely, if we get a message claiming, "The network's down!" yet we have not been notified by any monitoring system, then the problem is almost certainly the complaining user in question. To users, if there is any resource that cannot be contacted, whether that resource is internal to the network, on the Internet, or perhaps orbiting the earth, that means the network is "down." This apparently includes 404 errors from shady websites, mistyped URLs, or the lack of any sort of network connection on the user's laptop. Nothing is more grating to a network admin than someone claiming the network is "down." No, it isn't -- reboot your laptop.
Veteran network admin trait No. 3: We will ping and test several times before digging into the problem
If we begin looking into a problem, especially across a WAN or long-haul link with several providers in the middle, we will reserve judgment for the first several minutes. This is because these connections are subject to the vagaries of their path, and connectivity problems can come and go like ghosts in the night. A fiber WAN link that was stable a minute ago but is now exhibiting 30 percent packet loss will more than likely fix itself in short order. Only after a probationary period will we start digging deeper into the issue.
Veteran network admin trait No. 4: Believe it or not, we've tried turning it off and back on again
Many times, we will "fix" a problem by turning an interface off and back on again. In fact, this may be the first thing we try when troubleshooting an issue. Whether it's a problem with auto-negotiation, a sketchy cable, or sunspots, you'd be surprised at how often dropping and restarting an interface will restore proper operation. While networking may be a science, it's not without its white whale.
Veteran network admin trait No. 5: During an outage, we're not just staring at the screen -- we're following a path in our heads
If you come across a network admin working on a problem, looking at a routing table with a 1,000-yard stare, he's not experiencing performance anxiety or somehow "lost." He's running through dozens of possible scenarios in his head, following the routing and switching paths, and calculating possible problem scenarios. The uninitiated can't quite grok the way a network admin's mind works, but a parallel might be to imagine an intangible maze, then try to solve it. Somewhere in that maze lies a dead end that shouldn't be there. We're looking for that -- then we can take steps to open up the path again.
Veteran network admin trait No. 6: We calculate subnet masks and CIDR as easily as breathing
I think it's safe to say that perhaps outside of the common Class C netmask, the overwhelming majority of humans do not understand subnetting, or CIDR. For network admins, this is as intrinsic and involuntary to our brain as breathing. It's not just knowing that a /28 is a netmask of 255.255.255.240, and that it describes 16 addresses, 14 usable, or where subnet boundaries lie. It's the ability to collapse large numbers of smaller subnets into larger descriptors in order to reduce routing table sizes, ACL applications, and a wide variety of other internal networking tasks. When we see contiguous networks sliced into smaller chunks in an ACL, yet with identical rules applied, we get agita. A /19 is a much better idea than a collection of /24s. The same applies to wildmasks.
Veteran network admin trait No. 7: We do not tolerate bugs; they are of the devil
On occasion, conventional troubleshooting or building new networks run into an unexplainable blocking issue. After poring over configurations, sketching out connections, routes, and forwarding tables, and running debugs, one is brought no closer to solving the problem. This is the unholy area of networking inhabited by the software bug. Network admins think of switching and routing software bugs as personal attacks, and they will usually excoriate a vendor when one is discovered. This is because before the determination is made that the problem is due to a bug, nothing makes sense whatsoever. It completely violates years of experience and knowledge, throws waste to logic, and causes immense amounts of stress and turmoil. You might think of it as if you spontaneously transmogrified into a difference species. Everything you've ever known suddenly does not apply, yet here you are.
Veteran network admin trait No. 8: We can read live packet streams and write highly complex filters in our sleep
Few items in networking and computing are even remotely similar to how they're portrayed on TV and in movies. The terminal window with text rapidly scrolling past, however, is one of them. In the sys admin world, that's usually a log file tail. In the networking world, it's usually a packet dump. Depending on what we're trying to do, we may quickly call up a packet capture on a live circuit and watch packets fly by in real time. It's not gibberish. We're looking for telltale signs, and we'll winnow down our capture using filters until we've found what we're looking for. Speed is usually of the essence in these cases, so we're well versed in BPF filtering syntax. Also, we tend to notice really strange things like IP headers that "just don't look right" and other oddities that may as well be alphabet soup to most people. Some people speak Klingon. We speak IP.
Veteran network admin trait No. 9: We take big risks all the time
Network admins tend to work on many remote devices. Unlike server admins who can pull up a console if the server is otherwise inaccessible via the network, we usually have no such luxury. This means that changes we make to certain devices carry with them the ever-present threat of the loss of connection. Basically, we're always a missed keystroke or two away from making a problem worse, or causing a big problem where none previously existed.
While making a change to a remote switch and applying that change to port 34 instead of port 35, say, we could inadvertently cause a complete lack of connectivity to the remote site, taking who knows what offline until the device is power-cycled or we -- or worse, someone else -- drive over to fix it manually. Ideally, this shouldn't happen in the middle of the night, but given the usual timing of network maintenance windows, it often does. Oh, and we sometimes have to make changes to remote devices knowing that if anything goes slightly wrong, we will lose connectivity -- such as when changing Internet providers at a remote site and reconfiguring the firewall with a script because we have no other persistent connection to it. We live with this understanding all day, every day.
I hope that this insight into the extremely logical, yet consistently dangerous world of the network admin has shed some light on how we work and how we think. I don't expect it to curtail the repeated claims of the network being down, but maybe it's a start. In fact, if you're reading this and you are not a network admin, perhaps you should find the closest one and buy him or her a cup of coffee. They could probably use it.
This story, "Nine traits of the veteran network admin," was originally published at InfoWorld.com. Read more of Paul Venezia's The Deep End blog at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.
[ Also on InfoWorld: Nine traits of the veteran Unix admin | How to become a certified IT ninja | Get expert networking how-to advice from InfoWorld's Networking Deep Dive PDF special report. | Subscribe to InfoWorld's Data Center newsletter to stay on top of the latest developments. ]
Softpanorama hot topic of the month
Perl Wiki as System Administrator Tool : Requests for non-existing web pages :
General: DNS Ports Usage : MX Records Checking for for Users of Web Hosting : DNS Security
Cloud providers as intelligence collection hubs : Big Uncle is Watching You : Strategies of Defending Microsoft Windows against Malware : Fighting Spyware : Softpanorama Malware Defense Strategy : eMail Security : Spam : TCP Wrappers
- Sendmail : Minitutotial : Sendmail on RHEL : Configuring Solaris sendmail : Sendmail performance tuning : Sendmail Security : Sendmail Log Formats : Sendmail file permissions : Dual-instance sendmail
- Postfix : Postfix Troubleshooting : Postfix Connection Refused Problem
- Spam : Email Etiquette : Email Overload
TCP Performance Tuning : Trunking / Bonding Multiple Network Interfaces : Linux Multipath : Changing hostname : How to change IP address in RHEL : Linux Network Troubleshooting : Linux NFS : Linux Routing : Linux networking tips
The following Softpanorama links are sorted in the reverse order of the total monthly number of hits:
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of environmental, political, human rights, economic, democracy, scientific, and social justice issues, etc. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit exclusivly for research and educational purposes. If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner.
ABUSE: IPs or network segments from which we detect a stream of probes might be blocked for no less then 90 days. Multiple types of probes increase this period.
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Haterís Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least
Copyright © 1996-2016 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License.
Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|You can use PayPal to make a contribution, supporting development of this site and speed up access. In case softpanorama.org is down you can use the at softpanorama.info|
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.
Last modified: October 20, 2015