Softpanorama

May the source be with you, but remember the KISS principle ;-)
Contents Bulletin Scripting in shell and Perl Network troubleshooting History Humor

The Art of Debugging

Softpanorama Debugging Recommendations

News Recommended books Recommended Links Tutorials IDE GDB GCC Program instrumentation
Perl debugging Software Testing Program understanding HTMLization of programs Data display and data prettyprinting Memory Leaks Disassemblers Reverse engineering
Expect ArsClip Programmable Keyboards Teraterm TeraTerm Macros      
Assembler C M4 Beautifiers and Pretty Printers Terms and Concepts Random Findings Humor Etc
 .

All through my life, I've always used the programming language that blended  best with the debugging system and operating system that I'm using. If I had a better debugger for language X, and if X went well with the operating system, I would be using that.

Donald Knuth

Any bug not detected in the design phase will cost ten times more time to detect at the coding phase and an additional ten times more at the debugging phase.

A Second Look at the Cathedral and the Bazaar

I never make stupid mistakes. Only very, very clever ones.

"Dr Who"

This document contain review of debugging as  a separate discipline of computer science, that each and every programmer should study (and better study well ;-). "Old News and Recommended Links sections below  hopefully provide additional useful links for students learning debugging. It is partially based on my lectures on the subject at FDU...

Contents


Introduction

Debugging is the cornerstone of programming art. While the meaning of the verb to debug is to remove errors, in reality it is more about  seeing step by step execution of a program. As well as examining it against some specification of how program should behave. A programmer who has no access to high quality debugger is like a shortsighted person. He can function but he can miss some dangerous things in the environment very easily.

Debugging involves not only dealing with the programming language and operating environment, it involves dealing with people, and trying to anticipate as well as prevent problems.  Program debugging, is very similar to investigation of the crime scene. Some "obvious" leads can be and often are false. Finding relevant information is not easy and can take tremendous amount of time and require maintaining well organized documentation about the problem.  Some highly suspicious suspects without alibi are actually innocent. You need to have a plan and abilities to see the big picture not to be led off-track. In complex cases, you need clear analytical thinking, mastery of the language in question and experience to be able to get to the root cause.

DeMillo and Mathur have analyzed the errors of T E X reported by Knuth. Their analysis clearly reveals that simple errors do represent a significant fraction, though not the majority, of the errors in complex production programs. It also reveals that such errors remain hidden for a long time before testing or users exposes them. That suggests that using tools that can detect simple errors like lint, cross-reference tables, pretty-printers etc have very high return on investment.

That suggests that using tools that can detect simple errors like lint, cross-reference tables, pretty-printers etc have very high return on investment.

Each programming language is more then just a programming language. It is actually an environment of which language is the most important part but only an part. Other part of the environment consists of a set of tools which can be called language environment and include compilers, interpreter, debuggers, editor support of syntax highlighting, pretty-printers, and so on and so forth. One tool that have enormous influence on the "total quality" of the language environment is the quality of the debugger. Donald Knuth once noted that he preferred the language not based on its qualities as such but on how well this language is integrated with operating system in question and how good is the debugger. This is a real wisdom from a great master. I would say that the quality of debugger is as important as the quality of the language .

Classification of bugs

In my paper A Second Look at the Cathedral and the Bazaar, I used the following simplified classification of bugs:

  1. [Added later] Misspellings and "misstatements" (statements that look correct in particular language, but intent of the author was different like in classic if (i=1)... instead of if (i==1)... in C or usage of "==" instead of "eq" in Perl in comparison of strings  ( if ($test == 'OK') {...} ). There is strong tendency of making language "strict" so that all variable require declaration to prevent misspellings of variables. But in reality compiler can do very good job of detecting those errors and PL/1 debugging compiler went even further and tried to correct them. 

  2. Coding bugs Coding bugs are the result of not understanding the programming language construct thoroughly. These are the most common type of bugs and are relatively easy to debug. It is common for beginner programmers to blame these bugs on the compiler. So, the next time you suspect there's something wrong with the compiler, think again.
  3. Logical bugs (10 times as difficult to debug) A typical example is when an array index counter counts past the end of the array thus accessing an out-of-bounds memory location or a pointer that is uninitialized (points to NULL) is de-referenced. What is a segmentation fault? If your program exits (crashes) with the message "segmentation fault", it means that a pointer in your program is accessing (de-referencing) a memory location outside of it's permitted boundaries (or memory segment).
  4. Architectural bugs (impossible to debug; requires a major re-write of code). Architectural bugs are a result of flaws in the underlying operating system; Microsoft Windows is a good example of this. A lack of operating system security on architectural level permit proliferation of malware. Limitations in the design of programming languages also contribute to architectural bugs; e.g. the lack of memory bounds checking in the C language results in memory leaks and overflows.

Here is a useful quote from Wikipedia about debugging process:

Normally the first step in debugging is to attempt to reproduce the problem. This can be a non-trivial task, for example as with parallel processes or some unusual software bugs. Also, specific user environment and usage history can make it difficult to reproduce the problem.

After the bug is reproduced, the input of the program may need to be simplified to make it easier to debug. For example, a bug in a compiler can make it crash when parsing some large source file. However, after simplification of the test case, only few lines from the original source file can be sufficient to reproduce the same crash. Such simplification can be made manually, using a divide-and-conquer approach. The programmer will try to remove some parts of original test case and check if the problem still exists. When debugging the problem in a GUI, the programmer can try to skip some user interaction from the original problem description and check if remaining actions are sufficient for bugs to appear.

After the test case is sufficiently simplified, a programmer can use a debugger tool to examine program states (values of variables, plus the call stack) and track down the origin of the problem(s). Alternatively, tracing can be used. In simple cases, tracing is just a few print statements, which output the values of variables at certain points of program execution.

Typical recurrent bugs connected with you as a personality and the language used

People tend to repeat the same errors again and again. Some of them are connected with the language used, some are idiosyncratic. One general law is that such errors are repetitive.

So keeping log of your errors and usage it as a checklist helps to avoid much trouble. For example, after many years of using Perl I still periodically make simple but disastrous errors connected with usage of "==" instead of "eq" in comparison of strings, especially if before I used C++ or Java.  I trained myself to grep my program for all cases of "==" before starting debugging to see if I made a blunder.

Keeping log of your errors and usage it as a checklist helps to avoid much trouble

If you do not keep the log you can spend a lot of type bugling you head against the same wall. Each time you have found a non-trivial error in your program, write down the program fragment, the error messages and the solution. Next time it might save you a lot of time.

In addition:

Create and write program log

Large complex programs can benefit from their own logging capabilities that either can use OS facilities (via logger interface) or write own file. Carefully constructed logs help to pinpoint the error much faster. For example if the program process multiple files you will know at which file the error occurred.  And this is very important information.  Logs could be controlled by two general variables, similar to used by IBM in OS/360 which are called msglevel (verbosity) and minimal error level (suppresses warning and minor error).

It is a good practice to assign each error message as unique prefix and a number in "mainframe" style. While it looks somewhat too formal it greatly simplify finding were message was produced in the code.  Numbering and assigning prefixes (based on procedure/subsystem from which error is generated can be done automatically by post processing the program text).

Here is a simple example of such routine written in Perl

#
# Record message in log
# Generate message based on its severity code and location
# Param 0 -- severity 
#      ("ATSEWID" - ABEND, TERMINAL, Severe, Error, Warning, Info, Debug)
# Param 1 -- message code (can be simply line number of the source) 
# Param 2 -- message text
sub logme 
{
   if ($_[0] eq '') {
     print SYSLOG "\n\n";
     return;
   }
   my $ercode=$_[0];
   my $er_severity=index("ATSEWID",$ercode);
   
   if ($er_severity==-1) { $er_severity=length("ATSEWID")-2; $erprefix='';}
   else { 
     $ercounter[$er_severity]++;
     $erprefix="\[$ercode\] ";
   }

#----------------- Error history -------------------------

   my ercode=$_[1];
   my $message=$_[2]; 
   if ( $er_severity > 1 ) {
      if  ($top_error > $er_severity) {
          $top_error=$er_severity;
          $top_message = $message;
      }   
   }      
#--------- Message printing and logging --------------

   return if ($er_severity==1 && $debug==0); # do not print testing messages
   if ($er_severity <= $msglevel1) { 
      print SYSLOG $message;
   }
   if ($er_severity <= $msglevel2) {
      print $message;
   }      
   if ($ercode eq 'S') {
     # special_mes($support_addr,"[ANTISPAM] $message",");  

   }
}

Design Design of controllable debugging trace to STDERR should be a part of program design process

Design of the debugging trace consists of addition of printf (C) or cerr or cout (C++) statements at critical points of each program subroutine. Those statements should be controlled by two debugging variables. Let's them

Bit strings can represent various set operations:   AND finds A ∩ B, while OR operation finds  A U B
See:

 Those statements can help to understand the control flow and display critical data values during execution of  program.

You can also use assertions if the language has such a construct. There are some circumstances where printf debugging is appropriate. Generally debugging trace should be directed to stderr. Unlike stdout, stderr is unbuffered. Thus using stderr, you're much less likely to miss the last information before a crash.

Use sanity checks

Sanity checks are checks if value of the variable (typically a parameter) satisfy some implicit assumptions.  More formal incarnation of the same idea (compromised by its close association with the program verification snake oil salesman) are called assertions.

Sanity checks can be used inside the program where you can have checks like range(min, max), isfile, isdir, etc. In debugger such "sanity checks: can be part of a watch. Regular expression are especially useful here. But utilizing this capability requires knowledge of your debugger beyond basics. Still this is an area where additional time spend of learning capabilities of your debugger can repay you great dividends.

Programming that uses checks of the parameters passed to the program and subroutines is called defensive programming. For example all parameters researched from external programs should be checks for validity and applicable ranges, In days of systematic use of buffer overflows and sophisticated malware it is naive to assume that somebody will never try to pass 10K string where you expect 80 bytes max or that parameter that you received and used without checking to pass to the shell as a part of generated command will never contain malicious commands that were designed to exploit this opportunity. For example, it can contain  ";" (which gives opportunity for attacker to put additional commands into the string), "&&" etc.

Used without excessive zeal this is a good, valuable programming practice. Of course any sound idea can be destroyed and spoiled it by driving it extremes and making 90% of your code just checks of assertions ;-). Something like assertions for the sake of assertion religious sect, which probably can find a lot of adherents in some academic departments. So moderation and logic in application of this idea is highly recommended.

Know your debugger

 

I think the hard thing about all these tools is that it takes a fair amount of effort to become proficient.

-- Bill Joy

Good debugger that is now part of many mainstream programming languages is a complex system and you need to study it to get most of it. Unfortunately many programmers do not pay enough attention to studying of the debugger relying of their mastery of the language. This is a deeply wrong, flawed approach that cost a programmer many additional hours of frantic search for a problem which with better mastery of the debugger might take a couple of minutes to find.  Here is a relevant quote from my Perl debugging page:

Complexity of the debugger is such that few people learn it deeper that some very superficial level. I am one of such people :-). Most professionals that I know work with debugger consulting personal customized cheat-sheet. Creation of such reference card, for example using perl-debugger-refcard-a4 by Andrew Ford, or this page as a template is a must. For more inspiration see cheat-sheets section of Recommended Links

Complexity of the debugger is such that few people learn it deeper that some very superficial level. I am one of such people :-). Most professionals that I know work with debugger consulting personal customized cheat-sheet. Creation of such reference card, for example using perl-debugger-refcard-a4 , or this page as a template is a must. For more inspiration see cheat-sheets section of Recommended Links

Among unique capabilities of Perl debugger that any programmer should know and know well:

Shortcuts and automation of the debugging process

One can greatly benefit from learning Expect. This is a great tool to automate any debugger !!!. But clipboard managers like ArsClip and terminal emulators like Teraterm provide facilities that can cut the number of keystrokes during a debugging session by a half or more.

The most simple and universal tool for automating debugging is Programmable Keyboards. For example most programmable keyboards such as Sidewinder Pro allow you to create macros and also allow you to repeat them in a lop controllable by special button. That is an alternative to kicking n or Enter again and again ;-)

The first, the most simple and the most important tool here is assigning macros to programmable keys (most keyboard have 6 or 12 programmable keys and three banks that allow three different set of macros to be used.

To factor out your typical commands you need to write a log of your debugging process (Teraterm can do this automatically) and them assign those command that can be reused appropriate shortcuts.

Set of those shortcuts is as important as good set of test cases and you need to put efforts in creating one as in three-four months you skills in debugging particular program, skills achieved with such an effort and costing you so many hours will disappear. Set of Teraterm macros and ArsClip shortcut can revive some of those.

There are also testing automatic toolkits like as DejaGNU but they are belong the scope of this lecture.

Binge debugging

Debugging complex problems in large programs is not for suckers. Unless you can concentrate on the problem for, say, at least three hours straight you might never be able to find it. Finding complex bugs requires sitting behind the screen for hours and hours. It's like  air controller job when it hakes an hour to "catch the picture" of air traffic and you can lose in a minute it if somebody interrupts you. Ability to work long, long hours often 12 hours a day or more and persistence, bulldog type persistence is a must.  Like Calvin Coolidge noted

 "Nothing in the world can replace persistence. Talent will not; nothing more common than the unsuccessful man with talent. Genius will not; unrewarded genius is almost a proverb. Education will not; the world is full of educated derelicts. Persistence and Determination are omnipotent."

Similarly Bill Joy observed (Salon):

"Most people are bad programmers," says Joy. "The honest truth is that having a lot of people staring at the code does not find the really nasty bugs. The really nasty bugs are found by a couple of really smart people who just kill themselves. Most people looking at the code won't see anything ...

Even if you are talented programmer, long debugging sessions are a necessary for achieving the highest level of concentration possible and absorbing as much details as you can, in order to pinpoint a complex, illusive bugs. So while binge debugging is definitely harmful for your health, no question about it, it is also a necessity.  You just can't achieve anything working for an hour straight with 15 min breaks. And it's not a secret that not only debugging, but programming in general is a pretty harmful for health activity.  Long hours are necessity to achieve success. Bill Joy once noted:

We don't manage our time as well as we manage our space. There's an overhead of starting and an overhead of stopping a project because you kind of lose your momentum. And you've got to bracket and put aside all the things you're already doing. So you need reasonably large blocks of uninterrupted time if you're going to be successful at doing some of these things. That's why hackers tend to stay up late. If you stay up late and you have another hour of work to do, you can just stay up another hour later without running into a wall and having to stop. Whereas it might take three or four hours if you start over, you might finish if you just work that extra hour. If you're a morning person, the day always intrudes a fixed amount of time in the future. So it's much less efficient. Which is why I think computer people tend to be night people - because a machine doesn't get sleepy.

The question is only about what level of harm is unavoidable and what is just result of your unability to organize your activity better. There is no ready made answer to this problem. I think that after three hours your concentration starts to drop and it is time to make a break. Your milarge may vary.

Also much depends on the pre-history. When you are trying to find a bug for a week or more you tend more and more engage in binge sessions. I think debugging sessions longer then three hours are counterproductive as you can't maintain the highest level of concentration on the problem longer than that.

But again emotions when can't find elusive bug for days or even weeks are overwhelming and while I certainly recommend to do breaks periodically in such situation, I did not follow that advice when I experienced this situation myself. When you can't find a bug for a week your psychological condition is often pretty bad, binge debugging or not. 

What I can say is that sometime the problem became more clear after you  sleep for an hour or two.  Something happens in your brain while you are sleeping. It continues to process and organize the relevant information. This might be individual, but I strongly recommend to sleep for an hour or two after, say, three hour debugging session, if you can.

Also varying presentation medium influence the success of debugging process. Using projector to project the program or debugging session on the board where you can use dry marker sometimes helps. Just writing "f*ck" on the board produce some relief of tension when nothing helps ;-).

Pretty-printing program and working with program listing without computer sometimes help.

George Polya "how to solve it" approach

George Polya(1887-1985) was a famous mathematician, who got importance results in probability, analysis, number theory, geometry, combinatorics and mathematical physics. His book How to Solve It was probably the most significant contribution to heuristic since Descartes' Discourse on Method.

In it he proposed structuring the problem-solving process into four stages (see How To Solve It Step-by-step Plan ).

  1. Understanding the Problem
  2. Devising a Plan
  3. Carrying out the Plan
  4. Looking Back

For each stage George Polya supplied a series of questions that help to solve the problem. Some of them are:

For complete list questions in a form of step-by-step plan see How To Solve It Step-by-step Plan

The examples in the book are drawn mostly from elementary math, but the method is quite general and is perfectly applicable to the debugging of computer programs. It is especially valuable for solving not reproducible bugs, bugs that can't be reality reproduced from one test run to another. I read it when I was 15 years old and this was the only mathematical book that I really liked. I thought that it contains amazing insights into problem solving skills.  That perception was wrong as the book is useful only for those who can do without it, but my intensive study of the book later paid off great dividends in debugging computer programs.  Or at least I tend to think so now.  In any case I strongly recommend to try it. 

David Burns Ten Techniques

A good review of important debugging techniques was provided by David Burns in his article  The Mental Game of Debugging :

March 2000

For software developers, debugging is a fact of life. And for contract programmers, hunting bugs exclusively may be the name of the game. Here are 10 suggestions on hunting and fixing them.

It was my first task on a new project, and I had been assigned a blatantly visible bug. The product, a data display client, was able to scale its fonts to the size of the display area. Under certain circumstances on some machines, however, the font would grow rapidly and uncontrollably until a single letter filled the entire screen.

The original developers were literally out of the country, having planned overseas trips based on the scheduled ship date, so I was on my own. The code that calculated the font size based on the display area's dimensions was the obvious place to look. Finding no errors there, I assumed (correctly, it later turned out) that a floating-point rounding error was the ultimate cause of the bug. I wasted a day rewriting the code to eliminate that possibility, and then suffered the ignominy of watching my fix fail on the project manager's computer.

In fact, the problem lived in an entirely different section of the code. The font-size calculation used a value called PixelsPerInch. This was not, as I had assumed, a value established on initialization and then held constant. Instead, the value was recalculated each time the display area was resized, but not in the WM_SIZE processing code I had rewritten. Instead, the so-called "constant" was recalculated in the handler for WM_NCCALCSIZE. This logic interacted circularly with the font-size calculation such that a floating point rounding error could iteratively magnify itself, resulting ultimately in the visible bug. A few extra minutes spent learning the entire window-size logic of this application would have saved me hours.

Ten Techniques

If you're a software developer, debugging is a part of your life. And if you work as a contract programmer, as I often do, you may be brought into a project to hunt bugs exclusively. I can't tell you what the bugs will look like. They could be caused by a design error, a mistake in logic or the mishandling of some resource. The documentation and specifications will probably be inadequate or absent, and all you'll have to go on will be a brief and possibly erroneous description of the problem. In fact, you won't know the cause of the bug, or how to fix it, until you're finished.

To make matters worse, there aren't many books on debugging, and no single method or tool will fix them all. Debugging, like solving a puzzle or playing a game, requires alertness, intelligence and invention. Through experience, I've discovered 10 essential techniques to hunting and fixing bugs.

  1. Boast. Just as Olympic athletes use visualization to enhance their prowess, you can talk out loud about the details of a positive performance and thus become accustomed to the ideas, thoughts and actions associated with success. By boasting in the company of others, you also establish a social expectation of success.

    Should you have the chance, boast about how you will fix a particularly challenging bug. One manager apologized for presenting me with a difficult problem, but I merely grinned, "Thorny bugs are part of my natural habitat. I'll take care of it for you."

    Boast (or, if you prefer, visualize) to yourself as well. When you face a new problem, imagine how you will feel as you find the cause and obliterate it with airtight code.

    Sometimes it's more useful to put your boasting in writing. If you fix an especially nasty bug, write an explanatory memo to edify the other programmers and management. If you're fixing a number of simple bugs, get your productivity noticed by tracking and reporting the numbers in periodic progress reports. --[ I would actually add that you need to keep your own journal and revisit periodically the last year activities -- NNB]
     

  2. Relax and enjoy your work. Bug-fixing is a learning process. If you approach it with dragging feet, apprehension or stress, your mind won't be open to what you need to learn. Just in case that point isn't clear enough on its own, here are two corollaries:
    • Don't complain about the code you have to work on,
    • and don't complain about the tools you have to use.

    Complaining about code is like complaining about a mountain. It reflects more on the character of the complainer than it does on the terrain. Be the master of the mountain and confine your comments to positive suggestions about how to improve the code. As for the tools, if they are hard to use, learn how to use them better. If they don't do what needs to be done, see point six.
     

  3. Know your habitat. A developer writing new code might toil months or even years on a small part of a large project, learning and using only a fraction of the technologies and techniques involved. While a developer can be successful with deep knowledge of a narrow specialty, a bug hunter needs a broad knowledge of the environment he works in. How do you obtain that knowledge? As you go along. Whenever you encounter something that is unfamiliar, find out about it. Become familiar with the on-line documentation of the system you're working in. Today's operating systems and application programming interfaces have hundreds or thousands of entry points. You don't need to know them all, but you should be able to find and understand any of them quickly.

    In addition, make sure the online help and documentation for your programming environment is fully installed, so you don't have to track down the CD-ROM. Read any worthwhile newsgroups in your area of expertise—one good one for C++ programmers is comp.lang.c++.moderated—and cultivate unofficial as well as official sources of information. For every Microsoft Systems Journal you subscribe to, you should also read an independent publication.
     

  4. Don't be afraid to start at the beginning. If you're working on an unfamiliar project, and the bug is even slightly obscure, don't hesitate to gain an understanding of the entire environment before you begin. Start with the initialization or instantiation code, the class declarations, or the constructors. Review enough to understand the project's overall architecture. Trace through the code as it starts; examine how subsidiary modules or objects are attached, initialized and destroyed. You'll be surprised how often this review gives you an insight into the nature of your bug, but even if it doesn't, it's not time wasted. Such a review teaches you about the style of the code you're working in and allows you to learn about any unfamiliar technologies or special techniques that are used in the project.

  5. Know what you don't know. Prior assumed knowledge obstructs learning. Before you begin work on any bug, consciously empty your mind of preconceptions about its cause. This can be surprisingly hard in the face of managers and other programmers who are sure they know what the problem is. Don't ignore them, but do realize that if they really knew where the bug was, they wouldn't need you.

    Start by enumerating exactly what you do and don't know about the code and the bug. In one case, I was asked to work on a bug in some of the most obscure and tangled Win16 code I've ever encountered. The software was said to freeze or hang. Every programmer on the team was sure that the bug must be caused by a logic error, and several had spent a fair amount of time searching for such an error. First, I discarded the notion that the bug was caused by a logic error. I then discarded the idea that the program froze or hung. I didn't know whether either were true.

    A quick test established that the software did not hang. It merely failed to display any user-accessible controls. After two days of painstaking analysis, testing and message tracing, I had the answer: the Windows message queue was running out of slots, causing a call to PostMessage() to fail. As many Windows programmers before him had done, the code's author had assumed that PostMessage() always succeeded. In retrospect, the solution looks ridiculously simple, but I had to know what I didn't know.
     

  6. Use tools that will allow you to see what you need to see. Many programmers use whatever tools are present and accept the level of information they provide. Instead of remaining passive, decide what it is you need to see to analyze the bug. It might be network messages, DLL call parameters or hardware interrupts. You might need high-resolution timing or a log of object copies. The default debugger might be fine, or you might need a lower-level tool to debug a driver. If rapidly interacting components are involved, a debugger might just interfere, so you might use trace output or log files to see what is needed. Or, you might need to write the tool yourself.

    Be knowledgeable about the tools that are already available. An hour spent learning debugger options or how to configure an analyzer's output will pay off many times over if it saves you from having to write your own tool or helps you fix a bug you might not have found otherwise.

    Make sure that you understand the kinds of information that are available. Does the programming language support debug tracing, assertions or exceptions? Do you have access to medium and high-resolution timing information? Can you breakpoint on a change in the value of a variable, the 28th execution of a loop or a certain level of memory usage? What effect do these options have on your running program?

    Though the right tool or method is important, your primary focus should still be on the problem. Should you find yourself in the enviable situation of being surrounded by an excess of enticing tool options, remember that despite vendor promises, visual debuggers, run-time error detectors, API tracers and code analyzers can't entirely automate the process. Indeed, you may find yourself being run by them and looking at what they bring to your attention rather than at what you need to see. Producing software—designing the interfaces, writing the code, or hunting down and fixing bugs—is intellectual labor. A tool can help you see the problem, but the solution is up to you.
     

  7. Be patient. Bug hunting takes time, even though it usually occurs under tremendous deadline pressure. The urge to change some code and "see if it works" can be overwhelming, but don't give in. Don't change any code until you have seen for yourself what causes the problem, and never try to fix bugs on a schedule. You can't know the solution until you know the cause, and you can't know the cause until you learn your way to it. There is no way to know how long that will take. Some managers don't feel in control unless they can put a date on a schedule and then carp at someone when that date gets missed. Managing the expectations of such a manager is difficult, but it helps if you are tactful and understanding of the pressures she is under.
     
  8. Keep your eyes open for evidence of problems. Every programming language and environment has one or more common programming errors; look actively for these. In C++, it might be uninitialized member variables or improper releasing of resources. In Visual Basic, it might be improper use of DIM, or incorrect ON ERROR processing. If you're alert, you'll detect unreported bugs just waiting to manifest themselves under the right conditions.

    Good eyes can also reveal valuable insights that aren't directly related to bugs. Once, while tracing through code in search of a bug in Lotus Agenda, an early personal information manager, I noticed that one of the parsing routines was implemented inefficiently. I mentioned this to the manager, but he was uninterested. The parsing was part of idle-loop background processing, and its bad performance would never be noticed by users. Two weeks later, the parsing had to be moved to the foreground for unrelated reasons, and users complained loudly. Since I already knew where the problem was, I was able to quickly rewrite that code, improving performance by a factor of twenty. The complaints stopped, and the manager's gratitude was palpable.
     

  9. Know what is right. When bad specifications meet inept programming, disaster is near. A Silicon Valley start-up once hired me to fix an install program after the uninstall had wiped out a tester's hard drive. The specification for the uninstall had insisted that all files associated with the application should be deleted. Reasoning that the easiest way to accomplish this was to remove the application's entire directory tree, the junior programmer assigned this task had written code that obtained the current directory name (C:\ApplicationName\Binary\), parsed backward to eliminate the \Binary\, then destroyed whatever was left (which should have been C:\ApplicationName). Unfortunately, our hero's code sometimes parsed back one step too far, resulting in an unconditional delete of the contents of C:\.

    It took some time to convince my client that the primary fault was an error in specification. A properly written uninstall will not delete anything that was not created by the installer. User files created by the application or database files updated by the application should never be deleted automatically, but should be left for the user to delete manually.

    Problems like this are not uncommon. Sooner or later, you will find that while you know the cause of a bug and the site of the required change, the ambiguous, erroneous or absent specification leaves you in doubt as to how the software would behave if it functioned correctly. It then becomes your job to determine what the specification should be. You might ask the software's architects, interview the client or consult your own judgment. Whatever you do, be sure that the behavior you settle on is described exactly. Write the specification down, either as an e-mail to the client, a comment in the code or an addendum to the documentation. This may be more important than fixing the bug itself, because when the specification isn't clear, different developers, working at different times or without close communication, will implement different semantics. The bugs this can introduce can be extremely hard to fix.
     

  10. Fix the bug. Too often, I've had to fix a bug a second or third time, after someone else's fix failed to cover a predictable alternate case or came apart due to other changes in the project. Usually, the second or third repair happens because the fixer tried to patch existing code instead of rewriting it. It may seem safer to leave existing designs, declarations and code paths in place by trying to make them work with a tweak here and a state variable there, but sometimes replacing the old code with a better design is the only solution.

    I once inherited some code that had been the focus of numerous bug reports. Its task was to check expressions before passing them off to be processed by a server. Over the course of a series of fixes by different programmers, this code had become a welter of unreadable "if" statements.

    I only had one known bug to fix, but I could see that there would be more, and that any changes in the specification would be nearly impossible to implement without introducing more errors.

    Instead of tweaking the code, I wrote a simple recursive-descent parser to replace it. This took a day and a half, instead of the hour or two it would have taken to fix the one known bug, but it had its intended effect. New bug reports based on this part of the program virtually ceased. When the call came to add support for new expression types, they were easily integrated into the existing framework. Certainly, you shouldn't replace code that only needs a minor adjustment, but when a bug highlights a design flaw in the original code, you should think seriously about improving and replacing the original design.

    There is nothing sacred about the original object model, the interfaces created by the original coders, or the patterns or algorithms they selected.

    If you do choose to change an interface, replace a major portion of the original code or make some other profound change, be sure to have a good reason for doing so. Check with other stakeholders to make sure your changes will support their interests, and thoroughly document your changes.

Writing new software is a sublime act of creation. By the application of scientific knowledge, creative genius and an understanding of human nature, we construct from nothing a powerful tool, a spellbinding entertainment or a vital piece of infrastructure. While this may be a deeply satisfying occupation, hunting bugs can be just as absorbing, rewarding and fulfilling. So don't feel intimidated or bored when it is your fortune to fix a bug.

Smile—you're a hunter in search of some of the most elusive quarry there is.

Conclusions

Like programming itself debugging is an art. So advice about debugging is useful mainly to those who can do well without it. But for latter category this is a chance dramatically improve their efficiency as learning the techniques that are used by others (who often are older and spend many more years digging this sh*t) is the most cost-effective way of achieving real mastery.

Dr. Nikolai Bezroukov


Heelo
Top Visited
Switchboard
Latest
Past week
Past month


NEWS CONTENTS

Old News ;-)

[Oct 17, 2014] Don Knuth and the Art of Computer Programming The Interview

You have an essay about developing TeX where you talk about going over to a pure, destructive QA personality and trying your hardest to break code. Do you think most developers are good at that? DK: You are right that it takes a certain kind of mentality to create test cases that will detect subtle bugs. For example, I know that I'm not sneaky enough to ever become an expert on cryptography or computer security.

On the other hand I have been reasonably successful at designing torture tests for software that I've written, mostly by
(1) imagining myself as the enemy of the system, rather than as its friend;
(2) thinking of inputs that are legal but bizarre and unlikely ever to be useful;
(3) embedding an incredibly complicated construction into another that's even less scrutable.

Some parts of my test programs for TeX and METAFONT required many hours of thought before I could convince myself that the program did the right thing. But in the process I discovered bugs that I'm pretty sure wouldn't have been found by any other method that I've ever heard about.

Even better results will presumably be obtained if several different people independently create the torture tests. I can imagine that test creation would be a satisfying career.

I guess I do tend to use systems in unexpected ways, and I get some satisfaction when this reveals a flaw. For example, I remember having fun while visiting my sister and playing with a `shoot-em-up' game that my nephew showed me. Since I'm kind of a pacifist, I tried shooting bullets at the wall, instead of trying to kill the hidden attackers. Sure enough, I could spell my name on that wall, by making bullet holes in an appropriate pattern. But later when I came back to that same position, after wandering around further in the game, my name was gone! I think that was a bug in the software.

Long ago when I was a teenager, probably in 1952 or 1953, I saw my first "computer game", a demonstration of tic-tac-toe at a museum in Chicago where visitors were challenged to beat a machine.

The exhibit was designed by AT&T, and I think it was controlled by relay switches rather than by a real general-purpose computer.

I knew that I could never win if I made normal responses; so I intentionally made the stupidest possible moves. The machine got to a position where it had two ways to beat me by completing three in a row. I decided not to block either possibility. In response, the machine made both of its winning moves ... and this was clearly in violation of the rules. So I had won a moral victory.

Thus, it appears that an encouragement to "think outside the box" helps for designing test cases, just as it does in many other situations.

[Oct 18, 2013] Tom Clancy, Best-Selling Master of Military Thrillers, Dies at 66

Fully applicable to programming...
NYTimes.com

“I tell them you learn to write the same way you learn to play golf,” he once said. “You do it, and keep doing it until you get it right. A lot of people think something mystical happens to you, that maybe the muse kisses you on the ear. But writing isn’t divinely inspired — it’s hard work.”

[Sep 22, 2012] Debugging techniques

If you consider using printf debugging, please check out the use of assertions (see the section called Assertions: defensive programming) and of a debugger (see the section called The debugger); these are often much more effective and time-saving.

There are some circumstances where printf debugging is appropriate. If you want to use it, here are some tips:

Here is a nice way to do it. File debug.h:

#ifndef DEBUG_H
#define DEBUG_H
#include <stdarg.h>

#if defined(NDEBUG) && defined(__GNUC__)
/* gcc's cpp has extensions; it allows for macros with a variable number of
   arguments. We use this extension here to preprocess pmesg away. */
#define pmesg(level, format, args...) ((void)0)
#else
void pmesg(int level, char *format, ...);
/* print a message, if it is considered significant enough.
      Adapted from [K&R2], p. 174 */
#endif

#endif /* DEBUG_H */
        

File debug.c:

#include "debug.h"
#include <stdio.h>

extern int msglevel; /* the higher, the more messages... */

#if defined(NDEBUG) && defined(__GNUC__)
/* Nothing. pmesg has been "defined away" in debug.h already. */
#else
void pmesg(int level, char* format, ...) {
#ifdef NDEBUG
	/* Empty body, so a good compiler will optimise calls
	   to pmesg away */
#else
        va_list args;

        if (level>msglevel)
                return;

        va_start(args, format);
        vfprintf(stderr, format, args);
        va_end(args);
#endif /* NDEBUG */
#endif /* NDEBUG && __GNUC__ */
}
        

Here, msglevel is a global variable which you have to define, that controls how much debugging output is done. You can then use pmesg(100, "Foo is %l\n", foo) to print the value of foo in case msglevel is set to 100 or more.

Note that you can remove all this debugging code from your executable by adding -DNDEBUG to the preprocessor flags: for GCC, the preprocessor will remove it, and for other compilers pmesg will have an empty body, so that calls to it can be optimised away by the compiler. This trick was taken from assert.h; see the next section.

[Aug 9, 2009] brian's Guide to Solving Any Perl Problem

Some Perl-related advice is questionable (I really hate "script", especially useless for small scripts). But generic points are good, abeit not new...
My Philosophy of Debugging

I believe in three things:

It is not personal
Forget about code ownership. You may think yourself an artist, but even the old Masters produced a lot of crap. Everybody's code is crap, which means my code is crap and your code is crap. Learn to love that. When you have a problem, your first thought should be ``Something is wrong with my crappy code''. That means you do not get to blame perl. It is not personal.
Forget about how you do things. If the way you did things worked, you would not be reading this. That is not a bad thing. It is just time to evolve. We have all been there.
Personal responsibility
If you have a problem with your script it is just that---your problem. You should do as much to solve it by yourself as you can. Remember, everyone else has their own scripts, which means they have their own problems. Do your homework and give it your best shot before you bother someone else with your problems. If you honestly try everything in this guide and still cannot solve the problem, you have given it your best shot and it is time to bother someone else.
Change how you do things
Fix things so you do not have the same problem again. The problem is probably how you code, not what you code. Change the way you do things to make your life easier. Do not make Perl adapt to you because it will not. Adapt to Perl. It is just a language, not a way of life.

... ... ...

Look at the code before the line number in the error message!
Perl gives you warning messages when it gets worried and not before. By the time perl gets worried the problem has already occurred and the line number perl is on is actually after the problem. Look at the couple of expressions before the line number in the warning.
Have you checked Google?
If you have a problem, somebody else has probably had that problem. See if one of those other people posted something to the usenet group comp.lang.perl.misc by searching Google Groups (http://groups.google.com). The difference between people who ask questions on usenet and those who answer them is the ability to use Google Groups effectively.
Did you talk to the bear?
Explain your problem aloud. Actually say the words.
For a couple of years I had the pleasure of working with a really good programmer who could solve almost anything. When I got really stuck I would walk over to his desk and start to explain my problem. Usually I didn't make it past the third sentence without saying ``Never mind---I got it''. He almost never missed either.
Since you will probably need to do this so much, I recommend some sort of plush toy to act as your Perl therapist so you do not annoy your colleagues. I have a small bear that sits on my desk and I explain problems to him. My wife does not even pay attention when I talk to myself anymore.
Does the problem look different on paper?
You have been staring at the computer screen, so maybe a different medium will let you look at things in a new way. Try looking at a print-out of your program.
Have you watched The Daily Show with Jon Stewart?
Seriously. Perhaps you do not like Jon Stewart, so choose something else. Take a break. Stop thinking about the problem for a bit and let your mind relax. Come back to the problem later and the fix may become immediately apparent.
Have you packed your ego?
If you still have not made it this far, the problem may be psychological. You might be emotionally attached to a certain part of the code, so you do not change it. You might also think that everyone else is wrong but you. When you do that, you do not seriously consider the most likely source of bugs---yourself. Do not ignore anything. Verify everything.

How to Report Bugs Effectively by by Simon Tatham

Anybody who has written software for public use will probably have received at least one bad bug report. Reports that say nothing ("It doesn't work!"); reports that make no sense; reports that don't give enough information; reports that give wrong information. Reports of problems that turn out to be user error; reports of problems that turn out to be the fault of somebody else's program; reports of problems that turn out to be network failures.

There's a reason why technical support is seen as a horrible job to be in, and that reason is bad bug reports. However, not all bug reports are unpleasant: I maintain free software, when I'm not earning my living, and sometimes I receive wonderfully clear, helpful, informative bug reports.

In this essay I'll try to state clearly what makes a good bug report. Ideally I would like everybody in the world to read this essay before reporting any bugs to anybody. Certainly I would like everybody who reports bugs to me to have read it.

In a nutshell, the aim of a bug report is to enable the programmer to see the program failing in front of them. You can either show them in person, or give them careful and detailed instructions on how to make it fail. If they can make it fail, they will try to gather extra information until they know the cause. If they can't make it fail, they will have to ask you to gather that information for them.

In bug reports, try to make very clear what are actual facts ("I was at the computer and this happened") and what are speculations ("I think the problem might be this"). Leave out speculations if you want to, but don't leave out facts.

When you report a bug, you are doing so because you want the bug fixed. There is no point in swearing at the programmer or being deliberately unhelpful: it may be their fault and your problem, and you might be right to be angry with them, but the bug will get fixed faster if you help them by supplying all the information they need. Remember also that if the program is free, then the author is providing it out of kindness, so if too many people are rude to them then they may stop feeling kind.

"It doesn't work."

Give the programmer some credit for basic intelligence: if the program really didn't work at all, they would probably have noticed. Since they haven't noticed, it must be working for them. Therefore, either you are doing something differently from them, or your environment is different from theirs. They need information; providing this information is the purpose of a bug report. More information is almost always better than less.

Many programs, particularly free ones, publish their list of known bugs. If you can find a list of known bugs, it's worth reading it to see if the bug you've just found is already known or not. If it's already known, it probably isn't worth reporting again, but if you think you have more information than the report in the bug list, you might want to contact the programmer anyway. They might be able to fix the bug more easily if you can give them information they didn't already have.

This essay is full of guidelines. None of them is an absolute rule. Particular programmers have particular ways they like bugs to be reported. If the program comes with its own set of bug-reporting guidelines, read them. If the guidelines that come with the program contradict the guidelines in this essay, follow the ones that come with the program!

If you are not reporting a bug but just asking for help using the program, you should state where you have already looked for the answer to your question. ("I looked in chapter 4 and section 5.2 but couldn't find anything that told me if this is possible.") This will let the programmer know where people will expect to find the answer, so they can make the documentation easier to use.

... ... ...

Summary

[Oct 7, 2006] Insecurity in Open Source

So how come open-source software isn't saving the day?

BETTER ISN'T BEST. The answer is that it can—just not yet. The open-source development community must first graduate from Lake Wobegon University, where all of their software is just above average. They need to take lessons from the all-stars in closed-source software development. Likely, this means more thorough and rigorous end-to-end quality testing. Open source developers have a lot of pride in the quality of their individual contributions to code, but many proprietary organizations understand how to (and are required to) vouch for the high quality of the software system as a whole.

The irony is that our research shows that on average, open-source software is of higher quality than proprietary software. Indeed, open-source projects tend to clump together in the higher-quality range. Proprietary software applications scatter across the quality continuum, but the best ones tend to be considerably better than open source, and customers don't choose software based on industry averages.

The best of closed-source software is found in so-called mission-critical applications: things like jet engines, nuclear power plants, telephone systems, medical devices. These are things that simply can't fail, or people may die.

TESTING, TESTING. It's apparent from our research that the masters of proprietary software development know a few things—or simply utilize strict development practices—that could benefit the oftentimes brilliant (but perhaps less disciplined) open-source community.

This matters, since software is increasingly pervasive in business and government. And more and more of it is open source. As the lines of code pile up, applications become more complicated, more difficult to troubleshoot, and unfortunately, more likely to crash.

We see two trends in proprietary development that promise to make open-source software better, too. First, companies are paying a lot more attention to the lifecycle of their software. That means better end-to-end quality measurement that helps ensure the entire software system delivers on its promise. Second, there's a trend toward automated testing. As software gets more complicated, testing has to be done by machines as well as people. It's just too much for humans. The 2002 NIST study estimated that $22 billion of the cost of buggy software would vanish with better testing.

GOING MAINSTREAM? Because of the higher average quality of open-source software revealed by our research, we strongly believe it can cross the chasm into mainstream use. It offers too many advantages for both developers and consumers.

But in order for open-source software to become more prevalent in mission-critical applications, the open-source community must put more emphasis on industry best practices. We challenge this community to take a closer look at how the best proprietary software gets built and learn from that. Software quality and security are the most important factors in the choices that developers and companies make— not open-source vs. proprietary.

---

I scanned through the article, it didn't seem to mention how they tested the top proprietary software. I can well understand that there are are a lot of bugs in open source code since it is written by humans. But human also right the proprietary code. How did they test it?

---

they tested it by using a program that systemattically scans code for common errors.

I don't know if the closed source statistics are online somewhere, but these are the open source statistics.
http://scan.coverity.com/ [coverity.com]

---

You are assuming that "a whole lot of people" actually check the code and submit patches for FOSS projects... My guess is that most testing, even of FOSS software, is done with the compiled program, not by reading the source code.

by teg (97890) on Saturday October 07, @02:00PM (#16349367)
(http://www.pvv.ntnu.no/~teg/)

An open source software is tested by a whole lot of people over the world and everyone is free to take the code and test if. On the other hand in case of proprietary software this is not the case and is tested by far less number of individuals.

That sounds rather idealistic... The coverage on OSS varies a lot. Most is not tested much, and the testing is not systematic and analyzed, but ad hoc. And if a bug is found, many just shrug and think of it as buggy software, but don't do more about it. There is a world of difference between the high standards of large projects like the linux kernel, apache and eclipse vs. the thousands of small projects found on freshmeat or just googling for something to scratch your itch.

  • XXX

    I figured I should chip in, since I'm a Dev that works in QA.

    Somebody please explain to me exactly what kind of software bug can be found by automatic scanning that isn't found by standard debugging and compile-time checks. If a computer can ascertain exactly what the programmer intended to do, why do we need programmers?

    Well first of all, you have to assume that all programmers even do the "standard debugging and compile-time checks". Even then, those checks are often hardly comprehensive. You can build some scanners that will catch rudimentary bugs that SHOULD have been caught, but were not. For example, assign things to null, test boundary conditions, etc. These are all things that should be part of a standard unit test that's delivered along with the code.

    Also, things like memory leaks can be difficult to pinpoint. That's where tools like BoundsChecker [compuware.com] are nifty.

    Considering that most software bugs are logic bugs (off by one, etc) that can't be directly seen in the code without actually, you know, RUNNING the program, I find it difficult to believe that AI has come to the point where it can guess the coder's intentions and infer the purpose of an application.

    No one is saying that a program or "AI" (as you call it) can find all bugs. But it can certainly be used to find some rather simple ones. Overall, though, I agree that you can't depend on running programs to catch bugs. You've got to have a solid QA department, which will use all the right tools to get the job done and try to maintain quality to the highest degree.

    I'm not saying that OSS can't do this at all. I certainly think it can be done, but it does take more process and structure. Not having worked on any OSS projects myself, I imagine the largest, most important code-bases have a lot of this in place to drive quality. But your smaller or even mid-size OSS projects may be lacking in that department (much in the same way smaller and mid-size dev houses lack a decent QA department).

    xxx:

    Somebody please explain to me exactly what kind of software bug can be found by automatic scanning that isn't found by standard debugging and compile-time checks. If a computer can ascertain exactly what the programmer intended to do, why do we need programmers?

    Decimal one = 1;
    Decimal two = 2;

    one.add(two);

    System.out.printline(one);

    Guess whats printed? Similar errors are made if you use methods on java.lang. String like replace(pattern, replacement, pos).

    The simple answer to this is that they can't.

    Thats a very uninformed oppion! Tools like http://pmd.sourceforge.net/ [sourceforge.net] have a data base of over 100 "bug patterns" to check your code againt. That does not mean all found points are truely positive, but thy definitely are bad coding practice and my end in a bug later if the code gets changed.

    There are lots of simialr tools, check IBMs alaphaworks and developerworks e.g.

    angel'o'sphere

  • [Dec 11, 2003] Myths Open Source Developers Tell Ourselves by chromatic

    ONLamp.com

    One persistent mis-feature of open source development is thoughtless mimicry, copying the behaviors of other projects without considering if they work or if there are better options under the current circumstances. At best, these practices are conventional wisdom, things that everybody believes even if nobody really remembers why. At worst, they're lies we tell ourselves.

    Perhaps "lies" is too strong a word. "Myths" is better; these ideas may not be true, but we don't intend to deceive ourselves. We may not even be dogmatic about them, either. Ask any experienced open source developer if his users really want to track the latest CVS sources. Chances are, he doesn't really believe that.

    In practice, though, what we do is more important than what we say. Here's the problem. Many developers act as if these myths are true. Maybe it's time to reconsider our ideas about open source development. Are they true today? Were they ever true? Can we do better?

    Some of these myths also apply to proprietary software development. Both proprietary and open models have much room to improve in reliability, accessibility of the codebase, and maturity of the development process. Other myths are specific to open source development, though most stem from treating code as the primary artifact of development (not binaries), not from any relative immaturity in its participants or practices.

    Not every open source developer believes every one of these ideas, either. Many experienced coders already have good discipline and well-reasoned habits. The rest of us should learn from their example, understanding when and why certain practices work and don't work.

    Publishing your Code Will Attract Many Skilled and Frequent Contributors

    Myth: Publicly releasing open source code will attract flurries of patches and new contributors.

    Reality: You'll be lucky to hear from people merely using your code, much less those interested in modifying it.

    While user (and developer) feedback is an advantage of open source software, it's not required by most licenses, nor is it guaranteed by any social or technical means. When was the last time you reported a bug? When was the last time you tried to fix a bug? When was the last time you produced a patch? When was the last time you told a developer how her work solved your problem?

    Some projects grow large and attract many developers. Many more projects have only a few developers. Most of the code in a given project comes from one or a few developers. That's not bad — most projects don't need to be huge to be successful — but it's worth keeping in mind.

    The problem may be the definition of success. If your goal is to become famous, open source development probably isn't for you. If your goal is to become influential, open source development probably isn't for you. Those may both happen, but it's far more important to write and to share good software. Success is also hard to judge by other traditional means. It's difficult to count the number of people using your code, for example.

    It's far more important to write and to share good software. Be content to produce a useful program of sufficiently high quality. Be happy to get a couple of patches now and then. Be proud if one or two developers join your project. There's your success.

    This isn't a myth because it never happens. It's a myth because it doesn't happen as often as we'd like.

    Feature Freezes Help Stability

    Myth: Stopping new development for weeks or months to fix bugs is the best way to produce stable, polished software.

    Reality: Stopping new development for awhile to find and fix unknown bugs is fine. That's only a part of writing good software.

    The best way to write good software is not to write bugs in the first place. Several techniques can help, including test-driven development, code reviews, and working in small steps. All three ideas address the concept of managing technical debt: entropy increases, so take care of small problems before they grow large.

    Think of your project as your home. If you put things back when you're done with them, take out the trash every few days, and generally keep things in order, it's easy to tidy up before having friends over. If you rush around in the hours before your party, you'll end up stuffing things in drawers and closets. That may work in the short term, but eventually you'll need something you stashed away. Good luck.

    By avoiding bugs where possible, keeping the project clean and working as well as possible, and fixing things as you go, you'll make it easier for users to test your project. They'll probably find smaller bugs, as the big ones just won't be there. If you're lucky (the kind of luck attracted by clear-thinking and hard work), you'll pick up ideas for avoiding those bugs next time.

    Another goal of feature freezes is to solicit feedback from a wider range of users, especially those who use the code in their own projects. This is a good practice. At best, only a portion of the intended users will participate. The only way to get feedback from your entire audience is to release your code so that it reaches as many of them as possible.

    Many of the users you most want to test your code before an official release won't. The phrase "stable release" has special magic that "alpha," "beta," and "prelease" lack. The best way to get user feedback is to release your code in a stable form.

    Make it easy to keep your software clean, stable, and releasable. Make it a priority to fix bugs as you find them. Seek feedback during development, but don't lose momentum for weeks on end as you try to convince everyone to switch gears from writing new code to finding and closing old bugs.

    This isn't a myth because it's bad advice. It's only a myth because there's better advice.

    The Best Way to Learn a Project is to Fix its Bugs and Read its Code

    Myth: New developers interested in the project will best learn the project by fixing bugs and reading the source code.

    Reality: Reading code is difficult. Fixing bugs is difficult and probably something you don't want to do anyway. While giving someone unglamorous work is a good way to test his dedication, it relies on unstructured learning by osmosis.

    Learning a new project can be difficult. Given a huge archive of source code, where do you start? Do you just pick a corner and start reading? Do you fire up the debugger and step through? Do you search for strings seen in the user interface?

    While there's no substitute for reading the code, using the code as your only guide to the project is like mapping the California coast one pebble at a time. Sure, you'll get a sense of all the details, but how will you tell one pebble from the next? It's possible to understand a project by working your way up from the details, but it's easier to understand how the individual pieces fit together if you've already seen them from ten thousand feet.

    Writing any project over a few hundred lines of code means creating a vocabulary. Usually this is expressed through function and variable names. (Think of "interrupts," "pages," and "faults" in a kernel, for example.) Sometimes it takes the form of a larger metaphor. (Python's Twisted framework uses a sandwich metaphor.)

    Your project needs an overview. This should describe your goals and offer enough of a roadmap so people know where development is headed. You may not be able to predict volunteer contributions (or even if you'll receive any), but you should have a rough idea of the features you've implemented, the features you want to implement, and the problems you've encountered along the way.

    If you're writing automated tests as you go along (and you certainly should be), these tests can help make sense of the code. Customer tests, named appropriately, can provide conceptual links from working code to required features.

    Keep your overview and your tests up-to-date, though. Outdated documentation can be better than no documentation, but misleading documentation is, at best, annoying and unpleasant.

    This isn't a myth because reading the code and fixing bugs won't help people understand the project. It's a myth because the code is only an artifact of the project.

    Packaging Doesn't Matter

    Myth: Installation and configuration aren't as important as making the source available.

    Reality: If it takes too much work just to get the software working, many people will silently quit.

    Potential users become actual users through several steps. They hear about the project. Next, they find and download the software. Then they must brave the installation process. The easier it is to install your software, the sooner people can play with it. Conversely, the more difficult the installation, the more people will give up, often without giving you any feedback.

    Granted, you may find people who struggle through the installation, report bugs, and even send in patches, but they're relatively few in number. (I once wrote an installation guide for a piece of open source software and then took a job working on the code several months later. Sometimes it's worth persisting.)

    Difficulties often arise in two areas: managing dependencies and creating the initial configuration. For a good example of installation and customization, see Brian Ingerson's Kwiki. The amount of time he put into making installation easier has paid off by saving many users hours of customization. Those savings, in turn, have increased the number of people willing to continue using his code. It's so easy to use, why not set up a Kwiki for every good idea that comes along?

    It's OK to expect that mostly programmers will use development tools and libraries. It's also OK to assume that people should skim the details in the README and INSTALL files before trying to build the code. If you can't easily build, test, and install your code on another machine, though, you have no business releasing it to other people.

    It's not always possible, nor advisable, to avoid dependencies. Complex web applications likely require a database, a web server with special configurations (mod_perl, mod_php , mod_python, or a Java stack). Meta-distributions can help. Apache Toolbox can take out much of the pain of Apache configuration. Perl bundles can make it easier to install several CPAN modules. OS packages (RPMs, debs, ebuilds, ports, and packages) can help.

    It takes time to make these bundles and you might not have the hardware, software, or time to write and test them on all possible combinations. That's understandable; source code is the real compatibility layer on the free Unix platforms anyway.

    At a minimum, however, you should make your dependencies clear. Your configuration process should detect as many dependencies as possible without user input. It's OK to require more customization for more advanced features. However, users should be able to build and to install your software without having to dig through a manual or suck down the latest and greatest code from CVS for a dozen other projects.

    This isn't a myth because people really believe software should be difficult to install. It's a myth because many projects don't make it easier to install.

    It's Better to Start from Scratch

    Myth: Bad or unappealing code or projects should be thrown away completely.

    Reality: Solving the same simple problems again and again wastes time that could be applied to solving new, larger problems.

    Writing maintainable code is important. Perhaps it's the most important practice of software development. It's secondary, though, to solving a problem. While you should strive for clean, well-tested, and well-designed code that's reasonably easy to modify as you add features, it's even more important that your code actually works.

    Throwing away working code is usually a mistake. This applies to functions and libraries as well as entire programs. Sometimes it seems as if most of the effort in writing open source software goes to creating simple text editors, weblogs, and IRC clients that will never attract more than a handful of users.

    Many codebases are hard to read. It's hard to justify throwing away the things the code does well, though. Software isn't physical — it's relatively easy to change, even at the design level. It's not a building, where deciding to build four stories instead of two means digging up the entire foundation and starting over. Chances are, you've already solved several problems that you'd need to rediscover, reconsider, re-code, and re-debug if you threw everything away.

    Every new line of code you write has potential bugs. You will spend time debugging them. Though discipline (such as test-driven development, continual code review, and working in small steps) mitigates the effects, they don't compare in effectiveness to working on already-debugged, already-tested, and already-reviewed code.

    Too much effort is spent rewriting the simple things and not enough effort is spent reusing existing code. That doesn't mean you have to put up with bad (or simply different) ideas in the existing code. Clean them up as you go along. It's usually faster to refine code into something great than to wait for it to spring fully formed and perfect from your mind.

    This isn't a myth because rewriting bad code is wrong. It's a myth because it can be much easier to reuse and to refactor code than to replace it wholesale.

    Programs Suck; Frameworks Rule!

    Myth: It's better to provide a framework for lots of people to solve lots of problems than to solve only one problem well.

    Reality: It's really hard to write a good framework unless you're already using it to solve at least one real problem.

    Which is better, writing a library for one specific project or writing a library that lots of projects can use?

    Software developers have long pursued abstraction and reuse. These twin goals have driven the adoption of structured programming, object orientation, and modern aspects and traits, though not exactly to roaring successes. Whether proprietary code, patent encumbrances, or not-invented-here stubbornness, there may be more people producing "reusable" code than actually reusing code.

    Part of the problem is that it's more glamorous (in the delusive sense of the word) to solve a huge problem. Compare "Wouldn't it be nice if people had a fast, cross-platform engine that could handle any kind of 3D game, from shooter to multiplayer RPG to adventure?" to "Wouldn't it be nice to have a simple but fun open source shooter?"

    Big ambitions, while laudable, have at least two drawbacks. First, big goals make for big projects — projects that need more resources than you may have. Can you draw in enough people to spend dozens of man-years on a project, especially as that project only makes it possible to spend more time making the actual game? Can you keep the whole project in your head?

    Second, it's exceedingly difficult to know what is useful and good in a framework unless you're actually using it. Is one particular function call awkward? Does it take more setup work than you need? Have you optimized for the wrong ideas?

    Curiously, some of the most portable and flexible open source projects today started out deliberately small. The Linux kernel originally ran only on x86 processors. It's now impressively portable, from embedded processors to mainframes and super-computer clusters. The architecture-dependent portions of the code tend to be small. Code reuse in the kernel grew out of refining the design over time.

    Solve your real problem first. Generalize after you have working code. Repeat. This kind of reuse is opportunistic.

    This isn't a myth because frameworks are bad. This is a myth because it's amazingly difficult to know what every project of a type will need until you have at least one working project of that type.

    I'll Do it Right *This* Time

    Myth: Even though your previous code was buggy, undocumented, hard to maintain, or slow, your next attempt will be perfect.

    Reality: If you weren't disciplined then, why would you be disciplined now?

    Widespread Internet connectivity and adoption of Free and Open programming languages and tools make it easy to distribute code. On one hand, this lowers the barriers for people to contribute to open source software. On the other hand, the ease of distribution makes finding errors less crucial. This article has been copyedited, but not to the same degree as a print book; it's very easy to make corrections on the Web.

    It's very easy to put out code that works, though it's buggy, undocumented, slow, or hard to maintain. Of course, imperfect code that solves a problem is much better than perfect code that doesn't exist. It's OK (and even commendable) to release code with limitations, as long as you're honest about its limitations — though you should remove the ones that don't make sense.

    The problem is putting out bad code knowingly, expecting that you'll fix it later. You probably won't. Don't keep bad code around. Fix it or throw it away.

    This may seem to contradict the idea of not rewriting code from scratch. In conjunction, though, both ideas summarize to the rule of "Know what's worth keeping." It's OK to write quick and dirty code to figure out a problem. Just don't distribute it. Clean it up first.

    Develop good coding habits. Training yourself to write clean, sensible, and well-tested code takes time. Practice on all code you write. Getting out of the habit is, unfortunately, very easy.

    If you find yourself needing to rewrite code before you publish it, take notes on what you improve. If a maintainer rejects a patch over cleanliness issues, ask the project for suggestions to improve your next attempt. (If you're the maintainer, set some guidelines and spend some time coaching people along as an investment. If it doesn't immediately pay off to your project, it may help along other projects.) The opportunity for code review is a prime benefit of participating in open source development. Take advantage of it.

    This isn't a myth because it's impossible to improve your coding habits. This is a myth because too few developers actually have concrete, sensible plans to improve.

    Warnings Are OK

    Myth: Warnings are just warnings. They're not errors and no one really cares about them.

    Reality: Warnings can hide real problems, especially if you get used to them.

    It's difficult to design a truly generic language, compiler, or library partially because it's impossible to imagine all of its potential uses. The same rule applies to reporting warnings. While you can detect some dangerous or nonsensical conditions, it's possible that users who really know what they are doing should be able to bypass those warnings. In effect, it's sometimes very useful to be able to say, "I realize this is a nasty hack, but I'm willing to put up with the consequences in this one situation."

    Other times, what you consider a warnable or exceptional condition may not be worth mentioning in another context. Of course, the developer using the tool could just ignore the warnings, especially if they're nonfatal and are easily shunted off elsewhere (even if it is /dev/null). This is a problem.

    When the "low oil pressure" or "low battery" light comes on in a car, the proper response is to make sure that everything is running well. It's possible that the light or a sensor is malfunctioning, but ignoring the real problem — whether bad light or oil leak — may exacerbate further problems. If you assume that the light has malfunctioned but never replace it, how will you know if you're really out of oil?

    Similarly, an error log filled with trivial, fixable warnings may hide serious problems. Any well-designed tool generates warnings for a good reason: you're doing something suspicious.

    When possible, purge all warnings from your code. If you expect a warning to occur — and if you have a good reason for it — disable it in the narrowest possible scope. If it's generated by something the user does and if the user is privy to the warning, make it clear how to avoid that condition.

    Running a program that spews crazy font configuration questions and null widget access messages to the console is noisy and incredibly useless to anyone who'd rather run your software than fix your mess. Besides that, it's much easier to dig through error logs that only track real bugs and failures. Anything that makes it easier to find and fix bugs is nice.

    This isn't a myth because people really ignore warnings. It's a myth because too few people take the effort to clean them up.

    End Users Love Tracking CVS

    Myth: Users don't mind upgrading to the latest version from CVS for a bugfix or a long-awaited feature.

    Reality: If it's difficult for you to provide important bugfixes for previous releases, your CVS tree probably isn't very stable.

    It's tricky to stay abreast of a project's latest development sources. Not only do you have to keep track of the latest check-ins, you may have to guess when things are likely to spend more time working than crashing and build binaries right then. You can waste a lot of time watching compiles fail. That's not much fun for a developer. It's even less exciting for someone who just wants to use the software.

    Building software from CVS also likely means bypassing your distribution's usual package manager. That can get tangled very quickly. Try to keep required libraries up-to-date for only two applications you compiled on your own for awhile. You'll gain a new appreciation for people who make and test packages.

    There are two main solutions to this trouble.

    This isn't a myth because developers believe that development moves too fast for snapshots. It's a myth because developers aren't putting out smaller, more stable, more frequent releases.

    Common Sense Conclusions

    Again, these aren't myths because they're never true. There are good reasons to have a feature freeze. There are good reasons to invite new developers to get to know a project by looking through small or simple bug reports. Sometimes, it does make sense to write a framework. They're just not always true.

    It's always worth examining why you do what you do. What prevents you from releasing a new stable version every month or two? Can you solve that problem? Solve it. Would building up a good test suite help you cut your bug rates? Build it. Can you refactor a scary piece of code into something saner in a series of small steps? Refactor it.

    Making your source code available to the world doesn't make all of the problems of software development go away. You still need discipline, intelligence, and sometimes, creative solutions to weird problems. Fortunately, open source developers have more options. Not only can we work with smart people from all over the world, we have the opportunity to watch other projects solve problems well (and, occasionally, poorly).

    Learn from their examples, not just their code.

    chromatic is the technical editor of the O'Reilly Network and the co-author of Perl Testing: A Developer's Notebook.

    General Programming Concepts: Writing and Debugging Programs ... -- good book with large debugging section. AIX oriented but can be used for any Unix.

    list.unix-haters

    From: FG - Usenet Repost 1999-01-28 (blackhart@mindspring.com)
    Subject: Programming errors
    View: Original Format Newsgroups: list.unix-haters, misc.test
    Date: 1999/01/31
    Date: Thu, 28 Jan 1999 18:35:19 -0800
    From: FG
    Subject: Programming errors
    
    I am reminded again of how shaky the software world is.

    Someone has been making a major effort to clean up the code in the FreeBSD tree. In two days he has reported three instances of the following common C error:

    if (x = y)
    instead of
    if (x == y)

    This is in running code, in an OS whose developers consider stability to be one of its major advantages over other offerings.

    He also reported some missing breaks in a switch statement---many of us remember what THAT error did not too long ago. [RISKS-9.61 to 71. Trojan horse switches in midstream? PGN]

    [Jan 24, 2004] Editorials - Providing Good Feedback for Bug Reporters by Tracey Clark

    January 24, 2004 | freshmeat.net

    Providing Good Feedback for Bug Reporters

    A comment on a bug I submitted recently spurred me to provide some feedback from an application user's perspective on bug reports. There are ways of responding to a bug report that encourage the types of responses that are helpful to developers, and there are ways of responding that only produce anger and frustration, without getting anything fixed. My hope is to encourage good communication between bug reporters and developers to enable better, quicker bugfixes.


    Copyright notice: All reader-contributed material on freshmeat.net is the property and responsibility of its author; for reprint rights, please contact the author directly.


    My Background

    I have been reporting bugs for a few years, on various pieces of software. I've had both positive and negative experiences with the responses I've received from developers. I come from a professional background of using computers, supporting computers, change control, and basic programming, so I've seen problem reporting from a few angles.

    To begin, I'll outline responses I have received or seen that were unhelpful. Following that will be responses I have seen that were helpful. Finally, I'll outline some general guidelines which I feel promote good communication between developers and reporters.

    Unhelpful Responses
    It's not our product, it's some other software causing the problem (with no debugging or troubleshooting).

    Trying to shift blame with no investigation does nobody and nothing any good. It only makes the developers look like they don't want to take responsibility for a possible bug or don't want to do any troubleshooting. Some reported bugs do wind up really being a problem with other software. If properly investigated, you've got the data to back the position that other software caused the bug. Posting the exact reason it's a problem with other software to the report is great feedback for the bug reporters, who can then report it to the correct place. They also know you cared enough about their bug to help them get it to the right place, making them more likely to report future bugs.

    I can't reproduce it/works for me (with no other qualifications).

    That doesn't help anyone, and it sounds like another "I don't wanna do work" response. The fact that person B cannot reproduce a bug does not negate the fact that person A can always reproduce it. Helpful information to add to a comment like this would be: Did you use the same version on the same OS, following exactly the same steps (using menus as opposed to an icon), etc.? The programmers may have done things exactly the same way, but maybe they didn't. The bug reporters don't know unless they are told. Also helpful are comments about what you're trying to do to reproduce the problem, such as "I'm continuing to try this on another machine" or "I'm working with a few people to see if we can reproduce this bug". One failed attempt to reproduce the bug with no further work isn't working the bug, it's akin to lip service. Tell the reporters instead what other troubleshooting information they need to give you to track the bug down.

    The two responses above are unfortunately so common in the IT world, and so well-known by users (who aren't fooled), that a company made an IT-style Magic 8-Ball. The answers include:

    &!*#*&... We've said a million times not to report x because blah blah...

    (And other not-so-subtle "You're stupid for having contacted us" or abusive types of responses).

    A polite pointer to the FAQ or appropriate documentation is more appropriate. Telling people they shouldn't have written about one bug will discourage them from writing about any bug.

    No response (when assigned).

    Hello? Is anyone watching this thing? Even "Bug in queue, will get to this" would be helpful. (I like the note in Mozilla's Bugzilla which says "This means the bug is awaiting triage..."). Getting no response may lead reporters to feel that their time is being wasted.

    Helpful Responses

    It's always appropriate to thank someone for taking the time to report a perceived problem, even if it turns out not to be a problem with the software they thought was the problem. Even a badly-written bug report can be helpful with a bit of work (in most cases). The reporter can always be given a link to a "How to report bugs well" FAQ or your project's bug guidelines. Politeness is good. Happy bug reporters help us troubleshoot our software. They are a valuable resource to the Open Source community. Pissing them off means that you don't get as much testing from them. A simple "thank you" and a kind attitude go a long way to make someone feel respected and happy.

    I can't reproduce the problem, but here's something you can do to provide me with more information....

    Letting people know what else they can do to clarify a bug or provide more troubleshooting information lets them know that their response is not wasted, that the developer cares enough about the problem to want more information and actually wants to get the bug fixed.

    Bug confirmed, working on this...

    You may not yet know the cause of the problem, and don't have time to write much. A simple note to say you've duplicated the issue and you're looking into it lets the users know their bug report has been received and something is being done. This improves their confidence that their reports mean something.

    Suggestions

    Better communication and more questions rather than snarky answers make for happier bug reporters and better bug troubleshooting. Here are a few other, general suggestions that go toward improving communication:

    Have a written, easily findable bug submission procedure for your project.

    People who report bugs may not know what information is helpful for the developers. Your bug submission guidelines will tell them what that information is. You can't guarantee that everyone will read it or follow it. You can guarantee that you will get less of what you need in terms of helpful bug reports if you don't have one.

    Try to speak in terms the bug reporter can understand.

    Perhaps the meaning of "just run lsmod" is obvious to you, but it may not be obvious to the bug reporters. This doesn't make the reporters bad, nor does it justify getting angry at them. No one is born knowing all computer commands. Not everyone has the same level of expertise in the same areas as a developer. Communicating with the reporters in ways they understand means you will get more helpful answers from them. Sometimes, they can run a command they didn't know before if only you tell them how to do it or point them at the appropriate FAQ.

    Don't respond in anger, respond when you are calm.

    People are responding to your request to submit bug reports, and they deserve to be treated with a basic amount of respect. Responding in anger will increase the likelihood that you will say something inappropriate or something that you may regret. Wait until you have a cool head to respond to comments.

    Respond with common courtesy, don't put people down.

    Again, if you're asking people to submit bugs, putting them down when they do is counter-productive. It will only make you and your project look bad. Treating people respectfully increases their willingness to help you get your bugs solved and your programs running more stably. This shouldn't have to be said, but there it is.

    There are many things we can do to improve communication in bug reporting. This article provides a few which can be used in most situations. By following practices of courtesy and good feedback, we can improve the rate at which bugs get fixed and encourage the volunteers waiting to help us in that process. The Open Source community as a whole can benefit.


    This article is licensed under the Creative Commons Attribution-ShareAlike License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/1.0/ or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.


    Author's bio:

    Tracey Clark can be reached at grrliegeek@elenari.net.

    Troubleshooting Tips

    december.com

    Intro to Unix

    The purpose of this page is to give you some ideas to cope with a variety of problems that will occur when using Unix. To use Unix effectively, you need to not just know the syntax and features of the operating system, but you must develop an attitude of coping with problems and understanding a bit of the worldview that might help you use Unix better. Some of this advice may seem odd or humorous, but it all has the point of conveying some kinds of coping mechanisms that go beyond just knowing the syntax of the commands.

    1. Remain calm.
      Panic will cloud your ability to think through the problem and to come up with solutions. Slow down and calm down. Many times I see that problems arise when students try to type too quickly. Become familiar with your keyboard and how it works; type slowly if you have to. Remember that in Unix, you must type things exactly.
    2. Know when to quit.
      As you type a Unix command at the shell prompt, you will type the wrong symbol. You might try the backspace or delete keys. These may or may not work--the way the backspace and delete keys are mapped to their functions are not always done correctly. This is particularly true when using telnet. If you get some funny characters or strange characters, cancel the command line. Do this by holding down the control key (usually at the lower left-hand corner of a keyboard) and the C key at the same time. That cancels the command. If you don't do this, you may create some strange file or have some strange effect. (Strange characters = possible strange effects.) Be particularly vigilant when you are creating a file name.
    3. Seek clarity.
      If your terminal acts "funny" or your vi editor session is "strange," try this:
      $ set term=vt100
      $ clear

      The first command tells your telnet client that it is an old standard type of terminal (vt100); this often clears up strange control characters or if the computer output is not scrolling up on the screen. Then used with the clear command, you get added refreshment--this clears the screen of previous commands and output. This can then help you focus on a new task or way of approaching a problem.

    4. Don't flirt with deadlines.
      In the real world, real deadlines mean either money lost or money gained. In the real world, if you miss a deadline, you may lose your company money and lose your job. If you meet a deadline, you may help your company succeed. If you start your work close to the deadline, you may run into trouble that you might not be able to resolve in time.
    5. Understand that Unix was written for humans and by humans and thus is a language, full of ambiguity, inconsistencies, history, and culture.

      Computers don't need an operating system--you do. Electronic computers just care about high and low voltages. Operating systems like Unix were written by humans for humans to make use of the resources of the computer. Humans are full of all kinds of interesting ideas and these are reflected in the Unix operating system and its flavors. Not all the sequences of commands and options are entirely consistent. There are inconsistencies in syntax from command to command. There are cultural artifacts throughout the operating system that came out of the "geek" culture that gave rise to it-- who else would come up with the "finger" command?
    6. Understand that Unix and computers are a mystery to be lived, not a problem to be solved.
      There will be errors, mistakes, funny things happen. You could spend the rest of your life analyzing exactly what gave rise to a particular error--or you could:
      • Take a deep breath and try it again
      • Log off and log back in
      • Try a different computer
      • Get a night's sleep and try tomorrow (but not if your deadline is tonight)

      The combination of hardware, networks, operating systems, and software gives rise to a high level of complexity. When problems arise, I have found it is best to try some troubleshooting techniques, then try something else, rather than trying to figure out exactly what went wrong (however, I do read error messages and look for any clues on the screen). Due to the nature of Unix systems, by the time you have found out what went wrong, the system (its hardware or software) will have been changed or "upgraded" (introducing new problems). This is particularly true, I have found, in academic computing systems.

    7. Speak in a soothing, kind voice to the computer.
      This is more for you than for the computer. If you get frustrated and type haphazardly at the keyboard, you are not going to get work done. If you don't read the screen and any error messages, you are going to keep repeating the same mistakes over and over. Thanking the computer for what it does helps you focus on the give-and-take interaction you have with the computer. The soothing voice calms you down also.
    8. Realize that there are many paths to getting things done.
      There are usually many ways to accomplish a task. Don't get too dogmatic about finding the one and only one way to do it, nor get too over-involved in figuring out every way to do something. Most Unix users develop a particular set of habits for using the commands that match their way of thinking. Don't be too concerned if you see others use a different style of Unix interaction--however, you might learn some new things from them. Don't be too insistent on continually pointing out, "you could also do that by..." (particularly if you are standing over their shoulder). You could spend many weeks pointing out alternative ways to do a task. Find a way that works and matches your thinking and then get your work done.
    9. Make use of your environment.
      Think like a castaway on a desert isle who has only what is at his or her disposal to survive. If you are on an Internet-connected computer, you have access to range of online help resources, some of which are available right from the command line (although don't think too hard about the "man" pages--I don't think there are many humans alive who understand these fully). If you are on a computer which has a window system (such as Microsoft Windows or X Window System), bring up several windows at the same time:
      • A window that shows a Unix command prompt where you can type in commands (this would be the telnet client; in an X Window System environment, it would be an xterm)
      • A Web browser showing task information about what you are supposed to (this would be the assignment description)
      • Another Web browser that shows reference information
      • Perhaps another Web browser that shows more or alternate reference information

      Keep all of these windows up while working until you get your task done. Don't be afraid to bring up another Web browser, and keep the Web browser showing task information or reference information up. I've seen many students go through the trouble of bringing up a Web browser, following links to the assignment description, looking up a requirement for the assignment, then closing that Web browser, only to have to repeat the entire process when they need to look at the assignment description or reference information again. Don't be afraid to have as many windows open at the same time showing different information to get your work done.

    10. Unix gets work done.
      You may think that a command-line operating system like Unix is somehow outdated. After all, a telnet session opened to a Unix shell is pretty plain visually. Its gotta be outdated, right? I have found that this is not the case. When you need to accomplish a large amount of precise work, there is nothing like pipes, filters, shell programming, and computer programs, to accomplish tasks. These can be implemented very efficiently

    Compiler and tools tricks

    Textbooks are full of good advices:

    Use other aids as well. Explaining your code to someone else (even a teddy bear) is wonderfully effective. Use a debugger to get a stak trace. Use some of the commercial tools that check for memory leaks, array bounds violations, suspect code and the like. Step through your program when it has become clear that you have the wrong picture of how the code works.
    --- Brian W. Kernighan, Rob Pike, The practice of programming, 1999 (Chapter 5: Debugging)

    Enable every optional warning; view the warnings as a risk-free, high-return investment in your program. Don't ask, "Should I enable this warning?" Instead ask, "Why shouldn't I enable it?" Turn on every warning unless you have an excellent reason not to.
    --- Steve Macguire, Writing solid code, 1993

    Sounds familiar? But with which option? This page tries to answer that kind of question.

    Constructive feedback is welcome.

    Table of Contents

    The Linux kernel is no place for 'self-expressive fancy' by Morton

    Computerworld

    Q: Any advice for budding developers?

    a) Fix bugs. I spent the first 18 months of my involvement with the kernel just working bugs with people on the mailing list. As a consequence I learned a good deal about a large amount of the kernel. It is a great way to pick things up, and you're doing useful things at the same time.

    b) Switch off your ego. Don't be rude to people. Learn to give in. Learn to change your ways and perceptions to match those of the project which you are working in.

    Sometimes it is difficult, and sometimes you end up believing that opportunities have been lost. But in the long run, such sacrifices in the interest of the larger project are for the best.

    m4 Macro Processor Overview

    General Programming Concepts Writing and Debugging Programs

    This chapter provides information about the m4 macro processor, which is a front-end processor for any programming language being used in the operating system environment.

    The m4 macro processor is useful in many ways. At the beginning of a program, you can define a symbolic name or symbolic constant as a particular string of characters. You can then use the m4 program to replace unquoted occurrences of the symbolic name with the corresponding string. Besides replacing one string of text with another, the m4 macro processor provides the following features:

    The m4 macro processor processes strings of letters and digits called tokens. The m4 program reads each alphanumeric token and determines if it is the name of a macro. The program then replaces the name of the macro with its defining text, and pushes the resulting string back onto the input to be rescanned. You can call macros with arguments, in which case the arguments are collected and substituted into the right places in the defining text before the defining text is rescanned.

    The m4 program provides built-in macros such as define. You can also create new macros. Built-in and user-defined macros work the same way.

    Debugging in OSS Always Faster

    Slashdot

    The "many eyes" myth (Score:5, Insightful)
    by Anonymous Brave Guy (457657) on Friday June 20, @08:11PM (#6258980)

    A more likely explanation is the 'many eyes' that can review the code.

    Many eyes can. How many actually do? Unless you're talking about the really big projects, probably very few indeed -- one, I suspect, in many cases.

    It's not fair to cite mainstream developments like Linux or Mozilla as the way all open source is any more than it's fair to cite Microsoft's history on things like security and reliability as the way all closed source is.

    To quote Gene Spafford... (Score:5, Insightful)
    by DesScorp (410532) on Friday June 20, @09:35PM (#6259343)

    " Not always. A more likely explanation is the 'many eyes' that can review the code."

    I went to a speech by Gene Spafford here a few years ago, when the subject of Linux code quality versus other systems (especially MS) came up. Someone mentioned Eric Raymond's "Thousand Eyeballs" theory, that more people looking at the code ensured better quality.

    Spaff responded "that does no good if those thousand eyeballs are looking at things like networking your toaster instead of quality and security".

    I don't think this point is emphasized enough. It's not enough that lots of people are looking at the code. You need lot's of people with training, expierience, and an eye for problems to look at the code. He pointed out that one of the biggest problems in development is that while people can learn C from a book, and even get good at it, they don't learn proper software engineering techniques, philosophies, and debugging skills that way.

    In short, simply being open source and having lots of developers isn't a solution in itself.

    Re:Possible explanation? (Score:4, Interesting)
    by jafac (1449) on Friday June 20, @07:10PM (#6258556)

    I work in an environment where we do Peer Reviews, and I've worked, in the past, in an environment where "if it compiles, ship it" - and I'll say that even if the Peers occasionally miss problems in the Review - the coder who has to present it to the Peers has a TOTALLY different attitude.

    I see code that's very carefully analyzed first, thoroughly commented, thoughtfully indented, module, class, and variable names, though generally longer, they make sense. People go out of their way to be elegant.

    I think that Peer Review is probably MORE important to overall quality of the end product, than developer experience. That's just my opinion, but after living the chaos that was a non-peer reviewed environment for 10 years, the attitudes, etc. there's really a huge difference.

    It's even better if the Review team reserves the right, by convention, to give the presenter a wedgie if they don't like their code.

    Mediocre Propoganda at Best, A Joke at Worse (Score:5, Insightful)
    by Gabe Garza (535203) on Friday June 20, @06:10PM (#6258191)

    I know this is Slashdot and the party line is "OSS Rules!," but this seems pushing it even for this audience.

    This was a paper written for a class on statistics. It was not a rigorous study. Their findings are based on a lot of assumptions. They have a very small sample set--they only test their model on Linux, fetchmail, and Mozilla. Many people, including myself, consider these the cream of the crop so far as OSS goes.

    Before you praise it, I urge you to actually read the paper. Don't be intimidated by it--FUD is FUD, even if it's mixed with a heavy does of greek letters and charts.

    Absolutely false!!! (Score:2)
    by si_brain (316476) on Friday June 20, @06:11PM (#6258194)

    Because gdb is so slow with a realy big project.
    ( ...Symbol loading... )
    Trust me on this, I'm in this situation every day.

    I have a c++ (unmanaged) project to get done. (Score:3, Informative)
    by Sludge (1234) <sludge@t h r e e w ave.com> on Friday June 20, @11:07PM (#6259715)

    I develop with Cygwin and Emacs, but I compile and debug with Visual Studio 7.0. I believe that the Visual Studio debugger is unparalleled (lacking just Emacs integration! :) ), and there is nothing that can beat precompiled headers, a fast compiler (in the first place) and the visual debugging integration of Visual Studio.

    I then boot to Linux and port my code. I've been writing portable code for half a decade, so I know what I'm doing, more or less. But, I can get more work done with Visual Studio, faster.

    In case it makes a difference to your perception, I write end user apps, sometimes with heavy graphics requirements and GUI frontends.

    Due to the nature of my work, I can't rely on masses to test everything before I ship.

    Exploring processes with Truss: Part 1 By Sandra Henry-Stocker

    The ps command can tell you quite a few things about each process running on your system. These include the process owner, memory use, accumulated time, the process status (e.g., waiting on resources) and many other things as well. But one thing that ps cannot tell you is what a process is doing - what files it is using, what ports it has opened, what libraries it is using and what system calls it is making. If you can't look at source code to determine how a program works, you can tell a lot about it by using a procedure called "tracing". When you trace a process (e.g., truss date), you get verbose commentary on the process' actions. For example, you will see a line like this each time the program opens a file:

    open("/usr/lib/libc.so.1", O_RDONLY) = 4

    The text on the left side of the equals sign clearly indicates what is happening. The program is trying to open the file /usr/lib/libc.so.1 and it's trying to open it in read-only mode (as you would expect, given that this is a system library). The right side is not nearly as self-evident. We have just the number 4. Open is not a Unix command, of course, but a system call. That means that you can only use the command within a program. Due to the nature of Unix, however, system calls are documented in man pages just like ls and pwd.

    To determine what this number represents, you can skip down in this column or you can read the man page. If you elect to read the man page, you will undoubtedly read a line that tells you that the open() function returns a file descriptor for the named file. In other words, the number, 4 in our example, is the number of the file descriptor referred to in this open call. If the process that you are tracing opens a number of files, you will see a sequence of open calls. With other activity removed, the list might look something like this:

    open("/dev/zero", O_RDONLY) = 3

    open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT

    open("/usr/lib/libc.so.1", O_RDONLY) = 4

    open("/usr/lib/libdl.so.1", O_RDONLY) = 4

    open64("./../", O_RDONLY|O_NDELAY) = 3

    open64("./../../", O_RDONLY|O_NDELAY) = 3

    open("/etc/mnttab", O_RDONLY) = 4

    Notice that the first file handle is 3 and that file handles 3 and 4 are used repeatedly. The initial file handle is always 3. This indicates that it is the first file handle following those that are the same for every process that you will run - 0, 1 and 2. These represent standard in, standard out and standard error.

    The file handles shown in the example truss output above are repeated only because the associated files are subsequently closed. When a file is closed, the file handle that was used to access it can be used again.

    The close commands include only the file handle, since the location of the file is known. A close command would, therefore, be something like close(3). One of the lines shown above displays a different response - Err#2

    ENOENT. This "error" (the word is put in quotes because this does not necessarily indicate that the process is defective in any way) indicates that the file the open call is attempting to open does not exist. Read "ENOENT" as "No such file".

    Some open calls place multiple restrictions on the way that a file is opened. The open64 calls in the example output above, for example, specify both O_RDONLY and O_NDELAY. Again, reading the man page will help you to understand what each of these specifications means and will present with a list of other options as well.

    As you might expect, open is only one of many system calls that you will see when you run the truss command. Next week we will look at some additional system calls and determine what they are doing.

    Exploring processes with Truss: part 2 By Sandra Henry-Stocker

    While truss and its cousins on non-Solaris systems (e.g., strace on Linux and ktrace on many BSD systems) provide a lot of data on what a running process is doing, this information is only useful if you know what it means. Last week, we looked at the open call and the file handles that are returned by the call to open(). This week, we look at some other system calls and analyze what these system calls are doing. You've probably noticed that the nomenclature for system functions is to follow the name of the call with a set of empty parentheses for example, open(). You will see this nomenclature in use whenever system calls are discussed.

    The fstat() and fstat64() calls obtains information about open files - "fstat" refers to "file status". As you might expect, this information is retrieved from the files' inodes, including whether or not you are allowed to read the files' contents. If you trace the ls command (i.e., truss ls), for example, your trace will start with lines that resemble these:

    1 execve("/usr/bin/ls", 0x08047BCC, 0x08047BD4) argc = 1

    2 open("/dev/zero", O_RDONLY) = 3

    3 mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xDFBFA000

    4 xstat(2, "/usr/bin/ls", 0x08047934) = 0

    5 open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT

    6 sysconfig(_CONFIG_PAGESIZE) = 4096

    7 open("/usr/lib/libc.so.1", O_RDONLY) = 4

    8 fxstat(2, 4, 0x08047310) = 0

    ...

    28 lstat64(".", 0x080478B4) = 0

    29 open64(".", O_RDONLY|O_NDELAY) = 3

    30 fcntl(3, F_SETFD, 0x00000001) = 0

    31 fstat64(3, 0x0804787C) = 0

    32 brk(0x08057208) = 0

    33 brk(0x08059208) = 0

    34 getdents64(3, 0x08056F40, 1048) = 424

    35 getdents64(3, 0x08056F40, 1048) = 0

    36 close(3) = 0

    In line 31, we see a call to fstat64, but what file is it checking? The man page for the fstat() and your intuition are probably both telling you that this fstat call is obtaining information on the file opened two lines before – "." or the current directory - and that it is referring to this file by its file handle (3) returned by the open() call in line

    2. Keep in mind that a directory is simply a file, though a different variety of file, so the same system calls are used as would be used to check a text file.

    You will probably also notice that the file being opened is called /dev/zero (again, see line 2). Most Unix sysadmins will immediately know that /dev/zero is a special kind of file - primarily because it is stored in /dev. And, if moved to look more closely at the file, they

    will confirm that the file that /dev/zero points to (it is itself a symbolic link) is a special character file. What /dev/zero provides to system programmers, and to sysadmins if they care to use it, is an endless stream of zeroes. This is more useful than might first appear.

    To see how /dev/zero works, you can create a 10M-byte file full of zeroes with a command like this:

    /bin/dd < /dev/zero > zerofile bs=1024 seek=10240 count=1

    This command works well because it creates the needed file with only a few read and write operations; in other words, it is very efficient.

    You can verify that the file is zero-filled with od.

    # od -x zerofile

    0000000 0000 0000 0000 0000 0000 0000 0000 0000

    *

    50002000

    Each string of four zeros (0000) represents two bytes of data. The * on the second line of output indicates that all of the remaining lines are identical to the first.

    Looking back at the truss output above, we cannot help but notice that the first line of the truss output includes the name of the command that we are tracing. The execve() system call executes a process. The first argument to execve() is the name of the file from which the new process

    image is to be loaded. The mmap() call which follows maps the process image into memory. In

    other words, it directly incorporates file data into the process address space. The getdents64() calls on lines 34 and 35 are extracting information from the directory file - "dents" refers to "directory entries'.

    The sequence of steps that we see at the beginning of the truss output executing the entered command, opening /dev/zero, mapping memory and so on - looks the same whether you are tracing ls, pwd, date or restarting Apache. In fact, the first dozen or so lines in your truss output will be nearly identical regardless of the command you are running. You should, however, expect to see some differences between different Unix systems and different versions of Solaris.

    Viewing the output of truss, you can get a solid sense of how the operating system works. The same insights are available if you are tracing your own applications or troubleshooting third party executables.

    -------------------

    Sandra Henry-Stocker

    softpixel project trove MemCheckDeluxe

    About: MemCheck Deluxe is a memory usage tracker and leak finder. It allows developers to find memory leaks quickly, as well as providing some memory usage information.

    Changes: More documentation and comments and an example application were added. A bug where the last allocation would be omitted from the memory stats was fixed. free was made a lot faster, but this might break when using the pointer in free() or realloc() outside of mcd's scope.

    arkstat -- generic, but still useful for debugging net program tool.

    darkstat is a network traffic analyzer. It's basically a packet sniffer which runs as a background process on a cable/DSL router and gathers all sorts of useless but interesting statistics.

    I have a cable router at home that runs Linux and I like having some statistics about the data that's going through it.

    I'm a big fan of ntop and I've been using it for a long time. darkstat is an effort to create a smaller (in terms of memory footprint) and stabler ntop.

    http://www.ntop.org/

    darkstat works for me under Linux and FreeBSD. I've seen it compile on OpenBSD (thanks Galahad!) and Solaris (thanks trnepal!) machines through shell accounts I was provided. Daniel Bogan got it working on Mac OS X.

    darkstat-2.2.tar.gz - sources (185KB)
    ChangeLog
    Debian package homepage (search for latest)reeBSD port

    Valgrind, an open-source memory debugger for x86-linux

    Programmers! Make your software Valgrind-clean. Test it with Valgrind and fix all problems Valgrind reports. This will give you some assurance that your code is free of a broad class of memory management errors. You may well find undiscovered bugs, and your code will probably be more stable as a result. It's good for your code, good for you and especially it's good for the people who use your code.

    Don't delay -- Valgrind today.

    Ok, that was a bad pun. But the rest is serious.


    Valgrind is a GPL'd tool to help you find memory-management problems in your programs. When a program is run under Valgrind's supervision, all reads and writes of memory are checked, and calls to malloc/new/free/delete are intercepted. As a result, Valgrind can detect problems such as:

    Valgrind tracks each byte of memory in the original program with nine status bits, one of which tracks addressibility of that byte, while the other eight track the validity of the byte. As a result, it can detect the use of single uninitialised bits, and does not report spurious errors on bitfield operations.

    You can use it to debug more or less any dynamically-linked ELF x86 executable, without modification, recompilation, or anything. If you want, Valgrind can start GDB and attach it to your program at the point(s) where errors are detected, so that you can poke around and figure out what was going on at the time.

    Valgrind works well enough to debug many large x86-linux applications. To give you some idea of the scale of programs it can run: most of KDE3, Gnome stuff, Mozilla, OpenOffice, MySQL, Opera, KOffice-1.2beta1, emacs-21.2, xemacs-21.5(--pdump), Netscape-4.78, Gcc, AbiWord, etc, etc. KDE3 was extensively valgrinded in the two months prior to the KDE 3.0 release. Valgrind is first and foremost a debugging tool for large, complex programs. It isn't a toy or a research vehicle.

    Valgrind contains built-in support for doing very detailed cache profiling. Since we already intercept every read and write, and have all the debugging information for your program close to hand, it's not too hard to also do a detailed simulation of your CPU's L1-D, L1-I and unified L2 caches. The supplied vg_annotate program will show your source code with these counts attached to every line. The cache arrangement of your CPU is auto-detected using the CPUID instruction. If that doesn't work or you want to override it, you can specify custom cache parameters on the command line. The manual contains full documentation of this new feature.

    The cache profiling aspects are the excellent hacking of Nick Nethercote. Please do try it out -- you just need to put cachegrind in front of your program invokations, rather than valgrind. We hope it will be a useful performance-analysis tool.

    If you plan to use the cache profiler, you might want to look at Josef Weidendorfer's amazing kcachegrind visualisation tool and call-graph extension. This makes it a lot easier to make sense of the overwhelming mass of numbers that cachegrind generates.

    Full documentation for version 1.0.0 is supplied, up to date as of 27 July 2002. The overly-curious may want to read detailed tech docs about how Valgrind is implemented.

    Also supplied are some hints and tips for KDE folks trying to get started with it.

    To use: you need an x86 machine running Linux kernel 2.2.X or 2.4.X and glibc 2.1.X or 2.2.X. That covers the vast majority of installed systems at present. I develop it on RedHat 7.2, and recent snapshots are also tested on RedHat 6.2. This means the chances are good that valgrind will build and work without problems on any non-ancient Linux installation.

    The 1.0.3 release is known to build and run on Red Hats 6.2, 7.2, 7.3, Limbo (7.3.92 beta), and SuSE 8.1beta6, although use on the last two is not recommended, because Valgrind currently emits spurious undefined-value errors on code generated by gcc's >= 3.1, and these last two are gcc-3.2 based, in effect.

    There are some limitations, the most significant of which are:

    Section 4 of the manual contains a complete list of limitations. Reading the manual is generally a good idea. I realise it's wildly optimistic of me to expect anyone to actually do so, but still ... one can but hope ... :-)

    If you have problems with Valgrind, don't suffer in silence. Mail me. It's under active development, and I'll do what I can to help you get started.

    Julian Seward, jseward@acm.org

    Memory Leaks

    Slashdot

    G3ck0G33k writes: "Is there any free software version/clone of Rational's programs PureCoverage and/or Purify? I have worked with both of them on fairly large projects (>150,000 lines of code) and they were great to work with.

    When the first runs of Purify found nearly fifty instances of minor memory leaks, I was deeply frustrated/impressed. A free (perhaps GPLd) clone would be so interesting; Rational's licensing is killing my current budget. Of course, the more kinds of leaks it may detect, the better. GeckoGeek" We had a similar question last year but there's no harm in seeing what the current answers are.

    Three Questions About Each Bug You Find

    This kind of procedure can be helpful, as it helps to see "bigger picture". Actually this is a Polya "How to Solve It" derivative and reading the original would be a much better deal.
    1. Is this mistake somewhere else also? Look for other places in the code where the same pattern applies. Vary the pattern systematically to look for similar bugs.
    2. What next bug is hidden behind this one? Once you figure out how to fix the bug, you need to imagine what will happen once you fix it. The statement after the failing one may have a bug in it too, but the program never got that far before: or some other code may be entered for the first time as a result of your fix. Take a look at these untried statements and look for bugs in them.
    3. What should I do to prevent bugs like this? Ask how you can change your ways to make this kind of bug impossible by definition. By changing methods or tools, it's often possible to completely eliminate a whole class of failures instead of shooting the bugs down one by one.

    VISUAL DEBUGGING WITH DDD by Andreas Zeller

    If a debugger is a tool that lets you "see" what's going on in a program, then DDD is the tool that lets you see the most. Additional resources include vdebug.txt (listings).

    How to avoid Memory Leakage

    Errors and complex systems
    Any developer writing server applications, or other programs running continuously for a longer periods of time, knows how frustrating it can be with processes slowly allocating more and more memory until some other program finally crashes 3 days later after running out of memory. Memory leakage is probably one of the most difficult programming problems to solve, because you cannot easily search thousands of lines of code for a complex logical error that might cause a problem when some unlikely event occurs. If your application interacts with other programs as well, you might not even have any source code to search. If you are really unfortunate, a small error in your application could activate a buggy piece of code in some other program, and eventually that other program might crash the entire system after allocating all available memory. How are you supposed to find out what really happened?

    Debugging and Testing
    Writing computer programs is not a simple matter. Computers can execute millions of instructions in one second, but they still don't know what to do if you tell them to "draw a circle". Fortunately you can easily make any computer draw a circle by combining a number of simpler instructions. By using even more instructions you can make the computer do even more impressive things, like drawing a house or decompressing a live video stream from a mars. The only problem is, the more instructions you add, the more likely you are to make an error. With modern programs consisting of thousands or millions of lines of code, errors are pretty much unavoidable. This wouldn't be a problem at all if there were only syntactic errors, since these are easily detected by the compiler, but there are also logical errors (bugs) that will pass right through your compiler without a single warning. Many errors are not even serious enough to cause any problems on their own, but when you combine a few of these errors, you get to spend hours reading your own code, trying to figure out what it really does. So how can we find the errors without spending to many hours on every line of code? The answer is pretty obvious. We run the program! If there are any bugs, they will probably show up after running the program for a while. Although this is a very efficient strategy, there are some errors that don't affect the user-interface or any other observable part of the program directly. These errors are much harder to find, since it may take hours or even days before they have any effects on the user-interface. To find these errors without spending hours or days on each test-update cycle, we need to keep track of some more abstract properties of the program, like memory and cpu-usage. By monitoring these properties for some time, you will be able to find trends and predict problems long before you actually get an error message.

    Automating Repetitive Tasks in Visual Studio

    By Paul Kimmel - Published 7/30/2001
    Visual Studio.Net has added an expanded extensibility model and macros to the unified Visual Studio IDE. The macro tools allow you to quickly record repetitive tasks, enhancing productivity.

    Troubleshooters.Com

    Open source programmers stink at error handling

    Oct 25, 2001 | LinuxWorld

    Thanks to my very talented readers I've been able to start almost every recent column with a reader's PHP tip. I'm tempted to make it a regular feature, but with my luck the tips would stop rolling in the moment I made it official. So I want you to be aware that this week's tip is not part of any regular practice. It is purely coincidental that PHP tips appear in column after column. Now that I've jinx-proofed the column, I'll share the tip.

    Reader Michael Anderson wrote in with an alternative to using arrays to pass database information to PHP functions. As you may recall from the column Even more stupid PHP tricks, you can retrieve the results of a query into an array and pass that array to a function this way:

    <?PHP
    $result = mysql_query("select name, address from customer where cid=1");

    $CUST = mysql_fetch_array($result);
    do_something($CUST);

    function do_something($CUST) {
    echo $CUST["name"];
    echo $CUST["address"];
    }
    ?>

    Michael pointed out that you can also retrieve the data as an object and reference the fields as the object's properties. Here's the above example rewritten to use objects:

    <?PHP
    $result = mysql_query("select name, address from customer where cid=1");

    $CUST = mysql_fetch_object($result);
    do_something($CUST);

    function do_something($CUST) {
    echo $CUST->name;
    echo $CUST->address;
    }
    ?>

    I can't help but agree with Michael that this is a preferable way to handle the data, but only because it feels more natural to me to point to an object property than to reference an element of an array using the string name or address. It's purely a personal preference, probably stemming from habits I learned using C++.

    OCD programmers unite

    Nothing could be a better segue into the topic I had planned for this week. I'm thinking about starting a group called OLUG, the Obsessive Linux User Group. Although I know enough about psychology to know I don't meet the qualifications of a person with full-fledged OCD (Obsessive-Compulsive Disorder), I confess that I went back and rewrote my PHP code to use objects instead of arrays even there was no technical justification for doing so.

    Certain things bring out the OCD in me. Warning messages, for example. It doesn't matter if my programs seem to work perfectly. If a compiler issues warnings when I compile my code, I feel compelled to fix the code to get rid of the warnings even if I know the code works fine. Likewise, if my program generates warnings or error messages at run time, I feel driven to look for the reasons and get rid of them.

    Now I don't want you to get the wrong impression. My PHP and C++ code stand as testimony to the fact that my programming practices don't even come within light years of perfection. But just because I do not live up to the standards I am about to demand isn't going to stop me from demanding them. It's my right as a columnist. Those who can, do. Those who can't, write columns.

    I'll be blunt. Open source programmers need to stop being so darned lazy about error handling. That obviously doesn't include all open source programmers. You know who you are.

    Problem resolution

    MVS Systems Programming Chapter 18

    18.1 How to approach problems

    18.1.1 Chapter Overview

    This chapter discusses one of the most interesting and difficult parts of the systems programmer's job, and sometimes one of the most frustrating - dealing with problems. In most installations, the systems programmer is the problem solver of last resort - the expert to whom all others turn when a problem becomes too difficult or obscure for them to solve themselves. The mystique and reputation of systems programmers rests above all on their ability to deal with these situations - so cultivate this skill!

    Effective problem resolution depends on a number of factors:

    This chapter cannot tell you how to solve every individual problem you might encounter - every problem is different (if only in the way it is presented by the person experiencing it) and even the well-defined problems occupy many volumes of messages and diagnosis manuals. But it will attempt to cover:

    The joys and perils of open-source life

    AnchorDesk UK Commentary Box

    As described above, LTT has progressed at a phenomenal rate, in a very short time -- AND with lots of outside help. It has been said that LTT has already surpassed many available tracing tools. This is confirmed by the large number of Fortune 500 companies that currently use LTT to develop Linux based applications. In fact, many say they are planning to use Linux in their systems precisely because such a tracing tool exists for Linux.

    Since last April, RTAI tracing has been added to LTT thanks to funding by Lineo. This is a big step forward, as it enables real-time Linux developers to debug their systems using LTT. Lately, PowerPC support came in from several directions. I had been working on a port with funding from Lineo when I received a patch from Bao Gang Liu of Agilent China, followed closely by code from Andy Lowe of MontaVista. Just today, I received word from Maxwell Patten of Nortel Networks that he too has been working on a PowerPC port of LTT. And it isn't over yet... !

    In the future, LTT will aim to be available for additional architectures that Linux is running on. Also, there are requests for LTT to be run on additional open-source operating systems, such as BSD and HURD. Along with these system-level issues, there are numerous planned tool enhancements and flexibility additions.

    Feel free to contribute to this project, either by providing feedback, expertise, or source code. Most important of all, try to keep an eye on the big picture. Try to see how each piece fits with the others, and how they all interact together. After all, that's what LTT is all about.

    Linux Trace Toolkit supports real-time Linux
    · Event trace tool doesn't require instrumented libraries
    · First Linux Trace Toolkit for PowerPC
    ·

    [Jul 29, 2000] Slashdot Are Buffer Overflow Sploits Intel's Fault -- interesting discussion about problems with C

    [Jul 5, 1999] Bounds Checking for C


    See also

    Bookshelf


    Recommended Links

    Softpanorama hot topic of the month

    Softpanorama Recommended


    DBX

    OS390 UNIX - dbx

    Unix Documents

    On Line Help  -- dbx tutorial

    dbx debugger

    S50-1004_unix_debuggers_dbx_xdbx (11995)

    the UNIX course / UNIX programming in C /The C-compilerProgram debugging

    ITCWeb u029.dbx


    Data display and Data Prettyprinting

    DDD - The Data Display Debugger

    xDuel -- a graphical extension to Duel debugging language enables you to print data structures under gdb(GNU debugger).

    From Advanced programming in Perl: Data structures pretty-printing

    2.5 Pretty-Printing

    In building complicated data structures, it is always nice to have a pretty-printer handy for debugging. There are at least two options for pretty-printing data structures. The first is the Perl debugger itself. It uses a function called dumpValue in a file called dumpvar.pl, which can be found in the standard library directory. We can help ourselves to it, with the caveat that it is an unadvertised function and could change someday. To pretty-print this structure, for example:

      @sample = (11.233,{3 => 4, "hello" => [6,7]});

    we write the following:

    require 'dumpvar.pl';
    dumpValue(\@sample); # always pass by reference

    This prints something like this:

    0  11.233
    1  HASH(0xb75dc0)
       3 => 4
       'hello' => ARRAY(0xc70858)
          0  6
          1  7

    We will cover the require statement in Chapter 6, Modules. Meanwhile, just think of it as a fancy #include (which doesn't load the file if it is already loaded).

    The Data::Dumper module available from CPAN is another viable alternative for pretty-printing. Chapter 10, Persistence, covers this module in some detail, so we will not say any more about it here. Both modules detect circular references and handle subroutine and glob references.

    It is fun and instructive to write a pretty-printer ourselves. Example 2.5 illustrates a simple effort, which accounts for circular references but doesn't follow typeglobs or subroutine references. This example is used as follows:

    pretty_print(@sample); # Doesn't need a reference

    This prints

    11.233
    { # HASH(0xb78b00)
    :  3 => 4
    :  hello =>
    :  :  [ # ARRAY(0xc70858)
    :  :  :  6
    :  :  :  7
    :  :  ]
    }

    The following code contains specialized procedures (print_array, print_hash, or print_scalar) that know how to print specific data types. print_ref, charged with the task of pretty-printing a reference, simply dispatches control to one of the above procedures depending upon the type of argument given to it. In turn, these procedures may call print_ref recursively if the data types that they handle contain one or more references.

    Whenever a reference is encountered, it is also checked with a hash %already_seen to determine whether the reference has been printed before. This prevents the routine from going into an infinite loop in the presence of circular references. All functions manipulate the global variable $level and call print_indented, which appropriately indents and prints the string given to it.

    Example 2.5: Pretty-Printing

    $level = -1; # Level of indentation
    
    sub pretty_print {
        my $var;
        foreach $var (@_) {
            if (ref ($var)) {
                print_ref($var);
            } else {
                print_scalar($var);
            }
        }
    }
    
    sub print_scalar {
        ++$level;
        my $var = shift;
        print_indented ($var);
        --$level;
    }
    
    sub print_ref {
        my $r = shift;
        if (exists ($already_seen{$r})) {
            print_indented ("$r (Seen earlier)");
            return;
        } else {
            $already_seen{$r}=1;
        }
        my $ref_type = ref($r);
        if ($ref_type eq "ARRAY") {
            print_array($r);
        } elsif ($ref_type eq "SCALAR") {
            print "Ref -> $r";
            print_scalar($$r);
        } elsif ($ref_type eq "HASH") {
            print_hash($r);
        } elsif ($ref_type eq "REF") {
            ++$level;
            print_indented("Ref -> ($r)");
            print_ref($$r);
            --$level;
        } else {
            print_indented ("$ref_type (not supported)");
        }
    }
    
    sub print_array {
        my ($r_array) = @_;
        ++$level;
        print_indented ("[ # $r_array");
        foreach $var (@$r_array) {
            if (ref ($var)) {
                print_ref($var);
            } else {
                print_scalar($var);
            }
        }
        print_indented ("]");
        --$level;
    }
    
    sub print_hash {
        my($r_hash) = @_;
        my($key, $val);
        ++$level; 
        print_indented ("{ # $r_hash");
        while (($key, $val) = each %$r_hash) {
            $val = ($val ? $val : '""');
            ++$level;
            if (ref ($val)) {
                print_indented ("$key => ");
                print_ref($val);
            } else {
                print_indented ("$key => $val");
            }
            --$level;
        }
        print_indented ("}");
        --$level;
    }
    
    sub print_indented {
        $spaces = ":  " x $level;
        print "${spaces}$_[0]\n";
    }

    print_ref simply prints its argument (a reference) and returns if it has already seen this reference. If you were to read the output produced by this code, you would find it hard to imagine which reference points to which structure. As an exercise, you might try producing a better pretty-printer, which identifies appropriate structures by easily identifiable numeric labels like this:

    :  hello =>
    :  :  [          # 10
    :  :  :  6
    :  :  :  7
    :  :  ]
    :  foobar => array-ref # 10
    }

    The number 10 is an automatically generated label, which is more easily identifiable than something like ARRAY(0xc70858).


    Program instrumentation

    General Programming Concepts Writing and Debugging Programs - m4 Macro Processor Overview

    This chapter provides information about the m4 macro processor, which is a front-end processor for any programming language being used in the operating system environment.

    The m4 macro processor is useful in many ways. At the beginning of a program, you can define a symbolic name or symbolic constant as a particular string of characters. You can then use the m4 program to replace unquoted occurrences of the symbolic name with the corresponding string. Besides replacing one string of text with another, the m4 macro processor provides the following features:

    The m4 macro processor processes strings of letters and digits called tokens. The m4 program reads each alphanumeric token and determines if it is the name of a macro. The program then replaces the name of the macro with its defining text, and pushes the resulting string back onto the input to be rescanned. You can call macros with arguments, in which case the arguments are collected and substituted into the right places in the defining text before the defining text is rescanned.

    The m4 program provides built-in macros such as define. You can also create new macros. Built-in and user-defined macros work the same way.


    Memory Leaks

    Slashdot Memory Leaks

    G3ck0G33k writes: "Is there any free software version/clone of Rational's programs PureCoverage and/or Purify? I have worked with both of them on fairly large projects (>150,000 lines of code) and they were great to work with. When the first runs of Purify found nearly fifty instances of minor memory leaks, I was deeply frustrated/impressed. A free (perhaps GPLd) clone would be so interesting; Rational's licensing is killing my current budget. Of course, the more kinds of leaks it may detect, the better. GeckoGeek" We had a similar question last year but there's no harm in seeing what the current answers are.

    Random Findings

    Java

    Algorithmic and Automatic Debugging Home Page  -- very interesting

    NERSC Tutorials Tools

    Introduction to the Cray Totalview Debugger

    Tutorial Five Debugging a Project

    A Beginner's Guide to Unix vi and X-Windows

    Chapters 5 and 6 of The Practice of Programming cover debugging and testing. There, the rules of debugging and testing are illustrated with detailed examples. An excerpt of chapter 5 from The Practice of Programming, with some of the rules explained in greater detail is available at: http://cm.bell-labs.com/cm/cs/tpop/debugging.html

    The Practice of Programming, Brian W. Kernighan, Rob Pike. Addison-Wesley, 1999, (http://cm.bell-labs.com/cm/cs/tpop/).

    Debugging with GDB: The GNU Source-Level Debugger, Richard M. Stallman. Free Software Foundation, 1998, (http://www.gnu.org/doc/doc.html).

    The New Hacker's Dictionary, Eric S. Raymond ed., MIT Press, 1996. Available on-line as The Jargon File, (http://www.tuxedo.org/~esr/jargon/).

    The haikus were submissions to a contest held by Salon Magazine in 1998 (http://www.salon.com).

    Definitions of "Segmentation fault" and "bus error", from The C Programmer's Notebook at: (http://www.skwc.com/essent/prognotebook.html).



    Etc

    FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of environmental, political, human rights, economic, democracy, scientific, and social justice issues, etc. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit exclusivly for research and educational purposes.   If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner. 

    ABUSE: IPs or network segments from which we detect a stream of probes might be blocked for no less then 90 days. Multiple types of probes increase this period.  

    Society

    Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

    Quotes

    War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

    Bulletin:

    Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

    History:

    Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

    Classic books:

    The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

    Most popular humor pages:

    Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

    The Last but not Least


    Copyright © 1996-2016 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License.

    The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

    Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

    FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

    This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

    You can use PayPal to make a contribution, supporting development of this site and speed up access. In case softpanorama.org is down you can use the at softpanorama.info

    Disclaimer:

    The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

    Last modified: February 11, 2017