May the source be with you,
but remember the KISS principle ;-)
Key Softpanorama Topics
|About||Contents||Top Updates||Top Visited|
|Bulletin||Selected Papers||Softpanorama Bookshelf||History|
Nikolai Bezroukov. Portraits of Open Source Pioneers
For readers with high sensitivity to grammar errors access to this page is not recommended :-)
GCC was and still is the main programming achievement of RMS. I think that the period when RMS wrote gcc by reusing Pastel compiler developed at the Lawrence Livermore Lab constitutes the most productive part of RMS' programmer career, much more important then his Emacs efforts, although he is still fanatically attached to Emacs.
He stared writing the compiler being almost thirty years old. That means at the time when his programming abilities started to decline. Therefore gcc can be considered to be his "swan song" as a programmer. The prototype that he used was Pastel compiler developed at the Lawrence Livermore Lab. He is his version of the history GCC creation:
Shortly before beginning the GNU project, I heard about the Free University Compiler Kit, also known as VUCK. (The Dutch word for "free" is written with a V.) This was a compiler designed to handle multiple languages, including C and Pascal, and to support multiple target machines. I wrote to its author asking if GNU could use it.
He responded derisively, stating that the university was free but the compiler was not. I therefore decided that my first program for the GNU project would be a multi-language, multi-platform compiler.
Hoping to avoid the need to write the whole compiler myself, I obtained the source code for the Pastel compiler, which was a multi-platform compiler developed at Lawrence Livermore Lab. It supported, and was written in, an extended version of Pascal, designed to be a system-programming language. I added a C front end, and began porting it to the Motorola 68000 computer. But I had to give that up when I discovered that the compiler needed many megabytes of stack space, and the available 68000 Unix system would only allow 64k.
I then realized that the Pastel compiler functioned by parsing the entire input file into a syntax tree, converting the whole syntax tree into a chain of "instructions", and then generating the whole output file, without ever freeing any storage. At this point, I concluded I would have to write a new compiler from scratch. That new compiler is now known as GCC; none of the Pastel compiler is used in it, but I managed to adapt and use the C front end that I had written.
Of course there were collaborators, because RMS seems to be far from the cutting edge of compiler technology and never wrote any significant paper about compiler construction. But still he was the major driving force behind the project and the success of this project was achieved to large extent due to his personal efforts.
From the political standpoint it was a very bright idea: a person who controls the compiler has enormous influence on everybody who use it. Although at the beginning it was just C-compiler, eventually "after-Stallman" GCC (GNU-cc) became one of the most flexible, most powerful and most portable C ,C++, FORTRAN and (now) even Java compiler. Together with its libraries, it constitutes the powerful development platform that gives the possibility to write code which is still portable to almost all computer platforms you can imagine, from handhelds to supercomputers.
The first version of GCC seems to be finished around 1985 when RMS was 32 year old. Here is the first mention of gcc in GNU manifesto:
So far we have an Emacs text editor with Lisp for writing editor commands, a source level debugger, a yacc-compatible parser generator, a linker, and around 35 utilities. A shell (command interpreter) is nearly completed. A new portable optimizing C compiler has compiled itself and may be released this year. An initial kernel exists but many more features are needed to emulate Unix. When the kernel and compiler are finished, it will be possible to distribute a GNU system suitable for program development. We will use TeX as our text formatter, but an nroff is being worked on. We will use the free, portable X window system as well. After this we will add a portable Common Lisp, an Empire game, a spreadsheet, and hundreds of other things, plus on-line documentation. We hope to supply, eventually, everything useful that normally comes with a Unix system, and more.
GNU will be able to run Unix programs, but will not be identical to Unix. We will make all improvements that are convenient, based on our experience with other operating systems. In particular, we plan to have longer file names, file version numbers, a crashproof file system, file name completion perhaps, terminal-independent display support, and perhaps eventually a Lisp-based window system through which several Lisp programs and ordinary Unix programs can share a screen. Both C and Lisp will be available as system programming languages. We will try to support UUCP, MIT Chaosnet, and Internet protocols for communication.
Compilers take a long time to mature. The first more or less stable release seems to be 1.17 (January 9, 1988) That was pure luck as in 1989 the Emacs/Xemacs split started that consumed all his energy. RMS used a kind of Microsoft "embrace and extend" policy in GCC development: extensions to the C-language were enabled by default.
It looks like RMS personally participated in the development of the compiler at least till the GCC/egcs split (i.e 1997). Being a former compiler writer myself, I can attest that it is pretty physically challenging to run such a project for more then ten years, even if you are mostly a manager.
Development was not without problems with Cygnus emerging as an alternative force. Cygnus, the first commercial company devoted to provide commercial support for GNU software and especially GCC complier was co-founded by Michael Tiemann in 1989 and Tieman became the major driving force behind GCC since then.
In 1997 RMS was so tired of his marathon that he just wanted the compiler to be stable because it was the most important publicity force for FSF. But with GNU license you cannot stop: it enforces the law of jungle and the strongest take it all. Interest of other more flesh and motivated developers (as well as RMS personal qualities first demonstrated in Emacs/Emacs saga) led to a painful fork:
Subject: A new compiler project to merge the existing GCC forksA bunch of us (including Fortran, Linux, Intel and RTEMS hackers) have decided to start a more experimental development project, just like Cygnus and the FSF started the gcc2 project about 6 years ago. Only this time the net community with which we are working is larger! We are calling this project 'egcs' (pronounced 'eggs'). Why are we doing this? It's become increasingly clear in the course of hacking events that the FSF's needs for gcc2 are at odds with the objectives of many in the community who have done lots of hacking and improvement over the years. GCC is part of the FSF's publicity for the GNU project, as well as being the GNU system's compiler, so stability is paramount for them. On the other hand, Cygnus, the Linux folks, the pgcc folks, the Fortran folks and many others have done development work which has not yet gone into the GCC2 tree despite years of efforts to make it possible. This situation has resulted in a lot of strong words on the gcc2 mailing list which really is a shame since at the heart we all want the same thing: the continued success of gcc, the FSF, and Free Software in general. Apart from ill will, this is leading to great divergence which is increasingly making it harder for us all to work together -- It is almost as if we each had a proprietary compiler! Thus we are merging our efforts, building something that won't damage the stability of gcc2, so that we can have the best of both worlds. As you can see from the list below, we represent a diverse collection of streams of GCC development. These forks are painful and waste time; we are bringing our efforts together to simplify the development of new features. We expect that the gcc2 and egcs communities will continue to overlap to a great extent, since they're both working on GCC and both working on Free Software. All code will continue to be assigned to the FSF exactly as before and will be passed on to the gcc2 maintainers for ultimate inclusion into the gcc2 tree. Because the two projects have different objectives, there will be different sets of maintainers. Provisionally we have agreed that Jim Wilson is to act as the egcs maintainer and Jason Merrill as the maintainer of the egcs C++ front end. Craig Burley will continue to maintain the Fortran front end code in both efforts. What new features will be coming up soon? There is such a backlog of tested, un-merged-in features that we have been able to pick a useful initial set: New alias analysis support from John F. Carr. g77 (with some performance patches). A C++ repository for G++. A new instruction scheduler from IBM Haifa. A regmove pass (2-address machine optimizations that in future will help with compilation for the x86 and for now will help with some RISC machines). This will use the development snapshot of 3 August 97 as its base -- in other words we're not starting from the 18 month old gcc-2.7 release, but from a recent development snapshot with all the last 18 months' improvements, including major work on G++. We plan an initial release for the end of August. The second release will include some subset of the following: global cse and partial redundancy elimination. live range splitting. More features of IBM Haifa's instruction scheduling, including software pipelineing, and branch scheduling. sibling call opts. various new embedded targets. Further work on regmove. The egcs mailing list at cygnus.com will be used to discuss and prioritize these features. How to join: send mail to egcs-request at cygnus.com. That list is under majordomo. We have a web page that describes the various mailing lists and has this information at: http://www.cygnus.com/egcs. Alternatively, look for these releases as they spread through other projects such as RTEMS, Linux, etc. Come join us! David Henkel-Wallace (for the egcs members, who currently include, among others): Per Bothner Joe Buck Craig Burley John F. Carr Stan Cox David Edelsohn Kaveh R. Ghazi Richard Henderson David Henkel-Wallace Gordon Irlam Jakub Jelinek Kim Knuttila Gavin Koch Jeff Law Marc Lehmann H.J. Lu Jason Merrill Michael Meissner David S. Miller Toon Moene Jason Molenda Andreas Schwab Joel Sherrill Ian Lance Taylor Jim Wilson
After version 2.8.1, GCC development split into FSF GCC on the one hand, and Cygnus EGCS on the other. The first EGCS version (1.0.0) was released by Cygnus on December 3, 1997 and that instantly put FSF version on the back burner:
March 15, 1999 egcs-1.1.2 is released. March 10, 1999 Cygnus donates improved global constant propagation and lazy code motion optimizer framework. March 7, 1999 The egcs project now has additional online documentation. February 26, 1999 Richard Henderson of Cygnus Solutions has donated a major rewrite of the control flow analysis pass of the compiler. February 25, 1999 Marc Espie has donated support for OpenBSD on the Alpha, SPARC, x86, and m68k platforms. Additional targets are expected in the future. January 21, 1999 Cygnus donates support for the PowerPC 750 processor. The PPC750 is a 32bit superscalar implementation of the PowerPC family manufactured by both Motorola and IBM. The PPC750 is targeted at high end Macs as well as high end embedded applications. January 18, 1999 Christian Bruel and Jeff Law donate improved local dead store elimination. January 14, 1999 Cygnus donates support for Hypersparc (SS20) and Sparclite86x (embedded) processors. December 7, 1998 Cygnus donates support for demangling of HP aCC symbols. December 4, 1998 egcs-1.1.1 is released. November 26, 1998 A database with test results is now available online, thanks to Marc Lehmann. November 23, 1998 egcs now can dump flow graph information usable for graphical representation. Contributed by Ulrich Drepper. November 21, 1998 Cygnus donates support for the SH4 processor. November 10, 1998 An official steering committee has been formed. Here is the original announcement. November 5, 1998 The third snapshot of the rewritten libstdc++ is available. You can read some more on http://sources.redhat.com/libstdc++/. October 27, 1998 Bernd Schmidt donates localized spilling support. September 22, 1998 IBM Corporation delivers an update to the IBM Haifa instruction scheduler and new software pipelining and branch optimization support. September 18, 1998 Michael Hayes donates c4x port. September 6, 1998 Cygnus donates Java front end. September 3, 1998 egcs-1.1 is released. August 29, 1998 Cygnus donates Chill front end and runtime. August 25, 1998 David Miller donates rewritten sparc backend. August 19, 1998 Mark Mitchell donates load hoisting and store sinking support. July 15, 1998 The first snapshot of the rewritten libstdc++ is available. You can read some more here. June 29, 1998 Mark Mitchell donates alias analysis framework. May 26, 1998 We have added two new mailing lists for the egcs project. gcc-cvs and egcs-patches.
When a patch is checked into the CVS repository, a check-in notification message is automatically sent to the gcc-cvs mailing list. This will allow developers to monitor changes as they are made.
Patch submissions should be sent to egcs-patches instead of the main egcs list. This is primarily to help ensure that patch submissions do not get lost in the large volume of the main mailing list.
May 18, 1998 Cygnus donates gcse optimization pass. May 15, 1998 egcs-1.0.3 released!. March 18, 1998 egcs-1.0.2 released!. February 26, 1998 The egcs web pages are now supported by egcs project hardware and are searchable with webglimpse. The CVS sources are browsable with the free cvsweb package. February 7, 1998 Stanford has volunteered to host a high speed mirror for egcs. This should significantly improve download speeds for releases and snapshots. Thanks Stanford and Tobin Brockett for the use of their network, disks and computing facilities! January 12, 1998 Remote access to CVS sources is available!. January 6, 1998 egcs-1.0.1 released!. December 3, 1997 egcs-1.0 released!. August 15, 1997 The egcs project is announced publicly and the first snapshot is put on-line.
The egcs mailing list archives for details. I've also heard assertions that the only reason gcc-2.8 was released as quickly as it was is because of the pressures of the egcs release. Here is a Slashdot discussion that contains some additional info. After the fork egcs team proved to be definitely stronger and the development of the original branch stagnated.
This was pretty painful fork, especially personally for RMS, and consequences are still felt today. For example Linus Torvalds still prefer old GCC version and recompilation of the kernel with newer version lead to some subtle bugs due to not full standard compatibility of the old GCC compiler. Alan Cox has said for years that 2.0.x kernels is to be compiled by gcc, not egcs.
As FSF GCC died a silent death from malnutrition, both were (formally) reunited as of version 2.95 in April 1999. With a simple renaming trick, egcs became gcc now and formally the split was over:
Re: egcs to take over gcc maintenance
- To: email@example.com
- Subject: Re: egcs to take over gcc maintainance
- From: Theodore Papadopoulo <Theodore.Papadopoulo@sophia.inria.fr>
- Date: Fri, 16 Apr 1999 18:35:00 +0200
> I'm pleased to announce that the egcs team is taking over as the collective GCC maintainer of GCC. This means that the egcs > steering committee is changing its name to the gcc steering committee and future gcc releases will be made by the egcs
> (then gcc) team. This also means that the open development style is also carried over to gcc (a good thing).
That's a great piece of news...
Email: Theodore.Papadopoulo@sophia.inria.fr Tel: (33) 04 92 38 76 01
More information about the event can be found in the following Slashdot post:
yes, it's true; egcs is gcc. Some details (Score:4)
by JoeBuck (7947) on Tuesday April 20, @12:22PM (#1925069)
As a member of the egcs steering committee, which will become the gcc steering commitee, I can confirm that yes, the merger is official ... sometime in the near future there will be a gcc 3.0 from the egcs code base. The steering committee has been talking to RMS about doing this for months now; at times it's been contentious but now that we understand each other better, things are going much better.
The important thing to understand is that when we started egcs, this is what we were planning all along (well, OK, what some of us were planning). We wanted to change the way gcc worked, not just create a variant. That's why assignments always went to the FSF, why GNU coding style is rigorously followed.
Technically, egcs/gcc will run the same way as before. Since we are now fully GNU, we'll be making some minor changes to reflect that, but we've been doing them gradually in the past few months anyway so nothing that significant will change. Jeff Law remains the release manager; a number of other people have CVS write access; the steering committee handles the "political" and other nontechnical stuff and "hires" the release manager.
egcs/gcc is at this point considerably more bazaar-like than the Linux kernel in that many more people have the ability to get something into the official code (for Linux, only Linus can do that). Jeff Law decides what goes in the release, but he delegates major areas to other maintainers.
The reason for the delay in the announcement is that we were waiting for RMS to announce it (he sent a message to the gnu.*.announce lists), but someone cracked an important FSF machine and did an rm -rf / command. It was noticed and someone powered off the machine, but it appears that this machine hosted the GNU mailing lists, if I understand correctly, so there's nothing on gnu.announce. I don't know why there's still nothing on www.gnu.org (which was not cracked). Why do people do things like this?
Currently GCC Release Manager is Mark Mitchell, CodeSourcery's President and Chief Technical Officer. He received a MS in Computer Science from Stanford in 1999 and a BA from Harvard in 1994. His research interests centered around computational complexity and computer security. Mark worked at CenterLine Software as a software engineer before co-founding CodeSourcery. In his recent interview he provided some interesting facts about current problems and perspectives of GCC development as well as the reasons of growing independence of the product from FSF (for example a pretty interesting fact that version 2.96 of GCC was not even FSF version at all):
JB: There has been a problem with so called gcc-2.96. Why did several distributors create this version?
It's important for everyone to know that there was no version of GCC 2.96 from the FSF. I know Red Hat distributed a version that it called 2.96, and other companies may have done that too. I only know about the Red Hat version.
It is too bad that this version was released. It is essentially a development snapshot of the GCC tree, with a lot of Red Hat fixes. There are a lot of bugs in it, relative to either 2.95 or 3.0, and the C++ code generated is incompatible (at the binary level) with either 2.95 or 3.0. It's been very confusing to users, especially because the error messages generated by the 2.96 release refer users back to the FSF GCC bug-reporting address, even though a lot of the bugs in that release don't exist in the FSF releases. The saddest part is that a lot of people at Red Hat knew that using that release was a bad idea, and they couldn't convince their management of that fact.
Partly, this release is our fault, as GCC maintainers. There was a lot of frustruation because it took so long to produce a new GCC release. I'm currently leading an effort to reduce the time between GCC releases so that this kind of thing is less likely to happen again. I can understand why a company might need to put out an intermediate release of GCC if we are not able to do it ourselves. That's why I think it's important for people to support independent development of GCC, which is, of course, what CodeSourcery does. We're not affiliated with any of the distributors, and so we can act to try to improve the FSF version of GCC directly. When people financially support our work on the releases, that helps to make sure that there are frequent enough release to avoid these problems.
JB: Do you feel, that the 2.96 release speeded up the development and allowed gcc-3.0 to be ready faster?
That is a very difficult question to answer. On the one hand, Red Hat certainly fixed some bugs in Red Hat's 2.96 version, and some of those improvements were contributed back for GCC 3.0. (I do not know if all of them were contributed back, or not.) On the other hand, GCC developers at Red Hat must have spent a lot of time on testing and improving their 2.96 version, and therefore that time was not spent on GCC 3.0.
The problem is that for a company like Red Hat (or CodeSourcery) you can't choose between helping out with the FSF release of GCC and doing something else based just on what would be good for the FSF release. You have to try to make the best business decision, which might mean that you have to do something to please a customer, even though it doesn't help the FSF release.
If people would like to keep companies from making their own releases, there are two things to do: a) make that sentiment known to the companies, since companies like to please their customers, and b) hire companies like CodeSourcery to help work on the FSF releases.
JB: How many developers are currently working on GCC?
It's impossible to count. Hundreds, probably -- but there is definitely a group of ten or twenty that is responsible for most of the changes.
... ... ...
JB: Do you see compiling java to native code as a drawback when using free (speech) code? (compared to using p-code only)
I'm not so moralistic about these issues as some people. I think it's good that we support compiling from byte-code because that's how lots of Java code is distributed. Whether or not that code is free, we're providing a useful product. I suspect the FSF has a different viewpoint.
JB: What are future plans in gcc development?
I think the number one issue is the performance of the generated code. People are looking into different ways to optimize better. I'd also like to see a more robust compiler that issues better error messages and never crashes on incorrect input.
JB: Often, the major problem with hardware vendors is that they don't want to provide the technical documentation for their hardware (forcing you to use their or third party proprietary code). Is this also true with processor documentation?
Most popular workstation processors are well-documented from the point of view of their instruction set. There often isn't as much information available about timing and scheduling information. And some embedded vendors never make any information about their chips available, which means that they can't really distribute a version of GCC for their chips because the GCC source code would give away information about the chip.
AMD is a great example of a company trying to work closely with GCC up front. They made a lot of information about their new chip available very early in the process.
JB: Which systems do currently use GCC as their primary compiler set (not counting *BSD and GNU/Linux)?
Apple's OS X. If Apple succeeds, there will probably be more OS X developers using GCC than there are GNU/Linux developers.
... ... ...
Having RMS as a member of gcc steering committee(SC) has its problems and still invites forking ;-). As one of the participants of the Slashdot discussion noted:
Re:Speaking as a GCC maintainer, I call bullshit (Score:3, Informative)
by devphil (51341) on Sunday August 15, @02:16PM (#9974914)
You're not completely right, and not completely wrong. The politics are exceedingly complicated, and I regret it every time I learn more about them.
RMS doesn't have dictatorial power over the SC, nor a formal veto vote.
He does hold the copyright to GCC. (Well, the FSF holds the copyright, but he is the FSF.) That's a lot more important that many people realize.
Choice of implementation language is, strictly speaking, a purely technical issue. But it has so many consequences that it gets special attention.
The SC specifically avoids getting involved in technical issues whenever possible. Even when the SC is asked to decide something, they never go to RMS when they can help it, because he's so unaware of modern real-world technical issues and the bigger picture. It's far, far better to continue postponing a question than to ask it, when RMS is involved, because he will make a snap decision based on his own bizarre technical ideas, and then never change his mind in time for the new decision to be worth anything.
He can be convinced. Eventually. It took the SC over a year to explain and demonstrate that Java bytecode could not easily be used to subvert the GPL, therefore permitting GCJ to be checked in to the official repository was okay. I'm sure that someday we'll be using C++ in core code. Just not anytime soon.
As for forking again... well, yeah, I personally happen to be a proponent of that path. But I'm keenly aware of the damange that would to do GCC's reputation -- beyond the short-sighted typical
/. viewpoint of "always disobey every authority" -- and I'm still probably underestimating the problems.
Some additional information about gcc development can be found at History - GCC Wiki
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : C++ Humor : ARE YOU A BBS ADDICT? : Object oriented programmers of all nations : C Humor : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor: Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : The Most Comprehensive Collection of Editor-related Humor : Microsoft plans to buy Catholic Church : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor : Best Russian Programmer Humor : Russian Musical Humor : The Perl Purity Test : Politically Incorrect Humor : GPL-related Humor : OFM Humor : IDS Humor : Real Programmers Humor : Scripting Humor : Web Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor :
The Last but not Least
|You can use PayPal to make a contribution, supporting hosting of this site with different providers to distribute and speed up access. Currently there are two functional mirrors: softpanorama.info (the fastest) and softpanorama.net.|
The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.
Last Modified: July 07, 2013