May the source be with you,
but remember the KISS principle ;-)
Key Softpanorama Topics
|About||Contents||Top Updates||Top Visited|
|Bulletin||Selected Papers||Softpanorama Bookshelf||History|
If you are a real assembler language
guru and like algorithms
please volunteer for the MMIX project.
One on the greatest honors among assembler language programmers
is to be listed as a contributor in TAOCP !!!
Old News ;-)
|FAQs||Reference||FAT filesystem||Articles||Sources||Introductions and Tutorials||Tribute to Dmitry Gurtyak|
|Softpanorama Archives||OFM||Editors and Ctags||Classic Unix Tools||Coroutines in assembler||Pipes||MMIX|
|PL/360||S/360 assembler||IDE||History||CPU History||Humor||Etc|
I am convinced that assembler is a fundamental part of a programmer education and that a person that learned assembler has a better chance to became a good programmer that a person that spend the same amount of time on learning some fancy languages and/or programming methodologies like object-oriented programming ;-). You need to find a good book to start.
Unmatched introductory book for learning assembler still is:
- ***** [Socha1992] Peter Norton's Assembly Language Book for the IBM PC
- Brady, December 1, 1992, 3rd/bk&dsk edition, cover price $39.95
This is a great introductory book. Probably still unmatched.
It teaches assembly language starting with the now forgotten debug program. That's really helps because debug can act as interpreter for simple assembly programs. I would suggest replacing it with the free full-screen debugger AFD.EXE but still this is the best way to learn assembler. I recommend to run it from an OFM (for example VC - Volkov commander) and use hiew as a viewer (hiew has built-in dissassmbler) See Softpanorama archive for more details of this semi-forgotten world of DOS programming. Archives contains a lot of information and programs for fifteen year period from 1989 till 2004.
Microsoft original DOS debugger debug.exe is still available with Windows 98 (if you do not have Windows 98 you can get free DR-DOS; you can also can download AFD.EXE from the WEB, which is a nice debugger (it is available from some abandonware sites).
You usually can get Microsoft's Macro Assembler free. In the past Microsoft has made their Microsoft Windows Driver Development Kits (DDK) freely downloadable and it contains both assembler and linker. Many used books contains assembler linker and debugger of the disk. You can download Microsoft's Debugging Tools for Windows 32-bit Version. Borland versions (tasm and tdebug) can be downloaded from the Borland museum with C++ 1.01 (free registration required).
OK, I will be the first to admit that fashion rules in programming and fancy languages and programming methodologies are still important if you want a job in the industry. Still it is important to distinguish between things that are important as a cosmetic addition to your education and things that are fundamental and determine the level and quality of education as such. IMHO assembler is one of the cornerstones of programmer education, a gateway to several important programming methodologies.
Among them I would like to mention coroutines, compiler construction techniques and algorithms (that can be called an important programming paradigm and has a much wider spectrum of applicability then most people suspect) and program generation techniques (although the letter can be studied using scripting languages too), decompilation, reverse engineering and other methods of binary program understanding.
To a certain extent we can say that without solid base of assembler language programming the whole building of programmer education is built on sand.
Without solid base of assembler language programming
The question why to use assembler language is a very important question to answer. This is a philosophical question that any programmer need to decide for himself. First of all, assembler is not yet another computer language. It is the low-level language par excellence. I can only quote here the opinion of one of the greatest computer scientists Professor Donald Knuth, the computer scientist that I respect more than others. He is the author of the bible for the programmers The Art of Computer Programming). The latter is especially important for assembler programmers as Donald Knuth is one of few computer scientists who was able to see thru "object-oriented programming" smog and defend the value of the assembler language both for computer science education and for real and complex programming projects. In his description of MMIX he wrote (bold italic is mine -- Nikolai Bezroukov):
Many readers are no doubt thinking, "Why does Knuth replace MIX by another machine instead of just sticking to a high-level programming language? Hardly anybody uses assemblers these days.''
Such people are entitled to their opinions, and they need not bother reading the machine-language parts of my books. But the reasons for machine language that I gave in the preface to Volume 1, written in the early 1960s, remain valid today:
- One of the principal goals of my books is to show how high-level constructions are actually implemented in machines, not simply to show how they are applied. I explain coroutine linkage, input-output buffering, random number generation, high-precision arithmetic, radix conversion, packing of data, combinatorial searching, recursion, etc., from the ground up.
- The programs needed in my books are generally so short that their main points can be grasped easily.
- People who are more than casually interested in computers should have at least some idea of what the underlying hardware is like. Otherwise the programs they write will be pretty weird.
- Machine language is necessary in any case, as output of many of the software programs I describe.
- Expressing basic methods like algorithms for sorting and searching in machine language makes it possible to make meaningful studies of the effects of cache and RAM size and other hardware characteristics (memory speed, pipelining, multiple issue, lookaside buffers, the size of cache-lines, etc.) when comparing different schemes.
Moreover, if I did use a high-level language, what language should it be? In the 1960s I would probably have chosen Algol W; in the 1970s, I would then have had to rewrite my books using Pascal; in the 1980s, I would surely have changed everything to C; in the 1990s, I would have had to switch to C++ and then probably to Java. In the 2000s, yet another language will no doubt be de rigueur. I cannot afford the time to rewrite my books as languages go in and out of fashion; languages aren't the point of my books, the point is rather what you can do in your favorite language. My books focus on timeless truths.
Therefore I will continue to use English as the high-level language in TAOCP, and I will continue to use a low-level language to indicate how machines actually compute. Readers who only want to see algorithms that are already packaged in a plug-in way, using a trendy language, should buy other people's books.
The good news is that programming for RISC machines is pleasant and simple, when the RISC machine has a nice clean design...
What Donald Knuth missed here is that the instruction set itself can be considered as an artistic object and we can talk about beautiful and ugly CPUs. I think that S/360 CPU architecture is an immortal monument to the Gene Amdahl . And actually the beauty of CPU instruction set is most adequately expressed in the beauty of the assembler programs for this CPU. Anybody who worked on mainframes and then switched to Intel 80xx CPUs can attest that. The predominant feeling was that there is not much beauty in 80xx architecture and everything was crumpled together in a rather ugly way.
I would add that when "pedal to the metal" performance is required, then there is a significant (100% or more) performance margin to be gained by using assembly language. Ordinarily, one combines C/C++ and assembly by using the compiler's inline assembly feature, or by linking to a library of assembly routines. Such optimization should be based on profiling as for more or less complex program it is impossible to guess where will be the bottleneck of the program beforehand.
Also for many years, assembly language programmers were the only ones to benefit from some advanced concepts like coroutines structures, because with the exception of Modula-2 no mainstream high-level programming language supported them (now with the exception of Python). It's interesting to notes that coroutines are available is ksh93 but not in Perl.
What is really important is that assembler open you a door for studying compiler design -- a very powerful and underutilized programming paradigm (i.e. structuring the programming system as if it was a compiler for some new language). there are several interesting algorithms used in compiler construction that are almost forgotten by current programming mainstream.
I think experience with compiler writing is a eye-opening experience for any assembler programmer and if you are a student I strongly recommend you to take a compiler construction course if it is available in your university. It's really unfortunate that many students today study only a couple of complex and boring languages (C++ and Java) and never learn powerful programming techniques used in compilers. I suspect only most gifted students can overcome overdose of OO, Java, and other fancy topics in the current university curriculum ;-).
Assembler is often perfect for the programming in the small, especially for time-critical parts of your system. BTW such parts of the system are usually really small. In 1971 Donald Knuth published his groundbreaking paper "An empirical study of FORTRAN programs." ( Software---Practice and Experience, Vol 1, pages 105-133, 1971). The paper was an empirical study of executing FORTRAN programs selected randomly from semi protected sources stored on Stanford University's computer disks. In this paper he laid a foundation of empirical analysis of computer languages by providing convincing empirical evidence about the critical influence of the level of optimization of "inner loops" on performance, the fact that programs appear to exhibit a property termed locality of reference and provided powerful argument against orthogonal languages by observing the fact that only a small rather primitive subset of the languages is used in 90% of all statements. For example most of arithmetic expressions on the right side of assignment statements are simple increments/decrements or have a form a+1 or a-1. That means that the difference in expressiveness of high-level languages in comparison with assembler is often overstated. Moreover he discovered amazing fact that among assignment statements, the right hand sides of most has no operations (i.e. have a form a=b), of those which has most have one with most common (a+1 and a-1), and only tiny percent has two or more operations. Knuth was also the first to suggest profile based optimization, and today it is part of many research systems as well as production compilers.
This classic paper also describes how the use of run time statistics gathered from the execution of a program can be used to suggest optimizations. He statically and dynamically analyzed a large number of Fortran programs and measured the speedup that could be obtained by hand optimizing them. He found that on the average a program could be improved by as much as a factor of 4. Among other things, he concluded that programmers had poor intuition about what parts of their programs were the most time consuming, and that execution profiles would significantly help programmers to improve the performance of their programs: the best way to improve a program s performance is to concentrate on the parts or features of a program that according to the obtains profile are eating the lions share of the machine resources. Generally most programs behave according to Pareto law with 10% of code responsible for 90% of execution time. The paper formulated classic programming maxim "the premature program optimization is the root of all evils". Among other things the paper might inspire introduction to C increment statements (i++; a+=b) and development of RISC computers.
But you need some way to integrate all small bricks into the building (this is called "programming in the large" vs "programming in the small"). That's why I think it's very important for assembler programmer to learn at least one scripting language. I suggest Korn shell (on Unix), Tcl or REXX, all three are very simple, but powerful scripting languages available both Unix and Windows. Tcl has a lot of useful tools that can be used by assembler programmer including Tcl programmable OFM. Please note that this is just my recommendation and other scripting languages may be as good or better as recommended in your particular circumstances. Your mileage may vary.
As for assembler programmer toolset I would strongly recommend
An orthodox file manager can serve as a Swiss knife for assembler language programmers and be as productive environment (or more productive environment) as any complex IDE. It's an quintessential system programming tool.
I would like to stress it again that while every decent programmer should know at least one assembly language, it should not necessary be assembler for the Intel x80 architecture. The instruction set of the latter is a little bit too complex and convoluted in comparison with, say, the classic S/360 instruction set, as well as classic RISC CPUs (UltraSparc, PowerPC, etc). Intel assembler has a tremendous important advantage that both Microsoft and Borland assemblers and debuggers are both free and widely available and it has the largest amount of tutorials, books etc.
But for those who are interested in alternatives I would really advice to take a look on S/360 instruction set -- this is the oldest instruction set that is still in production after more than 40 year ! And S/360 was the first system that got "structured assembler language -- famous PL360 that many consider a superior to traditional assembler languages. S/360 has a wealth of freely available documentation and the oldest and the most respectable assembler programming culture. On the other hand it's really cheap to buy an old, used Mac or Sun Sparc machine and learn corresponding. much more elegant assembler language. Here the luck of documentation can be a problem, but its problem you probably can overcome...
It also important to learn some tools. The first tool for any assembler programmer is a debugger. You can benefit from a hex caclulator(HP produces some cheap and good) and a printed assembler reference card.
Assembler programmer usually spend in command line environment more time that programmers in other languages. And that is not bad if you have adequate tools in hand. Two other most important tools are file manager and editor. I strongly recommend Orthodox file manager. After overcoming the learning curve you will see that it will increase you productivity to the level close to those who work in GUI environment. As for editor this is a more complex and personal question. I would not recommend you anything but you see my opinion about editors in my editor page. Personally I prefer foldable scriptable editors like THE or VIM 6. Your mileage may vary.
Anyway, the assembler programming is a real programming -- not this OO-oriented traveling in Microsoft darkness that so many suckers prefer ;-).
Good luck !!!
Important Note: During 1989-1996 I was the editor-in-chief of the Softpanorama bulletin. During those seven years I collected a lot of interesting assembler programs in the Forum/Assembler section of the bulletin. See Softpanorama Contents. Bulletins are zipped and you need to dig out relevant parts of FORUM subdirectories yourself -- sorry there is no easy way. It's a real pity that one of the most talented contributors to the bulletin recently died -- see my Tribute to Dmitry Gurtyak (1971-1998). Here is the list of some of his assembler programs from his memorial page:
Dr. Nikolai Bezroukov
|Bulletin||Latest||Past week||Past month||
Dissy is a disassembler for multiple architectures. It is implemented as a graphical frontend to objdump. It allows fast navigation through the disassembled code and easy searching for addresses and symbols.
mixal is an assembler and interpreter for Donald Knuth's mythical MIX computer, defined in his book, "The Art of Computer Programming, Vol. 1: Fundamental Algorithms."
The MIX computer will soon be replaced by a RISC machine called MMIX. Meanwhile if you want to try out the existing programs for the original 60s-era machine, you might be able to find suitable software at the following sites:
- GNU's MIX Development Kit
- JMixSim, an OS-independent assembler and simulator, by Christian Kandeler
- MixIDE, another OS-independent assembler and simulator, by Andrea Tettamanzi
- MIXBuilder: an editor, assembler, simulator, and interactive debugger for Win32 platforms, by Bill Menees
- EMIX: an expandable MIX emulator for the Win32 platform, by Daniel Andrade and Marcus Pereira
- MIX/MIXAL in C with lex and CWEB documentation and a source debug facility, by Douglas Laing and Sergey Poznyakoff
- Allan Adler's "swiss" version that can be compiled for Linux
- Darius Bacon and Eric Raymond's open-source load-and-go assembler and simulator, from The Retrocomputing Museum
- John R. Ashmun's MIXware for the Be [Haiku] operating system, with extended support for interrupts
- Rutger van Bergen's MIX emulator in .NET/C#
- Chaoji Li's MIX assembler and simulator, in Perl
(Please let me know of any other sites that I should add to this list.)
IEEE Software Nov/Dec 2008, pp.18-19
As I write this column, I'm in the middle of two summer projects; with luck, they'll both be finished by the time you read it. One involves a forensic analysis of over 100,000 lines of old C and assembly code from about 1990, and I have to work on Windows XP. The other is a hack to translate code written in weird language L1 into weird language L2 with a program written in scripting language L3, where none of the L's even existed in 1990; this one uses Linux. Thus it's perhaps a bit surprising that I find myself relying on much the same toolset for these very different tasks.
... ... ..
here has surely been much progress in tools over the 25 years that IEEE Software has been around, and I wouldnít want to go back in time. But the tools I use today are mostly the same old onesógrep, diff, sort, awk, and friends. This might well mean that Iím a dinosaur stuck in the past. On the other hand, when it comes to doing simple things quickly, I can often have the job done while experts are still waiting for their IDE to start up. Sometimes the old ways are best, and theyíre certainly worth knowing well
This article is intended to help C & C++ programmers understand the essentials of what the linker does. I've explained this to a number of colleagues over the years, so I decided it was time to write it down so that it's more widely available (and so that I don't have to explain it again). [Updated March 2009 to include more information on the pecularities of linking on Windows, plus some clarification on the one definition rule.]
These are the manuscript chapters for my Linkers and Loaders, published by Morgan-Kaufman. See the book's web site for ordering information.
The text in these files is the unedited original manuscript. M-K has fine copy editors, who have fixed all the minor typos, spelling, and grammar errors in the printed book, but if you come across factual errors I'd still appreciate help getting all the details of linking and loading as complete and correct as possible. I will collect errata and fix them in subsequent printings.
The figures here are scans of hand-drawn sketches which have been redrawn for the book. You don't need to tell me I'm a lousy artist. I already know that.
20 Dec 2005 | developerWorks
Summary: The ELF object module format has had wide-ranging effects on software development for multiple platforms. Peter Seebach looks at the history of the ELF specification and why it's been so useful.
Apr 25, 2008 | informit.com
Andrew Binstock and Donald Knuth converse on the success of open source, the problem with multicore architecture, the disappointing lack of interest in literate programming, the menace of reusable code, and that urban legend about winning a programming contest with a single compilation.
Andrew Binstock: You are one of the fathers of the open-source revolution, even if you arenít widely heralded as such. You previously have stated that you released TeX as open source because of the problem of proprietary implementations at the time, and to invite corrections to the codeóboth of which are key drivers for open-source projects today. Have you been surprised by the success of open source since that time?
Donald Knuth: The success of open source code is perhaps the only thing in the computer field that hasnít surprised me during the past several decades. But it still hasnít reached its full potential; I believe that open-source programs will begin to be completely dominant as the economy moves more and more from products towards services, and as more and more volunteers arise to improve the code.
For example, open-source code can produce thousands of binaries, tuned perfectly to the configurations of individual users, whereas commercial software usually will exist in only a few versions. A generic binary executable file must include things like inefficient "sync" instructions that are totally inappropriate for many installations; such wastage goes away when the source code is highly configurable. This should be a huge win for open source.
Yet I think that a few programs, such as Adobe Photoshop, will always be superior to competitors like the Gimpófor some reason, I really donít know why! Iím quite willing to pay good money for really good software, if I believe that it has been produced by the best programmers.
Remember, though, that my opinion on economic questions is highly suspect, since Iím just an educator and scientist. I understand almost nothing about the marketplace.
Andrew: A story states that you once entered a programming contest at Stanford (I believe) and you submitted the winning entry, which worked correctly after a single compilation. Is this story true? In that vein, todayís developers frequently build programs writing small code increments followed by immediate compilation and the creation and running of unit tests. What are your thoughts on this approach to software development?
Donald: The story you heard is typical of legends that are based on only a small kernel of truth. Hereís what actually happened: John McCarthy decided in 1971 to have a Memorial Day Programming Race. All of the contestants except me worked at his AI Lab up in the hills above Stanford, using the WAITS time-sharing system; I was down on the main campus, where the only computer available to me was a mainframe for which I had to punch cards and submit them for processing in batch mode. I used Wirthís ALGOL W system (the predecessor of Pascal). My program didnít work the first time, but fortunately I could use Ed Satterthwaiteís excellent offline debugging system for ALGOL W, so I needed only two runs. Meanwhile, the folks using WAITS couldnít get enough machine cycles because their machine was so overloaded. (I think that the second-place finisher, using that "modern" approach, came in about an hour after I had submitted the winning entry with old-fangled methods.) It wasnít a fair contest.
As to your real question, the idea of immediate compilation and "unit tests" appeals to me only rarely, when Iím feeling my way in a totally unknown environment and need feedback about what works and what doesnít. Otherwise, lots of time is wasted on activities that I simply never need to perform or even think about. Nothing needs to be "mocked up."
Andrew: One of the emerging problems for developers, especially client-side developers, is changing their thinking to write programs in terms of threads. This concern, driven by the advent of inexpensive multicore PCs, surely will require that many algorithms be recast for multithreading, or at least to be thread-safe. So far, much of the work youíve published for Volume 4 of The Art of Computer Programming (TAOCP) doesnít seem to touch on this dimension. Do you expect to enter into problems of concurrency and parallel programming in upcoming work, especially since it would seem to be a natural fit with the combinatorial topics youíre currently working on?
Donald: The field of combinatorial algorithms is so vast that Iíll be lucky to pack its sequential aspects into three or four physical volumes, and I donít think the sequential methods are ever going to be unimportant. Conversely, the half-life of parallel techniques is very short, because hardware changes rapidly and each new machine needs a somewhat different approach. So I decided long ago to stick to what I know best. Other people understand parallel machines much better than I do; programmers should listen to them, not me, for guidance on how to deal with simultaneity.
Andrew: Vendors of multicore processors have expressed frustration at the difficulty of moving developers to this model. As a former professor, what thoughts do you have on this transition and how to make it happen? Is it a question of proper tools, such as better native support for concurrency in languages, or of execution frameworks? Or are there other solutions?
Donald: I donít want to duck your question entirely. I might as well flame a bit about my personal unhappiness with the current trend toward multicore architecture. To me, it looks more or less like the hardware designers have run out of ideas, and that theyíre trying to pass the blame for the future demise of Mooreís Law to the software writers by giving us machines that work faster only on a few key benchmarks! I wonít be surprised at all if the whole multithreading idea turns out to be a flop, worse than the "Titanium" approach that was supposed to be so terrificóuntil it turned out that the wished-for compilers were basically impossible to write.
Let me put it this way: During the past 50 years, Iíve written well over a thousand programs, many of which have substantial size. I canít think of even five of those programs that would have been enhanced noticeably by parallelism or multithreading. Surely, for example, multiple processors are no help to TeX.
How many programmers do you know who are enthusiastic about these promised machines of the future? I hear almost nothing but grief from software people, although the hardware folks in our department assure me that Iím wrong.
I know that important applications for parallelism existórendering graphics, breaking codes, scanning images, simulating physical and biological processes, etc. But all these applications require dedicated code and special-purpose techniques, which will need to be changed substantially every few years.
Even if I knew enough about such methods to write about them in TAOCP, my time would be largely wasted, because soon there would be little reason for anybody to read those parts. (Similarly, when I prepare the third edition of Volume 3 I plan to rip out much of the material about how to sort on magnetic tapes. That stuff was once one of the hottest topics in the whole software field, but now it largely wastes paper when the book is printed.)
The machine I use today has dual processors. I get to use them both only when Iím running two independent jobs at the same time; thatís nice, but it happens only a few minutes every week. If I had four processors, or eight, or more, I still wouldnít be any better off, considering the kind of work I doóeven though Iím using my computer almost every day during most of the day. So why should I be so happy about the future that hardware vendors promise? They think a magic bullet will come along to make multicores speed up my kind of work; I think itís a pipe dream. (Noóthatís the wrong metaphor! "Pipelines" actually work for me, but threads donít. Maybe the word I want is "bubble.")
From the opposite point of view, I do grant that web browsing probably will get better with multicores. Iíve been talking about my technical work, however, not recreation. I also admit that I havenít got many bright ideas about what I wish hardware designers would provide instead of multicores, now that theyíve begun to hit a wall with respect to sequential computation. (But my MMIX design contains several ideas that would substantially improve the current performance of the kinds of programs that concern me mostóat the cost of incompatibility with legacy x86 programs.)
Andrew: One of the few projects of yours that hasnít been embraced by a widespread community is literate programming. What are your thoughts about why literate programming didnít catch on? And is there anything youíd have done differently in retrospect regarding literate programming?
Donald: Literate programming is a very personal thing. I think itís terrific, but that might well be because Iím a very strange person. It has tens of thousands of fans, but not millions.
In my experience, software created with literate programming has turned out to be significantly better than software developed in more traditional ways. Yet ordinary software is usually okayóIíd give it a grade of C (or maybe C++), but not F; hence, the traditional methods stay with us. Since theyíre understood by a vast community of programmers, most people have no big incentive to change, just as Iím not motivated to learn Esperanto even though it might be preferable to English and German and French and Russian (if everybody switched).
Jon Bentley probably hit the nail on the head when he once was asked why literate programming hasnít taken the whole world by storm. He observed that a small percentage of the worldís population is good at programming, and a small percentage is good at writing; apparently I am asking everybody to be in both subsets.
Yet to me, literate programming is certainly the most important thing that came out of the TeX project. Not only has it enabled me to write and maintain programs faster and more reliably than ever before, and been one of my greatest sources of joy since the 1980sóit has actually been indispensable at times. Some of my major programs, such as the MMIX meta-simulator, could not have been written with any other methodology that Iíve ever heard of. The complexity was simply too daunting for my limited brain to handle; without literate programming, the whole enterprise would have flopped miserably.
If people do discover nice ways to use the newfangled multithreaded machines, I would expect the discovery to come from people who routinely use literate programming. Literate programming is what you need to rise above the ordinary level of achievement. But I donít believe in forcing ideas on anybody. If literate programming isnít your style, please forget it and do what you like. If nobody likes it but me, let it die.
On a positive note, Iíve been pleased to discover that the conventions of CWEB are already standard equipment within preinstalled software such as Makefiles, when I get off-the-shelf Linux these days.
Andrew: In Fascicle 1 of Volume 1, you reintroduced the MMIX computer, which is the 64-bit upgrade to the venerable MIX machine comp-sci students have come to know over many years. You previously described MMIX in great detail in MMIXware. Iíve read portions of both books, but canít tell whether the Fascicle updates or changes anything that appeared in MMIXware, or whether itís a pure synopsis. Could you clarify?
Donald: Volume 1 Fascicle 1 is a programmerís introduction, which includes instructive exercises and such things. The MMIXware book is a detailed reference manual, somewhat terse and dry, plus a bunch of literate programs that describe prototype software for people to build upon. Both books define the same computer (once the errata to MMIXware are incorporated from my website). For most readers of TAOCP, the first fascicle contains everything about MMIX that theyíll ever need or want to know.
I should point out, however, that MMIX isnít a single machine; itís an architecture with almost unlimited varieties of implementations, depending on different choices of functional units, different pipeline configurations, different approaches to multiple-instruction-issue, different ways to do branch prediction, different cache sizes, different strategies for cache replacement, different bus speeds, etc. Some instructions and/or registers can be emulated with software on "cheaper" versions of the hardware. And so on. Itís a test bed, all simulatable with my meta-simulator, even though advanced versions would be impossible to build effectively until another five years go by (and then we could ask for even further advances just by advancing the meta-simulator specs another notch).
Suppose you want to know if five separate multiplier units and/or three-way instruction issuing would speed up a given MMIX program. Or maybe the instruction and/or data cache could be made larger or smaller or more associative. Just fire up the meta-simulator and see what happens.
Andrew: As I suspect you donít use unit testing with MMIXAL, could you step me through how you go about making sure that your code works correctly under a wide variety of conditions and inputs? If you have a specific work routine around verification, could you describe it?
Donald: Most examples of machine language code in TAOCP appear in Volumes 1-3; by the time we get to Volume 4, such low-level detail is largely unnecessary and we can work safely at a higher level of abstraction. Thus, Iíve needed to write only a dozen or so MMIX programs while preparing the opening parts of Volume 4, and theyíre all pretty much toy programsónothing substantial. For little things like that, I just use informal verification methods, based on the theory that Iíve written up for the book, together with the MMIXAL assembler and MMIX simulator that are readily available on the Net (and described in full detail in the MMIXware book).
That simulator includes debugging features like the ones I found so useful in Ed Satterthwaiteís system for ALGOL W, mentioned earlier. I always feel quite confident after checking a program with those tools.
Andrew: Despite its formulation many years ago, TeX is still thriving, primarily as the foundation for LaTeX. While TeX has been effectively frozen at your request, are there features that you would want to change or add to it, if you had the time and bandwidth? If so, what are the major items you add/change?
Donald: I believe changes to TeX would cause much more harm than good. Other people who want other features are creating their own systems, and Iíve always encouraged further developmentóexcept that nobody should give their program the same name as mine. I want to take permanent responsibility for TeX and Metafont, and for all the nitty-gritty things that affect existing documents that rely on my work, such as the precise dimensions of characters in the Computer Modern fonts.
Andrew: One of the little-discussed aspects of software development is how to do design work on software in a completely new domain. You were faced with this issue when you undertook TeX: No prior art was available to you as source code, and it was a domain in which you werenít an expert. How did you approach the design, and how long did it take before you were comfortable entering into the coding portion?
Donald: Thatís another good question! Iíve discussed the answer in great detail in Chapter 10 of my book Literate Programming, together with Chapters 1 and 2 of my book Digital Typography. I think that anybody who is really interested in this topic will enjoy reading those chapters. (See also Digital Typography Chapters 24 and 25 for the complete first and second drafts of my initial design of TeX in 1977.)
Andrew: The books on TeX and the program itself show a clear concern for limiting memory usageóan important problem for systems of that era. Today, the concern for memory usage in programs has more to do with cache sizes. As someone who has designed a processor in software, the issues of cache-aware and cache-oblivious algorithms surely must have crossed your radar screen. Is the role of processor caches on algorithm design something that you expect to cover, even if indirectly, in your upcoming work?
Donald: I mentioned earlier that MMIX provides a test bed for many varieties of cache. And itís a software-implemented machine, so we can perform experiments that will be repeatable even a hundred years from now. Certainly the next editions of Volumes 1-3 will discuss the behavior of various basic algorithms with respect to different cache parameters.
In Volume 4 so far, I count about a dozen references to cache memory and cache-friendly approaches (not to mention a "memo cache," which is a different but related idea in software).
Andrew: What set of tools do you use today for writing TAOCP? Do you use TeX? LaTeX? CWEB? Word processor? And what do you use for the coding?
Donald: My general working style is to write everything first with pencil and paper, sitting beside a big wastebasket. Then I use Emacs to enter the text into my machine, using the conventions of TeX. I use tex, dvips, and gv to see the results, which appear on my screen almost instantaneously these days. I check my math with Mathematica.
I program every algorithm thatís discussed (so that I can thoroughly understand it) using CWEB, which works splendidly with the GDB debugger. I make the illustrations with MetaPost (or, in rare cases, on a Mac with Adobe Photoshop or Illustrator). I have some homemade tools, like my own spell-checker for TeX and CWEB within Emacs. I designed my own bitmap font for use with Emacs, because I hate the way the ASCII apostrophe and the left open quote have morphed into independent symbols that no longer match each other visually. I have special Emacs modes to help me classify all the tens of thousands of papers and notes in my files, and special Emacs keyboard shortcuts that make bookwriting a little bit like playing an organ. I prefer rxvt to xterm for terminal input. Since last December, Iíve been using a file backup system called backupfs, which meets my need beautifully to archive the daily state of every file.
According to the current directories on my machine, Iíve written 68 different CWEB programs so far this year. There were about 100 in 2007, 90 in 2006, 100 in 2005, 90 in 2004, etc. Furthermore, CWEB has an extremely convenient "change file" mechanism, with which I can rapidly create multiple versions and variations on a theme; so far in 2008 Iíve made 73 variations on those 68 themes. (Some of the variations are quite short, only a few bytes; others are 5KB or more. Some of the CWEB programs are quite substantial, like the 55-page BDD package that I completed in January.) Thus, you can see how important literate programming is in my life.
I currently use Ubuntu Linux, on a standalone laptopóit has no Internet connection. I occasionally carry flash memory drives between this machine and the Macs that I use for network surfing and graphics; but I trust my family jewels only to Linux. Incidentally, with Linux I much prefer the keyboard focus that I can get with classic FVWM to the GNOME and KDE environments that other people seem to like better. To each his own.
Andrew: You state in the preface of Fascicle 0 of Volume 4 of TAOCP that Volume 4 surely will comprise three volumes and possibly more. Itís clear from the text that youíre really enjoying writing on this topic. Given that, what is your confidence in the note posted on the TAOCP website that Volume 5 will see light of day by 2015?
Donald: If you check the Wayback Machine for previous incarnations of that web page, you will see that the number 2015 has not been constant.
Youíre certainly correct that Iím having a ball writing up this material, because I keep running into fascinating facts that simply canít be left outóeven though more than half of my notes donít make the final cut.
Precise time estimates are impossible, because I canít tell until getting deep into each section how much of the stuff in my files is going to be really fundamental and how much of it is going to be irrelevant to my book or too advanced. A lot of the recent literature is academic one-upmanship of limited interest to me; authors these days often introduce arcane methods that outperform the simpler techniques only when the problem size exceeds the number of protons in the universe. Such algorithms could never be important in a real computer application. I read hundreds of such papers to see if they might contain nuggets for programmers, but most of them wind up getting short shrift.
From a scheduling standpoint, all I know at present is that I must someday digest a huge amount of material that Iíve been collecting and filing for 45 years. I gain important time by working in batch mode: I donít read a paper in depth until I can deal with dozens of others on the same topic during the same week. When I finally am ready to read what has been collected about a topic, I might find out that I can zoom ahead because most of it is eminently forgettable for my purposes. On the other hand, I might discover that itís fundamental and deserves weeks of study; then Iíd have to edit my website and push that number 2015 closer to infinity.
Andrew: In late 2006, you were diagnosed with prostate cancer. How is your health today?
Donald: Naturally, the cancer will be a serious concern. I have superb doctors. At the moment I feel as healthy as ever, modulo being 70 years old. Words flow freely as I write TAOCP and as I write the literate programs that precede drafts of TAOCP. I wake up in the morning with ideas that please me, and some of those ideas actually please me also later in the day when Iíve entered them into my computer.
On the other hand, I willingly put myself in Godís hands with respect to how much more Iíll be able to do before cancer or heart disease or senility or whatever strikes. If I should unexpectedly die tomorrow, Iíll have no reason to complain, because my life has been incredibly blessed. Conversely, as long as Iím able to write about computer science, I intend to do my best to organize and expound upon the tens of thousands of technical papers that Iíve collected and made notes on since 1962.
Andrew: On your website, you mention that the Peoples Archive recently made a series of videos in which you reflect on your past life. In segment 93, "Advice to Young People," you advise that people shouldnít do something simply because itís trendy. As we know all too well, software development is as subject to fads as any other discipline. Can you give some examples that are currently in vogue, which developers shouldnít adopt simply because theyíre currently popular or because thatís the way theyíre currently done? Would you care to identify important examples of this outside of software development?
Donald: Hmm. That question is almost contradictory, because Iím basically advising young people to listen to themselves rather than to others, and Iím one of the others. Almost every biography of every person whom you would like to emulate will say that he or she did many things against the "conventional wisdom" of the day.
Still, I hate to duck your questions even though I also hate to offend other peopleís sensibilitiesógiven that software methodology has always been akin to religion. With the caveat that thereís no reason anybody should care about the opinions of a computer scientist/mathematician like me regarding software development, let me just say that almost everything Iíve ever heard associated with the term "extreme programming" sounds like exactly the wrong way to go...with one exception. The exception is the idea of working in teams and reading each otherís code. That idea is crucial, and it might even mask out all the terrible aspects of extreme programming that alarm me.
I also must confess to a strong bias against the fashion for reusable code. To me, "re-editable code" is much, much better than an untouchable black box or toolkit. I could go on and on about this. If youíre totally convinced that reusable code is wonderful, I probably wonít be able to sway you anyway, but youíll never convince me that reusable code isnít mostly a menace.
Hereís a question that you may well have meant to ask: Why is the new book called Volume 4 Fascicle 0, instead of Volume 4 Fascicle 1? The answer is that computer programmers will understand that I wasnít ready to begin writing Volume 4 of TAOCP at its true beginning point, because we know that the initialization of a program canít be written until the program itself takes shape. So I started in 2005 with Volume 4 Fascicle 2, after which came Fascicles 3 and 4. (Think of Star Wars, which began with Episode 4.)
About: nwbintools is a machine code tool-chain containing an assembler and various related development tools. It will thus be similar to GNU's binutils, but no attempts are made to duplicate its functionality, organization, or interfaces. The assembler works on x86 ELF-based Linux and FreeBSD systems.
Changes: This release fixes various critical bugs. Additionally, assembler performance has more than doubled, and approaches GNU assembler speed levels now.
Assembler Book Read or Download Individual Chapters Table of Contents and Index Short Table of Contents (44KB) PDF File Full Table of Contents (408KB) PDF File Index (724KB) PDF File Volume One - Data Representation (40K) Chapter One: Foreward (60K) PDF File Chapter Two: Hello, World of Assembly (316 K) PDF File Chapter Three: Data Representation (304K) PDF File Chapter Four: More Data Representation (284) PDF File Chapter Five: Questions, Projects, and Lab Exercises (136K) PDF File Volume Two - Introduction to Machine Architecture (28K) Chapter One: System Organization (164K) PDF File Chapter Two: Memory Access and Organization (340K) PDF File Chapter Three: Introduction to Digital Design (336K) PDF File Chapter Four: CPU Architecture (244K) PDF File Chapter Five: Instruction Set Architecture (212K) PDF File Chapter Six: Memory Architecture (164K) PDF File Chapter Seven: The I/O Subsystem (188K) PDF File Chapter Eight: Questions, Projects, and Lab Exercises (264K) PDF File Volume Three - Basic Assembly Language Programming (28K) Chapter One: Constants, Variables, and Data Types (216K) PDF File Chapter Two: Character Strings (176K) PDF File Chapter Three: Characters and Character Sets (204K) PDF File Chapter Four: Arrays (172K) PDF File Chapter Five: Records, Unions, and Namespaces (156K) PDF File Chapter Six: Dates and Times (144K) PDF File Chapter Seven: File I/O (188K) PDF File Chapter Eight: Introduction to Procedures (224K) PDF File Chapter Nine: Managing Large Programs (144K) PDF File Chapter Ten: Integer Arithmetic (216K) PDF File Chapter Eleven: Real Arithmetic (412K) PDF File Chapter Twelve: Calculation Via Table Lookup (152K) PDF File Chapter Thirteen: Questions, Projects, and Lab Exercises (480 K) PDF File Volume Four - Intermediate Assembly Language Programming (28K) Chapter One: Advanced High Level Control Structures (180 KB) PDF File Chapter Two: Low Level Control Structures (428 KB) PDF File Chapter Three: Intermediate Procedures (348 KB) PDF File Chapter Four: Advanced Arithmetic (436K) PDF File Chapter Five: Bit Manipulation (220 KB) PDF File Chapter Six: String Instructions (120 KB) PDF File Chapter Seven: The HLA Compile-Time Language (164 KB) PDF File Chapter Eight: Macros(272 KB) PDF File Chapter Nine: Domain Specific Languages (436 KB) PDF File Chapter Ten: Classes and Objects (408 KB) PDF File Chapter Eleven: The MMX Instruction Set (280 KB) PDF File Chapter Twelve: Mixed Language Programming (328 KB) PDF File Chapter Thirteen: Questions, Projects, and Lab Exercises (612 KB) PDF File Volume Five - Advanced Procedures (28K) Chapter One: Thunks (208 KB) PDF File Chapter Two: Iterators (200 KB) PDF File Chapter Three: Coroutines (100 KB) PDF File Chapter Four: Low Level Parameter Implementation (240 KB) PDF File Chapter Five: Lexical Nesting (184 KB) PDF File Chapter Six: Questions, Projects, and Lab Exercises (56KB) PDF File Appendices Appendix A: Solutions to Selected Exercises (20KB) N/A Appendix B: Console Graphic Characters (24KB) PDF File Appendix C: HLA Programming Style Guidelines (264KB) PDF File Appendix D: The 80x86 Instruction Set (224KB) PDF File Appendix E: HLA Language Reference (16KB) N/A Appendix F: HLA Standard Library Reference (16KB) N/A Appendix G: HLA Exceptions (52KB) PDF File Appendix H: HLA Compile-Time Functions (224KB) PDF File Appendix I: Installing HLA on Your System (192KB) PDF File Appendix J: Debugging HLA Programs (60KB) PDF File Appendix K: Comparison of HLA and MASM (16KB) N/A Appendix L: Code Generation for HLA High Level Statements 104KB) PDF File
About: GXemul is an instruction-level machine emulator. In addition to simulating CPUs, surrounding hardware components are also simulated, in some cases well enough to run real (unmodified) operating systems.
Changes: Stabilization fixes were made in the dyntrans core, and performance fixes were made for the new MIPS emulation mode.
Reminds me of chess, May 8, 2005
W Boudville (US) - See all my reviews
Decades ago, when Knuth wrote the first edition of his classic Art of Computer Programming, he invented an assembly language in which to implement the many algorithms of the books. He called it MIX. It was quite representative of the actual assemblers of the time [late 60s]. But time and Moore's Law marched on. The 8 bit nature of MIX grew increasingly outdated.
In response, Knuth gives us here a massively upgraded version, called MMIX. It operates on 64 bit wide data. Yay! Still a classic von Neumann architecture, mind you. But very spiffy. MMIX also has 256 general purpose registers and 32 special purpose registers, where these all are 64 bits wide, naturally. Plus, MMIX lives in an address space of 2**64 bytes of memory.
Unlike the Intel or AMD chips, which are CISC, Knuth opted for a RISC MMIX. So learning the opcodes is very rapid, if you have dealt with assemblers before.
This little text gets you up to speed in MMIX. Consider it as prep for the full volume 4, when that comes out. [Prof. Knuth, it's late.]
But this MMIX book is utterly unlike any other assembler book. It comes replete with programming problems (and answers) of considerable intellectual heft. Conventional assembler books simply don't do this. Their problems tend to be mundane and trivial. This book lets you find surprising conceptual depths hidden under a deceptively simple language. Compare this to chess.
[Jun 20, 2005] The IEEE standard for floating point arithmetic
Microsoft/INFO Why Floating Point Numbers May Lose Precision
Floating point decimal values generally do not have an exact binary representation. This is a side effect of how the CPU represents floating point data. For this reason, you may experience some loss of precision, and some floating point operations may produce unexpected results. This behavior is the end result of one of the following:
- The binary representation of the decimal number may not be exact.
- There is a type mismatch between the numbers used (for example, mixing float and double).
To resolve the behavior, you can either ensure that the value is greater or less than what you need, or you can get and use a Binary Coded Decimal (BCD) library that will maintain the precision.
Assembly Language Tutor. Very Short Intro. June 12th 1995 Copyright(C)1995-1996
Assembly Language Copyright Brian Brown, 1988-2000. All rights reserved.
This courseware is subject to copyright and may not be reproduced or copied without the written permission of the author. You may not redistribute this courseware without permission. If you are an educator, you may reference this material for use by your students, and if you purchase the CD, may host the files locally on your own network and print them out for student use or reference.
- Assembly language programming, history, operation
- Asm, directives and data types
- Asm, HLL constructs
- Asm, Data conversion routines
- Asm, intel iAPX8086
- Asm, Parameter Passing
- Asm, Interfacing to HLL, Linking, Options, Debug output
- Asm, Make, Map files, Segment types
- iAPX386, part 1, basic principles
- iAPX386, part 2, programming model, segmentation and paging
- iAPX386, part 3, addressing modes
- iAPX386, part 4, instruction sets
- iAPX386, part 5, protected mode programming examples
- Interrupts and Real-Time Considerations
- MC68000 Addressing Modes
- PC-Multi-Tasking Executive
Linux.com Some Linux apps are small wonders
13K in size for the editor that's something from DOS era :-). It is written in NASM assembler. It's refreshing to see useful program that is less than a megabyte ;-)
e3 text editor
The e3 console text editor takes minimalism to the max - the binary is a minuscule 13KB in size! So why use this instead of [insert the name of your favorite editor here]? If you're anything like me, you'll find a lot of your editing tasks are very short -- little more than tweaks. e3 starts instantly and has all the basic features you could want, including find/replace, block cut/copy/paste, and undo. For complex tasks I use a more feature-packed program, but for a quick change to /etc/fstab or something similar, this little editor wins every time.
e3 also does its best to be ubiquitous. It works on a whole host of operating systems, and perhaps best of all, it supports keyboard mappings that emulate WordStar, Pico, emacs, vi, and Nedit. You can hardly fail to feel at home with it.
Ketman Assembly Language Tutorial
An assembly-language tutorial is not the same thing as an assembler tutorial - or not usually. Still less usually would it be an interpreter tutorial. But in this case that's what it will turn out to be.
Because in my view there is nothing worse than trying to follow a technical explanation off the bare page, with nothing to illustrate the logical or arithmetical operations except your own imagination. I remember trying to learn ASM myself out of a book, and often wondered aloud why there was no tool that could help me understand what I was reading.
Assembly Language Techniques for the Solaris OS, x86 Platform Edition
It is an interesting exercise to read compiler-generated assembly code and find ways to make it more efficient.
Long ago, I took an assembly language course in college that focused on the x86 instruction set (286 at the time) and used MSDOS as the foundation. A lot has changed since then. Processors are much faster, compilers are much smarter at generating code, and software engineers are writing much larger programs. Has the software world changed enough over the years that application programmers don't need to worry about assembly language any more? Yes and no.
Yes, because companies tend to be more and more concerned with portability and look to new hardware to provide them with the performance they require.
No, because many enterprise solutions are judged on their price/performance ratio, in which any advantage in performance could be rewarded with a higher profit margin.
I have found that many of Sun's partners still use assembly language in their products to ensure that hot code paths are as efficient as possible. While compilers are able to generate much more efficient code today, the resulting code still doesn't always compete with hand-coded assembly written by an engineer that knows how to squeeze performance out of each microprocessor instruction. Assembly language remains a powerful tool for optimization, granting the programmer greater control, and with judicious use can enhance performance. Assembly language can also be a liability. It requires more specialized talent to maintain than higher-level languages and is not portable.
This paper discusses assembly language techniques for the Solaris Operating System running on the x86 architecture. My focus is to help others not just figure out how to integrate assembly language into their projects, but also help demonstrate that assembly language is not always the answer for better performance. I will not be covering the x86 instruction set or how to write assembly code. There are many books on the market that cover those topics extensively.
All of the examples provided in this paper can be compiled with either the compiler in Sun Studio software or GCC, except as noted in the text.
Coroutines are functions or procedures that save control state between calls (as opposed to, but very similar to, Generators, such as Random Number Generators, that save data state between calls). The most common example is a lexical analyser that behaves differently on different calls because it is tracking e.g. whether it is currently inside a function definition or outside a function definition.
Coroutines are very important as one of the very few examples of a form of concurrency that is useful, and yet constrained enough to completely avoid the typical difficulties (race conditions, deadlock, etc). Synchronization is built-in to the paradigm. It therefore cannot in general replace more general unconstrained forms of concurrency, but for some things it appears to be the ideal solution.
Lex and GNU Flex implement coroutine-based lexical analyzers, as can be seen by their support for entering and leaving user-defined states that allow for state-dependent pattern recognition.
In their most intuitive implementation (e.g. in languages that directly support them), they return control (and optionally a value) to the caller with a Yield statement, similarly to a return from a function, but when called (e.g. with "resume coroutine_name"), they resume execution at the point immediately following the previous yield.
Processes, Coroutines, and Concurrency Chapter 19
Comp.compilers Re Assembly verses a high-level language.
Newsgroups: comp.compilers From: firstname.lastname@example.org (Stavros Macrakis) In-Reply-To: email@example.com's message of Mon, 20 Nov 1995 03:53:54 GMT Keywords: performance, assembler Organization: OSF Research Institute References: 95-11-166 Date: Wed, 22 Nov 1995 21:21:12 GMT
firstname.lastname@example.org (Tom Powell ) writes:
How come programs written in assembly are so much faster than any
other high-level language. I know that it is a low-level language
and that it "speaks" directly to the hardware so it is faster, but
why can't high-level languages compile programs just as fast as
First of all, assembler is not an "other" high-level language. It is
the low-level language par excellence, lower-level even than C :-).
There are several reasons that assembly programs can be faster than compiled programs:
The assembly programmer can design data structures which take maximum advantage of the instruction set. To a certain extent, you can do this in languages like C if you're willing to write code that is specific to the architecture. But there are some instructions which are so specialized that it is very hard for compilers to recognize that they're the best way to do things; this is mostly true in CISC architectures.
The assembly programmer typically can estimate which parts of the program need the most optimization, and apply a variety of tricks which it would be a bad idea to apply everywhere, because they make the code larger. I don't know of any compilers that allow "turning up" or "turning down" optimization for code fragments, although most will allow it for compilation modules.
The assembly programmer can sometimes use specialized runtime structures, such as for instance reserving some registers globally for things that are often used, or designing special conventions for register use and parameter passing in a group of procedures. Another example is using the top of the stack as a local, unbounded stack without respecting frame conventions.
Some control structures are not widely supported by commonly-usedhigher-level languages, or are too general. For instance, coroutines are provided by very few languages. Many languages now provide threads, which are a generalization of coroutines, but often have more overhead.
The assembly programmer is sometimes willing to do global analysis which most compilers currently don't do.
Finally, the assembly programmer is more immediately aware of the cost of operations, and thus tends to choose more carefully as a function of cost. As language level rises, the cost of a given operation generally becomes less and less predictable.
All this said, there is no guarantee than an assembly program will be faster than a compiled program. A program of a given functionality will take longer to develop in assembler than in a higher-level language, so less time is available for design and performance tuning. Re-design is particularly painful in assembler since many decisions are written into the code. In many programs, large improvements can be made in performance by improving algorithms rather than coding; assembler is a disadvantage here since coding time is larger, and flexibility is less. Finally, it is harder to write reliable assembly code than reliable higher-level language code; getting a core dump faster is not much use.
Compiler writers have tried, over time, to incorporate some of these advantages of assembler. The "coalescing" style of compiler in particular in many ways resembles the work of a good assembly programmer: design your data structures and inner loops together, and early on in the design process. Various kinds of optimization and global analysis are done by compilers, but in the absence of application knowledge, it is hard to bound their runtime. (Another thread in this group talked about the desirability of turning optimization up very high in some cases.)
The Simple Machine Language Interpreter implements a "toy" machine language, which is intended to teach basic processor and computing concepts. It includes definitions of the relevant concepts and the language and an example program showing how to use the interpreter.
The name MASM originally referred to MACRO ASSEMBLER but over the years it has become synonymous with Microsoft Assembler.
MASM is a programming tool with a very long history and has been in constant devlopment since it earliest version in 1981. Below is the copyright notice when ML.EXE version 6.14 is run from the command line.
Microsoft (R) Macro Assembler Version 6.14.8444
Copyright (C) Microsoft Corp 1981-1997. All rights reserved.
MASM preserves the historical INTEL syntax for writing x86 assembler and it is a defacto industrial standard because of the length of time it has been a premier programming tool. From the late 80s onwards it has been compatible with whatever has been the current Microsoft object file format and this has made assembler modules written in MASM directly compatible with other Microsoft languages.
The current versions of ML.EXE (The assembler in MASM) currently range in the series 7.??? but the architecture and format of current and recent versions of ML.EXE are based on the version 6.00 that was released in 1990. The technical data published in the manuals for version 6.00 and the last seperate release of MASM 6.11d as a commercial product still mainly apply to the most recent versions.
Microsoft released version 6.11d in 1993 including 32 bit Windows operating system capacity. It is the original win 32 assembler and was supported in that release with 32 bit assembler code examples that would run on the early 32 bit versions of Windows NT.
Debunking the myths about MASM
1. MASM is no longer supported or upgraded by Microsoft.
Nobody has told this to Microsoft who supply ML.EXE 7.??? in the XPDDK. They also supply version 6.15 of ML.EXE in a processor pack for their flagship Visual C/C++ product. It is upgraded on a needs basic by Microsoft noting that it has been in constant development since 1981 and does not need much done to it apart from the occasional upgrade. It is well known that Microsoft use MASM to write critical parts of their own operating systems.
2. MASM produces bloated files that are larger than true low level assemblers.
This folklore does not fit the facts. In 16 bit code MASM can produce a 2 byte com file (int 19h) and in modern 32 bit PE (Portable Executable) files, it can produce a 1024 byte working window which is the PE specification minimum size for a PE file. Much of this folklore was invented in the middle 90s by people who could not ever write assembler code in MASM.
3. MASM is not a true low level assembler.
MASM can write at the lowest possible level in an assembler with direct DB sequences in the code section. The recent version can write the entire Intel instruction set up to SSE2. Many confuse the higher level simulations in MASM with the incapacity to write true low level code. MASM is the architypal MACRO assembler for the Windows platform and this macro capacity is powerful enough to emulate many forms of higher level constructions.
4. MASM is only an accessory for C programming.
Nobody seems to have told this to the hundreds of thousands of assembler programmers who write executable pre difference with MASM is it is powerful enough to use a C compiler as an accessory to assembler programming.
5. MASM modifies your code when it assembles it.
What you write is what you get with MASM. If you use some of the more common higher level capacities in MASM, you get the format it is written to create and this applies to the use of a stack frame for procedures and characteristics like jump length extension if a jump is too far away from the label it jumps to. In both instances you have options to change this, you can disable jump extensions if you want to and specify the jump range with SHORT or NEAR. With a procedure where you don't want a stack frame, you use a standard MASM syntax to turn the stack frame off and re-enable it after the procedure is finished. This technique is commonly used when writing reusable code for libraries where an extra register is required or where the stack overhead may slightly slow down the procedure.
6. MASM code is too complex for a beginner to learn.
This is another myth from people trying to support other assemblers that don't have the code parsing power of MASM. MASM can be used to write from very simple code to extremely complex code. The difference is that it is powerful enough to do both without the ugly and complicated syntax of some of the other assemblers around that don't have the parsing power that of MASM. Differing from some of the less powerful assemblers, MASM uses the historical Intel notation when it uses keywords like OFFSET as this correctly distinguishes between a fixed address within an executable image and a stack variable which is created at run time when a procedure is called. When MASM requires operators like BYTE PTR, it is to distinguish between ambiguous notation forms where the size of the data cannot be determined any other way.
7. MASM is really a compiler because it has high level code.
MASM is capable of writing many normal high level constructions like structures, unions, pointers and high level style procedures that have both size of parameter checking and parameter count checking but the difference again is that it is powerful enough to do this where many of the others are not. While you can write unreliable and impossible to debug code like some other assemblers must do, with MASM you have the choice to write highly reliable code that is subject to type checking. The real problem is that such capacities are technically hard to write into an assembler and while some of the authors of other assemblers are very higly skilled programmers, an individual does not have the resources or capacity of a large software corporation that has developed MASM over 23 years.
[Dec 7, 2005] flat assembler This is a place dedicated to assembly language programming for x86 and x86-64 systems and contains many resources for both beginners and advanced assembly programmers. This site is constantly being improved, and hopefully you'll find here some useful materials, no matter whether you are trying to learn the assembly language, or just are looking for the solution for some particular problem.
[Jun 4, 2005] Assembly Language Windows Applications
[Jun 4, 2002] assemblylanguage.net - Assembly Language Resources -- nice collection of links
[Mar 4, 2002] BURKS Assembly languages -- the online version of the 6th edition of BURKS, a non-profit set of CD-ROMs for students of Computer Science.
[Feb 02, 2002] VASM - Visual Assembler IDE
Every Linux distribution includes gas (GNU Assembler) that uses unconvinient for PC programmers AT&T syntax. There is an assembler with Intel syntax - nasm. Download and install binary packages for Linux and docs (note: Stampede Linux distribution includes nasm). All distribution include ld - it is contained in binutils package.
Floating point decimal values generally do not have an exact binary representation. This is a side effect of how the CPU represents floating point data. For this reason, you may experience some loss of precision, and some floating point operations may produce unexpected results. This behavior is the end result of one of the following:
- The binary representation of the decimal number may not be exact.
- There is a type mismatch between the numbers used (for example, mixing float and double).
To resolve the behavior, you can either ensure that the value is greater or less than what you need, or you can get and use a Binary Coded Decimal (BCD) library that will maintain the precision.
This is probably the oldest surviving assembler culture in existence
The Assembler Connection System 360/370 assembler site
The Assembler Connection provides a suite of sample programs that are written to assemble and link using Assembler/H or High Level Assembler (HLASM) when possible. If a technique is used that is unique to a specific dialect it will be noted. JCL members are provided to run the jobs as MVS batch jobs on an IBM mainframe or within a project using Micro Focus Mainframe Express (MFE) running on a PC with Windows . The 370 Assembler Option for MFE is required to run on the PC.
See also Softpanorama Donald Knuth page -- this is my modest tribute to the greatest programmer I even known.
If you are a real assembler
language guru and like algorithms please volunteer for the MMIX
MMIXmasters Home Page -- MMIX is the new computer and associated assembly
language that Donald Knuth will be using to specify the algorithms in the
next edition of his series, The Art of Computer
Programming (TAOCP). Currently (ca 1999) three volumes have been published, and more are in preparation. Knuth will use MMIX as the low-level programming language in the "ultimate" edition of his opus.
This web site is for the volunteers---the MMIXmasters---who are converting all of the programs in TAOCP, Volumes 1 - 3 from the old language MIX to the new language MMIX. Although the primary purpose of this site is to serve the MMIXmasters, non-volunteers are still welcome
MMIXware: A RISC Computer for the Third Millennium
Definition of architecture details (142KB of compressed PostScript) (version of 1 January 2001)
Definition of the assembly language and loader format (60KB of compressed PostScript) (version of 24 July 2000)
Definition of simple I/O, the runtime environment, and the simulator's online/offline iteraction commands (45KB) (version of 24 July 2000)
People have been accumulating several months of experience with a straightforward MMIX assembler and simulator, and I know that both programs work reasonably well on three platforms. The pipeline meta-simulator is also up and running, but with a user interface that is not for beginners. (This is one of the most difficult programs I've ever written, and surely one of the most interesting, for people who have time to explore low-level details.)
Click here to download MMIXware: the simple simulator, assembler, test programs, and full documentation, plus the meta-simulator: mmix.tar.gz (Version of 18 September 2000)
PL360 was the first Structured assembly language created by N. Wirth for the IBM 360 and IBM 370, with a several high-level control constructs. Syntactically it slightly resembles ALGOL 60. Its grammar is defined entirely by operator precedence. The classic refernce is "PL/360, A Programming Language for the 360 Computers", N. Wirth, J ACM 15(1):37-74 (Jan 1968). (see Journal of the ACM -- 1968). Electronic text of this famous paper is available from ACM Digital Library (pay per view or subscription), but there is a free electronic book written in Stanford: PL360 TEXTBOOK See homepage of the author Guertin's Home Page
PL360 from FOLDOC -- definition
Nicklaus Wirth -- a Pioneer of Computer Science (pdf) by Gustav Pomberger, Hanspeter Mцssenbцck, Peter Rechenberg, Johannes Kepler( University of Linz) email@example.com, firstname.lastname@example.org, email@example.com
Niklaus Wirth is one of the most influential scientists of the early computer age. His ideas and especially his programming languages have shaped generations of programmers worldwide. This paper tries to acknowledge the scientific achievements of Niklaus Wirth and to honor him as a person. A small part of the paper is also devoted to Wirth's influence on computer science at the Johannes Kepler University of Linz.
Ask computer specialists anywhere- be it in Europe or America, in Asia or Australia- who the most important computer scientists in the world are, and you can bet that the name Niklaus Wirth will be on every list. Ask programming language experts about the person with the greatest influence on the development of programming languages, and they would agree on Niklaus Wirth. Ask students,
teachers and computer hobbyists about the most important programming languages, and you can be sure that even today Pascal will be on every list. Finally, examine the literature of computer science in the elite circle of books with the greatest number of copies printed, the widest distribution, and the most translations, and you will find books by Niklaus Wirth, especially the prominent book on
These few examples show that professor Niklaus Wirth is one of the world's leading computer scientists. It is our honor and pleasure to acknowledge his achievements as a scientist, educator and person as well as his influence on research and teaching all over the world. We begin by reviewing Wirth's scientific career and by recognizing his most important works as well as the honors that the computer world has bestowed upon him. Wirth's work had significant impact on the development of computer science and the work of other researchers worldwide. Needless to say, he has also influenced research and teaching at the Johannes Kepler University of Linz, and part of this paper sketches this influence. Finally, we take particular pleasure in acknowledging Niklaus Wirth as a person and a contemporary.
Compilers for MVS
Ron Tatum gave me the "push" I needed to add PL360 to my collection of compilers. The archive:
contains the following files:
manual.prn - formatted print file containing PL360 manual
bcdval.pl3 - PL360 routines to convert numbers to/from real, complex, and double precision numbers as human readable strings in a format more familiar to FORTRAN and PL/1 programmers
bisearch.pl3 - PL360 routine implementing a binary search on a table with keys
pl360src.pl3 - PL360 source for the January, 1990 version of the Stanford PL360 compiler
runlib.pl3 - PL360 routine providing the functions of bcdval.pl3, bisearch.pl3, and shelsort.pl3 combined into a single module
shelsort.pl3 - PL360 routine implementing a simple shell sort algorithm
synproc.pl3 - PL360 routine implementing a simple precedence syntax processor
cmsinter.bal - Assembler language routines to interface the PL360 compiler to CMS
dosinter.bal - Assembler language routines to interface the PL360 compiler to DOS
dospl360.bal - Assembler language I/O routines for simple read/punch/write input/output from compiled PL360 programs to DOS
dosplio.bal - Assembler language I/O routines for tape/disk input/output from compiled PL360 programs to DOS
mtsinter.bal - Assembler language routines to interface the PL360 compiler to MTS
mvsinter.bal - Assembler language routines to interface the PL360 compiler to MVS
mvspl360.bal - Assembler language I/O routines for tape/disk input/output from compiled PL360 programs to MVS
mvsplio.bal - Assembler language I/O routines for tape/disk input/output from compiled PL360 programs to MVS
orvinter.bal - Assembler language routines to interface the PL360 compiler to Orvyl
orvrunti.bal - Assembler language routines to interface compiled PL360 programs to Orvyl
pl3link.jcl - JCL to link-edit compiled PL360 programs
proclib.upd - IEBUPDTE input to add procedure library members for PL360
grammar.bnf - a formal BNF type grammar for the PL360 language which may be processed by synproc.pl3
pl360n1.txt and pl360n2.txt - notes on Stanford modifications to PL360
I have converted each of the files from EBCDIC to ASCII and added CR/LF pairs so that the files may be opened and viewed more easily using the tools most people will have at hand. The archive also contains a couple of compiled object modules (suitable for input into the Link Editor or Loader) which I have left in their EBCDIC form. If you would prefer to obtain the original archive, use the link:
and download pl360.tar.gz from the public directory.
Terse is a programming tool that provides THE most compact assembler syntax for the x86 family!
However, it is evil proprietary software. It is said that there was a project for a free clone somewhere, that was abandoned after worthless pretenses that the syntax would be owned by the original author. Thus, if you're looking for a nifty programming project related to assembly hacking, I invite you to develop a terse-syntax frontend to NASM, if you like that syntax. As an interesting historic remark, on comp.compilers,
1999/07/11 19:36:51, the moderator wrote:
About 30 years ago I used Niklaus Wirth's PL360, which was basically a S/360
assembler with Algol syntax and a a little syntactic sugar like while loops that turned into the obvious branches. It really was an assembler, e.g., you had to write out your expressions with explicit assignments of values to registers, but it was nice. Wirth used it to write Algol W, a small fast Algol subset, which was a predecessor to Pascal. As is so often the case, Algol W was a significant improvement over many of its successors. -John"
PL_TDF is a language in the lineage of Wirth's PL360 and its later derivatives. The basic idea in PL360 was to give one an assembler in which one could express all of the order-code of the IBM 360 while still preserving the logical structure of the program using familiar programming constructs. If one had to produce a program at the code level, this approach was much preferable to writing "flat" assembly code using a traditional assembler, as anyone who has used both can testify.
In the TDF "machine" the problem is not lack of structure at its "assembly" level, but rather too much of it; one loses the sense of a TDF program because of its deeply nested structure. Also the naming conventions of TDF are designed to make them tractable to machine manipulation, rather than human reading and writing. However, the approach is basically the same. PL_TDF provides shorthand notations for the commonly occuring control structures and operations while still allowing one to use the standard TDF constructors which, in turn, may have shorthand notations for their parameters. The naming is always done by identifiers where the sort of the name is determined by its declaration, or by context.
The TDF derived from PL_TDF is guaranteed to be SORT correct; however, there is no SHAPE checking, so one can still make illegal TDF.
LINOLEUM Homepage - sample source code
Currently you cannot get to much from the IDE in comparison with Orthodox filemanager, good programming editor and WEB browser (for reference materials). You can also use Microsoft Studio as assembler IDE. If you want something different you can try some of the following projects:
Iczelion's Win32 Assembly Homepage some links
Tasm IDE Homepage Dead
Version 1.3 development (09-04-2001)
Of course development will continue forever. To avoid doing useless work I'd like to know what you want to see in future versions. Please e-mail me your requests at the address below.
Version 1.2 released (09-04-2001)
More than a year of development! Huge changes? Erm.. no. I've been very busy and have done nothing about it for more than a half a year or so. But this new version is quite worth the waiting. More oriented at Win32 programs you can actually make Win32 programs in ideal mode (!). It's not complete (yet) but most simple programs are more than possible to make. Check it out in the download section. I will add more header files and tools for Win32 programming in the future (fingers crossed :-). Oh yeah: thanks everyone for clicking on my support link!
Version 1.1 released (05-02-2000)
http://www.vasm.org/ Integrated development environment for Assembly language:
VASM: Visual Assembler IDE is being designed from the ground up to simplify development in assembly. After searching the web, there is no integrated development environment available that can be used professionally, as a learning tool, or just for fun. There are a few attempts to create visual assemblers; but, these projects have died or existing developers have lost interest to continue the development. I don't blame them. It's a huge undertaking. VASM is being development at this time with Borland Delphi for Windows simply because it's the best visual RAD tool out there. Period.
Softpanorama Tribute to Dmitry Gurtyak (1971-1998) [updated Dec.25, 1999]
[Dec. 10, 1999] Operating Systems The Boot Process -- a very good page. Please visit it
[Oct. 17, 1999] Windows Disassembler Disassemble Windows files - ZDNet Software Library by Eric Grass
Windows Disassembler is for program developers working in the Windows environment. Users can disassemble small Windows executables and dynamic link libraries. In addition, you can open a window and browse the source code of a program without having to write it to a file. It allows one to create the assembly language source code files if desired. This version disassembles all 486 instructions, assuming that the instructions are intended for 16-bit mode operation. It also includes good technical documentation.
[June 25, 1999] Assembly Language (x86) Resources by Michael Somos -- good
[June 7, 1999] assembly resources assembly, assembly language, assembler, machine language.
[April 18, 1999] All direct links to file in the Programmers Heaven - Assembler Zone removed at the request of the maintainer of the site... Also some files mentioned below are no longer online on this site. In such cases probably one can find them by direct search via www.filez.com or www.shareware.com I will correct this later...
[Feb.12,1999] GeoCities Computers & TechnologyProgrammingAssembly GeoAvenues
[ Nov.4,1998] Some new links
The 80x86 Assembly Pages by Jannes Faber. Useful info
Kibernetica9619 -- Dima Samsonov page
This document is intended to be a tutorial, showing how to write a simple assembly program in several UNIX operating systems on IA32 (i386) platform. Included material may or may not be applicable to other hardware and/or software platforms. Document explains program layout, system call convention, and build process. It accompanies Linux Assembly HOWTO, which may be of your interest as well, though is more Linux specific.
v0.3, April 09, 2000
Dr. Dobbs Journal Articles -- nice collection of papers
Configuring MASM 6.11 on your home computer
MoonWare Home Page -- x86 assembly FAQ is maintained by Ray Moon and consists of 6 parts:
You can also download a zipped version.
Linux Assembly HOWTO at Caldera (and mirrors) about programming for Linux.
Lynn Larrow's Communications related FAQ List
Filip Gieszczykiewicz's FAQs List
The Computer Journal FAQ List
BIOS Interrupts / Device Drivers
All major archives (Simtel, Hobbes, Garbo, etc.) contains assembler language sections. Visit them first
Other useful files from the Programmers Heaven. See also Programmers Heaven - Assembler Zone - SourceCode Filelist
|XLIB61.ZIP||262229||XLIB is a library of procedures which can greatly simplify protected mode programming under DOS. XLIB provides the simplest and most reliable method for accessing extended memory from real mode languages. A tutorial on protected mode is included. XLIB procedures handle mode switching, extended memory, memory-mapped IO, interrupts, and files. XLIB also handles CPU exceptions and performs debugging functions. XLIB operates under DPMI, VCPI, XMS and clean configurations. Both Microsoft and Borland formats are included.|
|ASMFILES.ZIP||8948||Arithmetic assembly sources for X86|
|ASMXMPLE.ZIP||5056||Assembler example for PC 8086|
|ASPIPRG.ZIP||24251||Code to assist writing code to the Advanced SCSI Programming Interface (ASPI). Support for programmers from Adaptec.|
|ASPIPROG.ZIP||25260||Some pc-SCSI programming files|
|CMCRC10.ZIP||15190||Compute CRC-16 and CRC-32. Written in 386 and 8088 ASM for maximum performance. W/full source code.|
|CMDSRC.ZIP||63533||Source To A Good Command-line Editor|
|GETSECT.ZIP||1922||Absolute Disk Sector Reader|
|KERNEL.ZIP||16487||Real-time OS kernel (asm)|
|PT.ZIP||1753||Shows Use Of Return & Call Functions|
|RAMSPY.ZIP||3363||Code To View Memory, Whats Left, Address|
|RANDOMGN.ZIP||6602||Asm Source To Generate Random Number|
|STRUCA86.ZIP||3536||Macros for structured programming in A86|
|TBONES07.ZIP||35343||Skeletal ASM programs for programming TSRs|
|ASM_GOOD.ZIP||117103||A Collection of Asm Sources Memory, Interupts, Video & More|
Comp.compilers Re XPL Language
From: firstname.lastname@example.org (Steve Meyer) Newsgroups: comp.compilers Date: 27 Aug 2000 22:27:57 -0400 Organization: Compilers Central References: 00-06-118 00-07-016 00-07-075 00-08-018 00-08-028 00-08-055 00-08-083 Keywords: history
I am not sure it makes sense to continue this historical discussion, but I think there is a lot more to story. The roots of modern computing lie in this story. For example, although both PL360 and XPL were available why did Professor Knuth use assembler in his Art of Programming books? Also, these original languages (and the Bell Labs counter-parts) arose in Academic (School of Literate and Science) computer science departments, but now computing is studied in EE departments.
On 13 Aug 2000 19:10:55 -0400, Duane Sand <email@example.com> wrote:
>Steve Meyer wrote in message 00-08-055...
>>>>>: Peter Flass <firstname.lastname@example.org> wrote:
>>>>>: > XPL, developed in the 1970's was one of the earliest "compiler compilers", was widely ported, and was the basis for a number of other languages such as the PL/M family.
>>I think PL/M and XPL came from different worlds that did not
>>communicate. I think people saw XPL as too high level. I think PL/M
>>came from other system level languages such as PL/360 (?). My
>>recollection may not be right.
>Niklaus Wirth developed PL360 as an alternative to writing IBM360
>assembly code directly. It was a quick one-person project. The
>parser used "operator precedence' techniques which predated practical
>LR methods. The tables could be worked out by hand in no time but the
>method couldn't handle BNFs of most languages. It was quite low
>level, mapping infix syntactic forms directly to single 360
Parsing may have gotten tenure for lots of professors but the most advanced programming language areas such as HDLs (hardware descriptions languages) now use the "predated" operator precedence
methods. Also Professor Wirth's languages have remained at the for-front of academic programming languages.
>instructions without any optimizations. The PL360 paper inspired lots
>of people to develop their own small languages.
I think PL360 was very popular within IBM and among the back then "modernist" movement away from assembly language. I know it was very popular at SLAC.
>McKeeman etc developed XPL on 360 as a tidy subset of PL/I that could
>be implemented by a few people and be useful in coding biggish things,
>including the compiler and parser generator. The parser was initially
>based on their extensions to operator precedence, which relaxed BNF
>restrictions but required use of a parser generator tool and was still
>limited compared to LR. XPL was "high level" only in having built-in
>a varying-length string data type supported by a garbage collector.
>There were no struct types.
I think it was hard back then to differentiate XPL from Mckeeman's advocacy of Burroughs B5500 style stack machine research program.
>Univ of Washington ported XPL onto SDS/Xerox systems that were like
>360 but with one instruction format.
>UW graduate Gary Kildall developed Intel's first programming tools for
>the 8008 and 8080, in trade for a very early portable computer: an
>8008 without keyboard or monitor, installed in a briefcase. Kildall
>used these (plus a floppy drive adapted by UW grad John Torode) to
>develop CP/M, the precursor to MS DOS. The Intel tools included an
>assembler and PL/M, both coded in Fortran. PL/M was inspired by the
>example of PL360 and the implementation methods of XPL. Kildall left
>before UW's XPL project but was likely very aware of it.
>PL/M's level was limited by the 8008's near inability to support proc
>calls. The first micro language to see significant use was Basic,
>implemented by assembler-coded interpreters. Implementing real
>applications in real compiled languages required later chips with
>nicer instruction sets, eg 8088 (gag) and M6800.
As the Z80 showed, 8088 was only one index register away from being real computer.
Historical question I think is why there was so little communciation between the current most popular BCPL, B, C, C++ research program and the Stanford/Silicon Valley research program.
Just my two cents.
Steve Meyer Phone: (415) 296-7017
Pragmatic C Software Corp. Fax: (415) 296-0946
220 Montgomery St., Suite 925 email: email@example.com
San Francisco, CA 94104
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : C++ Humor : ARE YOU A BBS ADDICT? : Object oriented programmers of all nations : C Humor : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor: Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : The Most Comprehensive Collection of Editor-related Humor : Microsoft plans to buy Catholic Church : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor : Best Russian Programmer Humor : Russian Musical Humor : The Perl Purity Test : Politically Incorrect Humor : GPL-related Humor : OFM Humor : IDS Humor : Real Programmers Humor : Scripting Humor : Web Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor :
The Last but not Least
|You can use PayPal to make a contribution, supporting hosting of this site with different providers to distribute and speed up access. Currently there are two functional mirrors: softpanorama.info (the fastest) and softpanorama.net.|
The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.
Last modified: August 05, 2013