Softpanorama

Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
May the source be with you, but remember the KISS principle ;-)
Bigger doesn't imply better. Bigger often is a sign of obesity, of lost control, of overcomplexity, of cancerous cells

Scriptorama: A Slightly Skeptical View
on Scripting Languages

News Introduction Recommended Books Recommended Links Programming Languages Design Recommended Papers Scripting Languages for Java vs. Pure Java
Software Engineering Anti-OO John Ousterhout Larry Wall Shell Giants Software Prototyping Software Life Cycle Models
Shells AWK Perl Perl Warts and Quirks Python PHP Javascript
Ruby Tcl/Tk R programming language Rexx Lua S-lang JVM-based scripting languages
Pipes Regex Program understanding Brooks law KISS Principle Software Prototyping Unix Component Model
Programming as a Profession Programming style Language design and programming quotes History Humor Random Findings Etc

This is the central page of the Softpanorama WEB site because I am strongly convinced that the development of scripting languages, not the replication of the efforts of BSD group undertaken by Stallman and Torvalds is the central part of open source. See Scripting languages as VHLL for more details.

 
Ordinarily technology changes fast. But programming languages are different: programming languages are not just technology, but what programmers think in.

They're half technology and half religion. And so the median language, meaning whatever language the median programmer uses, moves as slow as an iceberg.

Paul Graham: Beating the Averages

Libraries are more important that the language.

Donald Knuth


Introduction

A fruitful way to think about language development is to consider it a to be special type of theory building. Peter Naur suggested that programming in general is theory building activity in his 1985 paper "Programming as Theory Building". But idea is especially applicable to compilers and interpreters. What Peter Naur failed to understand was that design of programming languages has religious overtones and sometimes represent an activity, which is pretty close to the process of creating a new, obscure cult ;-). Clueless academics publishing junk papers at obscure conferences are high priests of the church of programming languages. Some, like Niklaus Wirth and Edsger W. Dijkstra, (temporary) reached the status close to (false) prophets :-).

On a deep conceptual level building of a new language is a human way of solving complex problems. That means that complier construction in probably the most underappreciated paradigm of programming of large systems. Much more so then greatly oversold object-oriented programming. OO benefits are greatly overstated.

For users, programming languages distinctly have religious aspects, so decisions about what language to use are often far from being rational and are mainly cultural.  Indoctrination at the university plays a very important role. Recently they were instrumental in making Java a new Cobol.

The second important observation about programming languages is that language per se is just a tiny part of what can be called language programming environment. The latter includes libraries, IDE, books, level of adoption at universities,  popular, important applications written in the language, level of support and key players that support the language on major platforms such as Windows and Linux and other similar things.

A mediocre language with good programming environment can give a run for the money to similar superior in design languages that are just naked.  This is  a story behind success of  Java and PHP. Critical application is also very important and this is a story of success of PHP which is nothing but a bastardatized derivative of Perl (with all the most interesting Perl features surgically removed ;-) adapted to creation of dynamic web sites using so called LAMP stack.

Progress in programming languages has been very uneven and contain several setbacks. Currently this progress is mainly limited to development of so called scripting languages.  Traditional high level languages field is stagnant for many decades.  From 2000 to 2017 we observed the huge sucess of Javascript; Python encroached in Perl territory (including genomics/bioinformatics) and R in turn start squeezing Python in several areas. At the same time Ruby despite initial success remained niche language.  PHP still holds its own in web-site design.

Some observations about scripting language design and  usage

At the same time there are some mysterious, unanswered question about factors that help the particular scripting language to increase its user base, or fail in popularity. Among them:

Nothing succeed like success

Those are difficult questions to answer without some way of classifying languages into different categories. Several such classifications exists. First of all like with natural languages, the number of people who speak a given language is a tremendous force that can overcome any real of perceived deficiencies of the language. In programming languages, like in natural languages nothing succeed like success.

The second interesting category is number of applications written in particular language that became part of Linux or, at least, are including in standard RHEL/FEDORA/CENTOS or Debian/Ubuntu repository.

The third relevant category is the number and quality of books for the particular language.

Complexity Curse

History of programming languages raises interesting general questions about the limit of complexity of programming languages. There is strong historical evidence that a language with simpler core, or even simplistic core Basic, Pascal) have better chances to acquire high level of popularity.

The underlying fact here probably is that most programmers are at best mediocre and such programmers tend on intuitive level to avoid more complex, more rich languages and prefer, say, Pascal to PL/1 and PHP to Perl. Or at least avoid it on a particular phase of language development (C++ is not simpler language then PL/1, but was widely adopted because of the progress of hardware, availability of compilers and not the least, because it was associated with OO exactly at the time OO became a mainstream fashion).

Complex non-orthogonal languages can succeed only as a result of a long period of language development (which usually adds complexly -- just compare Fortran IV with Fortran 99; or PHP 3 with PHP 5 ) from a smaller core. Attempts to ride some fashionable new trend extending existing popular language to this new "paradigm" also proved to be relatively successful (OO programming in case of C++, which is a superset of C).

Historically, few complex languages were successful (PL/1, Ada, Perl, C++), but even if they were successful, their success typically was temporary rather then permanent  (PL/1, Ada, Perl). As Professor Wilkes noted   (iee90):

Things move slowly in the computer language field but, over a sufficiently long period of time, it is possible to discern trends. In the 1970s, there was a vogue among system programmers for BCPL, a typeless language. This has now run its course, and system programmers appreciate some typing support. At the same time, they like a language with low level features that enable them to do things their way, rather than the compiler’s way, when they want to.

They continue, to have a strong preference for a lean language. At present they tend to favor C in its various versions. For applications in which flexibility is important, Lisp may be said to have gained strength as a popular programming language.

Further progress is necessary in the direction of achieving modularity. No language has so far emerged which exploits objects in a fully satisfactory manner, although C++ goes a long way. ADA was progressive in this respect, but unfortunately it is in the process of collapsing under its own great weight.

ADA is an example of what can happen when an official attempt is made to orchestrate technical advances. After the experience with PL/1 and ALGOL 68, it should have been clear that the future did not lie with massively large languages.

I would direct the reader’s attention to Modula-3, a modest attempt to build on the appeal and success of Pascal and Modula-2 [12].

Complexity of the compiler/interpreter also matter as it affects portability: this is one thing that probably doomed PL/1 (and later Ada), although those days a new language typically come with open source compiler (or in case of scripting languages, an interpreter) and this is less of a problem.

Here is an interesting take on language design from the preface to The D programming language book 9D language failed to achieve any significant level of popularity):

Programming language design seeks power in simplicity and, when successful, begets beauty.

Choosing the trade-offs among contradictory requirements is a difficult task that requires good taste from the language designer as much as mastery of theoretical principles and of practical implementation matters. Programming language design is software-engineering-complete.

D is a language that attempts to consistently do the right thing within the constraints it chose: system-level access to computing resources, high performance, and syntactic similarity with C-derived languages. In trying to do the right thing, D sometimes stays with tradition and does what other languages do, and other times it breaks tradition with a fresh, innovative solution. On occasion that meant revisiting the very constraints that D ostensibly embraced. For example, large program fragments or indeed entire programs can be written in a well-defined memory-safe subset of D, which entails giving away a small amount of system-level access for a large gain in program debuggability.

You may be interested in D if the following values are important to you:

The role of fashion

At the initial, the most difficult stage of language development the language should solve an important problem that was inadequately solved by currently popular languages.  But at the same time the language has few chances to succeed unless it perfectly fits into the current software fashion. This "fashion factor" is probably as important as several other factors combined. With the notable exclusion of "language sponsor" factor.  The latter can make or break the language.

Like in woman dress fashion rules in language design.  And with time this trend became more and more pronounced.  A new language should simultaneously represent the current fashionable trend.  For example OO-programming was a visit card into the world of "big, successful languages" since probably early 90th (C++, Java, Python).  Before that "structured programming" and "verification" (Pascal, Modula) played similar role.

Programming environment and the role of "powerful sponsor" in language success

PL/1, Java, C#, Ada, Python are languages that had powerful sponsors. Pascal, Basic, Forth, partially Perl (O'Reilly was a sponsor for a short period of time)  are examples of the languages that had no such sponsor during the initial period of development.  C and C++ are somewhere in between.

But language itself is not enough. Any language now need a "programming environment" which consists of a set of libraries, debugger and other tools (make tool, lint, pretty-printer, etc). The set of standard" libraries and debugger are probably two most important elements. They cost  lot of time (or money) to develop and here the role of powerful sponsor is difficult to underestimate.

While this is not the necessary condition for becoming popular, it really helps: other things equal the weight of the sponsor of the language does matter. For example Java, being a weak, inconsistent language (C-- with garbage collection and OO) was pushed through the throat on the strength of marketing and huge amount of money spend on creating Java programming environment. 

The same was partially true for  C# and Python. That's why Python, despite its "non-Unix" origin is more viable scripting language now then, say, Perl (which is better integrated with Unix and has pretty innovative for scripting languages support of pointers and regular expressions), or Ruby (which has support of coroutines from day 1, not as "bolted on" feature like in Python).

Like in political campaigns, negative advertizing also matter. For example Perl suffered greatly from blackmail comparing programs in it with "white noise". And then from withdrawal of O'Reilly from the role of sponsor of the language (although it continue to milk that Perl book publishing franchise ;-)

People proved to be pretty gullible and in this sense language marketing is not that different from woman clothing marketing :-)

Language level and success

One very important classification of programming languages is based on so called the level of the language.  Essentially after there is at least one language that is successful on a given level, the success of other languages on the same level became more problematic. Higher chances for success are for languages that have even slightly higher, but still higher level then successful predecessors.

The level of the language informally can be described as the number of statements (or, more correctly, the number of  lexical units (tokens)) needed to write a solution of a particular problem in one language versus another. This way we can distinguish several levels of programming languages:

 "Nanny languages" vs "Sharp razor" languages

Some people distinguish between "nanny languages" and "sharp razor" languages. The latter do not attempt to protect user from his errors while the former usually go too far... Right compromise is extremely difficult to find.

For example, I consider the explicit availability of pointers as an important feature of the language that greatly increases its expressive power and far outweighs risks of errors in hands of unskilled practitioners.  In other words attempts to make the language "safer" often misfire.

Expressive style of the languages

Another useful typology is based in expressive style of the language:

Those categories are not pure and somewhat overlap. For example, it's possible to program in an object-oriented style in C, or even assembler. Some scripting languages like Perl have built-in regular expressions engines that are a part of the language so they have functional component despite being procedural. Some relatively low level languages (Algol-style languages) implement garbage collection. A good example is Java. There are scripting languages that compile into common language framework which was designed for high level languages. For example, Iron Python compiles into .Net.

Weak correlation between quality of design and popularity

Popularity of the programming languages is not strongly connected to their quality. Some languages that look like a collection of language designer blunders (PHP, Java ) became quite popular. Java became  a new Cobol and PHP dominates dynamic Web sites construction. The dominant technology for such Web sites is often called LAMP, which means Linux - Apache -MySQL- PHP. Being a highly simplified but badly constructed subset of Perl, kind of new Basic for dynamic Web sites construction PHP provides the most depressing experience. I was unpleasantly surprised when I had learnt that the Wikipedia engine was rewritten in PHP from Perl some time ago, but this fact quite illustrates the trend.

So language design quality has little to do with the language success in the marketplace. Simpler languages have more wide appeal as success of PHP (which at the beginning was at the expense of Perl) suggests. In addition much depends whether the language has powerful sponsor like was the case with Java (Sun and IBM) as well as Python (Google).

Progress in programming languages has been very uneven and contain several setbacks like Java. Currently this progress is usually associated with scripting languages. History of programming languages raises interesting general questions about "laws" of programming language design. First let's reproduce several notable quotes:

  1. Knuth law of optimization: "Premature optimization is the root of all evil (or at least most of it) in programming." - Donald Knuth
  2. "Greenspun's Tenth Rule of Programming: any sufficiently complicated C or Fortran program contains an ad hoc informally-specified bug-ridden slow implementation of half of Common Lisp." - Phil Greenspun
  3. "The key to performance is elegance, not battalions of special cases." - Jon Bentley and Doug McIlroy
  4. "Some may say Ruby is a bad rip-off of Lisp or Smalltalk, and I admit that. But it is nicer to ordinary people." - Matz, LL2
  5. Most papers in computer science describe how their author learned what someone else already knew. - Peter Landin
  6. "The only way to learn a new programming language is by writing programs in it." - Kernighan and Ritchie
  7. "If I had a nickel for every time I've written "for (i = 0; i < N; i++)" in C, I'd be a millionaire." - Mike Vanier
  8. "Language designers are not intellectuals. They're not as interested in thinking as you might hope. They just want to get a language done and start using it." - Dave Moon
  9. "Don't worry about what anybody else is going to do. The best way to predict the future is to invent it." - Alan Kay
  10. "Programs must be written for people to read, and only incidentally for machines to execute." - Abelson & Sussman, SICP, preface to the first edition

Please note that one thing is to read language manual and appreciate how good the concepts are, and another to bet your project on a new, unproved language without good debuggers, manuals and, what is very important, libraries. Debugger is very important but standard libraries are crucial: they represent a factor that makes or breaks new languages.

In this sense languages are much like cars. For many people car is the thing that they use get to work and shopping mall and they are not very interesting is engine inline or V-type and the use of fuzzy logic in the transmission. What they care is safety, reliability, mileage, insurance and the size of trunk. In this sense "Worse is better" is very true. I already mentioned the importance of the debugger. The other important criteria is quality and availability of libraries. Actually libraries are what make 80% of the usability of the language, moreover in a sense libraries are more important than the language...

A popular belief that scripting is "unsafe" or "second rate" or "prototype" solution is completely wrong. If a project had died than it does not matter what was the implementation language, so for any successful project and tough schedules scripting language (especially in dual scripting language+C combination, for example TCL+C) is an optimal blend that for a large class of tasks. Such an approach helps to separate architectural decisions from implementation details much better that any OO model does.

Moreover even for tasks that handle a fair amount of computations and data (computationally intensive tasks) such languages as Python and Perl are often (but not always !) competitive with C++, C# and, especially, Java.

The second important observation about programming languages is that language per se is just a tiny part of what can be called language programming environment. the latter includes libraries, IDE, books, level of adoption at universities, popular, important applications written in the language, level of support and key players that support the language on major platforms such as Windows and Linux and other similar things. A mediocre language with good programming environment can give a run for the money to similar superior in design languages that are just naked. This is a story behind success of Java. Critical application is also very important and this is a story of success of PHP which is nothing but a bastardatized derivative of Perl (with all most interesting Perl features removed ;-) adapted to creation of dynamic web sites using so called LAMP stack.

History of programming languages raises interesting general questions about the limit of complexity of programming languages. There is strong historical evidence that languages with simpler core, or even simplistic core has more chanced to acquire high level of popularity. The underlying fact here probably is that most programmers are at best mediocre and such programmer tend on intuitive level to avoid more complex, more rich languages like, say, PL/1 and Perl. Or at least avoid it on a particular phase of language development (C++ is not simpler language then PL/1, but was widely adopted because OO became a fashion). Complex non-orthogonal languages can succeed only as a result on long period of language development from a smaller core or with the banner of some fashionable new trend (OO programming in case of C++).

Programming Language Development Timeline

Here is modified from Byte the timeline of Programming Languages (for the original see BYTE.com September 1995 / 20th Anniversary /)

Forties

ca. 1946


1949

Fifties


1951


1952

1957


1958


1959

Sixties


1960


1962


1963


1964


1965


1966


1967



1969

Seventies


1970


1972


1974


1975


1976


1977


1978


1979

Eighties


1980


1981


1982


1983


1984


1985


1986


1987


1988


1989

Nineties


1990


1991


1992


1993


1994


1995


1996


1997


2000


2006

2007 

2008:

2009:

2011

2017:

Special note on Scripting languages

Scripting helps to avoid OO trap that is pushed by
  "a hoard of practically illiterate researchers
publishing crap papers in junk conferences."

Despite the fact that scripting languages are really important computer science phenomena, they are usually happily ignored in university curriculums.  Students are usually indoctrinated (or in less politically correct terms  "brainwashed")  in Java and OO programming ;-)

This site tries to give scripting languages proper emphasis and  promotes scripting languages as an alternative to mainstream reliance on "Java as a new Cobol" approach for software development. Please read my introduction to the topic that was recently converted into the article: A Slightly Skeptical View on Scripting Languages.  

The tragedy of scripting language designer is that there is no way to overestimate the level of abuse of any feature of the language.  Half of the programmers by definition is below average and it is this half that matters most in enterprise environment.  In a way the higher is the level of programmer, the less relevant for him are limitations of the language. That's why statements like "Perl is badly suitable for large project development" are plain vanilla silly. With proper discipline it is perfectly suitable and programmers can be more productive with Perl than with Java. The real question is "What is the team quality and quantity?".  

Scripting is a part of Unix cultural tradition and Unix was the initial development platform for most of mainstream scripting languages with the exception of REXX. But they are portable and now all can be used in Windows and other OSes.

List of Softpanorama pages related to scripting languages

Standard topics

Main Representatives of the family Related topics History Etc

Different scripting languages provide different level of integration with base OS API (for example, Unix or Windows). For example Iron Python compiles into .Net and provides pretty high level of integration with Windows. The same is true about Perl and Unix: almost all Unix system calls are available directly from Perl. Moreover Perl integrates most of Unix API in a very natural way, making it perfect replacement of shell for coding complex scripts. It also have very good debugger. The latter is weak point of shells like bash and ksh93

Unix proved that treating everything like a file is a powerful OS paradigm. In a similar way scripting languages proved that "everything is a string" is also an extremely powerful programming paradigm.

Unix proved that treating everything like a file is a powerful OS paradigm. In a similar way scripting languages proved that "everything is a string" is also extremely powerful programming paradigm.

There are also several separate pages devoted to scripting in different applications. The main emphasis is on shells and Perl. Right now I am trying to convert my old Perl lecture notes into a eBook Nikolai Bezroukov. Introduction to Perl for Unix System Administrators.

Along with pages devoted to major scripting languages this site has many pages devoted to scripting in different applications.  There are more then a dozen of "Perl/Scripting tools for a particular area" type of pages. The most well developed and up-to-date pages of this set are probably Shells and Perl. This page main purpose is to follow the changes in programming practices that can be called the "rise of  scripting," as predicted in the famous John Ousterhout article Scripting: Higher Level Programming for the 21st Century in IEEE COMPUTER (1998). In this brilliant paper he wrote:

...Scripting languages such as Perl and Tcl represent a very different style of programming than system programming languages such as C or Java. Scripting languages are designed for "gluing" applications; they use typeless approaches to achieve a higher level of programming and more rapid application development than system programming languages. Increases in computer speed and changes in the application mix are making scripting languages more and more important for applications of the future.

...Scripting languages and system programming languages are complementary, and most major computing platforms since the 1960's have provided both kinds of languages. The languages are typically used together in component frameworks, where components are created with system programming languages and glued together with scripting languages. However, several recent trends, such as faster machines, better scripting languages, the increasing importance of graphical user interfaces and component architectures, and the growth of the Internet, have greatly increased the applicability of scripting languages. These trends will continue over the next decade, with more and more new applications written entirely in scripting languages and system programming languages used primarily for creating components.

My e-book Portraits of Open Source Pioneers contains several chapters on scripting (most are in early draft stage) that expand on this topic. 

The reader must understand that the treatment of the scripting languages in press, and especially academic press is far from being fair: entrenched academic interests often promote old or commercially supported paradigms until they retire, so change of paradigm often is possible only with the change of generations. And people tend to live longer those days... Please also be aware that even respectable academic magazines like Communications of ACM and IEEE Software often promote "Cargo cult software engineering" like Capability Maturity (CMM) model.

Dr. Nikolai Bezroukov


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

2007 2006 2005 2004 2003 2002 2001 2000 1999

[Sep 17, 2018] Which HTML5 tag should I use to mark up an author's name - Stack Overflow

Sep 17, 2018 | stackoverflow.com

Which HTML5 tag should I use to mark up an author's name? Ask Question up vote 75 down vote favorite 19


Quang Van ,Sep 3, 2011 at 1:03

For example of a blog-post or article.
<article>
<h1>header<h1>
<time>09-02-2011</time>
<author>John</author>
My article....
</article>

The author tag doesn't exist though... So what is the commonly used HTML5 tag for authors? Thanks.

(If there isn't, shouldn't there be one?)

Joseph Marikle ,Sep 3, 2011 at 1:10

<cite> maybe? I don't know lol. :P Doesn't make very much of a difference in style though. – Joseph Marikle Sep 3 '11 at 1:10

jalgames ,Apr 23, 2014 at 12:53

It's not about style. Technically, you can use a <p> to create a heading just by increasing the font size. But search engines won't understand it like that. – jalgames Apr 23 '14 at 12:53

Andreas Rejbrand ,Nov 17, 2014 at 18:58

You are not allowed to use the time element like that. Since dd-mm-yyy isn't one of the recognised formats, you have to supply a machine-readable version (in one of the recognised formats) in a datetime attribute of the time element. See w3.org/TR/2014/REC-html5-20141028/Andreas Rejbrand Nov 17 '14 at 18:58

Dan Dascalescu ,Jun 13, 2016 at 0:23

There's a better answer now than the accepted (robertc's) one. – Dan Dascalescu Jun 13 '16 at 0:23

robertc ,Sep 3, 2011 at 1:13

HTML5 has an author link type :
<a href="http://johnsplace.com" rel="author">John</a>

The weakness here is that it needs to be on some sort of link, but if you have that there's a long discussion of alternatives here . If you don't have a link, then just use a class attribute, that's what it's for:

<span class="author">John</span>

robertc ,Sep 3, 2011 at 1:21

@Quang Yes, I think a link type without an actual link would defeat the purpose of trying to mark it up semantically. – robertc Sep 3 '11 at 1:21

Paul D. Waite ,Sep 5, 2011 at 7:19

@Quang: the rel attribute is there to describe what the link's destination is. If the link has no destination, rel is meaningless. – Paul D. Waite Sep 5 '11 at 7:19

Michael Mior ,Jan 19, 2013 at 14:36

You might also want to look at schema.org for ways of expressing this type of information. – Michael Mior Jan 19 '13 at 14:36

robertc ,Jan 1, 2014 at 15:09

@jdh8 Because John is not the title of a workrobertc Jan 1 '14 at 15:09

Dan Dascalescu ,Jun 13, 2016 at 0:19

This answer just isn't the best any longer. Google no longer supports rel="author" , and as ryanve and Jason mention, the address tag was explicitly design for expressing authorship as well. – Dan Dascalescu Jun 13 '16 at 0:19

ryanve ,Sep 3, 2011 at 18:28

Both rel="author" and <address> are designed for this exact purpose. Both are supported in HTML5. The spec tells us that rel="author" can be used on <link> <a> , and <area> elements. Google also recommends its usage . Combining use of <address> and rel="author" seems optimal. HTML5 best affords wrapping <article> headlines and bylines info in a <header> like so:
<article>
    <header>
        <h1 class="headline">Headline</h1>
        <div class="byline">
            <address class="author">By <a rel="author" href="/author/john-doe">John Doe</a></address> 
            on <time pubdate datetime="2011-08-28" title="August 28th, 2011">8/28/11</time>
        </div>
    </header>

    <div class="article-content">
    ...
    </div>
</article>

If you want to add the hcard microformat , then I would do so like this:

<article>
    <header>
        <h1 class="headline">Headline</h1>
        <div class="byline vcard">
            <address class="author">By <a rel="author" class="url fn n" href="/author/john-doe">John Doe</a></address> 
            on <time pubdate datetime="2011-08-28" title="August 28th, 2011">on 8/28/11</time>
        </div>
    </header>

    <div class="article-content">
    ...
    </div>
</article>

aridlehoover ,Jun 24, 2013 at 18:12

Shouldn't "By " precede the <address> tag? It's not actually a part of the address. – aridlehoover Jun 24 '13 at 18:12

ryanve ,Jun 24, 2013 at 20:40

@aridlehoover Either seems correct according to whatwg.org/specs/web-apps/current-work/multipage/ - If outside, use .byline address { display:inline; font-style:inherit } to override the block default in browsers. – ryanve Jun 24 '13 at 20:40

ryanve ,Jun 24, 2013 at 20:46

@aridlehoover I also think that <dl> is viable. See the byline markup in the source of demo.actiontheme.com/sample-page for example. – ryanve Jun 24 '13 at 20:46

Paul Kozlovitch ,Apr 16, 2015 at 11:36

Since the pubdate attribute is gone from both the WHATWG and W3C specs, as Bruce Lawson writes here , I suggest you to remove it from your answer. – Paul Kozlovitch Apr 16 '15 at 11:36

Nathan Hornby ,May 28, 2015 at 12:41

This should really be the accepted answer. – Nathan Hornby May 28 '15 at 12:41

Jason Gennaro ,Sep 3, 2011 at 2:18

According to the HTML5 spec, you probably want address .

The address element represents the contact information for its nearest article or body element ancestor.

The spec further references address in respect to authors here

Under 4.4.4

Author information associated with an article element (q.v. the address element) does not apply to nested article elements.

Under 4.4.9

Contact information for the author or editor of a section belongs in an address element, possibly itself inside a footer.

All of which makes it seems that address is the best tag for this info.

That said, you could also give your address a rel or class of author .

<address class="author">Jason Gennaro</address>

Read more: http://dev.w3.org/html5/spec/sections.html#the-address-element

Quang Van ,Feb 29, 2012 at 9:24

Thanks Jason, do you know what "q.v." means? Under >4.4.4 >Author information associated with an article element (q.v. the address element) does not apply to nested article elements. – Quang Van Feb 29 '12 at 9:24

pageman ,Feb 10, 2014 at 5:44

@QuangVan - (wait, your initials are ... q.v. hmm) - q.v. means "quod vide" or "on this (matter) go see" - son on the matter of "q.v." go see english.stackexchange.com/questions/25252/ (q.v.) haha – pageman Feb 10 '14 at 5:44

Jason Gennaro ,Feb 10, 2014 at 13:19

@pageman well done rocking the Latin! – Jason Gennaro Feb 10 '14 at 13:19

pageman ,Feb 11, 2014 at 1:10

@JasonGennaro haha nanos gigantum humeris insidentes! – pageman Feb 11 '14 at 1:10

Jason Gennaro ,Feb 11, 2014 at 13:16

@pageman aren't we all. – Jason Gennaro Feb 11 '14 at 13:16

remyActual ,Sep 7, 2015 at 0:49

Google support for rel="author" is deprecated :

"Authorship markup is no longer supported in web search."

Use a Description List (Definition List in HTML 4.01) element.

From the HTML5 spec :

The dl element represents an association list consisting of zero or more name-value groups (a description list). A name-value group consists of one or more names (dt elements) followed by one or more values (dd elements), ignoring any nodes other than dt and dd elements. Within a single dl element, there should not be more than one dt element for each name.

Name-value groups may be terms and definitions, metadata topics and values, questions and answers, or any other groups of name-value data.

Authorship and other article meta information fits perfectly into this key:value pair structure:

An opinionated example:

<article>
  <header>
    <h1>Article Title</h1>
    <p class="subtitle">Subtitle</p>
    <dl class="dateline">
      <dt>Author:</dt>
      <dd>Remy Schrader</dd>
      <dt>All posts by author:</dt>
      <dd><a href="http://www.blog.net/authors/remy-schrader/">Link</a></dd>
      <dt>Contact:</dt>
      <dd><a mailto="remy@blog.net"><img src="email-sprite.png"></a></dd>
    </dl>
  </header>
  <section class="content">
    <!-- article content goes here -->
  </section>
</article>

As you can see when using the <dl> element for article meta information, we are free to wrap <address> , <a> and even <img> tags in <dt> and/or <dd> tags according to the nature of the content and it's intended function .
The <dl> , <dt> and <dd> tags are free to do their job -- semantically -- conveying information about the parent <article> ; <a> , <img> and <address> are similarly free to do their job -- again, semantically -- conveying information regarding where to find related content, non-verbal visual presentation, and contact details for authoritative parties, respectively.

steveax ,Sep 3, 2011 at 1:39

How about microdata :
<article>
<h1>header<h1>
<time>09-02-2011</time>
<div id="john" itemscope itemtype="http://microformats.org/profile/hcard">
 <h2 itemprop="fn">
  <span itemprop="n" itemscope>
   <span itemprop="given-name">John</span>
  </span>
 </h2>
</div>
My article....
</article>

Raphael ,Feb 26, 2016 at 11:29

You can use
<meta name="author" content="John Doe">

in the header as per the HTML5 specification .

> ,

If you were including contact details for the author, then the <address> tag is appropriate:

But if it's literally just the author's name, there isn't a specific tag for that. HTML doesn't include much related to people.

[Aug 25, 2018] What are the various link types in Windows? How do I create them?

Notable quotes:
"... SeCreateSymbolicLinkPrivilege ..."
"... shell links ..."
"... directory junctions ..."
"... indirectly accessed ..."
"... junction point ..."
"... on the local drives ..."
"... will be gone ..."
"... volume mount point ..."
"... Why should you even be viewing anything but your personal files on a daily basis? ..."
"... and that folder ..."
"... one-to-many relationship ..."
"... My brain explodes... ..."
Aug 25, 2018 | superuser.com

Cookie ,Oct 18, 2011 at 17:43

Is it possible to link two files or folders without having a different extension under Windows?

I'm looking for functionality equivalent to the soft and hard links in Unix.

barlop ,Jun 11, 2016 at 17:30

this is related superuser.com/questions/343074/barlop Jun 11 '16 at 17:30

barlop ,Jun 11, 2016 at 21:07

great article here cects.com/ watch out for juctions pre w7 – barlop Jun 11 '16 at 21:07

grawity ,Oct 18, 2011 at 18:00

Please note that the only unfortunate difference is that you need Administrator rights to create symbolic links. IE, you need an elevated prompt. (A workaround is the SeCreateSymbolicLinkPrivilege can be granted to normal Users via secpol.msc .)

Note in terminology: Windows shortcuts are not called "symlinks"; they are shell links , as they are simply files that the Windows Explorer shell treats specially.


Symlinks: How do I create them on NTFS file system?

Windows Vista and later versions support Unix-style symlinks on NTFS filesystems. Remember that they also follow the same path resolution – relative links are created relative to the link's location, not to the current directory. People often forget that. They can also be implemented using an absolute path; EG c:\windows\system32 instead of \system32 (which goes to a system32 directory connected to the link's location).
Symlinks are implemented using reparse points and generally have the same behavior as Unix symlinks.

For files you can execute:

mklink linkname targetpath

For directories you can execute:

mklink /d linkname targetpath

Hardlinks: How do I create them on NTFS file systems?

All versions of Windows NT support Unix-style hard links on NTFS filesystems. Using mklink on Vista and up:

mklink /h linkname targetpath

For Windows 2000 and XP, use fsutil .

fsutil hardlink create linkname targetpath

These also work the same way as Unix hard links – multiple file table entries point to the same inode .


Directory Junctions: How do I create them on NTFS file systems?

Windows 2000 and later support directory junctions on NTFS filesystems. They are different from symlinks in that they are always absolute and only point to directories, never to files.

mklink /j linkname targetpath

On versions which do not have mklink , download junction from Sysinternals:

junction linkname targetpath

Junctions are implemented using reparse points .


How can I mount a volume using a reparse point in Windows?

For completeness, on Windows 2000 and later , reparse points can also point to volumes , resulting in persistent Unix-style disk mounts :

mountvol mountpoint \\?\Volume{volumeguid}

Volume GUIDs are listed by mountvol ; they are static but only within the same machine.


Is there a way to do this in Windows Explorer?

Yes, you can use the shell extension Link Shell Extension which makes it very easy to make the links that have been described above. You can find the downloads at the bottom of the page .

The NTFS file system implemented in NT4, Windows 2000, Windows XP, Windows XP64, and Windows7 supports a facility known as hard links (referred to herein as Hardlinks ). Hardlinks provide the ability to keep a single copy of a file yet have it appear in multiple folders (directories). They can be created with the POSIX command ln included in the Windows Resource Kit, the fsutil command utility included in Windows XP or my command line ln.exe utility.

The extension allows the user to select one or many files or folders, then using the mouse, complete the creation of the required Links - Hardlinks, Junctions or Symbolic Links or in the case of folders to create Clones consisting of Hard or Symbolic Links. LSE is supported on all Windows versions that support NTFS version 5.0 or later, including Windows XP64 and Windows7. Hardlinks, Junctions and Symbolic Links are NOT supported on FAT file systems, and nor is the Cloning and Smart Copy process supported on FAT file systems.

The source can simple be picked using a right click menu.

And depending on what you picked , you right click on a destination folder and get a menu with options.

This makes it very easy to create links. For an extensive guide, read the LSE documentation .

Downloads can be found at the bottom of their page .

Relevant MSDN URLs:

Tom Wijsman ,Dec 31, 2011 at 3:42

In this answer I will attempt to outline what the different types of links in directory management are as well as why they are useful as well as when they could be used. When trying to achieve a certain organization on your file volumes, knowing the various different types as well as creating them is valuable knowledge.

For information on how a certain link can be made, refer to grawity 's answer .

What is a link?

A link is a relationship between two entities; in the context of directory management, a link can be seen as a relationship between the following two entities:

  1. Directory Table

    This table keeps track of the files and folders that reside in a specific folder.

    A directory table is a special type of file that represents a directory (also known as a folder). Each file or directory stored within it is represented by a 32-byte entry in the table. Each entry records the name, extension, attributes (archive, directory, hidden, read-only, system and volume), the date and time of last modification, the address of the first cluster of the file/directory's data and finally the size of the file/directory.

  2. Data Cluster

    More specifically, the first cluster of the file or directory.

    A cluster is the smallest logical amount of disk space that can be allocated to hold a file.

The special thing about this relationship is that it allows one to have only one data cluster but many links to that data cluster, this allows us to show data as being present in multiple locations. However, there are multiple ways to do this and each method of doing so has its own effects.

To see where this roots from, lets go back to the past...

What is a shell link and why is in not always sufficient?

Although it might not sound familiar, we all know this one! File shortcuts are undoubtedly the most frequently used way of linking files. These were found in some of the early versions of Windows 9x and have been there for a long while.

These allow you to quickly create a shortcut to any file or folder, they are more specifically made to store extra information along just the link like for example the working directory the file is executed in, the arguments to provide to the program as well as options like whether to maximize the program.

The downside to this approach of linking is exactly that, the extra information requires this type of link to have a data cluster on its own to contain that file. The problem then is not necessarily that it takes disk space, but its rather that it the link is indirectly accessed as the Data Cluster first has to be requested before we get to the actual link. If the path referred to in the actual link is gone the shell link will still exist.

If you were to operate on the file being referred to, you would actually first have to figure out in which directory the file is. You can't simply open the link in an editor as you would then be editing the .lnk file rather than the file being linked to. This locks out a lot of possible use cases for shell links.

How does a junction point link try to solve these problems?

A NTFS junction point allows one to create a symbolic link to a directory on the local drives , in such a way that it behaves just like a normal directory. So, you have one directory of files stored on your disk but can access it from multiple locations.

When removing the junction point, the original directory remains. When removing the original directory, the junction point remains. It is very costly to enumerate the disk to check for junction points that have to be deleted. This is a downside as a result of its implementation.

The NTFS junction point is implemented using NTFS reparse points , which are NTFS file system objects introduced with Windows 2000.

An NTFS reparse point is a type of NTFS file system object. Reparse points provide a way to extend the NTFS filesystem by adding extra information to the directory entry, so a file system filter can interpret how the operating system will treat the data. This allows the creation of junction points, NTFS symbolic links and volume mount points, and is a key feature to Windows 2000's Hierarchical Storage System.

That's right, the invention of the reparse point allows us to do more sophisticated ways of linking.

The NTFS junction point is a soft link , which means that it just links to the name of the file. This means that whenever the link is deleted the original data stays intact ; but, whenever the original data is deleted the original data will be gone .

Can I also soft link files? Are there symbolic links?

Yes, when Windows Vista came around they decided to extend the functionality of the NTFS file system object(s) by providing the NTFS symbolic link , which is a soft link that acts in the same way as the NTFS junction point. But can be applied to file and directories.

They again share the same deletion behavior, in some use cases this can be a pain for files as you don't want to have an useless copy of a file hanging around. This is why also the notion of hard links have been implemented.

What is a hard link and how does it behave as opposed to soft links?

Hard links are not NTFS file system objects, but they are instead a link to a file (in detail, they refer to the MFT entry as that stores extra information about the actual file). The MFT entry has a field that remembers the amounts of time a file is being hard linked to. The data will still be accessible as long as at least one link that points to it still exists.

So, the data does no longer depend on a single MFT entry to exist . As long as there is a hard link around, the data will survive. This prevents accidental deleting for cases where one does not want to remember where the original file was.

You could for example make a folder with "movies I still have to watch" as well as a folder "movies that I take on vacation" as well as a folder "favorite movies". Movies that are none of these will be properly deleted while movies that are any of these will continue to exist even when you have watched a movie.

What is a volume mount point link for?

Some IT or business people might dislike having to remember or type the different drive letters their system has. What does M: really mean anyway? Was it Music? Movies? Models? Maps?

Microsoft has done efforts over the year to try to migrate users away from the work in drive C: to work in your user folder . I could undoubtedly say that the users with UAC and permission problems are those that do not follow these guidelines, but doesn't that make them wonder:

Why should you even be viewing anything but your personal files on a daily basis?

Volume mount points are the professional IT way of not being limited by drive letters as well as having a directory structure that makes sense for them, but...

My files are in different places, can I use links to get them together?

In Windows 7, Libraries were introduced exactly for this purpose. Done with music files that are located in this folder, and that folder and that folder . From a lower level of view, a library can be viewed as multiple links. They are again implemented as a file system object that can contain multiple references. It is in essence a one-to-many relationship ...

My brain explodes... Can you summarize when to use them?

Medinoc ,Sep 29, 2014 at 16:28

Libraries are shell-level like shortcut links, right? – Medinoc Sep 29 '14 at 16:28

Tom Wijsman ,Sep 29, 2014 at 21:53

@Medinoc: No, they aggregate the content of multiple locations. – Tom Wijsman Sep 29 '14 at 21:53

Medinoc ,Sep 30, 2014 at 20:52

But do they do so at filesystem level in such way that, say, cmd.exe and dir can list the aggregated content (in which case, where in the file system are they, I can't find it), or do they only aggregate at shell level, where only Windows Explorer and file dialogs can show them? I was under the impression it was the latter, but your "No" challenges this unless I wrote my question wrong (I meant to say "Libraries are shell-level like shortcut links are , right?" ). – Medinoc Sep 30 '14 at 20:52

Tom Wijsman ,Oct 1, 2014 at 9:32

@Medinoc: They are files at C:\Users\{User}\AppData\Roaming\Microsoft\Windows\Libraries . – Tom Wijsman Oct 1 '14 at 9:32

Tom Wijsman ,Apr 25, 2015 at 10:23

@Pacerier: Windows uses the old location system, where you can for example move a music folder around from its properties. Libraries are a new addition, which the OS itself barely uses as a result. Therefore I doubt if anything would break; as they are intended solely for display purposes, ... – Tom Wijsman Apr 25 '15 at 10:23

GeminiDomino ,Oct 18, 2011 at 17:49

If you're on Windows Vista or later, and have admin rights, you might check out the mklink command (it's a command line tool). I'm not sure how symlink-y it actually is since windows gives it the little arrow icon it puts on shortcuts, but a quick notepad++ test on a text file suggests it might work for what you're looking for.

You can run mklink with no arguments for a quick usage guide.

I hope that helps.

jcrawfordor ,Oct 18, 2011 at 17:58

mklink uses NTFS junction points (I believe that's what they're called) to more or less perfectly duplicate Unix-style linking. Windows can tell that it's a junction, though, so it'll give it the traditional arrow icon. iirc you can remove this with some registry fiddling, but I don't remember where. – jcrawfordor Oct 18 '11 at 17:58

grawity ,Oct 18, 2011 at 18:01

@jcrawfordor: The disk structures are "reparse points" . Junctions and symlinks are two different types of reparse points; volume mountpoints are third. – grawity Oct 18 '11 at 18:01

grawity ,Oct 18, 2011 at 18:08

And yes, @Gemini, mklink -made symlinks were specifically implemented to work just like Unix ones . – grawity Oct 18 '11 at 18:08

GeminiDomino ,Oct 18, 2011 at 18:13

Thanks grawity, for the confirmation. I've never played around with them much, so I just wanted to include disclaim.h ;) – GeminiDomino Oct 18 '11 at 18:13

barlop ,Jun 11, 2016 at 17:32

this article has some distinctions

one important distinction is that in a sense, junctions pre win7 were a bit unsafe, in that deleting them would delete the target directory.

http://cects.com/overview-to-understanding-hard-links-junction-points-and-symbolic-links-in-windows/

A Junction Point should never be removed in Win2k, Win2003 and WinXP with Explorer, the del or del /s commands, or with any utility that recursively walks directories since these will delete the target directory and all its subdirectories. Instead, use the rmdir command, the linkd utility, or fsutil (if using WinXP or above) or a third party tool to remove the junction point without affecting the target. In Vista/Win7, it's safe to delete Junction Points with Explorer or with the rmdir and del commands.

[Aug 07, 2018] May I sort the -etc-group and -etc-passwd files

Aug 07, 2018 | unix.stackexchange.com

Ned64 ,Feb 18 at 13:52

My /etc/group has grown by adding new users as well as installing programs that have added their own user and/or group. The same is true for /etc/passwd . Editing has now become a little cumbersome due to the lack of structure.

May I sort these files (e.g. by numerical id or alphabetical by name) without negative effect on the system and/or package managers?

I would guess that is does not matter but just to be sure I would like to get a 2nd opinion. Maybe root needs to be the 1st line or within the first 1k lines or something?

The same goes for /etc/*shadow .

Kevin ,Feb 19 at 23:50

"Editing has now become a little cumbersome due to the lack of structure" Why are you editing those files by hand? – Kevin Feb 19 at 23:50

Barmar ,Feb 21 at 20:51

How does sorting the file help with editing? Is it because you want to group related accounts together, and then do similar changes in a range of rows? But will related account be adjacent if you sort by uid or name? – Barmar Feb 21 at 20:51

Ned64 ,Mar 13 at 23:15

@Barmar It has helped mainly because user accounts are grouped by ranges and separate from system accounts (when sorting by UID). Therefore it is easier e.g. to spot the correct line to examine or change when editing with vi . – Ned64 Mar 13 at 23:15

ErikF ,Feb 18 at 14:12

You should be OK doing this : in fact, according to the article and reading the documentation, you can sort /etc/passwd and /etc/group by UID/GID with pwck -s and grpck -s , respectively.

hvd ,Feb 18 at 22:59

@Menasheh This site's colours don't make them stand out as much as on other sites, but "OK doing this" in this answer is a hyperlink. – hvd Feb 18 at 22:59

mickeyf ,Feb 19 at 14:05

OK, fine, but... In general, are there valid reasons to manually edit /etc/passwd and similar files? Isn't it considered better to access these via the tools that are designed to create and modify them? – mickeyf Feb 19 at 14:05

ErikF ,Feb 20 at 21:21

@mickeyf I've seen people manually edit /etc/passwd when they're making batch changes, like changing the GECOS field for all users due to moving/restructuring (global room or phone number changes, etc.) It's not common anymore, but there are specific reasons that crop up from time to time. – ErikF Feb 20 at 21:21

hvd ,Feb 18 at 17:28

Although ErikF is correct that this should generally be okay, I do want to point out one potential issue:

You're allowed to map different usernames to the same UID. If you make use of this, tools that map a UID back to a username will generally pick the first username they find for that UID in /etc/passwd . Sorting may cause a different username to appear first. For display purposes (e.g. ls -l output), either username should work, but it's possible that you've configured some program to accept requests from username A, where it will deny those requests if it sees them coming from username B, even if A and B are the same user.

Rui F Ribeiro ,Feb 19 at 17:53

Having root at first line has been a long time de facto "standard" and is very convenient if you ever have to fix their shell or delete the password, when dealing with problems or recovering systems.

Likewise I prefer to have daemons/utils users in the middle and standard users at the end of both passwd and shadow .

hvd answer is also very good about disturbing the users order, especially in systems with many users maintained by hand.

If you somewhat manage to sort the files, for instance, only for standard users, it would be more sensible than changing the order of all users, imo.

Barmar ,Feb 21 at 20:13

If you sort numerically by UID, you should get your preferred order. Root is always 0 , and daemons conventionally have UIDs under 100. – Barmar Feb 21 at 20:13

Rui F Ribeiro ,Feb 21 at 20:16

@Barmar If sorting by UID and not by name, indeed, thanks for remembering. – Rui F Ribeiro Feb 21 at 20:16

[Jul 30, 2018] Non-root user getting root access after running sudo vi -etc-hosts

Notable quotes:
"... as the original user ..."
Jul 30, 2018 | unix.stackexchange.com

Gilles, Mar 10, 2018 at 10:24

If sudo vi /etc/hosts is successful, it means that the system administrator has allowed the user to run vi /etc/hosts as root. That's the whole point of sudo: it lets the system administrator authorize certain users to run certain commands with extra privileges.

Giving a user the permission to run vi gives them the permission to run any vi command, including :sh to run a shell and :w to overwrite any file on the system. A rule allowing only to run vi /etc/hosts does not make any sense since it allows the user to run arbitrary commands.

There is no "hacking" involved. The breach of security comes from a misconfiguration, not from a hole in the security model. Sudo does not particularly try to prevent against misconfiguration. Its documentation is well-known to be difficult to understand; if in doubt, ask around and don't try to do things that are too complicated.

It is in general a hard problem to give a user a specific privilege without giving them more than intended. A bulldozer approach like giving them the right to run an interactive program such as vi is bound to fail. A general piece of advice is to give the minimum privileges necessary to accomplish the task. If you want to allow a user to modify one file, don't give them the permission to run an editor. Instead, either:

Note that allowing a user to edit /etc/hosts may have an impact on your security infrastructure: if there's any place where you rely on a host name corresponding to a specific machine, then that user will be able to point it to a different machine. Consider that it is probably unnecessary anyway .

[Jul 05, 2018] Can rsync resume after being interrupted

Notable quotes:
"... as if it were successfully transferred ..."
Jul 05, 2018 | unix.stackexchange.com

Tim ,Sep 15, 2012 at 23:36

I used rsync to copy a large number of files, but my OS (Ubuntu) restarted unexpectedly.

After reboot, I ran rsync again, but from the output on the terminal, I found that rsync still copied those already copied before. But I heard that rsync is able to find differences between source and destination, and therefore to just copy the differences. So I wonder in my case if rsync can resume what was left last time?

Gilles ,Sep 16, 2012 at 1:56

Yes, rsync won't copy again files that it's already copied. There are a few edge cases where its detection can fail. Did it copy all the already-copied files? What options did you use? What were the source and target filesystems? If you run rsync again after it's copied everything, does it copy again? – Gilles Sep 16 '12 at 1:56

Tim ,Sep 16, 2012 at 2:30

@Gilles: Thanks! (1) I think I saw rsync copied the same files again from its output on the terminal. (2) Options are same as in my other post, i.e. sudo rsync -azvv /home/path/folder1/ /home/path/folder2 . (3) Source and target are both NTFS, buy source is an external HDD, and target is an internal HDD. (3) It is now running and hasn't finished yet. – Tim Sep 16 '12 at 2:30

jwbensley ,Sep 16, 2012 at 16:15

There is also the --partial flag to resume partially transferred files (useful for large files) – jwbensley Sep 16 '12 at 16:15

Tim ,Sep 19, 2012 at 5:20

@Gilles: What are some "edge cases where its detection can fail"? – Tim Sep 19 '12 at 5:20

Gilles ,Sep 19, 2012 at 9:25

@Tim Off the top of my head, there's at least clock skew, and differences in time resolution (a common issue with FAT filesystems which store times in 2-second increments, the --modify-window option helps with that). – Gilles Sep 19 '12 at 9:25

DanielSmedegaardBuus ,Nov 1, 2014 at 12:32

First of all, regarding the "resume" part of your question, --partial just tells the receiving end to keep partially transferred files if the sending end disappears as though they were completely transferred.

While transferring files, they are temporarily saved as hidden files in their target folders (e.g. .TheFileYouAreSending.lRWzDC ), or a specifically chosen folder if you set the --partial-dir switch. When a transfer fails and --partial is not set, this hidden file will remain in the target folder under this cryptic name, but if --partial is set, the file will be renamed to the actual target file name (in this case, TheFileYouAreSending ), even though the file isn't complete. The point is that you can later complete the transfer by running rsync again with either --append or --append-verify .

So, --partial doesn't itself resume a failed or cancelled transfer. To resume it, you'll have to use one of the aforementioned flags on the next run. So, if you need to make sure that the target won't ever contain files that appear to be fine but are actually incomplete, you shouldn't use --partial . Conversely, if you want to make sure you never leave behind stray failed files that are hidden in the target directory, and you know you'll be able to complete the transfer later, --partial is there to help you.

With regards to the --append switch mentioned above, this is the actual "resume" switch, and you can use it whether or not you're also using --partial . Actually, when you're using --append , no temporary files are ever created. Files are written directly to their targets. In this respect, --append gives the same result as --partial on a failed transfer, but without creating those hidden temporary files.

So, to sum up, if you're moving large files and you want the option to resume a cancelled or failed rsync operation from the exact point that rsync stopped, you need to use the --append or --append-verify switch on the next attempt.

As @Alex points out below, since version 3.0.0 rsync now has a new option, --append-verify , which behaves like --append did before that switch existed. You probably always want the behaviour of --append-verify , so check your version with rsync --version . If you're on a Mac and not using rsync from homebrew , you'll (at least up to and including El Capitan) have an older version and need to use --append rather than --append-verify . Why they didn't keep the behaviour on --append and instead named the newcomer --append-no-verify is a bit puzzling. Either way, --append on rsync before version 3 is the same as --append-verify on the newer versions.

--append-verify isn't dangerous: It will always read and compare the data on both ends and not just assume they're equal. It does this using checksums, so it's easy on the network, but it does require reading the shared amount of data on both ends of the wire before it can actually resume the transfer by appending to the target.

Second of all, you said that you "heard that rsync is able to find differences between source and destination, and therefore to just copy the differences."

That's correct, and it's called delta transfer, but it's a different thing. To enable this, you add the -c , or --checksum switch. Once this switch is used, rsync will examine files that exist on both ends of the wire. It does this in chunks, compares the checksums on both ends, and if they differ, it transfers just the differing parts of the file. But, as @Jonathan points out below, the comparison is only done when files are of the same size on both ends -- different sizes will cause rsync to upload the entire file, overwriting the target with the same name.

This requires a bit of computation on both ends initially, but can be extremely efficient at reducing network load if for example you're frequently backing up very large files fixed-size files that often contain minor changes. Examples that come to mind are virtual hard drive image files used in virtual machines or iSCSI targets.

It is notable that if you use --checksum to transfer a batch of files that are completely new to the target system, rsync will still calculate their checksums on the source system before transferring them. Why I do not know :)

So, in short:

If you're often using rsync to just "move stuff from A to B" and want the option to cancel that operation and later resume it, don't use --checksum , but do use --append-verify .

If you're using rsync to back up stuff often, using --append-verify probably won't do much for you, unless you're in the habit of sending large files that continuously grow in size but are rarely modified once written. As a bonus tip, if you're backing up to storage that supports snapshotting such as btrfs or zfs , adding the --inplace switch will help you reduce snapshot sizes since changed files aren't recreated but rather the changed blocks are written directly over the old ones. This switch is also useful if you want to avoid rsync creating copies of files on the target when only minor changes have occurred.

When using --append-verify , rsync will behave just like it always does on all files that are the same size. If they differ in modification or other timestamps, it will overwrite the target with the source without scrutinizing those files further. --checksum will compare the contents (checksums) of every file pair of identical name and size.

UPDATED 2015-09-01 Changed to reflect points made by @Alex (thanks!)

UPDATED 2017-07-14 Changed to reflect points made by @Jonathan (thanks!)

Alex ,Aug 28, 2015 at 3:49

According to the documentation --append does not check the data, but --append-verify does. Also, as @gaoithe points out in a comment below, the documentation claims --partial does resume from previous files. – Alex Aug 28 '15 at 3:49

DanielSmedegaardBuus ,Sep 1, 2015 at 13:29

Thank you @Alex for the updates. Indeed, since 3.0.0, --append no longer compares the source to the target file before appending. Quite important, really! --partial does not itself resume a failed file transfer, but rather leaves it there for a subsequent --append(-verify) to append to it. My answer was clearly misrepresenting this fact; I'll update it to include these points! Thanks a lot :) – DanielSmedegaardBuus Sep 1 '15 at 13:29

Cees Timmerman ,Sep 15, 2015 at 17:21

This says --partial is enough. – Cees Timmerman Sep 15 '15 at 17:21

DanielSmedegaardBuus ,May 10, 2016 at 19:31

@CMCDragonkai Actually, check out Alexander's answer below about --partial-dir -- looks like it's the perfect bullet for this. I may have missed something entirely ;) – DanielSmedegaardBuus May 10 '16 at 19:31

Jonathan Y. ,Jun 14, 2017 at 5:48

What's your level of confidence in the described behavior of --checksum ? According to the man it has more to do with deciding which files to flag for transfer than with delta-transfer (which, presumably, is rsync 's default behavior). – Jonathan Y. Jun 14 '17 at 5:48

Alexander O'Mara ,Jan 3, 2016 at 6:34

TL;DR:

Just specify a partial directory as the rsync man pages recommends:

--partial-dir=.rsync-partial

Longer explanation:

There is actually a built-in feature for doing this using the --partial-dir option, which has several advantages over the --partial and --append-verify / --append alternative.

Excerpt from the rsync man pages:
--partial-dir=DIR
      A  better way to keep partial files than the --partial option is
      to specify a DIR that will be used  to  hold  the  partial  data
      (instead  of  writing  it  out to the destination file).  On the
      next transfer, rsync will use a file found in this dir  as  data
      to  speed  up  the resumption of the transfer and then delete it
      after it has served its purpose.

      Note that if --whole-file is specified (or  implied),  any  par-
      tial-dir  file  that  is  found for a file that is being updated
      will simply be removed (since rsync  is  sending  files  without
      using rsync's delta-transfer algorithm).

      Rsync will create the DIR if it is missing (just the last dir --
      not the whole path).  This makes it easy to use a relative  path
      (such  as  "--partial-dir=.rsync-partial")  to have rsync create
      the partial-directory in the destination file's  directory  when
      needed,  and  then  remove  it  again  when  the partial file is
      deleted.

      If the partial-dir value is not an absolute path, rsync will add
      an  exclude rule at the end of all your existing excludes.  This
      will prevent the sending of any partial-dir files that may exist
      on the sending side, and will also prevent the untimely deletion
      of partial-dir items on the receiving  side.   An  example:  the
      above  --partial-dir  option would add the equivalent of "-f '-p
      .rsync-partial/'" at the end of any other filter rules.

By default, rsync uses a random temporary file name which gets deleted when a transfer fails. As mentioned, using --partial you can make rsync keep the incomplete file as if it were successfully transferred , so that it is possible to later append to it using the --append-verify / --append options. However there are several reasons this is sub-optimal.

  1. Your backup files may not be complete, and without checking the remote file which must still be unaltered, there's no way to know.
  2. If you are attempting to use --backup and --backup-dir , you've just added a new version of this file that never even exited before to your version history.

However if we use --partial-dir , rsync will preserve the temporary partial file, and resume downloading using that partial file next time you run it, and we do not suffer from the above issues.

trs ,Apr 7, 2017 at 0:00

This is really the answer. Hey everyone, LOOK HERE!! – trs Apr 7 '17 at 0:00

JKOlaf ,Jun 28, 2017 at 0:11

I agree this is a much more concise answer to the question. the TL;DR: is perfect and for those that need more can read the longer bit. Strong work. – JKOlaf Jun 28 '17 at 0:11

N2O ,Jul 29, 2014 at 18:24

You may want to add the -P option to your command.

From the man page:

--partial By default, rsync will delete any partially transferred file if the transfer
         is interrupted. In some circumstances it is more desirable to keep partially
         transferred files. Using the --partial option tells rsync to keep the partial
         file which should make a subsequent transfer of the rest of the file much faster.

  -P     The -P option is equivalent to --partial --progress.   Its  pur-
         pose  is to make it much easier to specify these two options for
         a long transfer that may be interrupted.

So instead of:

sudo rsync -azvv /home/path/folder1/ /home/path/folder2

Do:

sudo rsync -azvvP /home/path/folder1/ /home/path/folder2

Of course, if you don't want the progress updates, you can just use --partial , i.e.:

sudo rsync --partial -azvv /home/path/folder1/ /home/path/folder2

gaoithe ,Aug 19, 2015 at 11:29

@Flimm not quite correct. If there is an interruption (network or receiving side) then when using --partial the partial file is kept AND it is used when rsync is resumed. From the manpage: "Using the --partial option tells rsync to keep the partial file which should <b>make a subsequent transfer of the rest of the file much faster</b>." – gaoithe Aug 19 '15 at 11:29

DanielSmedegaardBuus ,Sep 1, 2015 at 14:11

@Flimm and @gaoithe, my answer wasn't quite accurate, and definitely not up-to-date. I've updated it to reflect version 3 + of rsync . It's important to stress, though, that --partial does not itself resume a failed transfer. See my answer for details :) – DanielSmedegaardBuus Sep 1 '15 at 14:11

guettli ,Nov 18, 2015 at 12:28

@DanielSmedegaardBuus I tried it and the -P is enough in my case. Versions: client has 3.1.0 and server has 3.1.1. I interrupted the transfer of a single large file with ctrl-c. I guess I am missing something. – guettli Nov 18 '15 at 12:28

Yadunandana ,Sep 16, 2012 at 16:07

I think you are forcibly calling the rsync and hence all data is getting downloaded when you recall it again. use --progress option to copy only those files which are not copied and --delete option to delete any files if already copied and now it does not exist in source folder...
rsync -avz --progress --delete -e  /home/path/folder1/ /home/path/folder2

If you are using ssh to login to other system and copy the files,

rsync -avz --progress --delete -e "ssh -o UserKnownHostsFile=/dev/null -o \
StrictHostKeyChecking=no" /home/path/folder1/ /home/path/folder2

let me know if there is any mistake in my understanding of this concept...

Fabien ,Jun 14, 2013 at 12:12

Can you please edit your answer and explain what your special ssh call does, and why you advice to do it? – Fabien Jun 14 '13 at 12:12

DanielSmedegaardBuus ,Dec 7, 2014 at 0:12

@Fabien He tells rsync to set two ssh options (rsync uses ssh to connect). The second one tells ssh to not prompt for confirmation if the host he's connecting to isn't already known (by existing in the "known hosts" file). The first one tells ssh to not use the default known hosts file (which would be ~/.ssh/known_hosts). He uses /dev/null instead, which is of course always empty, and as ssh would then not find the host in there, it would normally prompt for confirmation, hence option two. Upon connecting, ssh writes the now known host to /dev/null, effectively forgetting it instantly :) – DanielSmedegaardBuus Dec 7 '14 at 0:12

DanielSmedegaardBuus ,Dec 7, 2014 at 0:23

...but you were probably wondering what effect, if any, it has on the rsync operation itself. The answer is none. It only serves to not have the host you're connecting to added to your SSH known hosts file. Perhaps he's a sysadmin often connecting to a great number of new servers, temporary systems or whatnot. I don't know :) – DanielSmedegaardBuus Dec 7 '14 at 0:23

moi ,May 10, 2016 at 13:49

"use --progress option to copy only those files which are not copied" What? – moi May 10 '16 at 13:49

Paul d'Aoust ,Nov 17, 2016 at 22:39

There are a couple errors here; one is very serious: --delete will delete files in the destination that don't exist in the source. The less serious one is that --progress doesn't modify how things are copied; it just gives you a progress report on each file as it copies. (I fixed the serious error; replaced it with --remove-source-files .) – Paul d'Aoust Nov 17 '16 at 22:39

[Jul 04, 2018] How do I parse command line arguments in Bash

Notable quotes:
"... enhanced getopt ..."
Jul 04, 2018 | stackoverflow.com

Lawrence Johnston ,Oct 10, 2008 at 16:57

Say, I have a script that gets called with this line:
./myscript -vfd ./foo/bar/someFile -o /fizz/someOtherFile

or this one:

./myscript -v -f -d -o /fizz/someOtherFile ./foo/bar/someFile

What's the accepted way of parsing this such that in each case (or some combination of the two) $v , $f , and $d will all be set to true and $outFile will be equal to /fizz/someOtherFile ?

Inanc Gumus ,Apr 15, 2016 at 19:11

See my very easy and no-dependency answer here: stackoverflow.com/a/33826763/115363Inanc Gumus Apr 15 '16 at 19:11

dezza ,Aug 2, 2016 at 2:13

For zsh-users there's a great builtin called zparseopts which can do: zparseopts -D -E -M -- d=debug -debug=d And have both -d and --debug in the $debug array echo $+debug[1] will return 0 or 1 if one of those are used. Ref: zsh.org/mla/users/2011/msg00350.htmldezza Aug 2 '16 at 2:13

Bruno Bronosky ,Jan 7, 2013 at 20:01

Preferred Method: Using straight bash without getopt[s]

I originally answered the question as the OP asked. This Q/A is getting a lot of attention, so I should also offer the non-magic way to do this. I'm going to expand upon guneysus's answer to fix the nasty sed and include Tobias Kienzler's suggestion .

Two of the most common ways to pass key value pair arguments are:

Straight Bash Space Separated

Usage ./myscript.sh -e conf -s /etc -l /usr/lib /etc/hosts

#!/bin/bash

POSITIONAL=()
while [[ $# -gt 0 ]]
do
key="$1"

case $key in
    -e|--extension)
    EXTENSION="$2"
    shift # past argument
    shift # past value
    ;;
    -s|--searchpath)
    SEARCHPATH="$2"
    shift # past argument
    shift # past value
    ;;
    -l|--lib)
    LIBPATH="$2"
    shift # past argument
    shift # past value
    ;;
    --default)
    DEFAULT=YES
    shift # past argument
    ;;
    *)    # unknown option
    POSITIONAL+=("$1") # save it in an array for later
    shift # past argument
    ;;
esac
done
set -- "${POSITIONAL[@]}" # restore positional parameters

echo FILE EXTENSION  = "${EXTENSION}"
echo SEARCH PATH     = "${SEARCHPATH}"
echo LIBRARY PATH    = "${LIBPATH}"
echo DEFAULT         = "${DEFAULT}"
echo "Number files in SEARCH PATH with EXTENSION:" $(ls -1 "${SEARCHPATH}"/*."${EXTENSION}" | wc -l)
if [[ -n $1 ]]; then
    echo "Last line of file specified as non-opt/last argument:"
    tail -1 "$1"
fi
Straight Bash Equals Separated

Usage ./myscript.sh -e=conf -s=/etc -l=/usr/lib /etc/hosts

#!/bin/bash

for i in "$@"
do
case $i in
    -e=*|--extension=*)
    EXTENSION="${i#*=}"
    shift # past argument=value
    ;;
    -s=*|--searchpath=*)
    SEARCHPATH="${i#*=}"
    shift # past argument=value
    ;;
    -l=*|--lib=*)
    LIBPATH="${i#*=}"
    shift # past argument=value
    ;;
    --default)
    DEFAULT=YES
    shift # past argument with no value
    ;;
    *)
          # unknown option
    ;;
esac
done
echo "FILE EXTENSION  = ${EXTENSION}"
echo "SEARCH PATH     = ${SEARCHPATH}"
echo "LIBRARY PATH    = ${LIBPATH}"
echo "Number files in SEARCH PATH with EXTENSION:" $(ls -1 "${SEARCHPATH}"/*."${EXTENSION}" | wc -l)
if [[ -n $1 ]]; then
    echo "Last line of file specified as non-opt/last argument:"
    tail -1 $1
fi

To better understand ${i#*=} search for "Substring Removal" in this guide . It is functionally equivalent to `sed 's/[^=]*=//' <<< "$i"` which calls a needless subprocess or `echo "$i" | sed 's/[^=]*=//'` which calls two needless subprocesses.

Using getopt[s]

from: http://mywiki.wooledge.org/BashFAQ/035#getopts

Never use getopt(1). getopt cannot handle empty arguments strings, or arguments with embedded whitespace. Please forget that it ever existed.

The POSIX shell (and others) offer getopts which is safe to use instead. Here is a simplistic getopts example:

#!/bin/sh

# A POSIX variable
OPTIND=1         # Reset in case getopts has been used previously in the shell.

# Initialize our own variables:
output_file=""
verbose=0

while getopts "h?vf:" opt; do
    case "$opt" in
    h|\?)
        show_help
        exit 0
        ;;
    v)  verbose=1
        ;;
    f)  output_file=$OPTARG
        ;;
    esac
done

shift $((OPTIND-1))

[ "${1:-}" = "--" ] && shift

echo "verbose=$verbose, output_file='$output_file', Leftovers: $@"

# End of file

The advantages of getopts are:

  1. It's portable, and will work in e.g. dash.
  2. It can handle things like -vf filename in the expected Unix way, automatically.

The disadvantage of getopts is that it can only handle short options ( -h , not --help ) without trickery.

There is a getopts tutorial which explains what all of the syntax and variables mean. In bash, there is also help getopts , which might be informative.

Livven ,Jun 6, 2013 at 21:19

Is this really true? According to Wikipedia there's a newer GNU enhanced version of getopt which includes all the functionality of getopts and then some. man getopt on Ubuntu 13.04 outputs getopt - parse command options (enhanced) as the name, so I presume this enhanced version is standard now. – Livven Jun 6 '13 at 21:19

szablica ,Jul 17, 2013 at 15:23

That something is a certain way on your system is a very weak premise to base asumptions of "being standard" on. – szablica Jul 17 '13 at 15:23

Stephane Chazelas ,Aug 20, 2014 at 19:55

@Livven, that getopt is not a GNU utility, it's part of util-linux . – Stephane Chazelas Aug 20 '14 at 19:55

Nicolas Mongrain-Lacombe ,Jun 19, 2016 at 21:22

If you use -gt 0 , remove your shift after the esac , augment all the shift by 1 and add this case: *) break;; you can handle non optionnal arguments. Ex: pastebin.com/6DJ57HTcNicolas Mongrain-Lacombe Jun 19 '16 at 21:22

kolydart ,Jul 10, 2017 at 8:11

You do not echo –default . In the first example, I notice that if –default is the last argument, it is not processed (considered as non-opt), unless while [[ $# -gt 1 ]] is set as while [[ $# -gt 0 ]]kolydart Jul 10 '17 at 8:11

Robert Siemer ,Apr 20, 2015 at 17:47

No answer mentions enhanced getopt . And the top-voted answer is misleading: It ignores -⁠vfd style short options (requested by the OP), options after positional arguments (also requested by the OP) and it ignores parsing-errors. Instead:

The following calls

myscript -vfd ./foo/bar/someFile -o /fizz/someOtherFile
myscript -v -f -d -o/fizz/someOtherFile -- ./foo/bar/someFile
myscript --verbose --force --debug ./foo/bar/someFile -o/fizz/someOtherFile
myscript --output=/fizz/someOtherFile ./foo/bar/someFile -vfd
myscript ./foo/bar/someFile -df -v --output /fizz/someOtherFile

all return

verbose: y, force: y, debug: y, in: ./foo/bar/someFile, out: /fizz/someOtherFile

with the following myscript

#!/bin/bash

getopt --test > /dev/null
if [[ $? -ne 4 ]]; then
    echo "I'm sorry, `getopt --test` failed in this environment."
    exit 1
fi

OPTIONS=dfo:v
LONGOPTIONS=debug,force,output:,verbose

# -temporarily store output to be able to check for errors
# -e.g. use "--options" parameter by name to activate quoting/enhanced mode
# -pass arguments only via   -- "$@"   to separate them correctly
PARSED=$(getopt --options=$OPTIONS --longoptions=$LONGOPTIONS --name "$0" -- "$@")
if [[ $? -ne 0 ]]; then
    # e.g. $? == 1
    #  then getopt has complained about wrong arguments to stdout
    exit 2
fi
# read getopt's output this way to handle the quoting right:
eval set -- "$PARSED"

# now enjoy the options in order and nicely split until we see --
while true; do
    case "$1" in
        -d|--debug)
            d=y
            shift
            ;;
        -f|--force)
            f=y
            shift
            ;;
        -v|--verbose)
            v=y
            shift
            ;;
        -o|--output)
            outFile="$2"
            shift 2
            ;;
        --)
            shift
            break
            ;;
        *)
            echo "Programming error"
            exit 3
            ;;
    esac
done

# handle non-option arguments
if [[ $# -ne 1 ]]; then
    echo "$0: A single input file is required."
    exit 4
fi

echo "verbose: $v, force: $f, debug: $d, in: $1, out: $outFile"

1 enhanced getopt is available on most "bash-systems", including Cygwin; on OS X try brew install gnu-getopt
2 the POSIX exec() conventions have no reliable way to pass binary NULL in command line arguments; those bytes prematurely end the argument
3 first version released in 1997 or before (I only tracked it back to 1997)

johncip ,Jan 12, 2017 at 2:00

Thanks for this. Just confirmed from the feature table at en.wikipedia.org/wiki/Getopts , if you need support for long options, and you're not on Solaris, getopt is the way to go. – johncip Jan 12 '17 at 2:00

Kaushal Modi ,Apr 27, 2017 at 14:02

I believe that the only caveat with getopt is that it cannot be used conveniently in wrapper scripts where one might have few options specific to the wrapper script, and then pass the non-wrapper-script options to the wrapped executable, intact. Let's say I have a grep wrapper called mygrep and I have an option --foo specific to mygrep , then I cannot do mygrep --foo -A 2 , and have the -A 2 passed automatically to grep ; I need to do mygrep --foo -- -A 2 . Here is my implementation on top of your solution.Kaushal Modi Apr 27 '17 at 14:02

bobpaul ,Mar 20 at 16:45

Alex, I agree and there's really no way around that since we need to know the actual return value of getopt --test . I'm a big fan of "Unofficial Bash Strict mode", (which includes set -e ), and I just put the check for getopt ABOVE set -euo pipefail and IFS=$'\n\t' in my script. – bobpaul Mar 20 at 16:45

Robert Siemer ,Mar 21 at 9:10

@bobpaul Oh, there is a way around that. And I'll edit my answer soon to reflect my collections regarding this issue ( set -e )... – Robert Siemer Mar 21 at 9:10

Robert Siemer ,Mar 21 at 9:16

@bobpaul Your statement about util-linux is wrong and misleading as well: the package is marked "essential" on Ubuntu/Debian. As such, it is always installed. – Which distros are you talking about (where you say it needs to be installed on purpose)? – Robert Siemer Mar 21 at 9:16

guneysus ,Nov 13, 2012 at 10:31

from : digitalpeer.com with minor modifications

Usage myscript.sh -p=my_prefix -s=dirname -l=libname

#!/bin/bash
for i in "$@"
do
case $i in
    -p=*|--prefix=*)
    PREFIX="${i#*=}"

    ;;
    -s=*|--searchpath=*)
    SEARCHPATH="${i#*=}"
    ;;
    -l=*|--lib=*)
    DIR="${i#*=}"
    ;;
    --default)
    DEFAULT=YES
    ;;
    *)
            # unknown option
    ;;
esac
done
echo PREFIX = ${PREFIX}
echo SEARCH PATH = ${SEARCHPATH}
echo DIRS = ${DIR}
echo DEFAULT = ${DEFAULT}

To better understand ${i#*=} search for "Substring Removal" in this guide . It is functionally equivalent to `sed 's/[^=]*=//' <<< "$i"` which calls a needless subprocess or `echo "$i" | sed 's/[^=]*=//'` which calls two needless subprocesses.

Tobias Kienzler ,Nov 12, 2013 at 12:48

Neat! Though this won't work for space-separated arguments à la mount -t tempfs ... . One can probably fix this via something like while [ $# -ge 1 ]; do param=$1; shift; case $param in; -p) prefix=$1; shift;; etc – Tobias Kienzler Nov 12 '13 at 12:48

Robert Siemer ,Mar 19, 2016 at 15:23

This can't handle -vfd style combined short options. – Robert Siemer Mar 19 '16 at 15:23

bekur ,Dec 19, 2017 at 23:27

link is broken! – bekur Dec 19 '17 at 23:27

Matt J ,Oct 10, 2008 at 17:03

getopt() / getopts() is a good option. Stolen from here :

The simple use of "getopt" is shown in this mini-script:

#!/bin/bash
echo "Before getopt"
for i
do
  echo $i
done
args=`getopt abc:d $*`
set -- $args
echo "After getopt"
for i
do
  echo "-->$i"
done

What we have said is that any of -a, -b, -c or -d will be allowed, but that -c is followed by an argument (the "c:" says that).

If we call this "g" and try it out:

bash-2.05a$ ./g -abc foo
Before getopt
-abc
foo
After getopt
-->-a
-->-b
-->-c
-->foo
-->--

We start with two arguments, and "getopt" breaks apart the options and puts each in its own argument. It also added "--".

Robert Siemer ,Apr 16, 2016 at 14:37

Using $* is broken usage of getopt . (It hoses arguments with spaces.) See my answer for proper usage. – Robert Siemer Apr 16 '16 at 14:37

SDsolar ,Aug 10, 2017 at 14:07

Why would you want to make it more complicated? – SDsolar Aug 10 '17 at 14:07

thebunnyrules ,Jun 1 at 1:57

@Matt J, the first part of the script (for i) would be able to handle arguments with spaces in them if you use "$i" instead of $i. The getopts does not seem to be able to handle arguments with spaces. What would be the advantage of using getopt over the for i loop? – thebunnyrules Jun 1 at 1:57

bronson ,Jul 15, 2015 at 23:43

At the risk of adding another example to ignore, here's my scheme.

Hope it's useful to someone.

while [ "$#" -gt 0 ]; do
  case "$1" in
    -n) name="$2"; shift 2;;
    -p) pidfile="$2"; shift 2;;
    -l) logfile="$2"; shift 2;;

    --name=*) name="${1#*=}"; shift 1;;
    --pidfile=*) pidfile="${1#*=}"; shift 1;;
    --logfile=*) logfile="${1#*=}"; shift 1;;
    --name|--pidfile|--logfile) echo "$1 requires an argument" >&2; exit 1;;

    -*) echo "unknown option: $1" >&2; exit 1;;
    *) handle_argument "$1"; shift 1;;
  esac
done

rhombidodecahedron ,Sep 11, 2015 at 8:40

What is the "handle_argument" function? – rhombidodecahedron Sep 11 '15 at 8:40

bronson ,Oct 8, 2015 at 20:41

Sorry for the delay. In my script, the handle_argument function receives all the non-option arguments. You can replace that line with whatever you'd like, maybe *) die "unrecognized argument: $1" or collect the args into a variable *) args+="$1"; shift 1;; . – bronson Oct 8 '15 at 20:41

Guilherme Garnier ,Apr 13 at 16:10

Amazing! I've tested a couple of answers, but this is the only one that worked for all cases, including many positional parameters (both before and after flags) – Guilherme Garnier Apr 13 at 16:10

Shane Day ,Jul 1, 2014 at 1:20

I'm about 4 years late to this question, but want to give back. I used the earlier answers as a starting point to tidy up my old adhoc param parsing. I then refactored out the following template code. It handles both long and short params, using = or space separated arguments, as well as multiple short params grouped together. Finally it re-inserts any non-param arguments back into the $1,$2.. variables. I hope it's useful.
#!/usr/bin/env bash

# NOTICE: Uncomment if your script depends on bashisms.
#if [ -z "$BASH_VERSION" ]; then bash $0 $@ ; exit $? ; fi

echo "Before"
for i ; do echo - $i ; done


# Code template for parsing command line parameters using only portable shell
# code, while handling both long and short params, handling '-f file' and
# '-f=file' style param data and also capturing non-parameters to be inserted
# back into the shell positional parameters.

while [ -n "$1" ]; do
        # Copy so we can modify it (can't modify $1)
        OPT="$1"
        # Detect argument termination
        if [ x"$OPT" = x"--" ]; then
                shift
                for OPT ; do
                        REMAINS="$REMAINS \"$OPT\""
                done
                break
        fi
        # Parse current opt
        while [ x"$OPT" != x"-" ] ; do
                case "$OPT" in
                        # Handle --flag=value opts like this
                        -c=* | --config=* )
                                CONFIGFILE="${OPT#*=}"
                                shift
                                ;;
                        # and --flag value opts like this
                        -c* | --config )
                                CONFIGFILE="$2"
                                shift
                                ;;
                        -f* | --force )
                                FORCE=true
                                ;;
                        -r* | --retry )
                                RETRY=true
                                ;;
                        # Anything unknown is recorded for later
                        * )
                                REMAINS="$REMAINS \"$OPT\""
                                break
                                ;;
                esac
                # Check for multiple short options
                # NOTICE: be sure to update this pattern to match valid options
                NEXTOPT="${OPT#-[cfr]}" # try removing single short opt
                if [ x"$OPT" != x"$NEXTOPT" ] ; then
                        OPT="-$NEXTOPT"  # multiple short opts, keep going
                else
                        break  # long form, exit inner loop
                fi
        done
        # Done with that param. move to next
        shift
done
# Set the non-parameters back into the positional parameters ($1 $2 ..)
eval set -- $REMAINS


echo -e "After: \n configfile='$CONFIGFILE' \n force='$FORCE' \n retry='$RETRY' \n remains='$REMAINS'"
for i ; do echo - $i ; done

Robert Siemer ,Dec 6, 2015 at 13:47

This code can't handle options with arguments like this: -c1 . And the use of = to separate short options from their arguments is unusual... – Robert Siemer Dec 6 '15 at 13:47

sfnd ,Jun 6, 2016 at 19:28

I ran into two problems with this useful chunk of code: 1) the "shift" in the case of "-c=foo" ends up eating the next parameter; and 2) 'c' should not be included in the "[cfr]" pattern for combinable short options. – sfnd Jun 6 '16 at 19:28

Inanc Gumus ,Nov 20, 2015 at 12:28

More succinct way

script.sh

#!/bin/bash

while [[ "$#" > 0 ]]; do case $1 in
  -d|--deploy) deploy="$2"; shift;;
  -u|--uglify) uglify=1;;
  *) echo "Unknown parameter passed: $1"; exit 1;;
esac; shift; done

echo "Should deploy? $deploy"
echo "Should uglify? $uglify"

Usage:

./script.sh -d dev -u

# OR:

./script.sh --deploy dev --uglify

hfossli ,Apr 7 at 20:58

This is what I am doing. Have to while [[ "$#" > 1 ]] if I want to support ending the line with a boolean flag ./script.sh --debug dev --uglify fast --verbose . Example: gist.github.com/hfossli/4368aa5a577742c3c9f9266ed214aa58hfossli Apr 7 at 20:58

hfossli ,Apr 7 at 21:09

I sent an edit request. I just tested this and it works perfectly. – hfossli Apr 7 at 21:09

hfossli ,Apr 7 at 21:10

Wow! Simple and clean! This is how I'm using this: gist.github.com/hfossli/4368aa5a577742c3c9f9266ed214aa58hfossli Apr 7 at 21:10

Ponyboy47 ,Sep 8, 2016 at 18:59

My answer is largely based on the answer by Bruno Bronosky , but I sort of mashed his two pure bash implementations into one that I use pretty frequently.
# As long as there is at least one more argument, keep looping
while [[ $# -gt 0 ]]; do
    key="$1"
    case "$key" in
        # This is a flag type option. Will catch either -f or --foo
        -f|--foo)
        FOO=1
        ;;
        # Also a flag type option. Will catch either -b or --bar
        -b|--bar)
        BAR=1
        ;;
        # This is an arg value type option. Will catch -o value or --output-file value
        -o|--output-file)
        shift # past the key and to the value
        OUTPUTFILE="$1"
        ;;
        # This is an arg=value type option. Will catch -o=value or --output-file=value
        -o=*|--output-file=*)
        # No need to shift here since the value is part of the same string
        OUTPUTFILE="${key#*=}"
        ;;
        *)
        # Do whatever you want with extra options
        echo "Unknown option '$key'"
        ;;
    esac
    # Shift after checking all the cases to get the next option
    shift
done

This allows you to have both space separated options/values, as well as equal defined values.

So you could run your script using:

./myscript --foo -b -o /fizz/file.txt

as well as:

./myscript -f --bar -o=/fizz/file.txt

and both should have the same end result.

PROS:

CONS:

These are the only pros/cons I can think of off the top of my head

bubla ,Jul 10, 2016 at 22:40

I have found the matter to write portable parsing in scripts so frustrating that I have written Argbash - a FOSS code generator that can generate the arguments-parsing code for your script plus it has some nice features:

https://argbash.io

RichVel ,Aug 18, 2016 at 5:34

Thanks for writing argbash, I just used it and found it works well. I mostly went for argbash because it's a code generator supporting the older bash 3.x found on OS X 10.11 El Capitan. The only downside is that the code-generator approach means quite a lot of code in your main script, compared to calling a module. – RichVel Aug 18 '16 at 5:34

bubla ,Aug 23, 2016 at 20:40

You can actually use Argbash in a way that it produces tailor-made parsing library just for you that you can have included in your script or you can have it in a separate file and just source it. I have added an example to demonstrate that and I have made it more explicit in the documentation, too. – bubla Aug 23 '16 at 20:40

RichVel ,Aug 24, 2016 at 5:47

Good to know. That example is interesting but still not really clear - maybe you can change name of the generated script to 'parse_lib.sh' or similar and show where the main script calls it (like in the wrapping script section which is more complex use case). – RichVel Aug 24 '16 at 5:47

bubla ,Dec 2, 2016 at 20:12

The issues were addressed in recent version of argbash: Documentation has been improved, a quickstart argbash-init script has been introduced and you can even use argbash online at argbash.io/generatebubla Dec 2 '16 at 20:12

Alek ,Mar 1, 2012 at 15:15

I think this one is simple enough to use:
#!/bin/bash
#

readopt='getopts $opts opt;rc=$?;[ $rc$opt == 0? ]&&exit 1;[ $rc == 0 ]||{ shift $[OPTIND-1];false; }'

opts=vfdo:

# Enumerating options
while eval $readopt
do
    echo OPT:$opt ${OPTARG+OPTARG:$OPTARG}
done

# Enumerating arguments
for arg
do
    echo ARG:$arg
done

Invocation example:

./myscript -v -do /fizz/someOtherFile -f ./foo/bar/someFile
OPT:v 
OPT:d 
OPT:o OPTARG:/fizz/someOtherFile
OPT:f 
ARG:./foo/bar/someFile

erm3nda ,May 20, 2015 at 22:50

I read all and this one is my preferred one. I don't like to use -a=1 as argc style. I prefer to put first the main option -options and later the special ones with single spacing -o option . Im looking for the simplest-vs-better way to read argvs. – erm3nda May 20 '15 at 22:50

erm3nda ,May 20, 2015 at 23:25

It's working really well but if you pass an argument to a non a: option all the following options would be taken as arguments. You can check this line ./myscript -v -d fail -o /fizz/someOtherFile -f ./foo/bar/someFile with your own script. -d option is not set as d: – erm3nda May 20 '15 at 23:25

unsynchronized ,Jun 9, 2014 at 13:46

Expanding on the excellent answer by @guneysus, here is a tweak that lets user use whichever syntax they prefer, eg
command -x=myfilename.ext --another_switch

vs

command -x myfilename.ext --another_switch

That is to say the equals can be replaced with whitespace.

This "fuzzy interpretation" might not be to your liking, but if you are making scripts that are interchangeable with other utilities (as is the case with mine, which must work with ffmpeg), the flexibility is useful.

STD_IN=0

prefix=""
key=""
value=""
for keyValue in "$@"
do
  case "${prefix}${keyValue}" in
    -i=*|--input_filename=*)  key="-i";     value="${keyValue#*=}";; 
    -ss=*|--seek_from=*)      key="-ss";    value="${keyValue#*=}";;
    -t=*|--play_seconds=*)    key="-t";     value="${keyValue#*=}";;
    -|--stdin)                key="-";      value=1;;
    *)                                      value=$keyValue;;
  esac
  case $key in
    -i) MOVIE=$(resolveMovie "${value}");  prefix=""; key="";;
    -ss) SEEK_FROM="${value}";          prefix=""; key="";;
    -t)  PLAY_SECONDS="${value}";           prefix=""; key="";;
    -)   STD_IN=${value};                   prefix=""; key="";; 
    *)   prefix="${keyValue}=";;
  esac
done

vangorra ,Feb 12, 2015 at 21:50

getopts works great if #1 you have it installed and #2 you intend to run it on the same platform. OSX and Linux (for example) behave differently in this respect.

Here is a (non getopts) solution that supports equals, non-equals, and boolean flags. For example you could run your script in this way:

./script --arg1=value1 --arg2 value2 --shouldClean

# parse the arguments.
COUNTER=0
ARGS=("$@")
while [ $COUNTER -lt $# ]
do
    arg=${ARGS[$COUNTER]}
    let COUNTER=COUNTER+1
    nextArg=${ARGS[$COUNTER]}

    if [[ $skipNext -eq 1 ]]; then
        echo "Skipping"
        skipNext=0
        continue
    fi

    argKey=""
    argVal=""
    if [[ "$arg" =~ ^\- ]]; then
        # if the format is: -key=value
        if [[ "$arg" =~ \= ]]; then
            argVal=$(echo "$arg" | cut -d'=' -f2)
            argKey=$(echo "$arg" | cut -d'=' -f1)
            skipNext=0

        # if the format is: -key value
        elif [[ ! "$nextArg" =~ ^\- ]]; then
            argKey="$arg"
            argVal="$nextArg"
            skipNext=1

        # if the format is: -key (a boolean flag)
        elif [[ "$nextArg" =~ ^\- ]] || [[ -z "$nextArg" ]]; then
            argKey="$arg"
            argVal=""
            skipNext=0
        fi
    # if the format has not flag, just a value.
    else
        argKey=""
        argVal="$arg"
        skipNext=0
    fi

    case "$argKey" in 
        --source-scmurl)
            SOURCE_URL="$argVal"
        ;;
        --dest-scmurl)
            DEST_URL="$argVal"
        ;;
        --version-num)
            VERSION_NUM="$argVal"
        ;;
        -c|--clean)
            CLEAN_BEFORE_START="1"
        ;;
        -h|--help|-help|--h)
            showUsage
            exit
        ;;
    esac
done

akostadinov ,Jul 19, 2013 at 7:50

This is how I do in a function to avoid breaking getopts run at the same time somewhere higher in stack:
function waitForWeb () {
   local OPTIND=1 OPTARG OPTION
   local host=localhost port=8080 proto=http
   while getopts "h:p:r:" OPTION; do
      case "$OPTION" in
      h)
         host="$OPTARG"
         ;;
      p)
         port="$OPTARG"
         ;;
      r)
         proto="$OPTARG"
         ;;
      esac
   done
...
}

Renato Silva ,Jul 4, 2016 at 16:47

EasyOptions does not require any parsing:
## Options:
##   --verbose, -v  Verbose mode
##   --output=FILE  Output filename

source easyoptions || exit

if test -n "${verbose}"; then
    echo "output file is ${output}"
    echo "${arguments[@]}"
fi

Oleksii Chekulaiev ,Jul 1, 2016 at 20:56

I give you The Function parse_params that will parse params:
  1. Without polluting global scope.
  2. Effortlessly returns to you ready to use variables so that you could build further logic on them
  3. Amount of dashes before params does not matter ( --all equals -all equals all=all )

The script below is a copy-paste working demonstration. See show_use function to understand how to use parse_params .

Limitations:

  1. Does not support space delimited params ( -d 1 )
  2. Param names will lose dashes so --any-param and -anyparam are equivalent
  3. eval $(parse_params "$@") must be used inside bash function (it will not work in the global scope)

#!/bin/bash

# Universal Bash parameter parsing
# Parse equal sign separated params into named local variables
# Standalone named parameter value will equal its param name (--force creates variable $force=="force")
# Parses multi-valued named params into an array (--path=path1 --path=path2 creates ${path[*]} array)
# Parses un-named params into ${ARGV[*]} array
# Additionally puts all named params into ${ARGN[*]} array
# Additionally puts all standalone "option" params into ${ARGO[*]} array
# @author Oleksii Chekulaiev
# @version v1.3 (May-14-2018)
parse_params ()
{
    local existing_named
    local ARGV=() # un-named params
    local ARGN=() # named params
    local ARGO=() # options (--params)
    echo "local ARGV=(); local ARGN=(); local ARGO=();"
    while [[ "$1" != "" ]]; do
        # Escape asterisk to prevent bash asterisk expansion
        _escaped=${1/\*/\'\"*\"\'}
        # If equals delimited named parameter
        if [[ "$1" =~ ^..*=..* ]]; then
            # Add to named parameters array
            echo "ARGN+=('$_escaped');"
            # key is part before first =
            local _key=$(echo "$1" | cut -d = -f 1)
            # val is everything after key and = (protect from param==value error)
            local _val="${1/$_key=}"
            # remove dashes from key name
            _key=${_key//\-}
            # search for existing parameter name
            if (echo "$existing_named" | grep "\b$_key\b" >/dev/null); then
                # if name already exists then it's a multi-value named parameter
                # re-declare it as an array if needed
                if ! (declare -p _key 2> /dev/null | grep -q 'declare \-a'); then
                    echo "$_key=(\"\$$_key\");"
                fi
                # append new value
                echo "$_key+=('$_val');"
            else
                # single-value named parameter
                echo "local $_key=\"$_val\";"
                existing_named=" $_key"
            fi
        # If standalone named parameter
        elif [[ "$1" =~ ^\-. ]]; then
            # Add to options array
            echo "ARGO+=('$_escaped');"
            # remove dashes
            local _key=${1//\-}
            echo "local $_key=\"$_key\";"
        # non-named parameter
        else
            # Escape asterisk to prevent bash asterisk expansion
            _escaped=${1/\*/\'\"*\"\'}
            echo "ARGV+=('$_escaped');"
        fi
        shift
    done
}

#--------------------------- DEMO OF THE USAGE -------------------------------

show_use ()
{
    eval $(parse_params "$@")
    # --
    echo "${ARGV[0]}" # print first unnamed param
    echo "${ARGV[1]}" # print second unnamed param
    echo "${ARGN[0]}" # print first named param
    echo "${ARG0[0]}" # print first option param (--force)
    echo "$anyparam"  # print --anyparam value
    echo "$k"         # print k=5 value
    echo "${multivalue[0]}" # print first value of multi-value
    echo "${multivalue[1]}" # print second value of multi-value
    [[ "$force" == "force" ]] && echo "\$force is set so let the force be with you"
}

show_use "param 1" --anyparam="my value" param2 k=5 --force --multi-value=test1 --multi-value=test2

Oleksii Chekulaiev ,Sep 28, 2016 at 12:55

To use the demo to parse params that come into your bash script you just do show_use "$@"Oleksii Chekulaiev Sep 28 '16 at 12:55

Oleksii Chekulaiev ,Sep 28, 2016 at 12:58

Basically I found out that github.com/renatosilva/easyoptions does the same in the same way but is a bit more massive than this function. – Oleksii Chekulaiev Sep 28 '16 at 12:58

galmok ,Jun 24, 2015 at 10:54

I'd like to offer my version of option parsing, that allows for the following:
-s p1
--stage p1
-w somefolder
--workfolder somefolder
-sw p1 somefolder
-e=hello

Also allows for this (could be unwanted):

-s--workfolder p1 somefolder
-se=hello p1
-swe=hello p1 somefolder

You have to decide before use if = is to be used on an option or not. This is to keep the code clean(ish).

while [[ $# > 0 ]]
do
    key="$1"
    while [[ ${key+x} ]]
    do
        case $key in
            -s*|--stage)
                STAGE="$2"
                shift # option has parameter
                ;;
            -w*|--workfolder)
                workfolder="$2"
                shift # option has parameter
                ;;
            -e=*)
                EXAMPLE="${key#*=}"
                break # option has been fully handled
                ;;
            *)
                # unknown option
                echo Unknown option: $key #1>&2
                exit 10 # either this: my preferred way to handle unknown options
                break # or this: do this to signal the option has been handled (if exit isn't used)
                ;;
        esac
        # prepare for next option in this key, if any
        [[ "$key" = -? || "$key" == --* ]] && unset key || key="${key/#-?/-}"
    done
    shift # option(s) fully processed, proceed to next input argument
done

Luca Davanzo ,Nov 14, 2016 at 17:56

what's the meaning for "+x" on ${key+x} ? – Luca Davanzo Nov 14 '16 at 17:56

galmok ,Nov 15, 2016 at 9:10

It is a test to see if 'key' is present or not. Further down I unset key and this breaks the inner while loop. – galmok Nov 15 '16 at 9:10

Mark Fox ,Apr 27, 2015 at 2:42

Mixing positional and flag-based arguments --param=arg (equals delimited)

Freely mixing flags between positional arguments:

./script.sh dumbo 127.0.0.1 --environment=production -q -d
./script.sh dumbo --environment=production 127.0.0.1 --quiet -d

can be accomplished with a fairly concise approach:

# process flags
pointer=1
while [[ $pointer -le $# ]]; do
   param=${!pointer}
   if [[ $param != "-"* ]]; then ((pointer++)) # not a parameter flag so advance pointer
   else
      case $param in
         # paramter-flags with arguments
         -e=*|--environment=*) environment="${param#*=}";;
                  --another=*) another="${param#*=}";;

         # binary flags
         -q|--quiet) quiet=true;;
                 -d) debug=true;;
      esac

      # splice out pointer frame from positional list
      [[ $pointer -gt 1 ]] \
         && set -- ${@:1:((pointer - 1))} ${@:((pointer + 1)):$#} \
         || set -- ${@:((pointer + 1)):$#};
   fi
done

# positional remain
node_name=$1
ip_address=$2
--param arg (space delimited)

It's usualy clearer to not mix --flag=value and --flag value styles.

./script.sh dumbo 127.0.0.1 --environment production -q -d

This is a little dicey to read, but is still valid

./script.sh dumbo --environment production 127.0.0.1 --quiet -d

Source

# process flags
pointer=1
while [[ $pointer -le $# ]]; do
   if [[ ${!pointer} != "-"* ]]; then ((pointer++)) # not a parameter flag so advance pointer
   else
      param=${!pointer}
      ((pointer_plus = pointer + 1))
      slice_len=1

      case $param in
         # paramter-flags with arguments
         -e|--environment) environment=${!pointer_plus}; ((slice_len++));;
                --another) another=${!pointer_plus}; ((slice_len++));;

         # binary flags
         -q|--quiet) quiet=true;;
                 -d) debug=true;;
      esac

      # splice out pointer frame from positional list
      [[ $pointer -gt 1 ]] \
         && set -- ${@:1:((pointer - 1))} ${@:((pointer + $slice_len)):$#} \
         || set -- ${@:((pointer + $slice_len)):$#};
   fi
done

# positional remain
node_name=$1
ip_address=$2

schily ,Oct 19, 2015 at 13:59

Note that getopt(1) was a short living mistake from AT&T.

getopt was created in 1984 but already buried in 1986 because it was not really usable.

A proof for the fact that getopt is very outdated is that the getopt(1) man page still mentions "$*" instead of "$@" , that was added to the Bourne Shell in 1986 together with the getopts(1) shell builtin in order to deal with arguments with spaces inside.

BTW: if you are interested in parsing long options in shell scripts, it may be of interest to know that the getopt(3) implementation from libc (Solaris) and ksh93 both added a uniform long option implementation that supports long options as aliases for short options. This causes ksh93 and the Bourne Shell to implement a uniform interface for long options via getopts .

An example for long options taken from the Bourne Shell man page:

getopts "f:(file)(input-file)o:(output-file)" OPTX "$@"

shows how long option aliases may be used in both Bourne Shell and ksh93.

See the man page of a recent Bourne Shell:

http://schillix.sourceforge.net/man/man1/bosh.1.html

and the man page for getopt(3) from OpenSolaris:

http://schillix.sourceforge.net/man/man3c/getopt.3c.html

and last, the getopt(1) man page to verify the outdated $*:

http://schillix.sourceforge.net/man/man1/getopt.1.html

Volodymyr M. Lisivka ,Jul 9, 2013 at 16:51

Use module "arguments" from bash-modules

Example:

#!/bin/bash
. import.sh log arguments

NAME="world"

parse_arguments "-n|--name)NAME;S" -- "$@" || {
  error "Cannot parse command line."
  exit 1
}

info "Hello, $NAME!"

Mike Q ,Jun 14, 2014 at 18:01

This also might be useful to know, you can set a value and if someone provides input, override the default with that value..

myscript.sh -f ./serverlist.txt or just ./myscript.sh (and it takes defaults)

    #!/bin/bash
    # --- set the value, if there is inputs, override the defaults.

    HOME_FOLDER="${HOME}/owned_id_checker"
    SERVER_FILE_LIST="${HOME_FOLDER}/server_list.txt"

    while [[ $# > 1 ]]
    do
    key="$1"
    shift

    case $key in
        -i|--inputlist)
        SERVER_FILE_LIST="$1"
        shift
        ;;
    esac
    done


    echo "SERVER LIST   = ${SERVER_FILE_LIST}"

phk ,Oct 17, 2015 at 21:17

Another solution without getopt[s], POSIX, old Unix style

Similar to the solution Bruno Bronosky posted this here is one without the usage of getopt(s) .

Main differentiating feature of my solution is that it allows to have options concatenated together just like tar -xzf foo.tar.gz is equal to tar -x -z -f foo.tar.gz . And just like in tar , ps etc. the leading hyphen is optional for a block of short options (but this can be changed easily). Long options are supported as well (but when a block starts with one then two leading hyphens are required).

Code with example options
#!/bin/sh

echo
echo "POSIX-compliant getopt(s)-free old-style-supporting option parser from phk@[se.unix]"
echo

print_usage() {
  echo "Usage:

  $0 {a|b|c} [ARG...]

Options:

  --aaa-0-args
  -a
    Option without arguments.

  --bbb-1-args ARG
  -b ARG
    Option with one argument.

  --ccc-2-args ARG1 ARG2
  -c ARG1 ARG2
    Option with two arguments.

" >&2
}

if [ $# -le 0 ]; then
  print_usage
  exit 1
fi

opt=
while :; do

  if [ $# -le 0 ]; then

    # no parameters remaining -> end option parsing
    break

  elif [ ! "$opt" ]; then

    # we are at the beginning of a fresh block
    # remove optional leading hyphen and strip trailing whitespaces
    opt=$(echo "$1" | sed 's/^-\?\([a-zA-Z0-9\?-]*\)/\1/')

  fi

  # get the first character -> check whether long option
  first_chr=$(echo "$opt" | awk '{print substr($1, 1, 1)}')
  [ "$first_chr" = - ] && long_option=T || long_option=F

  # note to write the options here with a leading hyphen less
  # also do not forget to end short options with a star
  case $opt in

    -)

      # end of options
      shift
      break
      ;;

    a*|-aaa-0-args)

      echo "Option AAA activated!"
      ;;

    b*|-bbb-1-args)

      if [ "$2" ]; then
        echo "Option BBB with argument '$2' activated!"
        shift
      else
        echo "BBB parameters incomplete!" >&2
        print_usage
        exit 1
      fi
      ;;

    c*|-ccc-2-args)

      if [ "$2" ] && [ "$3" ]; then
        echo "Option CCC with arguments '$2' and '$3' activated!"
        shift 2
      else
        echo "CCC parameters incomplete!" >&2
        print_usage
        exit 1
      fi
      ;;

    h*|\?*|-help)

      print_usage
      exit 0
      ;;

    *)

      if [ "$long_option" = T ]; then
        opt=$(echo "$opt" | awk '{print substr($1, 2)}')
      else
        opt=$first_chr
      fi
      printf 'Error: Unknown option: "%s"\n' "$opt" >&2
      print_usage
      exit 1
      ;;

  esac

  if [ "$long_option" = T ]; then

    # if we had a long option then we are going to get a new block next
    shift
    opt=

  else

    # if we had a short option then just move to the next character
    opt=$(echo "$opt" | awk '{print substr($1, 2)}')

    # if block is now empty then shift to the next one
    [ "$opt" ] || shift

  fi

done

echo "Doing something..."

exit 0

For the example usage please see the examples further below.

Position of options with arguments

For what its worth there the options with arguments don't be the last (only long options need to be). So while e.g. in tar (at least in some implementations) the f options needs to be last because the file name follows ( tar xzf bar.tar.gz works but tar xfz bar.tar.gz does not) this is not the case here (see the later examples).

Multiple options with arguments

As another bonus the option parameters are consumed in the order of the options by the parameters with required options. Just look at the output of my script here with the command line abc X Y Z (or -abc X Y Z ):

Option AAA activated!
Option BBB with argument 'X' activated!
Option CCC with arguments 'Y' and 'Z' activated!
Long options concatenated as well

Also you can also have long options in option block given that they occur last in the block. So the following command lines are all equivalent (including the order in which the options and its arguments are being processed):

All of these lead to:

Option CCC with arguments 'Z' and 'Y' activated!
Option BBB with argument 'X' activated!
Option AAA activated!
Doing something...
Not in this solution Optional arguments

Options with optional arguments should be possible with a bit of work, e.g. by looking forward whether there is a block without a hyphen; the user would then need to put a hyphen in front of every block following a block with a parameter having an optional parameter. Maybe this is too complicated to communicate to the user so better just require a leading hyphen altogether in this case.

Things get even more complicated with multiple possible parameters. I would advise against making the options trying to be smart by determining whether the an argument might be for it or not (e.g. with an option just takes a number as an optional argument) because this might break in the future.

I personally favor additional options instead of optional arguments.

Option arguments introduced with an equal sign

Just like with optional arguments I am not a fan of this (BTW, is there a thread for discussing the pros/cons of different parameter styles?) but if you want this you could probably implement it yourself just like done at http://mywiki.wooledge.org/BashFAQ/035#Manual_loop with a --long-with-arg=?* case statement and then stripping the equal sign (this is BTW the site that says that making parameter concatenation is possible with some effort but "left [it] as an exercise for the reader" which made me take them at their word but I started from scratch).

Other notes

POSIX-compliant, works even on ancient Busybox setups I had to deal with (with e.g. cut , head and getopts missing).

Noah ,Aug 29, 2016 at 3:44

Solution that preserves unhandled arguments. Demos Included.

Here is my solution. It is VERY flexible and unlike others, shouldn't require external packages and handles leftover arguments cleanly.

Usage is: ./myscript -flag flagvariable -otherflag flagvar2

All you have to do is edit the validflags line. It prepends a hyphen and searches all arguments. It then defines the next argument as the flag name e.g.

./myscript -flag flagvariable -otherflag flagvar2
echo $flag $otherflag
flagvariable flagvar2

The main code (short version, verbose with examples further down, also a version with erroring out):

#!/usr/bin/env bash
#shebang.io
validflags="rate time number"
count=1
for arg in $@
do
    match=0
    argval=$1
    for flag in $validflags
    do
        sflag="-"$flag
        if [ "$argval" == "$sflag" ]
        then
            declare $flag=$2
            match=1
        fi
    done
        if [ "$match" == "1" ]
    then
        shift 2
    else
        leftovers=$(echo $leftovers $argval)
        shift
    fi
    count=$(($count+1))
done
#Cleanup then restore the leftovers
shift $#
set -- $leftovers

The verbose version with built in echo demos:

#!/usr/bin/env bash
#shebang.io
rate=30
time=30
number=30
echo "all args
$@"
validflags="rate time number"
count=1
for arg in $@
do
    match=0
    argval=$1
#   argval=$(echo $@ | cut -d ' ' -f$count)
    for flag in $validflags
    do
            sflag="-"$flag
        if [ "$argval" == "$sflag" ]
        then
            declare $flag=$2
            match=1
        fi
    done
        if [ "$match" == "1" ]
    then
        shift 2
    else
        leftovers=$(echo $leftovers $argval)
        shift
    fi
    count=$(($count+1))
done

#Cleanup then restore the leftovers
echo "pre final clear args:
$@"
shift $#
echo "post final clear args:
$@"
set -- $leftovers
echo "all post set args:
$@"
echo arg1: $1 arg2: $2

echo leftovers: $leftovers
echo rate $rate time $time number $number

Final one, this one errors out if an invalid -argument is passed through.

#!/usr/bin/env bash
#shebang.io
rate=30
time=30
number=30
validflags="rate time number"
count=1
for arg in $@
do
    argval=$1
    match=0
        if [ "${argval:0:1}" == "-" ]
    then
        for flag in $validflags
        do
                sflag="-"$flag
            if [ "$argval" == "$sflag" ]
            then
                declare $flag=$2
                match=1
            fi
        done
        if [ "$match" == "0" ]
        then
            echo "Bad argument: $argval"
            exit 1
        fi
        shift 2
    else
        leftovers=$(echo $leftovers $argval)
        shift
    fi
    count=$(($count+1))
done
#Cleanup then restore the leftovers
shift $#
set -- $leftovers
echo rate $rate time $time number $number
echo leftovers: $leftovers

Pros: What it does, it handles very well. It preserves unused arguments which a lot of the other solutions here don't. It also allows for variables to be called without being defined by hand in the script. It also allows prepopulation of variables if no corresponding argument is given. (See verbose example).

Cons: Can't parse a single complex arg string e.g. -xcvf would process as a single argument. You could somewhat easily write additional code into mine that adds this functionality though.

Daniel Bigham ,Aug 8, 2016 at 12:42

The top answer to this question seemed a bit buggy when I tried it -- here's my solution which I've found to be more robust:
boolean_arg=""
arg_with_value=""

while [[ $# -gt 0 ]]
do
key="$1"
case $key in
    -b|--boolean-arg)
    boolean_arg=true
    shift
    ;;
    -a|--arg-with-value)
    arg_with_value="$2"
    shift
    shift
    ;;
    -*)
    echo "Unknown option: $1"
    exit 1
    ;;
    *)
    arg_num=$(( $arg_num + 1 ))
    case $arg_num in
        1)
        first_normal_arg="$1"
        shift
        ;;
        2)
        second_normal_arg="$1"
        shift
        ;;
        *)
        bad_args=TRUE
    esac
    ;;
esac
done

# Handy to have this here when adding arguments to
# see if they're working. Just edit the '0' to be '1'.
if [[ 0 == 1 ]]; then
    echo "first_normal_arg: $first_normal_arg"
    echo "second_normal_arg: $second_normal_arg"
    echo "boolean_arg: $boolean_arg"
    echo "arg_with_value: $arg_with_value"
    exit 0
fi

if [[ $bad_args == TRUE || $arg_num < 2 ]]; then
    echo "Usage: $(basename "$0") <first-normal-arg> <second-normal-arg> [--boolean-arg] [--arg-with-value VALUE]"
    exit 1
fi

phyatt ,Sep 7, 2016 at 18:25

This example shows how to use getopt and eval and HEREDOC and shift to handle short and long parameters with and without a required value that follows. Also the switch/case statement is concise and easy to follow.
#!/usr/bin/env bash

# usage function
function usage()
{
   cat << HEREDOC

   Usage: $progname [--num NUM] [--time TIME_STR] [--verbose] [--dry-run]

   optional arguments:
     -h, --help           show this help message and exit
     -n, --num NUM        pass in a number
     -t, --time TIME_STR  pass in a time string
     -v, --verbose        increase the verbosity of the bash script
     --dry-run            do a dry run, don't change any files

HEREDOC
}  

# initialize variables
progname=$(basename $0)
verbose=0
dryrun=0
num_str=
time_str=

# use getopt and store the output into $OPTS
# note the use of -o for the short options, --long for the long name options
# and a : for any option that takes a parameter
OPTS=$(getopt -o "hn:t:v" --long "help,num:,time:,verbose,dry-run" -n "$progname" -- "$@")
if [ $? != 0 ] ; then echo "Error in command line arguments." >&2 ; usage; exit 1 ; fi
eval set -- "$OPTS"

while true; do
  # uncomment the next line to see how shift is working
  # echo "\$1:\"$1\" \$2:\"$2\""
  case "$1" in
    -h | --help ) usage; exit; ;;
    -n | --num ) num_str="$2"; shift 2 ;;
    -t | --time ) time_str="$2"; shift 2 ;;
    --dry-run ) dryrun=1; shift ;;
    -v | --verbose ) verbose=$((verbose + 1)); shift ;;
    -- ) shift; break ;;
    * ) break ;;
  esac
done

if (( $verbose > 0 )); then

   # print out all the parameters we read in
   cat <<-EOM
   num=$num_str
   time=$time_str
   verbose=$verbose
   dryrun=$dryrun
EOM
fi

# The rest of your script below

The most significant lines of the script above are these:

OPTS=$(getopt -o "hn:t:v" --long "help,num:,time:,verbose,dry-run" -n "$progname" -- "$@")
if [ $? != 0 ] ; then echo "Error in command line arguments." >&2 ; exit 1 ; fi
eval set -- "$OPTS"

while true; do
  case "$1" in
    -h | --help ) usage; exit; ;;
    -n | --num ) num_str="$2"; shift 2 ;;
    -t | --time ) time_str="$2"; shift 2 ;;
    --dry-run ) dryrun=1; shift ;;
    -v | --verbose ) verbose=$((verbose + 1)); shift ;;
    -- ) shift; break ;;
    * ) break ;;
  esac
done

Short, to the point, readable, and handles just about everything (IMHO).

Hope that helps someone.

Emeric Verschuur ,Feb 20, 2017 at 21:30

I have write a bash helper to write a nice bash tool

project home: https://gitlab.mbedsys.org/mbedsys/bashopts

example:

#!/bin/bash -ei

# load the library
. bashopts.sh

# Enable backtrace dusplay on error
trap 'bashopts_exit_handle' ERR

# Initialize the library
bashopts_setup -n "$0" -d "This is myapp tool description displayed on help message" -s "$HOME/.config/myapprc"

# Declare the options
bashopts_declare -n first_name -l first -o f -d "First name" -t string -i -s -r
bashopts_declare -n last_name -l last -o l -d "Last name" -t string -i -s -r
bashopts_declare -n display_name -l display-name -t string -d "Display name" -e "\$first_name \$last_name"
bashopts_declare -n age -l number -d "Age" -t number
bashopts_declare -n email_list -t string -m add -l email -d "Email adress"

# Parse arguments
bashopts_parse_args "$@"

# Process argument
bashopts_process_args

will give help:

NAME:
    ./example.sh - This is myapp tool description displayed on help message

USAGE:
    [options and commands] [-- [extra args]]

OPTIONS:
    -h,--help                          Display this help
    -n,--non-interactive true          Non interactive mode - [$bashopts_non_interactive] (type:boolean, default:false)
    -f,--first "John"                  First name - [$first_name] (type:string, default:"")
    -l,--last "Smith"                  Last name - [$last_name] (type:string, default:"")
    --display-name "John Smith"        Display name - [$display_name] (type:string, default:"$first_name $last_name")
    --number 0                         Age - [$age] (type:number, default:0)
    --email                            Email adress - [$email_list] (type:string, default:"")

enjoy :)

Josh Wulf ,Jun 24, 2017 at 18:07

I get this on Mac OS X: ``` lib/bashopts.sh: line 138: declare: -A: invalid option declare: usage: declare [-afFirtx] [-p] [name[=value] ...] Error in lib/bashopts.sh:138. 'declare -x -A bashopts_optprop_name' exited with status 2 Call tree: 1: lib/controller.sh:4 source(...) Exiting with status 1 ``` – Josh Wulf Jun 24 '17 at 18:07

Josh Wulf ,Jun 24, 2017 at 18:17

You need Bash version 4 to use this. On Mac, the default version is 3. You can use home brew to install bash 4. – Josh Wulf Jun 24 '17 at 18:17

a_z ,Mar 15, 2017 at 13:24

Here is my approach - using regexp.

script:

#!/usr/bin/env sh

help_menu() {
  echo "Usage:

  ${0##*/} [-h][-l FILENAME][-d]

Options:

  -h, --help
    display this help and exit

  -l, --logfile=FILENAME
    filename

  -d, --debug
    enable debug
  "
}

parse_options() {
  case $opt in
    h|help)
      help_menu
      exit
     ;;
    l|logfile)
      logfile=${attr}
      ;;
    d|debug)
      debug=true
      ;;
    *)
      echo "Unknown option: ${opt}\nRun ${0##*/} -h for help.">&2
      exit 1
  esac
}
options=$@

until [ "$options" = "" ]; do
  if [[ $options =~ (^ *(--([a-zA-Z0-9-]+)|-([a-zA-Z0-9-]+))(( |=)(([\_\.\?\/\\a-zA-Z0-9]?[ -]?[\_\.\?a-zA-Z0-9]+)+))?(.*)|(.+)) ]]; then
    if [[ ${BASH_REMATCH[3]} ]]; then # for --option[=][attribute] or --option[=][attribute]
      opt=${BASH_REMATCH[3]}
      attr=${BASH_REMATCH[7]}
      options=${BASH_REMATCH[9]}
    elif [[ ${BASH_REMATCH[4]} ]]; then # for block options -qwert[=][attribute] or single short option -a[=][attribute]
      pile=${BASH_REMATCH[4]}
      while (( ${#pile} > 1 )); do
        opt=${pile:0:1}
        attr=""
        pile=${pile/${pile:0:1}/}
        parse_options
      done
      opt=$pile
      attr=${BASH_REMATCH[7]}
      options=${BASH_REMATCH[9]}
    else # leftovers that don't match
      opt=${BASH_REMATCH[10]}
      options=""
    fi
    parse_options
  fi
done

mauron85 ,Jun 21, 2017 at 6:03

Like this one. Maybe just add -e param to echo with new line. – mauron85 Jun 21 '17 at 6:03

John ,Oct 10, 2017 at 22:49

Assume we create a shell script named test_args.sh as follow
#!/bin/sh
until [ $# -eq 0 ]
do
  name=${1:1}; shift;
  if [[ -z "$1" || $1 == -* ]] ; then eval "export $name=true"; else eval "export $name=$1"; shift; fi  
done
echo "year=$year month=$month day=$day flag=$flag"

After we run the following command:

sh test_args.sh  -year 2017 -flag  -month 12 -day 22

The output would be:

year=2017 month=12 day=22 flag=true

Will Barnwell ,Oct 10, 2017 at 23:57

This takes the same approach as Noah's answer , but has less safety checks / safeguards. This allows us to write arbitrary arguments into the script's environment and I'm pretty sure your use of eval here may allow command injection. – Will Barnwell Oct 10 '17 at 23:57

Masadow ,Oct 6, 2015 at 8:53

Here is my improved solution of Bruno Bronosky's answer using variable arrays.

it lets you mix parameters position and give you a parameter array preserving the order without the options

#!/bin/bash

echo $@

PARAMS=()
SOFT=0
SKIP=()
for i in "$@"
do
case $i in
    -n=*|--skip=*)
    SKIP+=("${i#*=}")
    ;;
    -s|--soft)
    SOFT=1
    ;;
    *)
        # unknown option
        PARAMS+=("$i")
    ;;
esac
done
echo "SKIP            = ${SKIP[@]}"
echo "SOFT            = $SOFT"
    echo "Parameters:"
    echo ${PARAMS[@]}

Will output for example:

$ ./test.sh parameter -s somefile --skip=.c --skip=.obj
parameter -s somefile --skip=.c --skip=.obj
SKIP            = .c .obj
SOFT            = 1
Parameters:
parameter somefile

Jason S ,Dec 3, 2017 at 1:01

You use shift on the known arguments and not on the unknown ones so your remaining $@ will be all but the first two arguments (in the order they are passed in), which could lead to some mistakes if you try to use $@ later. You don't need the shift for the = parameters, since you're not handling spaces and you're getting the value with the substring removal #*=Jason S Dec 3 '17 at 1:01

Masadow ,Dec 5, 2017 at 9:17

You're right, in fact, since I build a PARAMS variable, I don't need to use shift at all – Masadow Dec 5 '17 at 9:17

[Jul 04, 2018] file permissions - Bash test if a directory is writable by a given UID - Stack Overflow

Jul 04, 2018 | stackoverflow.com

> ,

You can use sudo to execute the test in your script. For instance:
sudo -u mysql -H sh -c "if [ -w $directory ] ; then echo 'Eureka' ; fi"

To do this, the user executing the script will need sudo privileges of course.

If you explicitly need the uid instead of the username, you can also use:

sudo -u \#42 -H sh -c "if [ -w $directory ] ; then echo 'Eureka' ; fi"

In this case, 42 is the uid of the mysql user. Substitute your own value if needed.

UPDATE (to support non-sudo-priviledged users)
To get a bash script to change-users without sudu would be to require the ability to suid ("switch user id"). This, as pointed out by this answer , is a security restriction that requires a hack to work around. Check this blog for an example of "how to" work around it (I haven't tested/tried it, so I can't confirm it's success).

My recommendation, if possible, would be to write a script in C that is given permission to suid (try chmod 4755 file-name ). Then, you can call setuid(#) from the C script to set the current user's id and either continue code-execution from the C application, or have it execute a separate bash script that runs whatever commands you need/want. This is also a pretty hacky method, but as far as non-sudo alternatives it's probably one of the easiest (in my opinion).

,

I've written a function can_user_write_to_file which will return 1 if the user passed to it either is the owner of the file/directory, or is member of a group which has write access to that file/directory. If not, the method returns 0 .
## Method which returns 1 if the user can write to the file or
## directory.
##
## $1 :: user name
## $2 :: file
function can_user_write_to_file() {
  if [[ $# -lt 2 || ! -r $2 ]]; then
    echo 0
    return
  fi

  local user_id=$(id -u ${1} 2>/dev/null)
  local file_owner_id=$(stat -c "%u" $2)
  if [[ ${user_id} == ${file_owner_id} ]]; then
    echo 1
    return
  fi

  local file_access=$(stat -c "%a" $2)
  local file_group_access=${file_access:1:1}
  local file_group_name=$(stat -c "%G" $2)
  local user_group_list=$(groups $1 2>/dev/null)

  if [ ${file_group_access} -ge 6 ]; then
    for el in ${user_group_list-nop}; do
      if [[ "${el}" == ${file_group_name} ]]; then
        echo 1
        return
      fi
    done
  fi

  echo 0
}

To test it, I wrote a wee test function:

function test_can_user_write_to_file() {
  echo "The file is: $(ls -l $2)"
  echo "User is:" $(groups $1 2>/dev/null)
  echo "User" $1 "can write to" $2 ":" $(can_user_write_to_file $1 $2)
  echo ""
}

test_can_user_write_to_file root /etc/fstab
test_can_user_write_to_file invaliduser /etc/motd
test_can_user_write_to_file torstein /home/torstein/.xsession
test_can_user_write_to_file torstein /tmp/file-with-only-group-write-access

At least from these tests, the method works as intended considering file ownership and group write access :-)

,

Because I had to make some changes to @chepner's answer in order to get it to work, I'm posting my ad-hoc script here for easy copy & paste. It's a minor refactoring only, and I have upvoted chepner's answer. I'll delete mine if the accepted answer is updated with these fixes. I have already left comments on that answer pointing out the things I had trouble with.

I wanted to do away with the Bashisms so that's why I'm not using arrays at all. The (( arithmetic evaluation )) is still a Bash-only feature, so I'm stuck on Bash after all.

for f; do
    set -- $(stat -Lc "0%a %G %U" "$f")
    (("$1" & 0002)) && continue
    if (("$1" & 0020)); then
        case " "$(groups "$USER")" " in *" "$2" "*) continue ;; esac
    elif (("$1" & 0200)); then
        [ "$3" = "$USER" ] && continue
    fi
    echo "$0: Wrong permissions" "$@" "$f" >&2
done

Without the comments, this is even fairly compact.

[Jul 02, 2018] How can I detect whether a symlink is broken in Bash - Stack Overflow

Jul 02, 2018 | stackoverflow.com

How can I detect whether a symlink is broken in Bash? Ask Question up vote 38 down vote favorite 7


zoltanctoth ,Nov 8, 2011 at 10:39

I run find and iterate through the results with [ \( -L $F \) ] to collect certain symbolic links.

I am wondering if there is an easy way to determine if the link is broken (points to a non-existent file) in this scenario.

Here is my code:

FILES=`find /target/ | grep -v '\.disabled$' | sort`

for F in $FILES; do
    if [ -L $F ]; then
        DO THINGS
    fi
done

Roger ,Nov 8, 2011 at 10:45

# test if file exists (test actual file, not symbolic link)
if [ ! -e "$F" ] ; then
    # code if the symlink is broken
fi

Calimo ,Apr 18, 2017 at 19:50

Note that the code will also be executed if the file does not exist at all. It is fine with find but in other scenarios (such as globs) should be combined with -h to handle this case, for instance [ -h "$F" -a ! -e "$F" ] . – Calimo Apr 18 '17 at 19:50

Sridhar-Sarnobat ,Jul 13, 2017 at 22:36

You're not really testing the symbolic link with this approach. – Sridhar-Sarnobat Jul 13 '17 at 22:36

Melab ,Jul 24, 2017 at 15:22

@Calimo There is no difference. – Melab Jul 24 '17 at 15:22

Shawn Chin ,Nov 8, 2011 at 10:51

This should print out links that are broken:
find /target/dir -type l ! -exec test -e {} \; -print

You can also chain in operations to find command, e.g. deleting the broken link:

find /target/dir -type l ! -exec test -e {} \; -exec rm {} \;

Andrew Schulman ,Nov 8, 2011 at 10:43

readlink -q will fail silently if the link is bad:
for F in $FILES; do
    if [ -L $F ]; then
        if readlink -q $F >/dev/null ; then
            DO THINGS
        else
            echo "$F: bad link" >/dev/stderr
        fi
    fi
done

zoltanctoth ,Nov 8, 2011 at 10:55

this seems pretty nice as this only returns true if the file is actually a symlink. But even with adding -q, readlink outputs the name of the link on linux. If this is the case in general maybe the answer should be updated with 'readlink -q $F > dev/null'. Or am I missing something? – zoltanctoth Nov 8 '11 at 10:55

Andrew Schulman ,Nov 8, 2011 at 11:02

No, you're right. Corrected, thanks. – Andrew Schulman Nov 8 '11 at 11:02

Chaim Geretz ,Mar 31, 2015 at 21:09

Which version? I don't see this behavior on my system readlink --version readlink (coreutils) 5.2.1 – Chaim Geretz Mar 31 '15 at 21:09

Aquarius Power ,May 4, 2014 at 23:46

this will work if the symlink was pointing to a file or a directory, but now is broken
if [[ -L "$strFile" ]] && [[ ! -a "$strFile" ]];then 
  echo "'$strFile' is a broken symlink"; 
fi

ACyclic ,May 24, 2014 at 13:02

This finds all files of type "link", which also resolves to a type "link". ie. a broken symlink
find /target -type l -xtype l

cdelacroix ,Jun 23, 2015 at 12:59

variant: find -L /target -type lcdelacroix Jun 23 '15 at 12:59

Sridhar-Sarnobat ,Jul 13, 2017 at 22:38

Can't you have a symlink to a symlink that isn't broken?' – Sridhar-Sarnobat Jul 13 '17 at 22:38

,

If you don't mind traversing non-broken dir symlinks, to find all orphaned links:
$ find -L /target -type l | while read -r file; do echo $file is orphaned; done

To find all files that are not orphaned links:

$ find -L /target ! -type l

[Jul 02, 2018] command line - How can I find broken symlinks

Mar 15, 2012 | unix.stackexchange.com

gabe, Mar 15, 2012 at 16:29

Is there a way to find all symbolic links that don't point anywere?

find ./ -type l

will give me all symbolic links, but makes no distinction between links that go somewhere and links that don't.

I'm currently doing:

find ./ -type l -exec file {} \; |grep broken

But I'm wondering what alternate solutions exist.

rozcietrzewiacz ,May 15, 2012 at 7:01

I'd strongly suggest not to use find -L for the task (see below for explanation). Here are some other ways to do this:

The find -L trick quoted by solo from commandlinefu looks nice and hacky, but it has one very dangerous pitfall : All the symlinks are followed. Consider directory with the contents presented below:

$ ls -l
total 0
lrwxrwxrwx 1 michal users  6 May 15 08:12 link_1 -> nonexistent1
lrwxrwxrwx 1 michal users  6 May 15 08:13 link_2 -> nonexistent2
lrwxrwxrwx 1 michal users  6 May 15 08:13 link_3 -> nonexistent3
lrwxrwxrwx 1 michal users  6 May 15 08:13 link_4 -> nonexistent4
lrwxrwxrwx 1 michal users 11 May 15 08:20 link_out -> /usr/share/

If you run find -L . -type l in that directory, all /usr/share/ would be searched as well (and that can take really long) 1 . For a find command that is "immune to outgoing links", don't use -L .


1 This may look like a minor inconvenience (the command will "just" take long to traverse all /usr/share ) – but can have more severe consequences. For instance, consider chroot environments: They can exist in some subdirectory of the main filesystem and contain symlinks to absolute locations. Those links could seem to be broken for the "outside" system, because they only point to proper places once you've entered the chroot. I also recall that some bootloader used symlinks under /boot that only made sense in an initial boot phase, when the boot partition was mounted as / .

So if you use a find -L command to find and then delete broken symlinks from some harmless-looking directory, you might even break your system...

quornian ,Nov 17, 2012 at 21:56

I think -type l is redundant since -xtype l will operate as -type l on non-links. So find -xtype l is probably all you need. Thanks for this approach. – quornian Nov 17 '12 at 21:56

qwertzguy ,Jan 8, 2015 at 21:37

Be aware that those solutions don't work for all filesystem types. For example it won't work for checking if /proc/XXX/exe link is broken. For this, use test -e "$(readlink /proc/XXX/exe)" . – qwertzguy Jan 8 '15 at 21:37

weakish ,Apr 8, 2016 at 4:57

@Flimm find . -xtype l means "find all symlinks whose (ultimate) target files are symlinks". But the ultimate target of a symlink cannot be a symlink, otherwise we can still follow the link and it is not the ultimate target. Since there is no such symlinks, we can define them as something else, i.e. broken symlinks. – weakish Apr 8 '16 at 4:57

weakish ,Apr 22, 2016 at 12:19

@JoóÁdám "which can only be a symbolic link in case it is broken". Give "broken symbolic link" or "non exist file" an individual type, instead of overloading l , is less confusing to me. – weakish Apr 22 '16 at 12:19

Alois Mahdal ,Jul 15, 2016 at 0:22

The warning at the end is useful, but note that this does not apply to the -L hack but rather to (blindly) removing broken symlinks in general. – Alois Mahdal Jul 15 '16 at 0:22

Sam Morris ,Mar 15, 2012 at 17:38

The symlinks command from http://www.ibiblio.org/pub/Linux/utils/file/symlinks-1.4.tar.gz can be used to identify symlinks with a variety of characteristics. For instance:
$ rm a
$ ln -s a b
$ symlinks .
dangling: /tmp/b -> a

qed ,Jul 27, 2014 at 20:32

Is this tool available for osx? – qed Jul 27 '14 at 20:32

qed ,Jul 27, 2014 at 20:51

Never mind, got it compiled. – qed Jul 27 '14 at 20:51

Daniel Jonsson ,Apr 11, 2015 at 22:11

Apparently symlinks is pre-installed on Fedora. – Daniel Jonsson Apr 11 '15 at 22:11

pooryorick ,Sep 29, 2012 at 14:02

As rozcietrzewiacz has already commented, find -L can have unexpected consequence of expanding the search into symlinked directories, so isn't the optimal approach. What no one has mentioned yet is that
find /path/to/search -xtype l

is the more concise, and logically identical command to

find /path/to/search -type l -xtype l

None of the solutions presented so far will detect cyclic symlinks, which is another type of breakage. this question addresses portability. To summarize, the portable way to find broken symbolic links, including cyclic links, is:

find /path/to/search -type l -exec test ! -e {} \; -print

For more details, see this question or ynform.org . Of course, the definitive source for all this is the findutils documentaton .

Flimm ,Oct 7, 2014 at 13:00

Short, consice, and addresses the find -L pitfall as well as cyclical links. +1 – Flimm Oct 7 '14 at 13:00

neu242 ,Aug 1, 2016 at 10:03

Nice. The last one works on MacOSX as well, while @rozcietrzewiacz's answer didn't. – neu242 Aug 1 '16 at 10:03

kwarrick ,Mar 15, 2012 at 16:52

I believe adding the -L flag to your command will allow you do get rid of the grep:
$ find -L . -type l

http://www.commandlinefu.com/commands/view/8260/find-broken-symlinks

from the man:

 -L      Cause the file information and file type (see stat(2)) returned 
         for each symbolic link to be those of the file referenced by the
         link, not the link itself. If the referenced file does not exist,
         the file information and type will be for the link itself.

rozcietrzewiacz ,May 15, 2012 at 7:37

At first I've upvoted this, but then I've realised how dangerous it may be. Before you use it, please have a look at my answer ! – rozcietrzewiacz May 15 '12 at 7:37

andy ,Dec 26, 2012 at 6:56

If you need a different behavior whether the link is broken or cyclic you can also use %Y with find:
$ touch a
$ ln -s a b  # link to existing target
$ ln -s c d  # link to non-existing target
$ ln -s e e  # link to itself
$ find . -type l -exec test ! -e {} \; -printf '%Y %p\n' \
   | while read type link; do
         case "$type" in
         N) echo "do something with broken link $link" ;;
         L) echo "do something with cyclic link $link" ;;
         esac
      done
do something with broken link ./d
do something with cyclic link ./e

This example is copied from this post (site deleted) .

Reference

syntaxerror ,Jun 25, 2015 at 0:28

Yet another shorthand for those whose find command does not support xtype can be derived from this: find . type l -printf "%Y %p\n" | grep -w '^N' . As andy beat me to it with the same (basic) idea in his script, I was reluctant to write it as separate answer. :) – syntaxerror Jun 25 '15 at 0:28

Alex ,Apr 30, 2013 at 6:37

find -L . -type l |xargs symlinks will give you info whether the link exists or not on a per foundfile basis.

conradkdotcom ,Oct 24, 2014 at 14:33

This will print out the names of broken symlinks in the current directory.
for l in $(find . -type l); do cd $(dirname $l); if [ ! -e "$(readlink $(basename $l))" ]; then echo $l; fi; cd - > /dev/null; done

Works in Bash. Don't know about other shells.

Iskren ,Aug 8, 2015 at 14:01

I use this for my case and it works quite well, as I know the directory to look for broken symlinks:
find -L $path -maxdepth 1 -type l

and my folder does include a link to /usr/share but it doesn't traverse it. Cross-device links and those that are valid for chroots, etc. are still a pitfall but for my use case it's sufficient.

,

Simple no-brainer answer, which is a variation on OP's version. Sometimes, you just want something easy to type or remember:
find . | xargs file | grep -i "broken symbolic link"

[Jul 02, 2018] Explanation of % directives in find -printf

Jul 02, 2018 | unix.stackexchange.com

san1512 ,Jul 11, 2015 at 6:24

find /tmp -printf '%s %p\n' |sort -n -r | head

This command is working fine but what are the %s %p options used here? Are there any other options that can be used?

Cyrus ,Jul 11, 2015 at 6:41

Take a look at find's manpage. – Cyrus Jul 11 '15 at 6:41

phuclv ,Oct 9, 2017 at 3:13

possible duplicate of Where to find printf formatting reference?phuclv Oct 9 '17 at 3:13

Hennes ,Jul 11, 2015 at 6:34

What are the %s %p options used here?

From the man page :

%s File's size in bytes.

%p File's name.

Scroll down on that page beyond all the regular letters for printf and read the parts which come prefixed with a %.

%n Number of hard links to file.

%p File's name.

%P File's name with the name of the starting-point under which it was found removed.

%s File's size in bytes.

%t File's last modification time in the format returned by the C `ctime' function.

Are there any other options that can be used?

There are. See the link to the manpage.

Kusalananda ,Nov 17, 2017 at 9:53

@don_crissti I'll never understand why people prefer random web documentation to the documentation installed on their systems (which has the added benefit of actually being relevant to their system). – Kusalananda Nov 17 '17 at 9:53

don_crissti ,Nov 17, 2017 at 12:52

@Kusalananda - Well, I can think of one scenario in which people would include a link to a web page instead of a quote from the documentation installed on their system: they're not on a linux machine at the time of writing the post... However, the link should point (imo) to the official docs (hence my comment above, which, for some unknown reason, was deleted by the mods...). That aside, I fully agree with you: the OP should consult the manual page installed on their system. – don_crissti Nov 17 '17 at 12:52

runlevel0 ,Feb 15 at 12:10

@don_crissti Or they are on a server that has no manpages installed which is rather frequent. – runlevel0 Feb 15 at 12:10

Hennes ,Feb 16 at 16:16

My manual page tend to be from FreeBSD though. Unless I happen to have a Linux VM within reach. And I have the impression that most questions are GNU/Linux based. – Hennes Feb 16 at 16:16

[Jun 23, 2018] Bash script processing limited number of commands in parallel

Jun 23, 2018 | stackoverflow.com

AL-Kateb ,Oct 23, 2013 at 13:33

I have a bash script that looks like this:
#!/bin/bash
wget LINK1 >/dev/null 2>&1
wget LINK2 >/dev/null 2>&1
wget LINK3 >/dev/null 2>&1
wget LINK4 >/dev/null 2>&1
# ..
# ..
wget LINK4000 >/dev/null 2>&1

But processing each line until the command is finished then moving to the next one is very time consuming, I want to process for instance 20 lines at once then when they're finished another 20 lines are processed.

I thought of wget LINK1 >/dev/null 2>&1 & to send the command to the background and carry on, but there are 4000 lines here this means I will have performance issues, not to mention being limited in how many processes I should start at the same time so this is not a good idea.

One solution that I'm thinking of right now is checking whether one of the commands is still running or not, for instance after 20 lines I can add this loop:

while [  $(ps -ef | grep KEYWORD | grep -v grep | wc -l) -gt 0 ]; do
sleep 1
done

Of course in this case I will need to append & to the end of the line! But I'm feeling this is not the right way to do it.

So how do I actually group each 20 lines together and wait for them to finish before going to the next 20 lines, this script is dynamically generated so I can do whatever math I want on it while it's being generated, but it DOES NOT have to use wget, it was just an example so any solution that is wget specific is not gonna do me any good.

kojiro ,Oct 23, 2013 at 13:46

wait is the right answer here, but your while [ $(ps would be much better written while pkill -0 $KEYWORD – using proctools that is, for legitimate reasons to check if a process with a specific name is still running. – kojiro Oct 23 '13 at 13:46

VasyaNovikov ,Jan 11 at 19:01

I think this question should be re-opened. The "possible duplicate" QA is all about running a finite number of programs in parallel. Like 2-3 commands. This question, however, is focused on running commands in e.g. a loop. (see "but there are 4000 lines"). – VasyaNovikov Jan 11 at 19:01

robinCTS ,Jan 11 at 23:08

@VasyaNovikov Have you read all the answers to both this question and the duplicate? Every single answer to this question here, can also be found in the answers to the duplicate question. That is precisely the definition of a duplicate question. It makes absolutely no difference whether or not you are running the commands in a loop. – robinCTS Jan 11 at 23:08

VasyaNovikov ,Jan 12 at 4:09

@robinCTS there are intersections, but questions themselves are different. Also, 6 of the most popular answers on the linked QA deal with 2 processes only. – VasyaNovikov Jan 12 at 4:09

Dan Nissenbaum ,Apr 20 at 15:35

I recommend reopening this question because its answer is clearer, cleaner, better, and much more highly upvoted than the answer at the linked question, though it is three years more recent. – Dan Nissenbaum Apr 20 at 15:35

devnull ,Oct 23, 2013 at 13:35

Use the wait built-in:
process1 &
process2 &
process3 &
process4 &
wait
process5 &
process6 &
process7 &
process8 &
wait

For the above example, 4 processes process1 .. process4 would be started in the background, and the shell would wait until those are completed before starting the next set ..

From the manual :

wait [jobspec or pid ...]

Wait until the child process specified by each process ID pid or job specification jobspec exits and return the exit status of the last command waited for. If a job spec is given, all processes in the job are waited for. If no arguments are given, all currently active child processes are waited for, and the return status is zero. If neither jobspec nor pid specifies an active child process of the shell, the return status is 127.

kojiro ,Oct 23, 2013 at 13:48

So basically i=0; waitevery=4; for link in "${links[@]}"; do wget "$link" & (( i++%waitevery==0 )) && wait; done >/dev/null 2>&1kojiro Oct 23 '13 at 13:48

rsaw ,Jul 18, 2014 at 17:26

Unless you're sure that each process will finish at the exact same time, this is a bad idea. You need to start up new jobs to keep the current total jobs at a certain cap .... parallel is the answer. – rsaw Jul 18 '14 at 17:26

DomainsFeatured ,Sep 13, 2016 at 22:55

Is there a way to do this in a loop? – DomainsFeatured Sep 13 '16 at 22:55

Bobby ,Apr 27, 2017 at 7:55

I've tried this but it seems that variable assignments done in one block are not available in the next block. Is this because they are separate processes? Is there a way to communicate the variables back to the main process? – Bobby Apr 27 '17 at 7:55

choroba ,Oct 23, 2013 at 13:38

See parallel . Its syntax is similar to xargs , but it runs the commands in parallel.

chepner ,Oct 23, 2013 at 14:35

This is better than using wait , since it takes care of starting new jobs as old ones complete, instead of waiting for an entire batch to finish before starting the next. – chepner Oct 23 '13 at 14:35

Mr. Llama ,Aug 13, 2015 at 19:30

For example, if you have the list of links in a file, you can do cat list_of_links.txt | parallel -j 4 wget {} which will keep four wget s running at a time. – Mr. Llama Aug 13 '15 at 19:30

0x004D44 ,Nov 2, 2015 at 21:42

There is a new kid in town called pexec which is a replacement for parallel . – 0x004D44 Nov 2 '15 at 21:42

mat ,Mar 1, 2016 at 21:04

Not to be picky, but xargs can also parallelize commands. – mat Mar 1 '16 at 21:04

Vader B ,Jun 27, 2016 at 6:41

In fact, xargs can run commands in parallel for you. There is a special -P max_procs command-line option for that. See man xargs .

> ,

You can run 20 processes and use the command:
wait

Your script will wait and continue when all your background jobs are finished.

[Jun 23, 2018] parallelism - correct xargs parallel usage

Jun 23, 2018 | unix.stackexchange.com

Yan Zhu ,Apr 19, 2015 at 6:59

I am using xargs to call a python script to process about 30 million small files. I hope to use xargs to parallelize the process. The command I am using is:
find ./data -name "*.json" -print0 |
  xargs -0 -I{} -P 40 python Convert.py {} > log.txt

Basically, Convert.py will read in a small json file (4kb), do some processing and write to another 4kb file. I am running on a server with 40 CPU cores. And no other CPU-intense process is running on this server.

By monitoring htop (btw, is there any other good way to monitor the CPU performance?), I find that -P 40 is not as fast as expected. Sometimes all cores will freeze and decrease almost to zero for 3-4 seconds, then will recover to 60-70%. Then I try to decrease the number of parallel processes to -P 20-30 , but it's still not very fast. The ideal behavior should be linear speed-up. Any suggestions for the parallel usage of xargs ?

Ole Tange ,Apr 19, 2015 at 8:45

You are most likely hit by I/O: The system cannot read the files fast enough. Try starting more than 40: This way it will be fine if some of the processes have to wait for I/O. – Ole Tange Apr 19 '15 at 8:45

Fox ,Apr 19, 2015 at 10:30

What kind of processing does the script do? Any database/network/io involved? How long does it run? – Fox Apr 19 '15 at 10:30

PSkocik ,Apr 19, 2015 at 11:41

I second @OleTange. That is the expected behavior if you run as many processes as you have cores and your tasks are IO bound. First the cores will wait on IO for their task (sleep), then they will process, and then repeat. If you add more processes, then the additional processes that currently aren't running on a physical core will have kicked off parallel IO operations, which will, when finished, eliminate or at least reduce the sleep periods on your cores. – PSkocik Apr 19 '15 at 11:41

Bichoy ,Apr 20, 2015 at 3:32

1- Do you have hyperthreading enabled? 2- in what you have up there, log.txt is actually overwritten with each call to convert.py ... not sure if this is the intended behavior or not. – Bichoy Apr 20 '15 at 3:32

Ole Tange ,May 11, 2015 at 18:38

xargs -P and > is opening up for race conditions because of the half-line problem gnu.org/software/parallel/ Using GNU Parallel instead will not have that problem. – Ole Tange May 11 '15 at 18:38

James Scriven ,Apr 24, 2015 at 18:00

I'd be willing to bet that your problem is python . You didn't say what kind of processing is being done on each file, but assuming you are just doing in-memory processing of the data, the running time will be dominated by starting up 30 million python virtual machines (interpreters).

If you can restructure your python program to take a list of files, instead of just one, you will get a huge improvement in performance. You can then still use xargs to further improve performance. For example, 40 processes, each processing 1000 files:

find ./data -name "*.json" -print0 |
  xargs -0 -L1000 -P 40 python Convert.py

This isn't to say that python is a bad/slow language; it's just not optimized for startup time. You'll see this with any virtual machine-based or interpreted language. Java, for example, would be even worse. If your program was written in C, there would still be a cost of starting a separate operating system process to handle each file, but it would be much less.

From there you can fiddle with -P to see if you can squeeze out a bit more speed, perhaps by increasing the number of processes to take advantage of idle processors while data is being read/written.

Stephen ,Apr 24, 2015 at 13:03

So firstly, consider the constraints:

What is the constraint on each job? If it's I/O you can probably get away with multiple jobs per CPU core up till you hit the limit of I/O, but if it's CPU intensive, its going to be worse than pointless running more jobs concurrently than you have CPU cores.

My understanding of these things is that GNU Parallel would give you better control over the queue of jobs etc.

See GNU parallel vs & (I mean background) vs xargs -P for a more detailed explanation of how the two differ.

,

As others said, check whether you're I/O-bound. Also, xargs' man page suggests using -n with -P , you don't mention the number of Convert.py processes you see running in parallel.

As a suggestion, if you're I/O-bound, you might try using an SSD block device, or try doing the processing in a tmpfs (of course, in this case you should check for enough memory, avoiding swap due to tmpfs pressure (I think), and the overhead of copying the data to it in the first place).

[Jun 23, 2018] Linux/Bash, how to schedule commands in a FIFO queue?

Jun 23, 2018 | superuser.com

Andrei ,Apr 10, 2013 at 14:26

I want the ability to schedule commands to be run in a FIFO queue. I DON'T want them to be run at a specified time in the future as would be the case with the "at" command. I want them to start running now, but not simultaneously. The next scheduled command in the queue should be run only after the first command finishes executing. Alternatively, it would be nice if I could specify a maximum number of commands from the queue that could be run simultaneously; for example if the maximum number of simultaneous commands is 2, then only at most 2 commands scheduled in the queue would be taken from the queue in a FIFO manner to be executed, the next command in the remaining queue being started only when one of the currently 2 running commands finishes.

I've heard task-spooler could do something like this, but this package doesn't appear to be well supported/tested and is not in the Ubuntu standard repositories (Ubuntu being what I'm using). If that's the best alternative then let me know and I'll use task-spooler, otherwise, I'm interested to find out what's the best, easiest, most tested, bug-free, canonical way to do such a thing with bash.

UPDATE:

Simple solutions like ; or && from bash do not work. I need to schedule these commands from an external program, when an event occurs. I just don't want to have hundreds of instances of my command running simultaneously, hence the need for a queue. There's an external program that will trigger events where I can run my own commands. I want to handle ALL triggered events, I don't want to miss any event, but I also don't want my system to crash, so that's why I want a queue to handle my commands triggered from the external program.

Andrei ,Apr 11, 2013 at 11:40

Task Spooler:

http://vicerveza.homeunix.net/~viric/soft/ts/

https://launchpad.net/ubuntu/+source/task-spooler/0.7.3-1

Does the trick very well. Hopefully it will be included in Ubuntu's package repos.

Hennes ,Apr 10, 2013 at 15:00

Use ;

For example:
ls ; touch test ; ls

That will list the directory. Only after ls has run it will run touch test which will create a file named test. And only after that has finished it will run the next command. (In this case another ls which will show the old contents and the newly created file).

Similar commands are || and && .

; will always run the next command.

&& will only run the next command it the first returned success.
Example: rm -rf *.mp3 && echo "Success! All MP3s deleted!"

|| will only run the next command if the first command returned a failure (non-zero) return value. Example: rm -rf *.mp3 || echo "Error! Some files could not be deleted! Check permissions!"

If you want to run a command in the background, append an ampersand ( & ).
Example:
make bzimage &
mp3blaster sound.mp3
make mytestsoftware ; ls ; firefox ; make clean

Will run two commands int he background (in this case a kernel build which will take some time and a program to play some music). And in the foregrounds it runs another compile job and, once that is finished ls, firefox and a make clean (all sequentially)

For more details, see man bash


[Edit after comment]

in pseudo code, something like this?

Program run_queue:

While(true)
{
   Wait_for_a_signal();

   While( queue not empty )
   {
       run next command from the queue.
       remove this command from the queue.
       // If commands where added to the queue during execution then
       // the queue is not empty, keep processing them all.
   }
   // Queue is now empty, returning to wait_for_a_signal
}
// 
// Wait forever on commands and add them to a queue
// Signal run_quueu when something gets added.
//
program add_to_queue()
{
   While(true)
   {
       Wait_for_event();
       Append command to queue
       signal run_queue
   }    
}

terdon ,Apr 10, 2013 at 15:03

The easiest way would be to simply run the commands sequentially:
cmd1; cmd2; cmd3; cmdN

If you want the next command to run only if the previous command exited successfully, use && :

cmd1 && cmd2 && cmd3 && cmdN

That is the only bash native way I know of doing what you want. If you need job control (setting a number of parallel jobs etc), you could try installing a queue manager such as TORQUE but that seems like overkill if all you want to do is launch jobs sequentially.

psusi ,Apr 10, 2013 at 15:24

You are looking for at 's twin brother: batch . It uses the same daemon but instead of scheduling a specific time, the jobs are queued and will be run whenever the system load average is low.

mpy ,Apr 10, 2013 at 14:59

Apart from dedicated queuing systems (like the Sun Grid Engine ) which you can also use locally on one machine and which offer dozens of possibilities, you can use something like
 command1 && command2 && command3

which is the other extreme -- a very simple approach. The latter neither does provide multiple simultaneous processes nor gradually filling of the "queue".

Bogdan Dumitru ,May 3, 2016 at 10:12

I went on the same route searching, trying out task-spooler and so on. The best of the best is this:

GNU Parallel --semaphore --fg It also has -j for parallel jobs.

[Jun 20, 2018] bash - sudo as another user with their environment

Using strace is an interesting debugging tip
Jun 20, 2018 | unix.stackexchange.com

user80551 ,Jan 2, 2015 at 4:29

$ whoami
admin
$ sudo -S -u otheruser whoami
otheruser
$ sudo -S -u otheruser /bin/bash -l -c 'echo $HOME'
/home/admin

Why isn't $HOME being set to /home/otheruser even though bash is invoked as a login shell?

Specifically, /home/otheruser/.bashrc isn't being sourced. Also, /home/otheruser/.profile isn't being sourced. - ( /home/otheruser/.bash_profile doesn't exist)

EDIT: The exact problem is actually https://stackoverflow.com/questions/27738224/mkvirtualenv-with-fabric-as-another-user-fails

Pavel Šimerda ,Jan 2, 2015 at 8:29

A solution to this question will solve the other question as well, you might want to delete the other question in this situation. – Pavel Šimerda Jan 2 '15 at 8:29

Pavel Šimerda ,Jan 2, 2015 at 8:27

To invoke a login shell using sudo just use -i . When command is not specified you'll get a login shell prompt, otherwise you'll get the output of your command.

Example (login shell):

sudo -i

Example (with a specified user):

sudo -i -u user

Example (with a command):

sudo -i -u user whoami

Example (print user's $HOME ):

sudo -i -u user echo \$HOME

Note: The backslash character ensures that the dollar sign reaches the target user's shell and is not interpreted in the calling user's shell.

I have just checked the last example with strace which tells you exactly what's happening. The output bellow shows that the shell is being called with --login and with the specified command, just as in your explicit call to bash, but in addition sudo can do its own work like setting the $HOME .

# strace -f -e process sudo -S -i -u user echo \$HOME
execve("/usr/bin/sudo", ["sudo", "-S", "-i", "-u", "user", "echo", "$HOME"], [/* 42 vars */]) = 0
...
[pid 12270] execve("/bin/bash", ["-bash", "--login", "-c", "echo \\$HOME"], [/* 16 vars */]) = 0
...

I noticed that you are using -S and I don't think it is generally a good technique. If you want to run commands as a different user without performing authentication from the keyboard, you might want to use SSH instead. It works for localhost as well as for other hosts and provides public key authentication that works without any interactive input.

ssh user@localhost echo \$HOME

Note: You don't need any special options with SSH as the SSH server always creates a login shell to be accessed by the SSH client.

John_West ,Nov 23, 2015 at 11:12

sudo -i -u user echo \$HOME doesn't work for me. Output: $HOME . strace gives the same output as yours. What's the issue? – John_West Nov 23 '15 at 11:12

Pavel Šimerda ,Jan 20, 2016 at 19:02

No idea, it still works for me, I'd need to see it or maybe even touch the system. – Pavel Šimerda Jan 20 '16 at 19:02

Jeff Snider ,Jan 2, 2015 at 8:04

You're giving Bash too much credit. All "login shell" means to Bash is what files are sourced at startup and shutdown. The $HOME variable doesn't figure into it.

The Bash docs explain some more what login shell means: https://www.gnu.org/software/bash/manual/html_node/Bash-Startup-Files.html#Bash-Startup-Files

In fact, Bash doesn't do anything to set $HOME at all. $HOME is set by whatever invokes the shell (login, ssh, etc.), and the shell inherits it. Whatever started your shell as admin set $HOME and then exec-ed bash , sudo by design doesn't alter the environment unless asked or configured to do so, so bash as otheruser inherited it from your shell.

If you want sudo to handle more of the environment in the way you're expecting, look at the -i switch for sudo. Try:

sudo -S -u otheruser -i /bin/bash -l -c 'echo $HOME'

The man page for sudo describes it in more detail, though not really well, I think: http://linux.die.net/man/8/sudo

user80551 ,Jan 2, 2015 at 8:11

$HOME isn't set by bash - Thanks, I didn't know that. – user80551 Jan 2 '15 at 8:11

Pavel Šimerda ,Jan 2, 2015 at 9:46

Look for strace in my answer. It shows that you don't need to build /bin/bash -l -c 'echo $HOME' command line yourself when using -i .

palswim ,Oct 13, 2016 at 20:21

That sudo syntax threw an error on my machine. ( su uses the -c option, but I don't think sudo does.) I had better luck with: HomeDir=$( sudo -u "$1" -H -s echo "\$HOME" )palswim Oct 13 '16 at 20:21

[Jun 20, 2018] What are the differences between su, sudo -s, sudo -i, sudo su

Notable quotes:
"... (which means "substitute user" or "switch user") ..."
"... (hmm... what's the mnemonic? Super-User-DO?) ..."
"... The official meaning of "su" is "substitute user" ..."
"... Interestingly, Ubuntu's manpage does not mention "substitute" at all. The manpage at gnu.org ( gnu.org/software/coreutils/manual/html_node/su-invocation.html ) does indeed say "su: Run a command with substitute user and group ID". ..."
"... sudo -s runs a [specified] shell with root privileges. sudo -i also acquires the root user's environment. ..."
"... To see the difference between su and sudo -s , do cd ~ and then pwd after each of them. In the first case, you'll be in root's home directory, because you're root. In the second case, you'll be in your own home directory, because you're yourself with root privileges. There's more discussion of this exact question here . ..."
"... I noticed sudo -s doesnt seem to process /etc/profile ..."
Jun 20, 2018 | askubuntu.com

Sergey ,Oct 22, 2011 at 7:21

The main difference between these commands is in the way they restrict access to their functions.

su (which means "substitute user" or "switch user") - does exactly that, it starts another shell instance with privileges of the target user. To ensure you have the rights to do that, it asks you for the password of the target user . So, to become root, you need to know root password. If there are several users on your machine who need to run commands as root, they all need to know root password - note that it'll be the same password. If you need to revoke admin permissions from one of the users, you need to change root password and tell it only to those people who need to keep access - messy.

sudo (hmm... what's the mnemonic? Super-User-DO?) is completely different. It uses a config file (/etc/sudoers) which lists which users have rights to specific actions (run commands as root, etc.) When invoked, it asks for the password of the user who started it - to ensure the person at the terminal is really the same "joe" who's listed in /etc/sudoers . To revoke admin privileges from a person, you just need to edit the config file (or remove the user from a group which is listed in that config). This results in much cleaner management of privileges.

As a result of this, in many Debian-based systems root user has no password set - i.e. it's not possible to login as root directly.

Also, /etc/sudoers allows to specify some additional options - i.e. user X is only able to run program Y etc.

The often-used sudo su combination works as follows: first sudo asks you for your password, and, if you're allowed to do so, invokes the next command ( su ) as a super-user. Because su is invoked by root , it require you to enter your password instead of root.

So, sudo su allows you to open a shell as another user (including root), if you're allowed super-user access by the /etc/sudoers file.

dr jimbob ,Oct 22, 2011 at 13:47

I've never seen su as "switch user", but always as superuser; the default behavior without another's user name (though it makes sense). From wikipedia : "The su command, also referred to as super user[1] as early as 1974, has also been called "substitute user", "spoof user" or "set user" because it allows changing the account associated with the current terminal (window)."

Sergey ,Oct 22, 2011 at 20:33

@dr jimbob: you're right, but I'm finding that "switch user" is kinda describes better what it does - though historically it stands for "super user". I'm also delighted to find that the wikipedia article is very similar to my answer - I never saw the article before :)

Angel O'Sphere ,Nov 26, 2013 at 13:02

The official meaning of "su" is "substitute user". See: "man su". – Angel O'Sphere Nov 26 '13 at 13:02

Sergey ,Nov 26, 2013 at 20:25

@AngelO'Sphere: Interestingly, Ubuntu's manpage does not mention "substitute" at all. The manpage at gnu.org ( gnu.org/software/coreutils/manual/html_node/su-invocation.html ) does indeed say "su: Run a command with substitute user and group ID". I think gnu.org is a canonical source :) – Sergey Nov 26 '13 at 20:25

Mike Scott ,Oct 22, 2011 at 6:28

sudo lets you run commands in your own user account with root privileges. su lets you switch user so that you're actually logged in as root.

sudo -s runs a [specified] shell with root privileges. sudo -i also acquires the root user's environment.

To see the difference between su and sudo -s , do cd ~ and then pwd after each of them. In the first case, you'll be in root's home directory, because you're root. In the second case, you'll be in your own home directory, because you're yourself with root privileges. There's more discussion of this exact question here .

Sergey ,Oct 22, 2011 at 7:28

"you're yourself with root privileges" is not what's actually happening :) Actually, it's not possible to be "yourself with root privileges" - either you're root or you're yourself. Try typing whoami in both cases. The fact that cd ~ results are different is a result of sudo -s not setting $HOME environment variable. – Sergey Oct 22 '11 at 7:28

Octopus ,Feb 6, 2015 at 22:15

@Sergey, whoami it says are 'root' because you are running the 'whoami' cmd as though you sudoed it, so temporarily (for the duration of that command) you appear to be the root user, but you might still not have full root access according to the sudoers file. – Octopus Feb 6 '15 at 22:15

Sergey ,Feb 6, 2015 at 22:24

@Octopus: what I was trying to say is that in Unix, a process can only have one UID, and that UID determines the permissions of the process. You can't be "yourself with root privileges", a program either runs with your UID or with root's UID (0). – Sergey Feb 6 '15 at 22:24

Sergey ,Feb 6, 2015 at 22:32

Regarding "you might still not have full root access according to the sudoers file": the sudoers file controls who can run which command as another user, but that happens before the command is executed. However, once you were allowed to start a process as, say, root -- the running process has root's UID and has a full access to the system, there's no way for sudo to restrict that.

Again, you're always either yourself or root, there's no "half-n-half". So, if sudoers file allows you to run shell as root -- permissions in that shell would be indistinguishable from a "normal" root shell. – Sergey Feb 6 '15 at 22:32

dotancohen ,Nov 8, 2014 at 14:07

This answer is a dupe of my answer on a dupe of this question , put here on the canonical answer so that people can find it!

The major difference between sudo -i and sudo -s is:

Here is an example, you can see that I have an application lsl in my ~/.bin/ directory which is accessible via sudo -s but not accessible with sudo -i . Note also that the Bash prompt changes as will with sudo -i but not with sudo -s :

dotancohen@melancholy:~$ ls .bin
lsl

dotancohen@melancholy:~$ which lsl
/home/dotancohen/.bin/lsl

dotancohen@melancholy:~$ sudo -i

root@melancholy:~# which lsl

root@melancholy:~# exit
logout

dotancohen@melancholy:~$ sudo -s
Sourced .bashrc

dotancohen@melancholy:~$ which lsl
/home/dotancohen/.bin/lsl

dotancohen@melancholy:~$ exit
exit

Though sudo -s is convenient for giving you the environment that you are familiar with, I recommend the use of sudo -i for two reasons:

  1. The visual reminder that you are in a 'root' session.
  2. The root environment is far less likely to be poisoned with malware, such as a rogue line in .bashrc .

meffect ,Feb 23, 2017 at 5:21

I noticed sudo -s doesnt seem to process /etc/profile , or anything I have in /etc/profile.d/ .. any idea why? – meffect Feb 23 '17 at 5:21

Marius Gedminas ,Oct 22, 2011 at 19:38

su asks for the password of the user "root".

sudo asks for your own password (and also checks if you're allowed to run commands as root, which is configured through /etc/sudoers -- by default all user accounts that belong to the "admin" group are allowed to use sudo).

sudo -s launches a shell as root, but doesn't change your working directory. sudo -i simulates a login into the root account: your working directory will be /root , and root's .profile etc. will be sourced as if on login.

DJCrashdummy ,Jul 29, 2017 at 0:58

to make the answer more complete: sudo -s is almost equal to su ($HOME is different) and sudo -i is equal to su -
In Ubuntu or a related system, I don't find much use for su in the traditional, super-user sense. sudo handles that case much better. However, su is great for becoming another user in one-off situations where configuring sudoers would be silly.

For example, if I'm repairing my system from a live CD/USB, I'll often mount my hard drive and other necessary stuff and chroot into the system. In such a case, my first command is generally:

su - myuser  # Note the '-'. It means to act as if that user had just logged in.

That way, I'm operating not as root, but as my normal user, and I then use sudo as appropriate.

[Jun 20, 2018] permission - allow sudo to another user without password

Jun 20, 2018 | apple.stackexchange.com

up vote 35 down vote favorite 11


zio ,Feb 17, 2013 at 13:12

I want to be able to 'su' to a specific user, allowing me to run any command without a password being entered.

For example:

If my login were user1 and the user I want to 'su' to is user2:

I would use the command:

su - user2

but then it prompts me with

Password:

Global nomad ,Feb 17, 2013 at 13:17

Ask the other user for the password. At least the other user knows what's been done under his/her id. – Global nomad Feb 17 '13 at 13:17

zio ,Feb 17, 2013 at 13:24

This is nothing to do with another physical user. Both ID's are mine. I know the password as I created the account. I just don't want to have to type the password every time. – zio Feb 17 '13 at 13:24

bmike ♦ ,Feb 17, 2013 at 15:32

Would it be ok to ssh to at user or do you need to inherit one shell in particular and need su to work? – bmike ♦ Feb 17 '13 at 15:32

bmike ♦ ,Feb 17, 2013 at 23:59

@zio Great use case. Does open -na Skype not work for you? – bmike ♦ Feb 17 '13 at 23:59

user495470 ,Feb 18, 2013 at 4:50

You could also try copying the application bundle and changing CFBundleIdentifier . – user495470 Feb 18 '13 at 4:50

Huygens ,Feb 18, 2013 at 7:39

sudo can do just that for you :)

It needs a bit of configuration though, but once done you would only do this:

sudo -u user2 -s

And you would be logged in as user2 without entering a password.

Configuration

To configure sudo, you must edit its configuration file via: visudo . Note: this command will open the configuration using the vi text editor, if you are unconfortable with that, you need to set another editor (using export EDITOR=<command> ) before executing the following line. Another command line editor sometimes regarded as easier is nano , so you would do export EDITOR=/usr/bin/nano . You usually need super user privilege for visudo :

sudo visudo

This file is structured in different section, the aliases, then defaults and finally at the end you have the rules. This is where you need to add the new line. So you navigate at the end of the file and add this:

user1    ALL=(user2) NOPASSWD: /bin/bash

You can replace also /bin/bash by ALL and then you could launch any command as user2 without a password: sudo -u user2 <command> .

Update

I have just seen your comment regarding Skype. You could consider adding Skype directly to the sudo's configuration file. I assume you have Skype installed in your Applications folder:

user1    ALL=(user2) NOPASSWD: /Applications/Skype.app/Contents/MacOS/Skype

Then you would call from the terminal:

sudo -u user2 /Applications/Skype.app/Contents/MacOS/Skype

bmike ♦ ,May 28, 2014 at 16:04

This is far less complicated than the ssh keys idea, so use this unless you need the ssh keys for remote access as well. – bmike ♦ May 28 '14 at 16:04

Stan Kurdziel ,Oct 26, 2015 at 16:56

One thing to note from a security-perspective is that specifying a specific command implies that it should be a read-only command for user1; Otherwise, they can overwrite the command with something else and run that as user2. And if you don't care about that, then you might as well specify that user1 can run any command as user2 and therefore have a simpler sudo config. – Stan Kurdziel Oct 26 '15 at 16:56

Huygens ,Oct 26, 2015 at 19:24

@StanKurdziel good point! Although it is something to be aware of, it's really seldom to have system executables writable by users unless you're root but in this case you don't need sudo ;-) But you're right to add this comment because it's so seldom that I've probably overlooked it more than one time. – Huygens Oct 26 '15 at 19:24

Gert van den Berg ,Aug 10, 2016 at 14:24

To get it nearer to the behaviour su - user2 instead of su user2 , the commands should probably all involve sudo -u user2 -i , in order to simulate an initial login as user2 – Gert van den Berg Aug 10 '16 at 14:24

bmike ,Feb 18, 2013 at 0:05

I would set up public/private ssh keys for the second account and store the key in the first account.

Then you could run a command like:

 ssh user@localhost -n /Applications/Skype.app/Contents/MacOS/Skype &

You'd still have the issues where Skype gets confused since two instances are running on one user account and files read/written by that program might conflict. It also might work well enough for your needs and you'd not need an iPod touch to run your second Skype instance.

calum_b ,Feb 18, 2013 at 9:54

This is a good secure solution for the general case of password-free login to any account on any host, but I'd say it's probably overkill when both accounts are on the same host and belong to the same user. – calum_b Feb 18 '13 at 9:54

bmike ♦ ,Feb 18, 2013 at 14:02

@scottishwildcat It's far more secure than the alternative of scripting the password and feeding it in clear text or using a variable and storing the password in the keychain and using a tool like expect to script the interaction. I just use sudo su - blah and type my password. I think the other answer covers sudo well enough to keep this as a comment. – bmike ♦ Feb 18 '13 at 14:02

calum_b ,Feb 18, 2013 at 17:47

Oh, I certainly wasn't suggesting your answer should be removed I didn't even down-vote, it's a perfectly good answer. – calum_b Feb 18 '13 at 17:47

bmike ♦ ,Feb 18, 2013 at 18:46

We appear to be in total agreement - thanks for the addition - feel free to edit it into the answer if you can improve on it. – bmike ♦ Feb 18 '13 at 18:46

Gert van den Berg ,Aug 10, 2016 at 14:20

The accepted solution ( sudo -u user2 <...> ) does have the advantage that it can't be used remotely, which might help for security - there is no private key for user1 that can be stolen. – Gert van den Berg Aug 10 '16 at 14:20

[Jun 20, 2018] linux - Automating the sudo su - user command

Jun 20, 2018 | superuser.com

5 down vote favorite


sam ,Feb 9, 2011 at 11:11

I want to automate
sudo su - user

from a script. It should then ask for a password.

grawity ,Feb 9, 2011 at 12:07

Don't sudo su - user , use sudo -iu user instead. (Easier to manage through sudoers , by the way.) – grawity Feb 9 '11 at 12:07

Hello71 ,Feb 10, 2011 at 1:33

How are you able to run sudo su without being able to run sudo visudo ? – Hello71 Feb 10 '11 at 1:33

Torian ,Feb 9, 2011 at 11:37

I will try and guess what you asked.

If you want to use sudo su - user without a password, you should (if you have the privileges) do the following on you sudoers file:

<youuser>  ALL = NOPASSWD: /bin/su - <otheruser>

where:

Then put into the script:

sudo /bin/su - <otheruser>

Doing just this, won't get subsequent commands get run by <otheruser> , it will spawn a new shell. If you want to run another command from within the script as this other user, you should use something like:

 sudo -u <otheruser> <command>

And in sudoers file:

<yourusername>  ALL = (<otheruser>) NOPASSWD: <command>

Obviously, a more generic line like:

<yourusername> ALL = (ALL) NOPASSWD: ALL

Will get things done, but would grant the permission to do anything as anyone.

sam ,Feb 9, 2011 at 11:43

when the sudo su - user command gets executed,it asks for a password. i want a solution in which script automaticaaly reads password from somewhere. i dont have permission to do what u told earlier. – sam Feb 9 '11 at 11:43

sam ,Feb 9, 2011 at 11:47

i have the permission to store password in a file. the script should read password from that file – sam Feb 9 '11 at 11:47

Olli ,Feb 9, 2011 at 12:46

You can use command
 echo "your_password" | sudo -S [rest of your parameters for sudo]

(Of course without [ and ])

Please note that you should protect your script from read access from unauthorized users. If you want to read password from separate file, you can use

  sudo -S [rest of your parameters for sudo] < /etc/sudo_password_file

(Or whatever is the name of password file, containing password and single line break.)

From sudo man page:

   -S          The -S (stdin) option causes sudo to read the password from
               the standard input instead of the terminal device.  The
               password must be followed by a newline character.

AlexandruC ,Dec 6, 2014 at 8:10

This actually works for me. – AlexandruC Dec 6 '14 at 8:10

Oscar Foley ,Feb 8, 2016 at 16:36

This is brilliant – Oscar Foley Feb 8 '16 at 16:36

Mikel ,Feb 9, 2011 at 11:26

The easiest way is to make it so that user doesn't have to type a password at all.

You can do that by running visudo , then changing the line that looks like:

someuser  ALL=(ALL) ALL

to

someuser  ALL=(ALL) NOPASSWD: ALL

However if it's just for one script, it would be more secure to restrict passwordless access to only that script, and remove the (ALL) , so they can only run it as root, not any user , e.g.

Cmnd_Alias THESCRIPT = /usr/local/bin/scriptname

someuser  ALL=NOPASSWD: THESCRIPT

Run man 5 sudoers to see all the details in the sudoers man page .

sam ,Feb 9, 2011 at 11:34

i do not have permission to edit sudoers file.. any other so that it should read password from somewhere so that automation of this can be done. – sam Feb 9 '11 at 11:34

Torian ,Feb 9, 2011 at 11:40

you are out of luck ... you could do this with, lets say expect but that would let the password for your user hardcoded somewhere, where people could see it (granted that you setup permissions the right way, it could still be read by root). – Torian Feb 9 '11 at 11:40

Mikel ,Feb 9, 2011 at 11:40

Try using expect . man expect for details. – Mikel Feb 9 '11 at 11:40

> ,

when the sudo su - user command gets executed,it asks for a password. i want a solution in which script automaticaaly reads password from somewhere. i dont have permission to edit sudoers file.i have the permission to store password in a file.the script should read password from that file – sam

[Jun 20, 2018] sudo - What does ALL ALL=(ALL) ALL mean in sudoers

Jun 20, 2018 | unix.stackexchange.com

up vote 6 down vote favorite 3


LoukiosValentine79 ,May 6, 2015 at 19:29

If a server has the following in /etc/sudoers:
Defaults targetpw
ALL ALL=(ALL) ALL

Then what does this mean? all the users can sudo to all the commands, only their password is needed?

lcd047 ,May 6, 2015 at 20:51

It means "security Nirvana", that's what it means. ;) – lcd047 May 6 '15 at 20:51

poz2k4444 ,May 6, 2015 at 20:19

From the sudoers(5) man page:

The sudoers policy plugin determines a user's sudo privileges.

For the targetpw:

sudo will prompt for the password of the user specified by the -u option (defaults to root) instead of the password of the invoking user when running a command or editing a file.

sudo(8) allows you to execute commands as someone else

So, basically it says that any user can run any command on any host as any user and yes, the user just has to authenticate, but with the password of the other user, in order to run anything.

The first ALL is the users allowed
The second one is the hosts
The third one is the user as you are running the command
The last one is the commands allowed

LoukiosValentine79 ,May 7, 2015 at 16:37

Thanks! In the meantime I found the "Defaults targetpw" entry in sudoers.. updated the Q – LoukiosValentine79 May 7 '15 at 16:37

poz2k4444 ,May 7, 2015 at 18:24

@LoukiosValentine79 I just update the answer, does that answer your question? – poz2k4444 May 7 '15 at 18:24

evan54 ,Feb 28, 2016 at 20:24

wait he has to enter his own password not of the other user right? – evan54 Feb 28 '16 at 20:24

x-yuri ,May 19, 2017 at 12:20

with targetpw the one of the other (target) user – x-yuri May 19 '17 at 12:20

[Jun 20, 2018] sudo - What is ALL ALL=!SUDOSUDO for

Jun 20, 2018 | unix.stackexchange.com

gasko peter ,Dec 6, 2012 at 12:50

The last line of the /etc/sudoers file is:
grep -i sudosudo /etc/sudoers
Cmnd_Alias SUDOSUDO = /usr/bin/sudo
ALL ALL=!SUDOSUDO

why? What does it exactly do?

UPDATE#1: Now I know that it prevents users to use the: "/usr/bin/sudo".

UPDATE#2: not allowing "root ALL=(ALL) ALL" is not a solution.

Updated Question: What is better besides this "SUDOSUDO"? (the problem with this that the sudo binary could be copied..)

Chris Down ,Dec 6, 2012 at 12:53

SUDOSUDO is probably an alias. Does it exist elsewhere in the file? – Chris Down Dec 6 '12 at 12:53

gasko peter ,Dec 6, 2012 at 14:21

question updated :D - so what does it means exactly? – gasko peter Dec 6 '12 at 14:21

gasko peter ,Dec 6, 2012 at 14:30

is "ALL ALL=!SUDOSUDO" as the last line is like when having DROP iptables POLICY and still using a -j DROP rule as last rule in ex.: INPUT chain? :D or does it has real effects? – gasko peter Dec 6 '12 at 14:30

Kevin ,Dec 6, 2012 at 14:48

I'm not 100% sure, but I believe it only prevents anyone from running sudo sudo ... . – Kevin Dec 6 '12 at 14:48

[Jun 18, 2018] Copy and paste text in midnight commander (MC) via putty in Linux

Notable quotes:
"... IF you're using putty in either Xorg or Windows (i.e terminal within a gui) , it's possible to use the "conventional" right-click copy/paste behavior while in mc. Hold the shift key while you mark/copy. ..."
"... Putty has ability to copy-paste. In mcedit, hold Shift and select by mouse ..."
Jun 18, 2018 | superuser.com

Den ,Mar 1, 2015 at 22:50

I use Midnight Commander (MC) editor over putty to edit files

I want to know how to copy text from one file, close it then open another file and paste it?

If it is not possible with Midnight Commander, is there another easy way to copy and paste specific text from different files?

szkj ,Mar 12, 2015 at 22:40

I would do it like this:
  1. switch to block selection mode by pressing F3
  2. select a block
  3. switch off block selection mode with F3
  4. press Ctrl+F which will open Save block dialog
  5. press Enter to save it to the default location
  6. open the other file in the editor, and navigate to the target location
  7. press Shift+F5 to open Insert file dialog
  8. press Enter to paste from the default file location (which is same as the one in Save block dialog)

NOTE: There are other environment related methods, that could be more conventional nowadays, but the above one does not depend on any desktop environment related clipboard, (terminal emulator features, putty, Xorg, etc.). This is a pure mcedit feature which works everywhere.

Andrejs ,Apr 28, 2016 at 8:13

To copy: (hold) Shift + Select with mouse (copies to clipboard)

To paste in windows: Ctrl+V

To paste in another file in PuTTY/MC: Shift + Ins

Piotr Dobrogost ,Mar 30, 2017 at 17:32

If you get unwanted indents in what was pasted then while editing file in Midnight Commander press F9 to show top menu and in Options/Generals menu uncheck Return does autoindent option. Yes, I was happy when I found it too :) – Piotr Dobrogost Mar 30 '17 at 17:32

mcii-1962 ,May 26, 2015 at 13:17

IF you're using putty in either Xorg or Windows (i.e terminal within a gui) , it's possible to use the "conventional" right-click copy/paste behavior while in mc. Hold the shift key while you mark/copy.

Eden ,Feb 15, 2017 at 4:09

  1. Hold down the Shift key, and drag the mouse through the text you want to copy. The text's background will become dark orange.
  2. Release the Shift key and press Shift + Ctrl + c . The text will be copied.
  3. Now you can paste the text to anywhere you want by pressing Shift + Ctrl + v , even to the new page in MC.

xoid ,Jun 6, 2016 at 6:37

Putty has ability to copy-paste. In mcedit, hold Shift and select by mouse

mcii-1962 ,Jun 20, 2016 at 23:01

LOL - did you actually read the other answers? And your answer is incomplete, you should include what to do with the mouse in order to "select by mouse".
According to help in MC:

Ctrl + Insert copies to the mcedit.clip, and Shift + Insert pastes from mcedit.clip.

It doesn't work for me, by some reason, but by pressing F9 you get a menu, Edit > Copy to clipfile - worked fine.

[Jun 13, 2018] reverse engineering - Is there a C++ decompiler

Jun 13, 2018 | stackoverflow.com

David Holm ,Oct 15, 2008 at 15:08

You can use IDA Pro by Hex-Rays . You will usually not get good C++ out of a binary unless you compiled in debugging information. Prepare to spend a lot of manual labor reversing the code.

If you didn't strip the binaries there is some hope as IDA Pro can produce C-alike code for you to work with. Usually it is very rough though, at least when I used it a couple of years ago.

davenpcj ,May 5, 2012 at

To clarify, IDA will only give the disassembly. There's an add-on to it called Hex-Rays that will decompile the rest of the way into C/C++ source, to the extent that's possible. – davenpcj May 5 '12 at

Dustin Getz ,Oct 15, 2008 at 15:15

information is discarded in the compiling process. Even if a decompiler could produce the logical equivalent code with classes and everything (it probably can't), the self-documenting part is gone in optimized release code. No variable names, no routine names, no class names - just addresses.

Darshan Chaudhary ,Aug 14, 2017 at 17:36

"the soul" of the program is gone, just an empty shell of it's former self..." – Darshan Chaudhary Aug 14 '17 at 17:36

,

Yes, but none of them will manage to produce readable enough code to worth the effort. You will spend more time trying to read the decompiled source with assembler blocks inside, than rewriting your old app from scratch.

[Jun 13, 2018] MC_HOME allows you to run mc with alternative mc.init

Notable quotes:
"... MC_HOME variable can be set to alternative path prior to starting mc. Man pages are not something you can find the answer right away =) ..."
"... A small drawback of this solution: if you set MC_HOME to a directory different from your usual HOME, mc will ignore the content of your usual ~/.bashrc so, for example, your custom aliases defined in that file won't work anymore. Workaround: add a symlink to your ~/.bashrc into the new MC_HOME directory ..."
"... at the same time ..."
Jun 13, 2018 | unix.stackexchange.com

Tagwint ,Dec 19, 2014 at 16:41

That turned out to be simpler as one might think. MC_HOME variable can be set to alternative path prior to starting mc. Man pages are not something you can find the answer right away =)

here's how it works: - usual way

[jsmith@wstation5 ~]$ mc -F
Root directory: /home/jsmith

[System data]
<skipped>

[User data]
    Config directory: /home/jsmith/.config/mc/
    Data directory:   /home/jsmith/.local/share/mc/
        skins:          /home/jsmith/.local/share/mc/skins/
        extfs.d:        /home/jsmith/.local/share/mc/extfs.d/
        fish:           /home/jsmith/.local/share/mc/fish/
        mcedit macros:  /home/jsmith/.local/share/mc/mc.macros
        mcedit external macros: /home/jsmith/.local/share/mc/mcedit/macros.d/macro.*
    Cache directory:  /home/jsmith/.cache/mc/

and the alternative way:

[jsmith@wstation5 ~]$ MC_HOME=/tmp/MCHOME mc -F
Root directory: /tmp/MCHOME

[System data]
<skipped>    

[User data]
    Config directory: /tmp/MCHOME/.config/mc/
    Data directory:   /tmp/MCHOME/.local/share/mc/
        skins:          /tmp/MCHOME/.local/share/mc/skins/
        extfs.d:        /tmp/MCHOME/.local/share/mc/extfs.d/
        fish:           /tmp/MCHOME/.local/share/mc/fish/
        mcedit macros:  /tmp/MCHOME/.local/share/mc/mc.macros
        mcedit external macros: /tmp/MCHOME/.local/share/mc/mcedit/macros.d/macro.*
    Cache directory:  /tmp/MCHOME/.cache/mc/

Use case of this feature:

You have to share the same user name on remote server (access can be distinguished by rsa keys) and want to use your favorite mc configuration w/o overwriting it. Concurrent sessions do not interfere each other.

This works well as a part of sshrc-approach described in https://github.com/Russell91/sshrc

Cri ,Sep 5, 2016 at 10:26

A small drawback of this solution: if you set MC_HOME to a directory different from your usual HOME, mc will ignore the content of your usual ~/.bashrc so, for example, your custom aliases defined in that file won't work anymore. Workaround: add a symlink to your ~/.bashrc into the new MC_HOME directoryCri Sep 5 '16 at 10:26

goldilocks ,Dec 18, 2014 at 16:03

If you mean, you want to be able to run two instances of mc as the same user at the same time with different config directories, as far as I can tell you can't. The path is hardcoded.

However, if you mean, you want to be able to switch which config directory is being used, here's an idea (tested, works). You probably want to do it without mc running:

Hopefully it's clear what's happening there -- this sets a the config directory path as a symlink. Whatever configuration changes you now make and save will be int the one directory. You can then exit and switch_mc two , reverting to the old config, then start mc again, make changes and save them, etc.

You could get away with removing the killall mc and playing around; the configuration stuff is in the ini file, which is read at start-up (so you can't switch on the fly this way). It's then not touched until exit unless you "Save setup", but at exit it may be overwritten, so the danger here is that you erase something you did earlier or outside of the running instance.

Tagwint ,Dec 18, 2014 at 16:52

that works indeed, your idea is pretty clear, thank you for your time However my idea was to be able run differently configured mc's under the same account not interfering each other. I should have specified that in my question. The path to config dir is in fact hardcoded, but it is hardcoded RELATIVELY to user's home dir, that is the value of $HOME, thus changing it before mc start DOES change the config dir location - I've checked that. the drawback is $HOME stays changed as long as mc runs, which could be resolved if mc had a kind of startup hook to put restore to original HOME into – Tagwint Dec 18 '14 at 16:52

Tagwint ,Dec 18, 2014 at 17:17

I've extended my original q with 'same time' condition - it did not fit in my prev comment size limitation – Tagwint Dec 18 '14 at 17:17

[Jun 13, 2018] MC (Midnight Commmander) mc/ini settings file location

Jun 13, 2018 | unix.stackexchange.com

UVV ,Oct 13, 2014 at 7:51

It's in the following file: ~/.config/mc/ini .

obohovyk ,Oct 13, 2014 at 7:53

Unfortunately not... – obohovyk Oct 13 '14 at 7:53

UVV ,Oct 13, 2014 at 8:02

@alexkowalski then it's ~/.config/mc/iniUVV Oct 13 '14 at 8:02

obohovyk ,Oct 13, 2014 at 8:41

Yeah, thanks!!! – obohovyk Oct 13 '14 at 8:41

,

If you have not made any changes, the config file does not yet exist.

The easy way to change from the default skin:

  1. Start Midnight Commander
    sudo mc
    
  2. F9 , O for Options, or cursor to "Options" and press Enter
  3. A for Appearance, or cursor to Appearance and press Enter

    You will see that default is the current skin.

  4. Press Enter to see the other skin choices
  5. Cursor to the skin you want and select it by pressing Enter
  6. Click OK

After you do this, the ini file will exist and can be edited, but it is easier to change skins using the method I described.

[Jun 02, 2018] Parallelise rsync using GNU Parallel

Jun 02, 2018 | unix.stackexchange.com

up vote 7 down vote favorite 4


Mandar Shinde ,Mar 13, 2015 at 6:51

I have been using a rsync script to synchronize data at one host with the data at another host. The data has numerous small-sized files that contribute to almost 1.2TB.

In order to sync those files, I have been using rsync command as follows:

rsync -avzm --stats --human-readable --include-from proj.lst /data/projects REMOTEHOST:/data/

The contents of proj.lst are as follows:

+ proj1
+ proj1/*
+ proj1/*/*
+ proj1/*/*/*.tar
+ proj1/*/*/*.pdf
+ proj2
+ proj2/*
+ proj2/*/*
+ proj2/*/*/*.tar
+ proj2/*/*/*.pdf
...
...
...
- *

As a test, I picked up two of those projects (8.5GB of data) and I executed the command above. Being a sequential process, it tool 14 minutes 58 seconds to complete. So, for 1.2TB of data it would take several hours.

If I would could multiple rsync processes in parallel (using & , xargs or parallel ), it would save my time.

I tried with below command with parallel (after cd ing to source directory) and it took 12 minutes 37 seconds to execute:

parallel --will-cite -j 5 rsync -avzm --stats --human-readable {} REMOTEHOST:/data/ ::: .

This should have taken 5 times less time, but it didn't. I think, I'm going wrong somewhere.

How can I run multiple rsync processes in order to reduce the execution time?

Ole Tange ,Mar 13, 2015 at 7:25

Are you limited by network bandwidth? Disk iops? Disk bandwidth? – Ole Tange Mar 13 '15 at 7:25

Mandar Shinde ,Mar 13, 2015 at 7:32

If possible, we would want to use 50% of total bandwidth. But, parallelising multiple rsync s is our first priority. – Mandar Shinde Mar 13 '15 at 7:32

Ole Tange ,Mar 13, 2015 at 7:41

Can you let us know your: Network bandwidth, disk iops, disk bandwidth, and the bandwidth actually used? – Ole Tange Mar 13 '15 at 7:41

Mandar Shinde ,Mar 13, 2015 at 7:47

In fact, I do not know about above parameters. For the time being, we can neglect the optimization part. Multiple rsync s in parallel is the primary focus now. – Mandar Shinde Mar 13 '15 at 7:47

Mandar Shinde ,Apr 11, 2015 at 13:53

Following steps did the job for me:
  1. Run the rsync --dry-run first in order to get the list of files those would be affected.

rsync -avzm --stats --safe-links --ignore-existing --dry-run --human-readable /data/projects REMOTE-HOST:/data/ > /tmp/transfer.log

  1. I fed the output of cat transfer.log to parallel in order to run 5 rsync s in parallel, as follows:

cat /tmp/transfer.log | parallel --will-cite -j 5 rsync -avzm --relative --stats --safe-links --ignore-existing --human-readable {} REMOTE-HOST:/data/ > result.log

Here, --relative option ( link ) ensured that the directory structure for the affected files, at the source and destination, remains the same (inside /data/ directory), so the command must be run in the source folder (in example, /data/projects ).

Sandip Bhattacharya ,Nov 17, 2016 at 21:22

That would do an rsync per file. It would probably be more efficient to split up the whole file list using split and feed those filenames to parallel. Then use rsync's --files-from to get the filenames out of each file and sync them. rm backups.* split -l 3000 backup.list backups. ls backups.* | parallel --line-buffer --verbose -j 5 rsync --progress -av --files-from {} /LOCAL/PARENT/PATH/ REMOTE_HOST:REMOTE_PATH/ – Sandip Bhattacharya Nov 17 '16 at 21:22

Mike D ,Sep 19, 2017 at 16:42

How does the second rsync command handle the lines in result.log that are not files? i.e. receiving file list ... done created directory /data/ . – Mike D Sep 19 '17 at 16:42

Cheetah ,Oct 12, 2017 at 5:31

On newer versions of rsync (3.1.0+), you can use --info=name in place of -v , and you'll get just the names of the files and directories. You may want to use --protect-args to the 'inner' transferring rsync too if any files might have spaces or shell metacharacters in them. – Cheetah Oct 12 '17 at 5:31

Mikhail ,Apr 10, 2017 at 3:28

I would strongly discourage anybody from using the accepted answer, a better solution is to crawl the top level directory and launch a proportional number of rync operations.

I have a large zfs volume and my source was was a cifs mount. Both are linked with 10G, and in some benchmarks can saturate the link. Performance was evaluated using zpool iostat 1 .

The source drive was mounted like:

mount -t cifs -o username=,password= //static_ip/70tb /mnt/Datahoarder_Mount/ -o vers=3.0

Using a single rsync process:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/ /StoragePod

the io meter reads:

StoragePod  30.0T   144T      0  1.61K      0   130M
StoragePod  30.0T   144T      0  1.61K      0   130M
StoragePod  30.0T   144T      0  1.62K      0   130M

This in synthetic benchmarks (crystal disk), performance for sequential write approaches 900 MB/s which means the link is saturated. 130MB/s is not very good, and the difference between waiting a weekend and two weeks.

So, I built the file list and tried to run the sync again (I have a 64 core machine):

cat /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount.log | parallel --will-cite -j 16 rsync -avzm --relative --stats --safe-links --size-only --human-readable {} /StoragePod/ > /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount_result.log

and it had the same performance!

StoragePod  29.9T   144T      0  1.63K      0   130M
StoragePod  29.9T   144T      0  1.62K      0   130M
StoragePod  29.9T   144T      0  1.56K      0   129M

As an alternative I simply ran rsync on the root folders:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/Marcello_zinc_bone /StoragePod/Marcello_zinc_bone
rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/fibroblast_growth /StoragePod/fibroblast_growth
rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/QDIC /StoragePod/QDIC
rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/sexy_dps_cell /StoragePod/sexy_dps_cell

This actually boosted performance:

StoragePod  30.1T   144T     13  3.66K   112K   343M
StoragePod  30.1T   144T     24  5.11K   184K   469M
StoragePod  30.1T   144T     25  4.30K   196K   373M

In conclusion, as @Sandip Bhattacharya brought up, write a small script to get the directories and parallel that. Alternatively, pass a file list to rsync. But don't create new instances for each file.

Julien Palard ,May 25, 2016 at 14:15

I personally use this simple one:
ls -1 | parallel rsync -a {} /destination/directory/

Which only is usefull when you have more than a few non-near-empty directories, else you'll end up having almost every rsync terminating and the last one doing all the job alone.

Ole Tange ,Mar 13, 2015 at 7:25

A tested way to do the parallelized rsync is: http://www.gnu.org/software/parallel/man.html#EXAMPLE:-Parallelizing-rsync

rsync is a great tool, but sometimes it will not fill up the available bandwidth. This is often a problem when copying several big files over high speed connections.

The following will start one rsync per big file in src-dir to dest-dir on the server fooserver:

cd src-dir; find . -type f -size +100000 | \
parallel -v ssh fooserver mkdir -p /dest-dir/{//}\; \
  rsync -s -Havessh {} fooserver:/dest-dir/{}

The directories created may end up with wrong permissions and smaller files are not being transferred. To fix those run rsync a final time:

rsync -Havessh src-dir/ fooserver:/dest-dir/

If you are unable to push data, but need to pull them and the files are called digits.png (e.g. 000000.png) you might be able to do:

seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/

Mandar Shinde ,Mar 13, 2015 at 7:34

Any other alternative in order to avoid find ? – Mandar Shinde Mar 13 '15 at 7:34

Ole Tange ,Mar 17, 2015 at 9:20

Limit the -maxdepth of find. – Ole Tange Mar 17 '15 at 9:20

Mandar Shinde ,Apr 10, 2015 at 3:47

If I use --dry-run option in rsync , I would have a list of files that would be transferred. Can I provide that file list to parallel in order to parallelise the process? – Mandar Shinde Apr 10 '15 at 3:47

Ole Tange ,Apr 10, 2015 at 5:51

cat files | parallel -v ssh fooserver mkdir -p /dest-dir/{//}\; rsync -s -Havessh {} fooserver:/dest-dir/{} – Ole Tange Apr 10 '15 at 5:51

Mandar Shinde ,Apr 10, 2015 at 9:49

Can you please explain the mkdir -p /dest-dir/{//}\; part? Especially the {//} thing is a bit confusing. – Mandar Shinde Apr 10 '15 at 9:49

,

For multi destination syncs, I am using
parallel rsync -avi /path/to/source ::: host1: host2: host3:

Hint: All ssh connections are established with public keys in ~/.ssh/authorized_keys

[May 13, 2018] What is the difference between FASTA, FASTQ, and SAM file formats

May 13, 2018 | bioinformatics.stackexchange.com

Konrad Rudolph ,Jun 2, 2017 at 12:16

Let's start with what they have in common: All three formats store
  1. sequence data, and
  2. sequence metadata.

Furthermore, all three formats are text-based.

However, beyond that all three formats are different and serve different purposes.

Let's start with the simplest format:

FASTA

FASTA stores a variable number of sequence records, and for each record it stores the sequence itself, and a sequence ID. Each record starts with a header line whose first character is > , followed by the sequence ID. The next lines of a record contain the actual sequence.

The Wikipedia artice gives several examples for peptide sequences, but since FASTQ and SAM are used exclusively (?) for nucleotide sequences, here's a nucleotide example:

>Mus_musculus_tRNA-Ala-AGC-1-1 (chr13.trna34-AlaAGC)
GGGGGTGTAGCTCAGTGGTAGAGCGCGTGCTTAGCATGCACGAGGcCCTGGGTTCGATCC
CCAGCACCTCCA
>Mus_musculus_tRNA-Ala-AGC-10-1 (chr13.trna457-AlaAGC)
GGGGGATTAGCTCAAATGGTAGAGCGCTCGCTTAGCATGCAAGAGGtAGTGGGATCGATG
CCCACATCCTCCA

The ID can be in any arbitrary format, although several conventions exist .

In the context of nucleotide sequences, FASTA is mostly used to store reference data; that is, data extracted from a curated database; the above is adapted from GtRNAdb (a database of tRNA sequences).

FASTQ

FASTQ was conceived to solve a specific problem of FASTA files: when sequencing, the confidence in a given base call (that is, the identity of a nucleotide) varies. This is expressed in the Phred quality score . FASTA had no standardised way of encoding this. By contrast, a FASTQ record contains a sequence of quality scores for each nucleotide.

A FASTQ record has the following format:

  1. A line starting with @ , containing the sequence ID.
  2. One or more lines that contain the sequence.
  3. A new line starting with the character + , and being either empty or repeating the sequence ID.
  4. One or more lines that contain the quality scores.

Here's an example of a FASTQ file with two records:

@071112_SLXA-EAS1_s_7:5:1:817:345
GGGTGATGGCCGCTGCCGATGGCGTC
AAATCCCACC
+
IIIIIIIIIIIIIIIIIIIIIIIIII
IIII9IG9IC
@071112_SLXA-EAS1_s_7:5:1:801:338
GTTCAGGGATACGACGTTTGTATTTTAAGAATCTGA
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII6IBI

FASTQ files are mostly used to store short-read data from high-throughput sequencing experiments. As a consequence, the sequence and quality scores are usually put into a single line each, and indeed many tools assume that each record in a FASTQ file is exactly four lines long, even though this isn't guaranteed.

As for FASTA, the format of the sequence ID isn't standardised, but different producers of FASTQ use fixed notations that follow strict conventions .

SAM

SAM files are so complex that a complete description [PDF] takes 15 pages. So here's the short version.

The original purpose of SAM files is to store mapping information for sequences from high-throughput sequencing. As a consequence, a SAM record needs to store more than just the sequence and its quality, it also needs to store information about where and how a sequence maps into the reference.

Unlike the previous formats, SAM is tab-based, and each record, consisting of either 11 or 12 fields, fills exactly one line. Here's an example (tabs replaced by fixed-width spacing):

r001  99  chr1  7 30  17M         =  37  39  TTAGATAAAGGATACTG   IIIIIIIIIIIIIIIII
r002  0   chrX  9 30  3S6M1P1I4M  *  0   0   AAAAGATAAGGATA      IIIIIIIIII6IBI    NM:i:1

For a description of the individual fields, refer to the documentation. The relevant bit is this: SAM can express exactly the same information as FASTQ, plus, as mentioned, the mapping information. However, SAM is also used to store read data without mapping information.

In addition to sequence records, SAM files can also contain a header , which stores information about the reference that the sequences were mapped to, and the tool used to create the SAM file. Header information precede the sequence records, and consist of lines starting with @ .

SAM itself is almost never used as a storage format; instead, files are stored in BAM format, which is a compact binary representation of SAM. It stores the same information, just more efficiently, and in conjunction with a search index , allows fast retrieval of individual records from the middle of the file (= fast random access ). BAM files are also much more compact than compressed FASTQ or FASTA files.


The above implies a hierarchy in what the formats can store: FASTA ⊂ FASTQ ⊂ SAM.

In a typical high-throughput analysis workflow, you will encounter all three file types:

  1. FASTA to store the reference genome/transcriptome that the sequence fragments will be mapped to.
  2. FASTQ to store the sequence fragments before mapping.
  3. SAM/BAM to store the sequence fragments after mapping.

Scott Gigante ,Aug 17, 2017 at 6:01

FASTQ is used for long-read sequencing as well, which could have a single record being thousands of 80-character lines long. Sometimes these are split by line breaks, sometimes not. – Scott Gigante Aug 17 '17 at 6:01

Konrad Rudolph ,Aug 17, 2017 at 10:03

@ScottGigante I alluded to this by saying that the sequence can take up several lines. – Konrad Rudolph Aug 17 '17 at 10:03

Scott Gigante ,Aug 17, 2017 at 13:22

Sorry, should have clarified: I was just referring to the line "FASTQ files are (almost?) exclusively used to store short-read data from high-throughput sequencing experiments." Definitely not exclusively. – Scott Gigante Aug 17 '17 at 13:22

Konrad Rudolph ,Feb 21 at 17:06

@charlesdarwin I have no idea. The line with the plus sign is completely redundant. The original developers of the FASTQ format probably intended it as a redundancy to simplify error checking (= to see if the record was complete) but it fails at that. In hindsight it shouldn't have been included. Unfortunately we're stuck with it for now. – Konrad Rudolph Feb 21 at 17:06

Wouter De Coster ,Feb 21 at 23:16

@KonradRudolph as far as I know fastq is a combination of fasta and qual files, see also ncbi.nlm.nih.gov/pmc/articles/PMC2847217 This explains the header of the quality part. It, however, doesn't make sense we're stuck with it... – Wouter De Coster Feb 21 at 23:16

eastafri ,May 16, 2017 at 18:57

In a nutshell,

FASTA file format is a DNA sequence format for specifying or representing DNA sequences and was first described by Pearson (Pearson,W.R. and Lipman,D.J. (1988) Improved tools for biological sequence comparison. Proc. Natl Acad. Sci. USA, 85, 2444–2448)

FASTQ is another DNA sequence file format that extends the FASTA format with the ability to store the sequence quality. The quality scores are often represented in ASCII characters which correspond to a phred score)

Both FASTA and FASTQ are common sequence representation formats and have emerged as key data interchange formats for molecular biology and bioinformatics.

SAM is format for representing sequence alignment information from a read aligner. It represents sequence information in respect to a given reference sequence. The information is stored in a series of tab delimited ascii columns. The full SAM format specification is available at http://samtools.sourceforge.net/SAM1.pdf

user172818 ♦ ,May 16, 2017 at 19:07

On a historical note, the Sanger Institute first used the FASTQ format. – user172818 ♦ May 16 '17 at 19:07

Konrad Rudolph ,Jun 2, 2017 at 10:43

SAM can also (and is increasingly used for it, see PacBio) store unaligned sequence information, and in this regard equivalent to FASTQ. – Konrad Rudolph Jun 2 '17 at 10:43

bli ,Jun 2, 2017 at 11:30

Note that fasta is also often use for protein data, not just DNA. – bli Jun 2 '17 at 11:30

BaCh ,May 16, 2017 at 18:53

Incidentally, the first part of your question is something you could have looked up yourself as the first hits on Google of "NAME format" point you to primers on Wikipedia, no less. In future, please do that before asking a question.
  1. FASTA
  2. FASTQ
  3. SAM

FASTA (officially) just stores the name of a sequence and the sequence, inofficially people also add comment fields after the name of the sequence. FASTQ was invented to store both sequence and associated quality values (e.g. from sequencing instruments). SAM was invented to store alignments of (small) sequences (e.g. generated from sequencing) with associated quality values and some further data onto a larger sequences, called reference sequences, the latter being anything from a tiny virus sequence to ultra-large plant sequences.

Alon Gelber ,May 16, 2017 at 19:50

FASTA and FATSQ formats are both file formats that contain sequencing reads while SAM files are these reads aligned to a reference sequence. In other words, FASTA and FASTQ are the "raw data" of sequencing while SAM is the product of aligning the sequencing reads to a refseq.

A FASTA file contains a read name followed by the sequence. An example of one of these reads for RNASeq might be:

>Flow cell number: lane number: chip coordinates etc.
ATTGGCTAATTGGCTAATTGGCTAATTGGCTAATTGGCTAATTGGCTAATTGGCTAATTGGCTA

The FASTQ version of this read will have two more lines, one + as a space holder and then a line of quality scores for the base calls. The qualities are given as characters with '!' being the lowest and '~' being the highest, in increasing ASCII value. It would look something like this

@Flow cell number: lane number: chip coordinates etc.
ATTGGCTAATTGGCTAATTGGCTAATTGGCTAATTGGCTAATTGGCTAATTGGCTAATTGGCTA
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65

A SAM file has many fields for each alignment, the header begins with the @ character. The alignment contains 11 mandatory fields and various optional ones. You can find the spec file here: https://samtools.github.io/hts-specs/SAMv1.pdf .

Often you'll see BAM files which are just compressed binary versions of SAM files. You can view these alignment files using various tools, such as SAMtools, IGV or USCS Genome browser.

As to the benefits, FASTA/FASTQ vs. SAM/BAM is comparing apples and oranges. I do a lot of RNASeq work so generally we take the FASTQ files and align them the a refseq using an aligner such as STAR which outputs SAM/BAM files. There's a l ot you can do with just these alignment files, looking at expression, but usually I'll use a tool such as RSEM to "count" the reads from various genes to create an expression matrix, samples as columns and genes as rows. Whether you get FASTQ or FASTA files just depends on your sequencing platform. I've never heard of anybody really using the quality scores.

Konrad Rudolph ,Jun 2, 2017 at 10:47

Careful, the FASTQ format description is wrong: a FASTQ record can span more than four lines; also, + isn't a placeholder, it's a separator between the sequence and the quality score, with an optional repetition of the record ID following it. Finally, the quality score string has to be the same length as the sequence. – Konrad Rudolph Jun 2 '17 at 10:47

[May 09, 2018] How to read binary file in Perl - Stack Overflow

Notable quotes:
"... BTW: I don't think it's a good idea to read tons of binary files into memory at once. You can search them 1 by 1... ..."
May 09, 2018 | stackoverflow.com

2 down vote favorite 1


Grace ,Jan 19, 2012 at 2:08

I'm having an issue with writing a Perl script to read a binary file.

My code is as the following whereby the $file are files in binary format. I tried to search through the web and apply in my code, tried to print it out, but it seems it doesn't work well.

Currently it only prints the '&&&&&&&&&&&" and ""ppppppppppp", but what I really want is it can print out each of the $line , so that I can do some other post processing later. Also, I'm not quite sure what the $data is as I see it is part of the code from sample in article, stating suppose to be a scalar. I need somebody who can pin point me where the error goes wrong in my code. Below is what I did.

my $tmp = "$basedir/$key";
opendir (TEMP1, "$tmp");
my @dirs = readdir(TEMP1);
closedir(TEMP1);

foreach my $dirs (@dirs) {
    next if ($dirs eq "." || $dirs eq "..");
    print "---->$dirs\n";
    my $d = "$basedir/$key/$dirs";
    if (-d "$d") {
        opendir (TEMP2, $d) || die $!;
        my @files = readdir (TEMP2); # This should read binary files
        closedir (TEMP2);

        #my $buffer = "";
        #opendir (FILE, $d) || die $!;
        #binmode (FILE);
        #my @files =  readdir (FILE, $buffer, 169108570);
        #closedir (FILE);

        foreach my $file (@files) {
            next if ($file eq "." || $file eq "..");
            my $f = "$d/$file";
            print "==>$file\n";
            open FILE, $file || die $!;
            binmode FILE;
            foreach ($line = read (FILE, $data, 169108570)) {
                print "&&&&&&&&&&&$line\n";
                print "ppppppppppp$data\n";
            }
            close FILE;
        }
    }
}

I have altered my code so that it goes like as below. Now I can read the $data. Thanks J-16 SDiZ for pointing out that. I'm trying to push the info I got from the binary file to an array called "@array", thinkking to grep data from the array for string whichever match "p04" but fail. Can someone point out where is the error?

my $tmp = "$basedir/$key";
opendir (TEMP1, "$tmp");
my @dirs = readdir (TEMP1);
closedir (TEMP1);

foreach my $dirs (@dirs) {
    next if ($dirs eq "." || $dirs eq "..");
    print "---->$dirs\n";
    my $d = "$basedir/$key/$dirs";
    if (-d "$d") {
        opendir (TEMP2, $d) || die $!;
        my @files = readdir (TEMP2); #This should read binary files
        closedir (TEMP2);

        foreach my $file (@files) {
            next if ($file eq "." || $file eq "..");
            my $f = "$d/$file";
            print "==>$file\n";
            open FILE, $file || die $!;
            binmode FILE;
            foreach ($line = read (FILE, $data, 169108570)) {
                print "&&&&&&&&&&&$line\n";
                print "ppppppppppp$data\n";
                push @array, $data;
            }
            close FILE;
        }
    }
}

foreach $item (@array) {
    #print "==>$item<==\n"; # It prints out content of binary file without the ==> and <== if I uncomment this.. weird!
    if ($item =~ /p04(.*)/) {
        print "=>$item<===============\n"; # It prints "=><===============" according to the number of binary file I have.  This is wrong that I aspect it to print the content of each binary file instead :(
        next if ($item !~ /^w+/);
        open (LOG, ">log") or die $!;
        #print LOG $item;
        close LOG;
    }
}

Again, I changed my code as following, but it still doesn't work as it do not able to grep the "p04" correctly by checking on the "log" file. It did grep the whole file including binary like this "@^@^@^@^G^D^@^@^@^^@p04bbhi06^@^^@^@^@^@^@^@^@^@hh^R^@^@^@^^@^@^@p04lohhj09^@^@^@^^@@" . What I'm aspecting is it do grep the anything with p04 only such as grepping p04bbhi06 and p04lohhj09. Here is how my code goes:-

foreach my $file (@files) {
    next if ($file eq "." || $file eq "..");
    my $f = "$d/$file";
    print "==>$file\n";
    open FILE, $f || die $!;
    binmode FILE;
    my @lines = <FILE>;
    close FILE;
    foreach $cell (@lines) {
        if ($cell =~ /b12/) {
            push @array, $cell;
        }
    }
}

#my @matches = grep /p04/, @lines;
#foreach $item (@matches) {
foreach $item (@array) {
    #print "-->$item<--";
    open (LOG, ">log") or die $!;
    print LOG $item;
    close LOG;
}

Brad Gilbert ,Jan 19, 2012 at 15:53

use autodie Brad Gilbert Jan 19 '12 at 15:53

reinierpost ,Jan 30, 2012 at 13:00

There is no such thing as 'binary format'. Please be more precise. What format are the files in? What characteristics do they have that cause you to call them 'in binary format'? – reinierpost Jan 30 '12 at 13:00

Grace ,Jan 31, 2012 at 6:56

It is in .gds format. This file is able to read in Unix with strings command. It was reaable in my Perl script but I am not able to grep the data I wanted (p04* here in my code) . – Grace Jan 31 '12 at 6:56

mivk ,Nov 19, 2013 at 13:16

As already suggested, use File::Find or something to get your list of files. For the rest, what do you really want? Output the whole file content if you found a match? Or just the parts that match? And what do you want to match? p04(.*) matches anything from "p04" up to the next newline. You then have that "anything" in $1 . Leave out all the clumsy directory stuff and concentrate first on what you want out of a single file. How big are the files? You are only reading the first 170MB. And you keep overwriting the "log" file, so it only contains the last item from the last file. – mivk Nov 19 '13 at 13:16

jm666 ,May 12, 2015 at 6:44

@reinierpost the OP under the "binary file" probably mean the opposite of the text files - e.g. same thing as is in the perldoc's -X documentation see the -B explanation. (cite: -B File is a "binary" file (opposite of -T).) – jm666 May 12 '15 at 6:44

J-16 SDiZ ,Jan 19, 2012 at 2:19

Use:
$line = read (FILE, $data, 169108570);

The data is in $data ; and $line is the number of bytes read.

       my $f = "$d/$file" ;
       print "==>$file\n" ;
       open FILE, $file || die $! ;

I guess the full path is in $f , but you are opening $file . (In my testing -- even $f is not the full path, but I guess you may have some other glue code...)

If you just want to walk all the files in a directory, try File::DirWalk or File::Find .

Grace ,Jan 19, 2012 at 2:34

Hi J-16 SDiZ, thanks for the reply. each of the $file is in binary format, and what I want to do is to read eaxh of the file to grep some information in readable format and dump into another file (which I consider here as post processing). I want to perform something like "strings <filename> | grep <text synctax>" as in Unix. whereby the <filename> is the $file here in my code. My problem here is cannot read the binary file so that I can proceed with other stuff. Thanks. – Grace Jan 19 '12 at 2:34

Dimanoid ,Jan 20, 2012 at 8:51

I am not sure if I understood you right.

If you need to read a binary file, you can do the same as for a text file:

open F, "/bin/bash";
my $file = do { local $/; <F> };
close F;

Under Windows you may need to add binmode F; under *nix it works without it.

If you need to find which lines in an array contains some word, you can use grep function:

my @matches = grep /something/, @array_to_grep;

You will get all matched lines in the new array @matches .

BTW: I don't think it's a good idea to read tons of binary files into memory at once. You can search them 1 by 1...

If you need to find where the match occurs you can use another standard function, index :

my $offset = index('myword', $file);

Grace ,Jan 30, 2012 at 4:30

Hi Dinanoid, thanks for your answer, I tried it but it didn't work well for me. I tried to edit my code as above (my own code, and it didn't work). Also, tried code as below as you suggested, it didn't work for me either. Can you point out where I did wrong? Thanks. – Grace Jan 30 '12 at 4:30

Peter Mortensen ,May 1, 2016 at 8:31

What will $file be assigned to? An array of characters? A string? Something else? – Peter Mortensen May 1 '16 at 8:31

> ,

I'm not sure I'll be able to answer the OP question exactly, but here are some notes that may be related. (edit: this is the same approach as answer by @Dimanoid, but with more detail)

Say you have a file, which is a mix of ASCII data, and binary. Here is an example in a bash terminal:

$ echo -e "aa aa\x00\x0abb bb" | tee tester.txt
aa aa
bb bb
$ du -b tester.txt 
13  tester.txt
$ hexdump -C tester.txt 
00000000  61 61 20 61 61 00 0a 62  62 20 62 62 0a           |aa aa..bb bb.|
0000000d

Note that byte 00 (specified as \x00 ) is a non-printable character, (and in C , it also means "end of a string") - thereby, its presence makes tester.txt a binary file. The file has size of 13 bytes as seen by du , because of the trailing \n added by the echo (as it can be seen from hexdump ).

Now, let's see what happens when we try to read it with perl 's <> diamond operator (see also What's the use of <> in perl? ):

$ perl -e '
open IN, "<./tester.txt";
binmode(IN);
$data = <IN>; # does this slurp entire file in one go?
close(IN);
print "length is: " . length($data) . "\n";
print "data is: --$data--\n";
'

length is: 7
data is: --aa aa
--

Clearly, the entire file didn't get slurped - it broke at the line end \n (and not at the binary \x00 ). That is because the diamond filehandle <FH> operator is actually shortcut for readline (see Perl Cookbook: Chapter 8, File Contents )

The same link tells that one should undef the input record separator, \$ (which by default is set to \n ), in order to slurp the entire file. You may want to have this change be only local, which is why the braces and local are used instead of undef (see Perl Idioms Explained - my $string = do { local $/; }; ); so we have:

$ perl -e '
open IN, "<./tester.txt";
print "_$/_\n"; # check if $/ is \n
binmode(IN);
{
local $/; # undef $/; is global
$data = <IN>; # this should slurp one go now
};
print "_$/_\n"; # check again if $/ is \n
close(IN);
print "length is: " . length($data) . "\n";
print "data is: --$data--\n";
'

_
_
_
_
length is: 13
data is: --aa aa
bb bb
--

... and now we can see the file is slurped in its entirety.

Since binary data implies unprintable characters, you may want to inspect the actual contents of $data by printing via sprintf or pack / unpack instead.

Hope this helps someone,
Cheers!

[May 04, 2018] Why are tar archive formats switching to xz compression to replace bzip2 and what about gzip

May 04, 2018 | unix.stackexchange.com

のbるしtyぱんky ,Jan 6, 2014 at 18:39

More and more tar archives use the xz format based on LZMA2 for compression instead of the traditional bzip2(bz2) compression. In fact kernel.org made a late " Good-bye bzip2 " announcement, 27th Dec. 2013 , indicating kernel sources would from this point on be released in both tar.gz and tar.xz format - and on the main page of the website what's directly offered is in tar.xz .

Are there any specific reasons explaining why this is happening and what is the relevance of gzip in this context?

> ,

For distributing archives over the Internet, the following things are generally a priority:
  1. Compression ratio (i.e., how small the compressor makes the data);
  2. Decompression time (CPU requirements);
  3. Decompression memory requirements; and
  4. Compatibility (how wide-spread the decompression program is)

Compression memory & CPU requirements aren't very important, because you can use a large fast machine for that, and you only have to do it once.

Compared to bzip2, xz has a better compression ratio and lower (better) decompression time. It, however -- at the compression settings typically used -- requires more memory to decompress [1] and is somewhat less widespread. Gzip uses less memory than either.

So, both gzip and xz format archives are posted, allowing you to pick:

There isn't really a realistic combination of factors that'd get you to pick bzip2. So its being phased out.

I looked at compression comparisons in a blog post . I didn't attempt to replicate the results, and I suspect some of it has changed (mostly, I expect xz has improved, as its the newest.)

(There are some specific scenarios where a good bzip2 implementation may be preferable to xz: bzip2 can compresses a file with lots of zeros and genome DNA sequences better than xz. Newer versions of xz now have an (optional) block mode which allows data recovery after the point of corruption and parallel compression and [in theory] decompression. Previously, only bzip2 offered these. [2] However none of these are relevant for kernel distribution)


1: In archive size, xz -3 is around bzip -9 . Then xz uses less memory to decompress. But xz -9 (as, e.g., used for Linux kernel tarballs) uses much more than bzip -9 . (And even xz -0 needs more than gzip -9 ).

2: F21 System Wide Change: lbzip2 as default bzip2 implementation

> ,

First of all, this question is not directly related to tar . Tar just creates an uncompressed archive, the compression is then applied later on.

Gzip is known to be relatively fast when compared to LZMA2 and bzip2. If speed matters, gzip (especially the multithreaded implementation pigz ) is often a good compromise between compression speed and compression ratio. Although there are alternatives if speed is an issue (e.g. LZ4).

However, if a high compression ratio is desired LZMA2 beats bzip2 in almost every aspect. The compression speed is often slower, but it decompresses much faster and provides a much better compression ratio at the cost of higher memory usage.

There is not much reason to use bzip2 any more, except of backwards compatibility. Furthermore, LZMA2 was desiged with multithreading in mind and many implementations by default make use of multicore CPUs (unfortunately xz on Linux does not do this, yet). This makes sense since the clock speeds won't increase any more but the number of cores will.

There are multithreaded bzip2 implementations (e.g. pbzip ), but they are often not installed by default. Also note that multithreaded bzip2 only really pay off while compressing whereas decompression uses a single thread if the file was compress using a single threaded bzip2 , in contrast to LZMA2. Parallel bzip2 variants can only leverage multicore CPUs if the file was compressed using a parallel bzip2 version, which is often not the case.

Slyx ,Jan 6, 2014 at 19:14

Short answer : xz is more efficient in terms of compression ratio. So it saves disk space and optimizes the transfer through the network.
You can see this Quick Benchmark so as to discover the difference by practical tests.

Mark Warburton ,Apr 14, 2016 at 14:15

LZMA2 is a block compression system whereas gzip is not. This means that LZMA2 lends itself to multi-threading. Also, if corruption occurs in an archive, you can generally recover data from subsequent blocks with LZMA2 but you cannot do this with gzip. In practice, you lose the entire archive with gzip subsequent to the corrupted block. With an LZMA2 archive, you only lose the file(s) affected by the corrupted block(s). This can be important in larger archives with multiple files.

,

[May 04, 2018] bit manipulation - Bit operations in Perl

May 04, 2018 | stackoverflow.com

4 down vote favorite


Toren ,Jan 12, 2011 at 14:50

I have an attribute (32 bits-long), that each bit responsible to specific functionality. Perl script I'm writing should turn on 4th bit, but save previous definitions of other bits.

I use in my program:

Sub BitOperationOnAttr

{

my $a="";

MyGetFunc( $a);

$a |= 0x00000008;

MySetFunc( $a);

}

** MyGetFunc/ MySetFunc my own functions that know read/fix value.

Questions:

  1. if usage of $a |= 0x00000008; is right ?
  2. how extract hex value by Regular Expression from string I have : For example:

"Attribute: Somestring: value (8 long (0x8))"

Michael Carman ,Jan 12, 2011 at 16:13

Your questions are not related; they should be posted separately. That makes it easier for other people with similar questions to find them. – Michael Carman Jan 12 '11 at 16:13

toolic ,Jan 12, 2011 at 16:47

Same question asked on PerlMonks: perlmonks.org/?node_id=881892toolic Jan 12 '11 at 16:47

psmears ,Jan 12, 2011 at 15:00

  1. if usage of $a |= 0x00000008; is right ?

Yes, this is fine.

  1. how extract hex value by Regular Expression from string I have : For example:

"Attribute: Somestring: value (8 long (0x8))"

I'm assuming you have a string like the above, and want to use a regular expression to extract the "0x8". In that case, something like:

if ($string =~ m/0x([0-9a-fA-F]+)/) {
    $value = hex($1);
} else {
    # string didn't match
}

should work.

Toren ,Jan 16, 2011 at 12:35

Thank you for quick answer. You show me the right way to solve the problem – Toren Jan 16 '11 at 12:35

Michael Carman ,Jan 12, 2011 at 16:32

Perl provides several ways for dealing with binary data:

Your scenario sounds like a set of packed flags. The bitwise operators are a good fit for this:

my $mask = 1 << 3;   # 0x0008
$value |=  $mask;    # set bit
$value &= ~$mask;    # clear bit
if ($value & $mask)  # check bit

vec is designed for use with bit vectors. (Each element has the same size, which must be a power of two.) It could work here as well:

vec($value, 3, 1) = 1;  # set bit
vec($value, 3, 1) = 0;  # clear bit
if (vec($value, 3, 1))  # check bit

pack and unpack are better suited for working with things like C structs or endianness.

Toren ,Jan 16, 2011 at 12:36

Thank you . Your answer is very informative – Toren Jan 16 '11 at 12:36

sdaau ,Jul 15, 2014 at 5:01

I upvoted, but there is something very important missing: vec operates on a string! If we use a number; say: $val=5; printf("b%08b",$val); (this gives b00000101 ) -- then one can see that the "check bit" syntax, say: for($ix=7;$ix>=0;$ix--) { print vec($val, $ix, 1); }; print "\n"; will not work (it gives 00110101 , which is not the same number). The correct is to convert the number to ASCII char, i.e. print vec(sprintf("%c", $val), $ix, 1); . – sdaau Jul 15 '14 at 5:01

[Apr 29, 2018] Clear unused space with zeros (ext3, ext4)

Notable quotes:
"... Purpose: I'd like to compress partition images, so filling unused space with zeros is highly recommended. ..."
"... Such an utility is zerofree . ..."
"... Be careful - I lost ext4 filesystem using zerofree on Astralinux (Debian based) ..."
"... If the "disk" your filesystem is on is thin provisioned (e.g. a modern SSD supporting TRIM, a VM file whose format supports sparseness etc.) and your kernel says the block device understands it, you can use e2fsck -E discard src_fs to discard unused space (requires e2fsprogs 1.42.2 or higher). ..."
"... If you have e2fsprogs 1.42.9, then you can use e2image to create the partition image without the free space in the first place, so you can skip the zeroing step. ..."
Apr 29, 2018 | unix.stackexchange.com

Grzegorz Wierzowiecki, Jul 29, 2012 at 10:02

How to clear unused space with zeros ? (ext3, ext4)

I'm looking for something smarter than

cat /dev/zero > /mnt/X/big_zero ; sync; rm /mnt/X/big_zero

Like FSArchiver is looking for "used space" and ignores unused, but opposite site.

Purpose: I'd like to compress partition images, so filling unused space with zeros is highly recommended.

Btw. For btrfs : Clear unused space with zeros (btrfs)

Mat, Jul 29, 2012 at 10:18

Check this out: superuser.com/questions/19326/Mat Jul 29 '12 at 10:18

Totor, Jan 5, 2014 at 2:57

Two different kind of answer are possible. What are you trying to achieve? Either 1) security, by forbidding someone to read those data, or 2) optimizing compression of the whole partition or [SSD performance]( en.wikipedia.org/wiki/Trim_(computing) ? – Totor Jan 5 '14 at 2:57

enzotib, Jul 29, 2012 at 11:45

Such an utility is zerofree .

From its description:

Zerofree finds the unallocated, non-zeroed blocks in an ext2 or ext3 file-system and fills them with zeroes. This is useful if the device on which this file-system resides is a disk image. In this case, depending on the type of disk image, a secondary utility may be able to reduce the size of the disk image after zerofree has been run. Zerofree requires the file-system to be unmounted or mounted read-only.

The usual way to achieve the same result (zeroing the unused blocks) is to run "dd" do create a file full of zeroes that takes up the entire free space on the drive, and then delete this file. This has many disadvantages, which zerofree alleviates:

  • it is slow
  • it makes the disk image (temporarily) grow to its maximal extent
  • it (temporarily) uses all free space on the disk, so other concurrent write actions may fail.

Zerofree has been written to be run from GNU/Linux systems installed as guest OSes inside a virtual machine. If this is not your case, you almost certainly don't need this package.

UPDATE #1

The description of the .deb package contains the following paragraph now which would imply this will work fine with ext4 too.

Description: zero free blocks from ext2, ext3 and ext4 file-systems Zerofree finds the unallocated blocks with non-zero value content in an ext2, ext3 or ext4 file-system and fills them with zeroes...

Grzegorz Wierzowiecki, Jul 29, 2012 at 14:08

Is it official page of the tool intgat.tigress.co.uk/rmy/uml/index.html ? Do you think it's safe to use with ext4 ? – Grzegorz Wierzowiecki Jul 29 '12 at 14:08

enzotib, Jul 29, 2012 at 14:12

@GrzegorzWierzowiecki: yes, that is the page, but for debian and friends it is already in the repos. I used on a ext4 partition on a virtual disk to successively shrink the disk file image, and had no problem. – enzotib Jul 29 '12 at 14:12

jlh, Mar 4, 2016 at 10:10

This isn't equivalent to the crude dd method in the original question, since it doesn't work on mounted file systems. – jlh Mar 4 '16 at 10:10

endolith, Oct 14, 2016 at 16:33

zerofree page talks about a patch that lets you do "filesystem is mounted with the zerofree option" so that it always zeros out deleted files continuously. does this require recompiling the kernel then? is there an easier way to accomplish the same thing? – endolith Oct 14 '16 at 16:33

Hubbitus, Nov 23, 2016 at 22:20

Be careful - I lost ext4 filesystem using zerofree on Astralinux (Debian based)Hubbitus Nov 23 '16 at 22:20

Anon, Dec 27, 2015 at 17:53

Summary of the methods (as mentioned in this question and elsewhere) to clear unused space on ext2/ext3/ext4: Zeroing unused space File system is not mounted File system is mounted

Having the filesystem unmounted will give better results than having it mounted. Discarding tends to be the fastest method when a lot of previously used space needs to be zeroed but using zerofree after the discard process can sometimes zero a little bit extra (depending on how discard is implemented on the "disk").

Making the image file smaller Image is in a dedicated VM format

You will need to use an appropriate disk image tool (such as qemu-img convert src_image dst_image ) to enable the zeroed space to be reclaimed and to allow the file representing the image to become smaller.

Image is a raw file

One of the following techniques can be used to make the file sparse (so runs of zero stop taking up space):

These days it might easier to use a tool like virt-sparsify to do these steps and more in one go.

Sources

cas, Jul 29, 2012 at 11:45

sfill from secure-delete can do this and several other related jobs.

e.g.

sfill -l -l -z /mnt/X
UPDATE #1

There is a source tree that appears to be used by the ArchLinux project on github that contains the source for sfill which is a tool included in the package Secure-Delete.

Also a copy of sfill 's man page is here:

cas, Jul 29, 2012 at 12:04

that URL is obsolete. no idea where its home page is now (or even if it still has one), but it's packaged for debian and ubuntu. probably other distros too. if you need source code, that can be found in the debian archives if you can't find it anywhere else. – cas Jul 29 '12 at 12:04

mwfearnley, Jul 31, 2017 at 13:04

The obsolete manpage URL is fixed now. Looks like "Digipedia" is no longer a thing. – mwfearnley Jul 31 '17 at 13:04

psusi, Apr 2, 2014 at 15:27

If you have e2fsprogs 1.42.9, then you can use e2image to create the partition image without the free space in the first place, so you can skip the zeroing step.

mwfearnley, Mar 3, 2017 at 13:36

I couldn't (easily) find any info online about these parameters, but they are indeed given in the 1.42.9 release notes: e2fsprogs.sf.net/e2fsprogs-release.html#1.42.9mwfearnley Mar 3 '17 at 13:36

user64219, Apr 2, 2014 at 14:39

You can use sfill . It's a better solution for thin volumes.

Anthon, Apr 2, 2014 at 15:01

If you want to comment on cas answer, wait until you have enough reputation to do so. – Anthon Apr 2 '14 at 15:01

derobert, Apr 2, 2014 at 17:01

I think the answer is referring to manpages.ubuntu.com/manpages/lucid/man1/sfill.1.html ... which is at least an attempt at answering. ("online" in this case meaning "with the filesystem mounted", not "on the web"). – derobert Apr 2 '14 at 17:01

[Apr 28, 2018] tar exclude single files/directories, not patterns

The important detail about this is that the excluded file name must match exactly the notation reported by the tar listing.
Apr 28, 2018 | stackoverflow.com

Udo G ,May 9, 2012 at 7:13

I'm using tar to make daily backups of a server and want to avoid backup of /proc and /sys system directories, but without excluding any directories named "proc" or "sys" somewhere else in the file tree.

For, example having the following directory tree (" bla " being normal files):

# find
.
./sys
./sys/bla
./foo
./foo/sys
./foo/sys/bla

I would like to exclude ./sys but not ./foo/sys .

I can't seem to find an --exclude pattern that does that...

# tar cvf /dev/null * --exclude=sys
foo/

or...

# tar cvf /dev/null * --exclude=/sys
foo/
foo/sys/
foo/sys/bla
sys/
sys/bla

Any ideas? (Linux Debian 6)

drinchev ,May 9, 2012 at 7:19

Are you sure there is no exclude? If you are using MAC OS it is a different story! Look heredrinchev May 9 '12 at 7:19

Udo G ,May 9, 2012 at 7:21

Not sure I understand your question. There is a --exclude option, but I don't know how to match it for single, absolute file names (not any file by that name) - see my examples above. – Udo G May 9 '12 at 7:21

paulsm4 ,May 9, 2012 at 7:22

Look here: stackoverflow.com/questions/984204/paulsm4 May 9 '12 at 7:22

CharlesB ,May 9, 2012 at 7:29

You can specify absolute paths to the exclude pattern, this way other sys or proc directories will be archived:
tar --exclude=/sys --exclude=/proc /

Udo G ,May 9, 2012 at 7:34

True, but the important detail about this is that the excluded file name must match exactly the notation reported by the tar listing. For my example that would be ./sys - as I just found out now. – Udo G May 9 '12 at 7:34

pjv ,Apr 9, 2013 at 18:14

In this case you might want to use:
--anchored --exclude=sys/\*

because in case your tar does not show the leading "/" you have a problem with the filter.

Savvas Radevic ,May 9, 2013 at 10:44

This did the trick for me, thank you! I wanted to exclude a specific directory, not all directories/subdirectories matching the pattern. bsdtar does not have "--anchored" option though, and with bsdtar we can use full paths to exclude specific folders. – Savvas Radevic May 9 '13 at 10:44

Savvas Radevic ,May 9, 2013 at 10:58

ah found it! in bsdtar the anchor is "^": bsdtar cvjf test.tar.bz2 --exclude myfile.avi --exclude "^myexcludedfolder" *Savvas Radevic May 9 '13 at 10:58

Stephen Donecker ,Nov 8, 2012 at 19:12

Using tar you can exclude directories by placing a tag file in any directory that should be skipped.

Create tag files,

touch /sys/.exclude_from_backup
touch /proc/.exclude_from_backup

Then,

tar -czf backup.tar.gz --exclude-tag-all=.exclude_from_backup *

pjv ,Apr 9, 2013 at 17:58

Good idea in theory but often /sys and /proc cannot be written to. – pjv Apr 9 '13 at 17:58

[Apr 27, 2018] Shell command to tar directory excluding certain files-folders

Highly recommended!
Notable quotes:
"... Trailing slashes at the end of excluded folders will cause tar to not exclude those folders at all ..."
"... I had to remove the single quotation marks in order to exclude sucessfully the directories ..."
"... Exclude files using tags by placing a tag file in any directory that should be skipped ..."
"... Nice and clear thank you. For me the issue was that other answers include absolute or relative paths. But all you have to do is add the name of the folder you want to exclude. ..."
"... Adding a wildcard after the excluded directory will exclude the files but preserve the directories: ..."
"... You can use cpio(1) to create tar files. cpio takes the files to archive on stdin, so if you've already figured out the find command you want to use to select the files the archive, pipe it into cpio to create the tar file: ..."
Apr 27, 2018 | stackoverflow.com

deepwell ,Jun 11, 2009 at 22:57

Is there a simple shell command/script that supports excluding certain files/folders from being archived?

I have a directory that need to be archived with a sub directory that has a number of very large files I do not need to backup.

Not quite solutions:

The tar --exclude=PATTERN command matches the given pattern and excludes those files, but I need specific files & folders to be ignored (full file path), otherwise valid files might be excluded.

I could also use the find command to create a list of files and exclude the ones I don't want to archive and pass the list to tar, but that only works with for a small amount of files. I have tens of thousands.

I'm beginning to think the only solution is to create a file with a list of files/folders to be excluded, then use rsync with --exclude-from=file to copy all the files to a tmp directory, and then use tar to archive that directory.

Can anybody think of a better/more efficient solution?

EDIT: cma 's solution works well. The big gotcha is that the --exclude='./folder' MUST be at the beginning of the tar command. Full command (cd first, so backup is relative to that directory):

cd /folder_to_backup
tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz .

Rekhyt ,May 1, 2012 at 12:55

Another thing caught me out on that, might be worth a note:

Trailing slashes at the end of excluded folders will cause tar to not exclude those folders at all. – Rekhyt May 1 '12 at 12:55

Brice ,Jun 24, 2014 at 16:06

I had to remove the single quotation marks in order to exclude sucessfully the directories. ( tar -zcvf gatling-charts-highcharts-1.4.6.tar.gz /opt/gatling-charts-highcharts-1.4.6 --exclude=results --exclude=target ) – Brice Jun 24 '14 at 16:06

Charles Ma ,Jun 11, 2009 at 23:11

You can have multiple exclude options for tar so
$ tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz .

etc will work. Make sure to put --exclude before the source and destination items.

shasi kanth ,Feb 27, 2015 at 10:49

As an example, if you are trying to backup your wordpress project folder, excluding the uploads folder, you can use this command:

tar -cvf wordpress_backup.tar wordpress --exclude=wp-content/uploads

Alfred Bez ,Jul 16, 2015 at 7:28

I came up with the following command: tar -zcv --exclude='file1' --exclude='pattern*' --exclude='file2' -f /backup/filename.tgz . note that the -f flag needs to precede the tar file see:

flickerfly ,Aug 21, 2015 at 16:22

A "/" on the end of the exclude directory will cause it to fail. I guess tar thinks an ending / is part of the directory name to exclude. BAD: --exclude=mydir/ GOOD: --exclude=mydir – flickerfly Aug 21 '15 at 16:22

NightKnight on Cloudinsidr.com ,Nov 24, 2016 at 9:55

> Make sure to put --exclude before the source and destination items. OR use an absolute path for the exclude: tar -cvpzf backups/target.tar.gz --exclude='/home/username/backups' /home/username – NightKnight on Cloudinsidr.com Nov 24 '16 at 9:55

Johan Soderberg ,Jun 11, 2009 at 23:10

To clarify, you can use full path for --exclude. – Johan Soderberg Jun 11 '09 at 23:10

Stephen Donecker ,Nov 8, 2012 at 0:22

Possible options to exclude files/directories from backup using tar:

Exclude files using multiple patterns

tar -czf backup.tar.gz --exclude=PATTERN1 --exclude=PATTERN2 ... /path/to/backup

Exclude files using an exclude file filled with a list of patterns

tar -czf backup.tar.gz -X /path/to/exclude.txt /path/to/backup

Exclude files using tags by placing a tag file in any directory that should be skipped

tar -czf backup.tar.gz --exclude-tag-all=exclude.tag /path/to/backup

Anish Ramaswamy ,May 16, 2015 at 0:11

This answer definitely helped me! The gotcha for me was that my command looked something like tar -czvf mysite.tar.gz mysite --exclude='./mysite/file3' --exclude='./mysite/folder3' , and this didn't exclude anything. – Anish Ramaswamy May 16 '15 at 0:11

Hubert ,Feb 22, 2017 at 7:38

Nice and clear thank you. For me the issue was that other answers include absolute or relative paths. But all you have to do is add the name of the folder you want to exclude.Hubert Feb 22 '17 at 7:38

GeertVc ,Dec 31, 2013 at 13:35

Just want to add to the above, that it is important that the directory to be excluded should NOT contain a final backslash. So, --exclude='/path/to/exclude/dir' is CORRECT , --exclude='/path/to/exclude/dir/' is WRONG . – GeertVc Dec 31 '13 at 13:35

Eric Manley ,May 14, 2015 at 14:10

You can use standard "ant notation" to exclude directories relative.
This works for me and excludes any .git or node_module directories.
tar -cvf myFile.tar --exclude=**/.git/* --exclude=**/node_modules/*  -T /data/txt/myInputFile.txt 2> /data/txt/myTarLogFile.txt

myInputFile.txt Contains:

/dev2/java
/dev2/javascript

not2qubit ,Apr 4 at 3:24

I believe this require that the Bash shell option variable globstar has to be enabled. Check with shopt -s globstar . I think it off by default on most unix based OS's. From Bash manual: " globstar: If set, the pattern ** used in a filename expansion context will match all files and zero or more directories and subdirectories. If the pattern is followed by a '/', only directories and subdirectories match. " – not2qubit Apr 4 at 3:24

Benoit Duffez ,Jun 19, 2016 at 21:14

Don't forget COPYFILE_DISABLE=1 when using tar, otherwise you may get ._ files in your tarballBenoit Duffez Jun 19 '16 at 21:14

Scott Stensland ,Feb 12, 2015 at 20:55

This exclude pattern handles filename suffix like png or mp3 as well as directory names like .git and node_modules
tar --exclude={*.png,*.mp3,*.wav,.git,node_modules} -Jcf ${target_tarball}  ${source_dirname}

Alex B ,Jun 11, 2009 at 23:03

Use the find command in conjunction with the tar append (-r) option. This way you can add files to an existing tar in a single step, instead of a two pass solution (create list of files, create tar).
find /dir/dir -prune ... -o etc etc.... -exec tar rvf ~/tarfile.tar {} \;

carlo ,Mar 4, 2012 at 15:18

To avoid possible 'xargs: Argument list too long' errors due to the use of find ... | xargs ... when processing tens of thousands of files, you can pipe the output of find directly to tar using find ... -print0 | tar --null ... .
# archive a given directory, but exclude various files & directories 
# specified by their full file paths
find "$(pwd -P)" -type d \( -path '/path/to/dir1' -or -path '/path/to/dir2' \) -prune \
   -or -not \( -path '/path/to/file1' -or -path '/path/to/file2' \) -print0 | 
   gnutar --null --no-recursion -czf archive.tar.gz --files-from -
   #bsdtar --null -n -czf archive.tar.gz -T -

Znik ,Mar 4, 2014 at 12:20

you can quote 'exclude' string, like this: 'somedir/filesdir/*' then shell isn't going to expand asterisks and other white chars.

Tuxdude ,Nov 15, 2014 at 5:12

xargs -n 1 is another option to avoid xargs: Argument list too long error ;) – Tuxdude Nov 15 '14 at 5:12

Aaron Votre ,Jul 15, 2016 at 15:56

I agree the --exclude flag is the right approach.
$ tar --exclude='./folder_or_file' --exclude='file_pattern' --exclude='fileA'

A word of warning for a side effect that I did not find immediately obvious: The exclusion of 'fileA' in this example will search for 'fileA' RECURSIVELY!

Example:A directory with a single subdirectory containing a file of the same name (data.txt)

data.txt
config.txt
--+dirA
  |  data.txt
  |  config.docx

Mike ,May 9, 2014 at 21:26

After reading this thread, I did a little testing on RHEL 5 and here are my results for tarring up the abc directory:

This will exclude the directories error and logs and all files under the directories:

tar cvpzf abc.tgz abc/ --exclude='abc/error' --exclude='abc/logs'

Adding a wildcard after the excluded directory will exclude the files but preserve the directories:

tar cvpzf abc.tgz --exclude='abc/error/*' --exclude='abc/logs/*' abc/

camh ,Jun 12, 2009 at 5:53

You can use cpio(1) to create tar files. cpio takes the files to archive on stdin, so if you've already figured out the find command you want to use to select the files the archive, pipe it into cpio to create the tar file:
find ... | cpio -o -H ustar | gzip -c > archive.tar.gz

frommelmak ,Sep 10, 2012 at 14:08

You can also use one of the "--exclude-tag" options depending on your needs:

The folder hosting the specified FILE will be excluded.

Joe ,Jun 11, 2009 at 23:04

Your best bet is to use find with tar, via xargs (to handle the large number of arguments). For example:
find / -print0 | xargs -0 tar cjf tarfile.tar.bz2

jørgensen ,Mar 4, 2012 at 15:23

That can cause tar to be invoked multiple times - and will also pack files repeatedly. Correct is: find / -print0 | tar -T- --null --no-recursive -cjf tarfile.tar.bz2jørgensen Mar 4 '12 at 15:23

Stphane ,Dec 19, 2015 at 11:10

I read somewhere that when using xargs , one should use tar r option instead of c because when find actually finds loads of results, the xargs will split those results (based on the local command line arguments limit) into chuncks and invoke tar on each part. This will result in a archive containing the last chunck returned by xargs and not all results found by the find command. – Stphane Dec 19 '15 at 11:10

Andrew ,Apr 14, 2014 at 16:21

gnu tar v 1.26 the --exclude needs to come after archive file and backup directory arguments, should have no leading or trailing slashes, and prefers no quotes (single or double). So relative to the PARENT directory to be backed up, it's:

tar cvfz /path_to/mytar.tgz ./dir_to_backup --exclude=some_path/to_exclude

Ashwini Gupta ,Jan 12 at 10:30

tar -cvzf destination_folder source_folder -X /home/folder/excludes.txt

-X indicates a file which contains a list of filenames which must be excluded from the backup. For Instance, you can specify *~ in this file to not include any filenames ending with ~ in the backup.

Georgios ,Sep 4, 2013 at 22:35

Possible redundant answer but since I found it useful, here it is:

While a FreeBSD root (i.e. using csh) I wanted to copy my whole root filesystem to /mnt but without /usr and (obviously) /mnt. This is what worked (I am at /):

tar --exclude ./usr --exclude ./mnt --create --file - . (cd /mnt && tar xvd -)

My whole point is that it was necessary (by putting the ./ ) to specify to tar that the excluded directories where part of the greater directory being copied.

My €0.02

user2792605 ,Sep 30, 2013 at 20:07

I had no luck getting tar to exclude a 5 Gigabyte subdirectory a few levels deep. In the end, I just used the unix Zip command. It worked a lot easier for me.

So for this particular example from the original post
(tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz . )

The equivalent would be:

zip -r /backup/filename.zip . -x upload/folder/**\* upload/folder2/**\*

(NOTE: Here is the post I originally used that helped me https://superuser.com/questions/312301/unix-zip-directory-but-excluded-specific-subdirectories-and-everything-within-t )

t0r0X ,Sep 29, 2014 at 20:25

Beware: zip does not pack empty directories, but tar does! – t0r0X Sep 29 '14 at 20:25

RohitPorwal ,Jul 21, 2016 at 9:56

Check it out
tar cvpzf zip_folder.tgz . --exclude=./public --exclude=./tmp --exclude=./log --exclude=fileName

James ,Oct 28, 2016 at 14:01

The following bash script should do the trick. It uses the answer given here by Marcus Sundman.
#!/bin/bash

echo -n "Please enter the name of the tar file you wish to create with out extension "
read nam

echo -n "Please enter the path to the directories to tar "
read pathin

echo tar -czvf $nam.tar.gz
excludes=`find $pathin -iname "*.CC" -exec echo "--exclude \'{}\'" \;|xargs`
echo $pathin

echo tar -czvf $nam.tar.gz $excludes $pathin

This will print out the command you need and you can just copy and paste it back in. There is probably a more elegant way to provide it directly to the command line.

Just change *.CC for any other common extension, file name or regex you want to exclude and this should still work.

EDIT

Just to add a little explanation; find generates a list of files matching the chosen regex (in this case *.CC). This list is passed via xargs to the echo command. This prints --exclude 'one entry from the list'. The slashes () are escape characters for the ' marks.

tripleee ,Sep 14, 2017 at 4:27

Requiring interactive input is a poor design choice for most shell scripts. Make it read command-line parameters instead and you get the benefit of the shell's tab completion, history completion, history editing, etc. – tripleee Sep 14 '17 at 4:27

tripleee ,Sep 14, 2017 at 4:38

Additionally, your script does not work for paths which contain whitespace or shell metacharacters. You should basically always put variables in double quotes unless you specifically require the shell to perform whitespace tokenization and wildcard expansion. For details, please see stackoverflow.com/questions/10067266/tripleee Sep 14 '17 at 4:38

> ,Apr 18 at 0:31

For those who have issues with it, some versions of tar would only work properly without the './' in the exclude value.
Tar --version

tar (GNU tar) 1.27.1

Command syntax that work:

tar -czvf ../allfiles-butsome.tar.gz * --exclude=acme/foo

These will not work:

$ tar -czvf ../allfiles-butsome.tar.gz * --exclude=./acme/foo
$ tar -czvf ../allfiles-butsome.tar.gz * --exclude='./acme/foo'
$ tar --exclude=./acme/foo -czvf ../allfiles-butsome.tar.gz *
$ tar --exclude='./acme/foo' -czvf ../allfiles-butsome.tar.gz *
$ tar -czvf ../allfiles-butsome.tar.gz * --exclude=/full/path/acme/foo
$ tar -czvf ../allfiles-butsome.tar.gz * --exclude='/full/path/acme/foo'
$ tar --exclude=/full/path/acme/foo -czvf ../allfiles-butsome.tar.gz *
$ tar --exclude='/full/path/acme/foo' -czvf ../allfiles-butsome.tar.gz *

[Apr 04, 2018] gzip - How can I recover files from a corrupted .tar.gz archive

Apr 04, 2018 | stackoverflow.com

22 down vote favorite 5


Tom Melluish ,Oct 14, 2008 at 14:30

I have a large number of files in a .tar.gz archive. Checking the file type with the command
file SMS.tar.gz

gives the response

gzip compressed data - deflate method , max compression

When I try to extract the archive with gunzip, after a delay I receive the message

gunzip: SMS.tar.gz: unexpected end of file

Is there any way to recover even part of the archive?

David Segonds ,Oct 14, 2008 at 14:32

Are you sure that it is a gzip file? I would first run 'file SMS.tar.gz' to validate that.

Then I would read the The gzip Recovery Toolkit page.

George ,Mar 1, 2014 at 18:57

gzrecover does not come installed on Mac OS. However, Liudvikas Bukys's method worked fine. Had tcpdump piped into gzip, killed with Control-C, unexpected EOF trying to decompress pipee file. – George Mar 1 '14 at 18:57

Nemo ,Jun 24, 2016 at 2:49

gzip Recovery Toolkit is tremendous. Thanks! – Nemo Jun 24 '16 at 2:49

Liudvikas Bukys ,Oct 21, 2008 at 18:29

Recovery is possible but it depends on what caused the corruption.

If the file is just truncated, getting some partial result out is not too hard; just run

gunzip < SMS.tar.gz > SMS.tar.partial

which will give some output despite the error at the end.

If the compressed file has large missing blocks, it's basically hopeless after the bad block.

If the compressed file is systematically corrupted in small ways (e.g. transferring the binary file in ASCII mode, which smashes carriage returns and newlines throughout the file), it is possible to recover but requires quite a bit of custom programming, it's really only worth it if you have absolutely no other recourse (no backups) and the data is worth a lot of effort. (I have done it successfully.) I mentioned this scenario in a previous question .

The answers for .zip files differ somewhat, since zip archives have multiple separately-compressed members, so there's more hope (though most commercial tools are rather bogus, they eliminate warnings by patching CRCs, not by recovering good data). But your question was about a .tar.gz file, which is an archive with one big member.

JohnEye ,Oct 4, 2016 at 11:27

There will most likely be an unreadable file after this procedure. Fortunately, there is a tool to fix this and get the partial data from it too: riaschissl.bestsolution.at/2015/03/JohnEye Oct 4 '16 at 11:27

> ,

Here is one possible scenario that we encountered. We had a tar.gz file that would not decompress, trying to unzip gave the error:
gzip -d A.tar.gz
gzip: A.tar.gz: invalid compressed data--format violated

I figured out that the file may been originally uploaded over a non binary ftp connection (we don't know for sure).

The solution was relatively simple using the unix dos2unix utility

dos2unix A.tar.gz
dos2unix: converting file A.tar.gz to UNIX format ...
tar -xvf A.tar
file1.txt
file2.txt 
....etc.

It worked! This is one slim possibility, and maybe worth a try - it may help somebody out there.

[Mar 15, 2018] Why Python Sucks Armin Ronacher's Thoughts and Writings

Mar 15, 2018 | lucumr.pocoo.org

Armin Ronacher 's Thoughts and Writings

Why Python Sucks

written on Monday, June 11, 2007

And of course also why it sucks a lot less than any other language. But it's not perfect. My personal problems with python:

Why it still sucks less? Good question. Probably because the meta programming capabilities are great, the libraries are awesome, indention based syntax is hip, first class functions , quite fast, many bindings (PyGTW FTW!) and the community is nice and friendly. And there is WSGI!

[Mar 15, 2018] programming languages - What are the drawbacks of Python - Software Engineering Stack Exchange

Mar 15, 2018 | softwareengineering.stackexchange.com

down vote

> ,

I use Python somewhat regularly, and overall I consider it to be a very good language. Nonetheless, no language is perfect. Here are the drawbacks in order of importance to me personally:
  1. It's slow. I mean really, really slow. A lot of times this doesn't matter, but it definitely means you'll need another language for those performance-critical bits.
  2. Nested functions kind of suck in that you can't modify variables in the outer scope. Edit: I still use Python 2 due to library support, and this design flaw irritates the heck out of me, but apparently it's fixed in Python 3 due to the nonlocal statement. Can't wait for the libs I use to be ported so this flaw can be sent to the ash heap of history for good.
  3. It's missing a few features that can be useful to library/generic code and IMHO are simplicity taken to unhealthy extremes. The most important ones I can think of are user-defined value types (I'm guessing these can be created with metaclass magic, but I've never tried), and ref function parameter.
  4. It's far from the metal. Need to write threading primitives or kernel code or something? Good luck.
  5. While I don't mind the lack of ability to catch semantic errors upfront as a tradeoff for the dynamism that Python offers, I wish there were a way to catch syntactic errors and silly things like mistyping variable names without having to actually run the code.
  6. The documentation isn't as good as languages like PHP and Java that have strong corporate backings.

[Mar 15, 2018] Why Python Sucks

Mar 15, 2018 | dannyman.toldme.com

[Mar 13, 2018] git log - View the change history of a file using Git versioning

Mar 13, 2018 | stackoverflow.com

Richard ,Nov 10, 2008 at 15:42

How can I view the change history of an individual file in Git, complete details with what has changed?

I have got as far as:

git log -- [filename]

which shows me the commit history of the file, but how do I get at the content of each of the file changes?

I'm trying to make the transition from MS SourceSafe and that used to be a simple right-clickshow history .

chris ,May 10, 2010 at 8:58

The above link is no-longer valid. This link is working today: Git Community Bookchris May 10 '10 at 8:58

Claudio Acciaresi ,Aug 24, 2009 at 12:05

For this I'd use:
gitk [filename]

or to follow filename past renames

gitk --follow [filename]

Egon Willighagen ,Apr 6, 2010 at 15:50

But I rather even have a tool that combined the above with 'git blame' allowing me to browse the source of a file as it changes in time... – Egon Willighagen Apr 6 '10 at 15:50

Dan Moulding ,Mar 30, 2011 at 23:17

Unfortunately, this doesn't follow the history of the file past renames. – Dan Moulding Mar 30 '11 at 23:17

Florian Gutmann ,Apr 26, 2011 at 9:05

I was also looking for the history of files that were previously renamed and found this thread first. The solution is to use "git log --follow <filename>" as Phil pointed out here . – Florian Gutmann Apr 26 '11 at 9:05

mikemaccana ,Jul 18, 2011 at 15:17

The author was looking for a command line tool. While gitk comes with GIT, it's neither a command line app nor a particularly good GUI. – mikemaccana Jul 18 '11 at 15:17

hdgarrood ,May 13, 2013 at 14:57

Was he looking for a command line tool? "right click -> show history" certainly doesn't imply it. – hdgarrood May 13 '13 at 14:57

VolkA ,Nov 10, 2008 at 15:56

You can use
git log -p filename

to let git generate the patches for each log entry.

See

git help log

for more options - it can actually do a lot of nice things :) To get just the diff for a specific commit you can

git show HEAD

or any other revision by identifier. Or use

gitk

to browse the changes visually.

Jonas Byström ,Feb 17, 2011 at 17:13

git show HEAD shows all files, do you know how to track an individual file (as Richard was asking for)? – Jonas Byström Feb 17 '11 at 17:13

Marcos Oliveira ,Feb 9, 2012 at 21:44

you use: git show <revision> -- filename, that will show the diffs for that revision, in case exists one. – Marcos Oliveira Feb 9 '12 at 21:44

Raffi Khatchadourian ,May 9, 2012 at 22:29

--stat is also helpful. You can use it together with -p. – Raffi Khatchadourian May 9 '12 at 22:29

Paulo Casaretto ,Feb 27, 2013 at 18:05

This is great. gitk does not behave well when specifying paths that do not exist anymore. I used git log -p -- path . – Paulo Casaretto Feb 27 '13 at 18:05

ghayes ,Jul 21, 2013 at 19:28

Plus gitk looks like it was built by the boogie monster. This is a great answer and is best tailored to the original question. – ghayes Jul 21 '13 at 19:28

Dan Moulding ,Mar 30, 2011 at 23:25

git log --follow -p -- file

This will show the entire history of the file (including history beyond renames and with diffs for each change).

In other words, if the file named bar was once named foo , then git log -p bar (without the --follow option) will only show the file's history up to the point where it was renamed -- it won't show the file's history when it was known as foo . Using git log --follow -p bar will show the file's entire history, including any changes to the file when it was known as foo . The -p option ensures that diffs are included for each change.

Raffi Khatchadourian ,May 9, 2012 at 22:29

--stat is also helpful. You can use it together with -p. – Raffi Khatchadourian May 9 '12 at 22:29

zzeroo ,Sep 6, 2012 at 14:11

Dan's answer is the only real one! git log --follow -p filezzeroo Sep 6 '12 at 14:11

Trevor Boyd Smith ,Sep 11, 2012 at 18:54

I agree this is the REAL answer. (1.) --follow ensures that you see file renames (2.) -p ensures that you see how the file gets changed (3.) it is command line only. – Trevor Boyd Smith Sep 11 '12 at 18:54

Dan Moulding ,May 28, 2015 at 16:10

@Benjohn The -- option tells Git that it has reached the end of the options and that anything that follows -- should be treated as an argument. For git log this only makes any difference if you have a path name that begins with a dash . Say you wanted to know the history of a file that has the unfortunate name "--follow": git log --follow -p -- --followDan Moulding May 28 '15 at 16:10

NHDaly ,May 30, 2015 at 6:03

@Benjohn: Normally, the -- is useful because it can also guard against any revision names that match the filename you've entered, which can actually be scary. For example: If you had both a branch and a file named foo , git log -p foo would show the git log history up to foo , not the history for the file foo . But @DanMoulding is right that since the --follow command only takes a single filename as its argument, this is less necessary since it can't be a revision . I just learned that. Maybe you were right to leave it out of your answer then; I'm not sure. – NHDaly May 30 '15 at 6:03

Falken ,Jun 7, 2012 at 10:23

If you prefer to stay text-based, you may want to use tig .

Quick Install:

Use it to view history on a single file: tig [filename]
Or browse detailed repo history: tig

Similar to gitk but text based. Supports colors in terminal!

Tom McKenzie ,Oct 24, 2012 at 5:28

Excellent text-based tool, great answer. I freaked out when I saw the dependencies for gitk installing on my headless server. Would upvote again A+++ – Tom McKenzie Oct 24 '12 at 5:28

gloriphobia ,Oct 27, 2017 at 12:05

You can look at specific files with tig too, i.e. tig -- path/to/specific/filegloriphobia Oct 27 '17 at 12:05

farktronix ,Nov 11, 2008 at 6:12

git whatchanged -p filename is also equivalent to git log -p filename in this case.

You can also see when a specific line of code inside a file was changed with git blame filename . This will print out a short commit id, the author, timestamp, and complete line of code for every line in the file. This is very useful after you've found a bug and you want to know when it was introduced (or who's fault it was).

rockXrock ,Mar 8, 2013 at 9:45

+1, but filename is not optional in command git blame filename . – rockXrock Mar 8 '13 at 9:45

ciastek ,Mar 18, 2014 at 8:03

"New users are encouraged to use git-log instead. (...) The command is kept primarily for historical reasons;" – ciastek Mar 18 '14 at 8:03

Mark Fox ,Jul 30, 2013 at 18:55

SourceTree users

If you use SourceTree to visualize your repository (it's free and quite good) you can right click a file and select Log Selected

The display (below) is much friendlier than gitk and most the other options listed. Unfortunately (at this time) there is no easy way to launch this view from the command line -- SourceTree's CLI currently just opens repos.

Chris ,Mar 13, 2015 at 13:07

I particularly like the option "Follow renamed files", which allows you to see if a file was renamed or moved. – Chris Mar 13 '15 at 13:07

Sam Lewallen ,Jun 30, 2015 at 6:16

but unless i'm mistaken (please let me know!), one can only compare two versions at a time in the gui? Are there any clients which have an elegant interface for diffing several different versions at once? Possibly with a zoom-out view like in Sublime Text? That would be really useful I think. – Sam Lewallen Jun 30 '15 at 6:16

Mark Fox ,Jun 30, 2015 at 18:47

@SamLewallen If I understand correctly you want to compare three different commits? This sounds similar to a three-way merge (mine, yours, base) -- usually this strategy is used for resolving merge conflicts not necessarily comparing three arbitrary commits. There are many tools that support three way merges stackoverflow.com/questions/10998728/ but the trick is feeding these tools the specific revisions gitready.com/intermediate/2009/02/27/Mark Fox Jun 30 '15 at 18:47

Sam Lewallen ,Jun 30, 2015 at 19:02

Thanks Mark Fox, that's what I mean. Do you happen to know of any applications that will do that? – Sam Lewallen Jun 30 '15 at 19:02

AechoLiu ,Jan 25 at 6:58

You save my life. You can use gitk to find the SHA1 hash, and then open SourceTree to enter Log Selected.. based on the found SHA1 . – AechoLiu Jan 25 at 6:58

yllohy ,Aug 11, 2010 at 13:01

To show what revision and author last modified each line of a file:
git blame filename

or if you want to use the powerful blame GUI:

git gui blame filename

John Lawrence Aspden ,Dec 5, 2012 at 18:38

Summary of other answers after reading through them and playing a bit:

The usual command line command would be

git log --follow --all -p dir/file.c

But you can also use either gitk (gui) or tig (text-ui) to give much more human-readable ways of looking at it.

gitk --follow --all -p dir/file.c

tig --follow --all -p dir/file.c

Under debian/ubuntu, the install command for these lovely tools is as expected :

sudo apt-get install gitk tig

And I'm currently using:

alias gdf='gitk --follow --all -p'

so that I can just type gdf dir to get a focussed history of everything in subdirectory dir .

PopcornKing ,Feb 25, 2013 at 17:11

I think this is a great answer. Maybe you arent getting voted as well because you answer other ways (IMHO better) to see the changes i.e. via gitk and tig in addition to git. – PopcornKing Feb 25 '13 at 17:11

parasrish ,Aug 16, 2016 at 10:04

Just to add to answer. Locate the path (in git space, up to which exists in repository still). Then use the command stated above "git log --follow --all -p <folder_path/file_path>". There may be the case, that the filde/folder would have been removed over the history, hence locate the maximum path that exists still, and try to fetch its history. works ! – parasrish Aug 16 '16 at 10:04

cregox ,Mar 18, 2017 at 9:44

--all is for all branches, the rest is explained in @Dan's answer – cregox Mar 18 '17 at 9:44

Palesz ,Jun 26, 2013 at 20:12

Add this alias to your .gitconfig:
[alias]
    lg = log --all --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset'\n--abbrev-commit --date=relative

And use the command like this:

> git lg
> git lg -- filename

The output will look almost exactly the same as the gitk output. Enjoy.

jmbeck ,Jul 22, 2013 at 14:40

After I ran that lg shortcut, I said (and I quote) "Beautiful!". However, note that the "\n" after "--graph" is an error. – jmbeck Jul 22 '13 at 14:40

Egel ,Mar 27, 2015 at 12:11

Also can be used git lg -p filename - it returns a beautiful diff of searched file. – Egel Mar 27 '15 at 12:11

Jian ,Nov 19, 2012 at 6:25

I wrote git-playback for this exact purpose
pip install git-playback
git playback [filename]

This has the benefit of both displaying the results in the command line (like git log -p ) while also letting you step through each commit using the arrow keys (like gitk ).

George Anderson ,Sep 17, 2010 at 16:50

Or:

gitx -- <path/to/filename>

if you're using gitx

Igor Ganapolsky ,Sep 4, 2011 at 16:19

For some reason my gitx opens up blank. – Igor Ganapolsky Sep 4 '11 at 16:19

zdsbs ,Jan 3, 2014 at 18:17

@IgorGanapolsky you have to make sure you're at the root of your git repository – zdsbs Jan 3 '14 at 18:17

Adi Shavit ,Aug 7, 2012 at 13:57

If you want to see the whole history of a file, including on all other branches use:
gitk --all <filename>

lang2 ,Nov 10, 2015 at 11:53

Lately I discovered tig and found it very useful. There are some cases I'd wish it does A or B but most of the time it's rather neat.

For your case, tig <filename> might be what you're looking for.

http://jonas.nitro.dk/tig/

PhiLho ,Nov 28, 2012 at 15:58

With the excellent Git Extensions , you go to a point in the history where the file still existed (if it have been deleted, otherwise just go to HEAD), switch to the File tree tab, right-click on the file and choose File history .

By default, it follows the file through the renames, and the Blame tab allows to see the name at a given revision.

It has some minor gotchas, like showing fatal: Not a valid object name in the View tab when clicking on the deletion revision, but I can live with that. :-)

Evan Hahn ,May 9, 2013 at 20:39

Worth noting that this is Windows-only. – Evan Hahn May 9 '13 at 20:39

Shmil The Cat ,Aug 9, 2015 at 17:42

@EvanHahn not accurate, via mono one can use GitExtension also on Linux, we use it on ubuntu and quite happy w/ it. see git-extensions-documentation.readthedocs.org/en/latest/Shmil The Cat Aug 9 '15 at 17:42

cori ,Nov 10, 2008 at 15:56

If you're using the git GUI (on Windows) under the Repository menu you can use "Visualize master's History". Highlight a commit in the top pane and a file in the lower right and you'll see the diff for that commit in the lower left.

jmbeck ,Jul 22, 2013 at 14:42

How does this answer the question? – jmbeck Jul 22 '13 at 14:42

cori ,Jul 22, 2013 at 15:34

Well, OP didn't specify command line, and moving from SourceSafe (which is a GUI) it seemed relevant to point out that you could do pretty much the same thing that you can do in VSS in the Git GUI on Windows. – cori Jul 22 '13 at 15:34

Malks ,Dec 1, 2011 at 5:24

The answer I was looking for that wasn't in this thread is to see changes in files that I'd staged for commit. i.e.
git diff --cached

ghayes ,Jul 21, 2013 at 19:47

If you want to include local (unstaged) changes, I often run git diff origin/master to show the complete differences between your local branch and the master branch (which can be updated from remote via git fetch ) – ghayes Jul 21 '13 at 19:47

Brad Koch ,Feb 22, 2014 at 0:04

-1, That's a diff, not a change history. – Brad Koch Feb 22 '14 at 0:04

user3885927 ,Aug 12, 2015 at 21:22

If you use TortoiseGit you should be able to right click on the file and do TortoiseGit --> Show Log . In the window that pops up, make sure:

Noam Manos ,Nov 30, 2015 at 10:54

TortoiseGit (and Eclipse Git as well) somehow misses revisions of the selected file, don't count on it! – Noam Manos Nov 30 '15 at 10:54

user3885927 ,Nov 30, 2015 at 23:20

@NoamManos, I haven't encountered that issue, so I cannot verify if your statement is correct. – user3885927 Nov 30 '15 at 23:20

Noam Manos ,Dec 1, 2015 at 12:06

My mistake, it only happens in Eclipse, but in TortoiseGit you can see all revisions of a file if unchecking "show all project" + checking "all branches" (in case the file was committed on another branch, before it was merged to main branch). I'll update your answer. – Noam Manos Dec 1 '15 at 12:06

Lukasz Czerwinski ,May 20, 2013 at 17:17

git diff -U <filename> give you a unified diff.

It should be colored on red and green. If it's not, run: git config color.ui auto first.

jitendrapurohit ,Aug 13, 2015 at 5:41

You can also try this which lists the commits that has changed a specific part of a file (Implemented in Git 1.8.4).

Result returned would be the list of commits that modified this particular part. Command goes like :

git log --pretty=short -u -L <upperLimit>,<lowerLimit>:<path_to_filename>

where upperLimit is the start_line_number and lowerLimit is the ending_line_number of the file.

Antonín Slejška ,Jun 1, 2016 at 10:44

SmartGit :
  1. In the menu enable to display unchanged files: View / Show unchanged files
  2. Right click the file and select 'Log' or press 'Ctrl-L'

AhHatem ,Jan 2, 2013 at 19:35

If you are using eclipse with the git plugin, it has an excellent comparison view with history. Right click the file and select "compare with"=> "history"

avgvstvs ,Sep 27, 2013 at 13:22

That won't allow you to find a deleted file however. – avgvstvs Sep 27 '13 at 13:22

golimar ,Oct 30, 2012 at 14:35

Comparing two versions of a file is different to Viewing the change history of a file – golimar May 7 '15 at 10:22

[Dec 06, 2017] Install R on RedHat errors on dependencies that don't exist

Highly recommended!
Dec 06, 2017 | stackoverflow.com

Jon ,Jul 11, 2014 at 23:55

I have installed R before on a machine running RedHat EL6.5, but I recently had a problem installing new packages (i.e. install.packages()). Since I couldn't find a solution to this, I tried reinstalling R using:
sudo yum remove R

and

sudo yum install R

But now I get:

....
---> Package R-core-devel.x86_64 0:3.1.0-5.el6 will be installed
--> Processing Dependency: blas-devel >= 3.0 for package: R-core-devel-3.1.0-5.el6.x86_64
--> Processing Dependency: libicu-devel for package: R-core-devel-3.1.0-5.el6.x86_64
--> Processing Dependency: lapack-devel for package: R-core-devel-3.1.0-5.el6.x86_64
---> Package xz-devel.x86_64 0:4.999.9-0.3.beta.20091007git.el6 will be installed
--> Finished Dependency Resolution
Error: Package: R-core-devel-3.1.0-5.el6.x86_64 (epel)
           Requires: blas-devel >= 3.0
Error: Package: R-core-devel-3.1.0-5.el6.x86_64 (epel)
       Requires: lapack-devel
Error: Package: R-core-devel-3.1.0-5.el6.x86_64 (epel)
       Requires: libicu-devel
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest

I already checked, and blas-devel is installed, but the newest version is 0.2.8. Checked using:

yum info openblas-devel.x86_64

Any thoughts as to what is going wrong? Thanks.

Scott Ritchie ,Jul 12, 2014 at 0:31

A cursory search of blas-devel in google shows that the latest version is at least version 3.2. You probably used to have an older version of R installed, and the newer version depends on a version of BLAS not available in RedHat? – Scott Ritchie Jul 12 '14 at 0:31

bdemarest ,Jul 12, 2014 at 0:31

Can solve this by sudo yum install lapack-devel , etc.. until the errors stop. – bdemarest Jul 12 '14 at 0:31

Jon ,Jul 14, 2014 at 4:08

sudo yum install lapack-devel does not work. Returns: No package lapack-devel available. Scott - you are right that blas-devel is not available in yum. What is the best way to fix this? – Jon Jul 14 '14 at 4:08

Owen ,Aug 27, 2014 at 18:33

I had the same issue. Not sure why these packages are missing from RHEL's repos, but they are in CentOS 6.5, so the follow solution works, if you want to keep things in the package paradigm:
wget http://mirror.centos.org/centos/6/os/x86_64/Packages/lapack-devel-3.2.1-4.el6.x86_64.rpm
wget http://mirror.centos.org/centos/6/os/x86_64/Packages/blas-devel-3.2.1-4.el6.x86_64.rpm
wget http://mirror.centos.org/centos/6/os/x86_64/Packages/texinfo-tex-4.13a-8.el6.x86_64.rpm
wget http://mirror.centos.org/centos/6/os/x86_64/Packages/libicu-devel-4.2.1-9.1.el6_2.x86_64.rpm
sudo yum localinstall *.rpm

cheers


UPDATE: Leon's answer is better -- see below.

DavidJ ,Mar 23, 2015 at 19:50

When installing texinfo-tex-5.1-4.el7.x86_654, it complains about requiring tex(epsd.tex), but I've no idea which package supplies that. This is on RHEL7, obviously (and using CentOS7 packages). – DavidJ Mar 23 '15 at 19:50

Owen ,Mar 24, 2015 at 21:07

Are you trying to install using rpm or yum? yum should attempt to resolve dependencies. – Owen Mar 24 '15 at 21:07

DavidJ ,Mar 25, 2015 at 14:18

It was yum complaining. Adding the analogous CentOS repo to /etc/yum.repos.d temporarily and then installing just the missing dependencies, then removing it and installing R fixed the issue. It is apparently a issue/bug with the RHEL package dependencies. I had to be careful to ensure the all other packages came from the RHEL repos, not CentOS, hence not a good idea to install R itself when the CentOS repo is active. – DavidJ Mar 25 '15 at 14:18

Owen ,Mar 26, 2015 at 4:49

Glad you figured it out. When I stumbled on this last year I was also surprised that the Centos repos seemed more complete than RHEL. – Owen Mar 26 '15 at 4:49

Dave X ,May 28, 2015 at 19:33

They are in the RHEL optional RPMs. See Leon's answer. – Dave X May 28 '15 at 19:33

Leon ,May 21, 2015 at 18:38

Do the following:
  1. vim /etc/yum.repos.d/redhat.repo
  2. Change enabled = 0 in [rhel-6-server-optional-rpms] section of the file to enabled=1
  3. yum install R

DONE!

I think I should give reference to the site of solution:

https://bluehatrecord.wordpress.com/2014/10/13/installing-r-on-red-hat-enterprise-linux-6-5/

Dave X ,May 28, 2015 at 19:31

Works for RHEL7 with [rhel-7-server-optional-rpms] change too. – Dave X May 28 '15 at 19:31

Jon ,Aug 4, 2014 at 4:49

The best solution I could come up with was to install from source. This worked and was not too bad. However, now it isn't in my package manager.

[Dec 06, 2017] Download RStudio Server -- RStudio

Dec 06, 2017 | www.rstudio.com
RStudio Server v0.99 requires RedHat or CentOS version 5.4 (or higher) as well as an installation of R. You can install R for RedHat and CentOS using the instructions on CRAN: https://cran.rstudio.com/bin/linux/redhat/README .

RedHat/CentOS 6 and 7

To download and install RStudio Server open a terminal window and execute the commands corresponding to the 32 or 64-bit version as appropriate.

64bit
Size: 43.5 MB MD5: 1e973cd9532d435d8a980bf84ec85c30 Version: 1.1.383 Released: 2017-10-09

$ wget https://download2.rstudio.org/rstudio-server-rhel-1.1.383-x86_64.rpm
$ sudo yum install --nogpgcheck rstudio-server-rhel-1.1.383-x86_64.rpm

See the Getting Started document for information on configuring and managing the server.

Read the RStudio Server Professional Admin Guide for more detailed instructions.

[Dec 06, 2017] The difficulties of moving from Python to R

Dec 06, 2017 | blog.danwin.com

This post is in response to: Python, Machine Learning, and Language Wars , by Sebastian Raschka

As someone who's switched from Ruby to Python (because the latter is far easier to teach, IMO) and who has also put significant time into learning R just to use ggplot2, I was really surprised at the lack of relevant Google results for "switching from python to r" – or similarly phrased queries. In fact, that particular query will bring up more results for R to Python , e.g. " Python Displacing R as The Programming Language For Data ". The use of R is so ubiquitous in academia (and in the wild, ggplot2 tends to wow nearly on the same level as D3) that I had just assumed there were a fair number of Python/Ruby developers who have tried jumping into R. But there aren't minimaxir's guides are the most and only comprehensive how-to-do-R-as-written-by-an-outsider guides I've seen on the web.

By and far, the most common shift seems to be that of Raschka's – going from R to Python:

Well, I guess it's no big secret that I was an R person once. I even wrote a book about it So, how can I summarize my feelings about R? I am not exactly sure where this quote is comes from – I picked it up from someone somewhere some time ago – but it is great for explaining the difference between R and Python: "R is a programming language developed by statisticians for statisticians; Python was developed by a computer scientist, and it can be used by programmers to apply statistical techniques." Part of the message is that both R and Python are similarly capable for "data science" tasks, however, the Python syntax simply feels more natural to me – it's a personal taste.

That said, one of the things I've appreciated about R is how it "just works" I usually install R through Homebrew, but installing RStudio via point and click is also straightforward . I can see why that's a huge appeal for both beginners and people who want to do computation but not necessarily become developers. Hell, I've been struggling for what feels like months to do just the most rudimentary GIS work in Python 3 . But in just a couple weeks of learning R – and leveraging however it manages to package GDAL and all its other geospatial dependencies with rgdal – been able to create some decent geospatial visualizations (and queries) :

... ... ...

I'm actually enjoying plotting with Matplotlib and seaborn, but it's hard to beat the elegance of ggplot2 – it's worth learning R just to be able to read and better understand Wickham's ggplot2 book and its explanation of the "Grammar of Graphics" . And there's nothing else quite like ggmap in other languages.

Also, I used to hate how <- was used for assignment. Now, that's one of the things I miss most about using R. I've grown up with single-equals-sign assignment in every other language I've learned, but after having to teach some programming the difference between == and = is a common and often hugely stumping error for beginners. Not only that, they have trouble remembering how assignment even works, even for basic variable assignment I've come to realize that I've programmed so long that I immediately recognize the pattern, but that can't possibly be the case for novices, who if they've taken general math classes, have never seen the equals sign that way. The <- operator makes a lot more sense though I would have never thought that if hadn't read Hadley Wickham's style guide .

Speaking of Wickham's style guide, one thing I wish I had done at the very early stages of learning R is to have read Wickham's Advanced R book – which is free online (and contains the style guide). Not only is it just a great read for any programmer, like everything Wickham writes, it is not at all an "advanced" book if you are coming from another language. It goes over the fundamentals of how the language is designed. For example, one major pain point for me was not realizing that R does not have scalars – things that appear to be scalars happen to be vectors of length one. This is something Wickham's book mentions in its Data structures chapter .

Another vital and easy-to-read chapter: Wickham's explanation of R's non-standard evaluation has totally illuminated to me why a programmer of Wickham's caliber enjoys building in R, but why I would find it infuriating to teach R versus Python to beginners.

(Here's another negative take on non-standard evaluation , by an R-using statistician)

FWIW, Wickham has posted a repo attempting to chart and analyze various trends and metrics about R and Python usage . I won't be that methodical; on Reddit, r/Python seems to be by far the biggest programming subreddit. At the time of writing, it has 122,690 readers . By comparison, r/ruby and r/javascript have 31,200 and 82,825 subscribers, respectively. The R-focused subreddit, r/rstats , currently has 8,500 subscribers.

The Python community is so active on Reddit that it has its own learners subreddit – r/learnpython – with 54,300 subscribers .

From anecdotal observations, I don't think Python shows much sign of diminishing popularity on Hacker News, either. Not just because Python-language specific posts keep making the front page, but because of the general increased interest in artificial intelligence, coinciding with Google's recent release of TensorFlow , which they've even quickly ported to Python 3.x .

[Nov 16, 2017] The Decline of Perl - My Opinion

Rumors of Perl demise were greatly exaggerated...
Notable quotes:
"... Secondly I am personally not overly concerned with what the popular language of the day is. As I commented ages ago at RE (tilly) 1: Java vs. Perl from the CB , the dream that is sold to PHBs of programmers as interchangable monkeys doesn't appeal to me, and is a proven recipe for IT disasters. See Choose the most powerful language for further discussion, and a link to an excellent article by Paul Graham. As long as I have freedom to be productive, I will make the best choice for me. Often that is Perl. ..."
"... Indeed languages like Python and Ruby borrow some of Perl's good ideas, and make them conveniently available to people who want some of the power of Perl, but who didn't happen to click with Perl. I think this is a good thing in the end. Trying different approaches allows people to figure out what they like and why they like it, leading to better languages later. ..."
"... Well, to start off, I think you're equating "popularity" with "success". It's true that Perl awareness is not as widespread as the other languages you mention. (By the way, I notice that two of them, C# and Java, are products of corporations who would like to run away with the market regardless of the technical merit of their solution). But when I first read the title of your note, I thought a lot of other things. To me, "decline" means disuse or death, neither of which I think apply to Perl today. ..."
"... a real program ..."
"... Perl is declining ..."
Feb 02, 2002 | www.perlmonks.org
trs80 (Priest) on Feb 02, 2002 at 18:54 UTC
I love Perl and have been using it since 1996 at my work for administrative tasks as well as web based products. I use Perl on Unix and Windows based machines for numerous tasks.

Before I get in depth about the decline let me give a little background of myself. I got my first computer in 1981, a TRS-80 Model III 4 MHz Z80 processor, 16K RAM, no HD, no FD, just a cassette tape sequential read/write for storage and retrieval. The TRS-80 line allowed for assembler or BASIC programs to be run on it. I programmed in both BASIC and assembler, but most BASIC since I had limited memory and using the tape became very annoying. Lets time warp forward to 1987 when Perl was first released.

The introduction of Perl was not household knowledge; the use of computers in the home was still considerably low. Those that did have computers most likely did very specific tasks; such as bring work home from the office. So it is fairly safe to say that Perl was not targeted at inexperienced computer users, but more to system administrators and boy did system administrators love it. Now lets time warp ahead to 1994.

1994 marked what I consider the start of the rush to the WWW (not the internet) and it was the birth year of Perl 5 and DBI. The WWW brought to use the ability to easily link to any other site/document/page via hypertext markup language or as we like to say, HTML. This " new " idea caused created a stir in the non-tech world for the first time. The WWW, as HTML progressed, started to make using and knowing about computers a little less geek. Most servers were UNIX based and as the needs for dynamic content or form handling grew what language was there to assist? Perl. So Perl became, in a way, the default web language for people that hadn't been entrenched in programming another CGI capable language and just wanted to process a form, create a flat file data store, etc. that is non-techies.

Perl served us all well, but on the horizon were the competitors. The web had proven itself not to be a flash in the pan, but a tool through which commerce and social interaction could take new form and allow people that had never considered using a computer before a reason to purchase one. So the big software and hardware giants looked for ways to gain control over the web and the Internet in general. There were even smaller players that were Open Source and freeware just like Perl.

So by 2000 there were several mature choices for Internet development and Perl was a drift in the sea of choice. I also see 2000 as the year the tide went out for Internet development in general. The "rush" had subsided, companies and investor started to really analyze what they had done and what the new media really offered to them. Along with analyzing comes consultants. Consultants have an interest in researching or developing the best product possible for a company. The company is interested in that when they terminate the contract with the consultant that they will be able to maintain what they bought. This brings us to the rub on Perl. How can a consultant convince a company that his application language of choice is free and isn't backed by a company? ActiveState I believe backs Perl to some extent, but one company generally isn't enough to put a CTO at ease.

So the decline of Perl use can be summarized with these facts:

  1. Perl is thought of as a UNIX administrative tool
  2. There are not enough professional organizations or guilds to bolster confidence with corporations that investing in a Perl solution is a good long-term plan.
  3. Perl doesn't have large scale advertising and full time advocates that keep Perl in major computing publications and remind companies that when they chose, chose Perl.
  4. There is no official certification. I have seen Larry's comments on this and I agree with him, but lack of certification hurts in the corporate world.
  5. Lack of college or university Perl class, or maybe better-stated lack of Perl promotion by colleges.
I suppose all of this really only matter to people that don't make their living extending Perl or using it for system admin work that isn't approved by a board or committee. People that make final products based on Perl for the Internet and as standalone applications are effected by the myths and facts of Perl.

Last year a possible opportunity I had to produce a complete package for a large telecommunications firm failed in part due to lack of confidence in Perl as the language of choice, despite the fact that two districts had been successfully using the prototype and increased efficiency.

Another factor is the overseas development services. My most recent employer had a subsidiary in India with 30 developers. Training for Perl was unheard of. There were signs literally everywhere for C++ , C# and Java, but no mention of Perl. It seems Perl is used for down and dirty utilities not full scale applications.

Maybe Perl isn't "supposed" to be for large-scale applications, but I think it can be and I think it's more then mature enough and supported to provide a corporation with a robust and wise long-term solution.

I am very interested in your opinions about why you feel Perl is or isn't gaining ground.

tilly (Archbishop) on Feb 02, 2002 at 23:30 UTC

Re (tilly) 1: The Decline of Perl - My Opinion

I have many opinions about your points.

First of all I don't know whether Perl is declining. Certainly I know that some of the Perl 6 effort has done exactly what it was intended to do, and attracted effort and interest in Perl. I know that at my job we have replaced the vast majority of our work with Perl, and the directions we are considering away from Perl are not exactly popularly publicized ones.

Secondly I am personally not overly concerned with what the popular language of the day is. As I commented ages ago at RE (tilly) 1: Java vs. Perl from the CB , the dream that is sold to PHBs of programmers as interchangable monkeys doesn't appeal to me, and is a proven recipe for IT disasters. See Choose the most powerful language for further discussion, and a link to an excellent article by Paul Graham. As long as I have freedom to be productive, I will make the best choice for me. Often that is Perl.

Third I don't see it as a huge disaster if Perl at some point falls by the wayside. Perl is not magically great to push just because it is Perl. Perl is good because it does things very well. But other languages can adopt some of Perl's good ideas and do what Perl already does. Indeed languages like Python and Ruby borrow some of Perl's good ideas, and make them conveniently available to people who want some of the power of Perl, but who didn't happen to click with Perl. I think this is a good thing in the end. Trying different approaches allows people to figure out what they like and why they like it, leading to better languages later.

Perhaps I am being narrow minded in focusing so much on what makes for good personal productivity, but I don't think so. Lacking excellent marketing, Perl can't win in the hype game. It has to win by actually being better for solving problems. Sure, you don't see Perl advertised everywhere. But smart management understands that something is up when a small team of Perl programmers in 3 months manages to match what a fair sized team of Java programmers had done in 2 years. And when the Perl programmers come back again a year later and in a similar time frame do what the Java programmers had planned to do over the next 5 years...

VSarkiss (Monsignor) on Feb 02, 2002 at 22:36 UTC

Re: The Decline of Perl - My Opinion

Well, to start off, I think you're equating "popularity" with "success". It's true that Perl awareness is not as widespread as the other languages you mention. (By the way, I notice that two of them, C# and Java, are products of corporations who would like to run away with the market regardless of the technical merit of their solution). But when I first read the title of your note, I thought a lot of other things. To me, "decline" means disuse or death, neither of which I think apply to Perl today.

The fact that there isn't a company with a lot of money standing behind Perl is probably the cause of the phenomena you observe. The same applies to other software that is produced and maintained by enthusiasts rather than companies. Linux is a very good example. Up to recently, Linux was perceived as just a hobbyist's toy. Now there are small glimmers of its acceptance in the corporate world, mainly from IBM stepping up and putting together a marketing package that non-technical people can understand. (I'm thinking of recent TV ad campaigns.) But does that make Linux "better"? Now there's a way to start an argument.

I agree that except in certain cases (look at The Top Perl Shops for some examples), most companies don't "get" Perl. Last year at my current client, I suggested that a new application be prototyped using a Perl backend. My suggestion was met with something between ridicule and disbelief ("We don't want a bunch of scripts , we want a real program ". That one stung.) To give them credit -- and this lends credence to one of your points -- they had very few people that could program Perl. And none of then were very good at it.

So has this lack of knowledge among some people made Perl any worse, or any less useful? No, definitely not. I think the ongoing work with Perl 6 is some of the most interesting and exciting stuff around. I think the language continues to be used in fascinating leading-edge application areas, such as bioinformatics. The state of Perl definitely doesn't fit my definition of "decline".

Nonetheless, I think your point of "corporate acceptance" is well-taken. It's not that the language is declining, it's that it's not making inroads in the average boardroom. How do you get past that barrier? For my part, I think the p5ee project is a step in the right direction. We need to simplify Perl training, which is one of the goals of the standardization, and provide something for a corporate executive to hold on to -- which is a topic of discussion in the mailing list right now.

And the nice part is that the standardized framework doesn't stop all the wonderful and creative Cool uses for Perl that we've become accustomed to. If the lack of corporate acceptance is of concern to you, then join the group. "Dont' curse the darkness, light a candle" is an old Chinese proverb .

trs80 (Priest) on Feb 02, 2002 at 22:48 UTC

Re: Re: The Decline of Perl - My Opinion There was a post on the mod_perl list later the same day that I wrote this, that shows a steady rise in the use of mod_perl based servers. The graph .
Maybe a better title should be, Top Hindrances in Selling Perl Solutions

derby (Abbot) on Feb 02, 2002 at 23:00 UTC

Re: The Decline of Perl - My Opinion

trs80,

I agree with your timeline and general ideas about what brought perl into focus. I also agree with perl being adrift in a sea of choices and with the rush subsiding in 2000. I whole heartedly disagree with your arguments about why perl will decline. Let's not look at your summary but at the pillars of your summary.

1. Perl is thought of as a UNIX administrative tool.

You really don't have much support for this summary. You state that system administrators love it but so do a whole lot of developers!

2. There are not enough professional organizations or guilds to bolster confidence with corporations that investing in a Perl solution is a good long-term plan.

Well you're at one of the most professional guilds right now. I don't see a cold fusion monks site? What do you want? Certification? I think more of my BS and BA then I do of any certification. As for "good long-term plans" - very few business see past the quarter. While I think this is generally bad , I think it's going to work wonders for open software. Where can you trim the budget to ensure profitability - cut those huge $$$$ software licenses down to nothing.

3. Perl doesn't have large scale advertising and full time advocates that keep Perl in major computing publications and remind companies that when they chose, chose Perl.

Hmmm ... not sure I want to base my IT selection on what the mags have to say -- I've seen a whole lot of shelf-ware that was bought due to what some wags say in the latest issue of some Ziff Davis trash.

3. There is no official certification. I have seen Larry's comments on this and I agree with him, but lack of certification hurts in the corporate world.

The are only two certifications that count - one is years of experience and the other is a sheep skin. Anything else is pure garbage. As long as you have the fundamentals of algorithm design down - then who cares what the cert is.

4. Lack of college or university Perl class, or maybe better-stated lack of Perl promotion by colleges.

I wouldn't expect anyone right out of college to be productive in any language. I would expect them to know what makes a good algorithm - and that my friend is language agnostic. Be it VB, C, C++, or perl - you have to know big-o.

It sounds like you're a little worried about a corporations perception about our language of choice. I wouldn't be. Given the current perception of corporate management (ala Enron), I think the people who make the (ehh) long-range plans may be around a lot less than us tech weenies. Bring it in back doors if you want. Rename it if you have to - an SOAP enabled back-end XML processor may be more appealling then an apache/mod_perl engine (that's where the BA comes in).

It also sound like you're worried about overseas chop shops. Ed Yourdan rang that bell about ten years ago with "The Decline and the Fall of the American Programmer." I must say I lost a lot of respect for Ed on that one. Farming out development to India has proven to be more of a lead egg than the golden hen - time zone headaches and culture clash has proved very hard to overcome.

Perl is moving on ... it seemed static because everyone was catching up to it. That being said, some day other OSS languages may overtake it but Python and Ruby are still in perl's review mirror.

Just like loved ones, we tend to ignore those which are around us everyday. For this Valentines day, do us all a favor and by some chocolates and a few flowers for our hard-working and beloved partner - We love you perl.

-derby

dws (Chancellor) on Feb 02, 2002 at 23:53 UTC

Re: The Decline of Perl - My Opinion

A few thoughts and data points: Perl may have gained ground initially in System Administration, and since the Web came along, Perl now is though of more as the language of CGIs.

My previous employer also had a subsidiary in India, and my distributed project included a team there, working on a large (80K line) web application in Perl. Folks coming out of University in India are more likely to have been trained in C++ or Java, but Perl isn't unknown.

On Certification: Be very careful with this. HR departments might like to see certifications on resumes, but in the development teams I've worked in, a Very Big Flag is raised by someone claiming to have a certification. The implication, fair or not, is "this person is a Bozo."

random (Monk) on Feb 03, 2002 at 02:44 UTC

Re: The Decline of Perl - My Opinion

Here's the thing: to some extent, you're right, but at the same time, you're still way off base. Now let's say, just for a second, that the only thing Perl can / should be used for is system administration and web programming (please don't flame me, guys. I know that is by no means the extent of Perl's capabilities, but I don't have time to write a post covering all the bases right now.) Even assuming that's true, you're still wrong.

Considering the web side of it, yes, Perl is being used in far fewer places now. There are two major reasons for this: one is the abundance of other languages (PHP and ::shudder:: ASP, for example). Another is the fact that a lot of sites backed by Perl programming crashed and burned when the dot-coms did. You know what? I don't see this as a big deal. The techies who wrote those sites are still around, likely still using Perl, and hopefully not hurting its reputation by using it to support companies that will never, ever, make any money or do anything useful...ever. (Not to say that all dot-coms were this way, but c'mon, there were quite a few useless companies out there.) These sites were thrown together (in many cases) to make a quick buck. Granted, Perl can be great for quick-and-dirty code, but do we really want it making up the majority of the volume of the code out there?

System administration: I still think Perl is one of the finest languages ever created for system administration, especially cross-platform system admin, for those of us who don't want to learn the ins and outs of shell scripting 100x over. I really don't think there'll be much argument there, so I'll move on.

The community: when was the last time Perl had organizations as devoted as the Perl Mongers and Yet Another Society? Do you see a website as popular and helpful as Perlmonks for PHP? Before last year, I don't remember ever having programmers whose sole job is to improve and evangelize Perl. Do you? I can't remember ever before having an argument on the Linux kernel mailing list on whether Linux should switch to a quasi-Perl method of patch management. (This past week, for those who haven't been reading the kernel mailing list or Slashdot.)

To be honest, I have no idea what you're talking about. As far as I'm concerned, Perl is as strong as it's ever been. If not, then it's purely a matter of evangelization. I know I've impressed more than one boss by getting what would've been a several-day-long job done in an hour with Perl. Have you?

- Mike -

Biker (Priest) on Feb 03, 2002 at 14:56 UTC

Re: The Decline of Perl - My Opinion

I'm not sure how you backup the statement that Perl is declining , but anyway.

There's a huge difference between writing sysadmin tools and writing business oriented applications. (Unless your business is to provide sysadmin tools. ;-)

In my shop, the sysadmins are free to use almost any tools, languages, etc. to get their work done. OTOH, when it comes to business supporting applications, with end user interface, the situation is very different. This is when our middle level management starts to worry.

My personal experience is that Perl does have a very short and low learning curve when it comes to writing different 'tools' to be used by IT folks.

The learning curve may quickly become long and steep when you want to create a business oriented application, very often combining techniques like CGI, DBI (or SybPerl), using non standard libraries written as OO or not plus adding your own modules to cover your shop-specific topics. Especially if you want to create some shared business objects to be reused by other Perl applications. Add to this that the CGI application shall create JavaScript to help the user orient through the application, sometimes even JS that eval() 's more JS and it becomes tricky. (Did someone mention 'security' as well?)

Furthermore, there is (at least in Central Europe) a huge lack of good training. There are commercial training courses, but those that I've found around where I live are all beginners courses covering the first chapters in the llama. Which is good, but not enough.

Because after the introduction course, my colleagues ask me how to proceed. When I tell them to go on-line, read books and otherwise participate, they are unhappy. Yes, many of them still haven't really learnt how to take advantage of the 'Net. And yes again, not enough people (fewer and fewer?) appreciates to RTFM. They all want quick solutions. And no, they don't appreciate to spend their evenings reading the Cookbook or the Camel. (Some odd colleagues, I admit. ;-)

You can repeatedly have quick solutions using Perl, but that requires efforts to learn. And this learning stage (where I'm still at) is not quick if you need to do something big .

Too many people (around where I make my living) want quick solutions to everything with no or little investment.
(Do I sound old? Yeah, I guess I am. That's how you become after 20+ years in this business. ;-)

Conclusion: In my shop, the middle management are worried what will happen if I leave.

Questions like: " Who will take over all this Perl stuff? " and " How can we get a new colleague up to speed on this Perl thingy within a week or two? " are commonplace here. Which in a way creates a resistance.

I'd say: Momentum! When there is a hord of Perl programmers available the middle management will sleep well during the nights. (At least concerning "The Perl Problem". ;-)

I'm working very hard to create the momentum that is necessary in our shop and I hope you do the same . ( Biker points a finger at the reader.)

"Livet är hårt" sa bonden.
"Grymt" sa grisen...

beebware (Pilgrim) on Feb 03, 2002 at 19:48 UTC

Re: The Decline of Perl - My Opinion

I agree with most of this, but a major problem that I have noticed is that with the recent 'dot.gone burn', the (job) market is literally flooded with experienced Perl programmers, but there are no positions for them. Mostly this is due to Perl being free. To learn C++, C#, Java you've got to spend a lot of money. Courses do not come cheap, the software doesn't come cheap and certification (where it is offered) isn't cheap.

However, if anybody with just programming knowledge in anything from BASIC upwards and the willingness to learn can get hold for software that is free to use, free to download, free to 'learn by example' (the quickest and easiest way to learn IMHO) - you can probably have a Perl programmer that can make a basic script in next to no time. He runs his own website off it and other scripts he wrote, learns, looks at other scripts and before you know it, he's writing a complete Intranet based Management Information System in it... If that person had to get C++, learn by reading books and going to courses (and the associated costs - there isn't that much code available to read and learn by), it would have taken so much longer and if you are on a budget - it isn't an option.

Compare Linux+MySQL+Apache+Perl to (shudder) Windows 2000 Advanced Server+MS SQL Server+IIS+ASP. Which costs a lot more in the set up costs, staff costs and maintenance. Let alone security holes? But which do big corporations go for (even though ever techy knows which is the best one to go for). Why? Because 'oh, things are free for a reason - if we've got to pay lots of money for it it has got to be damn good - just look at all those ASP programmers asking 60,000UKP upwards, it must be good if they are charging that much'.

All in all, if Perl6+ had a license fee charge for it and a 'certification' was made available AND Perl programmers put up their 'minimum wage', Perl would take off again big time. Of course, it's all IMHO, but you did ask for my opinion :-)

tilly (Archbishop) on Feb 03, 2002 at 20:11 UTC

Re (tilly) 2: The Decline of Perl - My Opinion

If you really think that by asking money and shutting up Perl, you would make it more popular and profitable, then I challenge you to go out and try to do it.

If you read the licensing terms, you can take Perl, take advantage of the artistic license, rename it slightly, and make your own version which can be proprietary if you want. (See oraperl and sybperl for examples where this was done with Perl 4.)

My prediction based on both theory and observation of past examples (particularly examples of what people in the Lisp world do wrong time after time again) is that you will put in a lot of energy, lose money, and never achieve popularity. For some of the theory, the usually referred to starting place is The Cathedral and the Bazaar .

Of course if you want to charge money for something and can get away with it, go ahead. No less than Larry Wall has said that, It's almost like we're doing Windows users a favor by charging them money for something they could get for free, because they get confused otherwise. But I think that as time goes by it is becoming more mainstream to accept that it is possible for software to be both free and good at the same time.

beebware (Pilgrim) on Feb 04, 2002 at 01:23 UTC

Re: Re (tilly) 2: The Decline of Perl - My Opinion

I know, but that's how head of departments and corporate management think: these are the people that believe the FUD that Microsoft put out ('XP is more secure - we best get it then', never mind that Linux is more secure and the Windows 2000 machines are also secure). Sometimes it's just the brand name which also helps - I know of a certain sausage manufacture who makes sausages for two major supermarkets. People say 'Oh, that's from Supermarket X' upon tasting, although it is just the same sausage.

All in all - it comes down to packaging. 'Tart' something up by packaging, brand names and high prices are, despite the rival product being better in every respect, the 'well packaged' product will win.

[Nov 16, 2017] Which is better, Perl or Python? Which one is more robust? How do they compare with each other?

Notable quotes:
"... Python is relatively constraining - in the sense that it does not give the same amount of freedom as PERL in implementing something (note I said 'same amount' - there is still some freedom to do things). But I also see that as Python's strength - by clamping down the parts of the language that could lead to chaos we can live without, I think it makes for a neat and tidy language. ..."
"... Perl 5, Python and Ruby are nearly the same, because all have copied code from each other and describe the same programming level. Perl 5 is the most backward compatible language of all. ..."
"... Python is notably slower than Perl when using regex or data manipulation. ..."
"... you could use PyCharm with Perl too ..."
May 30, 2015 | www.quora.com

Joe Pepersack , Just Another Perl Hacker Answered May 30 2015

Perl is better. Perl has almost no constraints. It's philosophy is that there is more than one way to do it (TIMTOWTDI, pronounced Tim Toady). Python artificially restricts what you can do as a programmer. It's philosophy is that there should be one way to do it. If you don't agree with Guido's way of doing it, you're sh*t out of luck.

Basically, Python is Perl with training wheels. Training wheels are a great thing for a beginner, but eventually you should outgrow them. Yes, riding without training wheels is less safe. You can wreck and make a bloody mess of yourself. But you can also do things that you can't do if you have training wheels. You can go faster and do interesting and useful tricks that aren't possible otherwise. Perl gives you great power, but with great power comes great responsibility.

A big thing that Pythonistas tout as their superiority is that Python forces you to write clean code. That's true, it does... at the point of a gun, sometimes at the detriment of simplicity or brevity. Perl merely gives you the tools to write clean code (perltidy, perlcritic, use strict, /x option for commenting regexes) and gently encourages you to use them.

Perl gives you more than enough rope to hang yourself (and not just rope, Perl gives you bungee cords, wire, chain, string, and just about any other thing you can possibly hang yourself with). This can be a problem. Python was a reaction to this, and their idea of "solving" the problem was to only give you one piece of rope and make it so short you can't possibly hurt yourself with it. If you want to tie a bungee cord around your waist and jump off a bridge, Python says "no way, bungee cords aren't allowed". Perl says "Here you go, hope you know what you are doing... and by the way here are some things that you can optionally use if you want to be safer"

Some clear advantage of Perl:

Advantages of Python
Dan Lenski , I do all my best thinking in Python, and I've written a lot of it Updated Jun 1 2015
Though I may get flamed for it, I will put it even more bluntly than others have: Python is better than Perl . Python's syntax is cleaner, its object-oriented type system is more modern and consistent, and its libraries are more consistent. ( EDIT: As Christian Walde points out in the comments, my criticism of Perl OOP is out-of-date with respect to the current de facto standard of Moo/se. I do believe that Perl's utility is still encumbered by historical baggage in this area and others.)

I have used both languages extensively for both professional work and personal projects (Perl mainly in 1999-2007, Python mainly since), in domains ranging from number crunching (both PDL and NumPy are excellent) to web-based programming (mainly with Embperl and Flask ) to good ol' munging text files and database CRUD

Both Python and Perl have large user communities including many programmers who are far more skilled and experienced than I could ever hope to be. One of the best things about Python is that the community generally espouses this aspect of "The Zen of Python"

Python's philosophy rejects the Perl " there is more than one way to do it " approach to language design in favor of "there should be one -- and preferably only one -- obvious way to do it".
... while this principle might seem stultifying or constraining at first, in practice it means that most good Python programmers think about the principle of least surprise and make it easier for others to read and interface with their code.

In part as a consequence of this discipline, and definitely because of Python's strong typing , and arguably because of its "cleaner" syntax, Python code is considerably easier to read than Perl code. One of the main events that motivated me to switch was the experience of writing code to automate lab equipment in grad school. I realized I couldn't read Perl code I'd written several months prior, despite the fact that I consider myself a careful and consistent programmer when it comes to coding style ( Dunning–Kruger effect , perhaps? :-P).

One of the only things I still use Perl for occasionally is writing quick-and-dirty code to manipulate text strings with regular expressions; Sai Janani Ganesan's pointed out Perl's value for this as well . Perl has a lot of nice syntactic sugar and command line options for munging text quickly*, and in fact there's one regex idiom I used a lot in Perl and for which I've never found a totally satisfying substitute in Python


* For example, the one-liner perl bak pe 's/foo/bar/g' *. txt will go through a bunch of text files and replace foo with bar everywhere while making backup files with the bak extension.
David Roth , Linux sysadmin, developer, vim enthusiast Answered Jun 9, 2015
Neither language is objectively better. If you get enthused about a language - and by all means, get enthused about a language! - you're going to find aspects of it that just strike you as beautiful. Other people who don't share your point of view may find that same aspect pointless. Opinions will vary widely. But the sentence in your question that attracted my attention, and which forms the basis of my answer, is "But, most of my teammates use Perl."

If you have enough charisma to sway your team to Python, great. But in your situation, Perl is probably better . Why? Because you're part of a team, and a team is more effective when it can communicate easily. If your team has a signifigant code base written in Perl, you need to be fluent in it. And you need to contribute to the code base in that same language. Life will be easier for you and your team.

Now, it may be that you've got some natural lines of demarcation in your areas of interest where it makes sense to write code for one problem domain in Perl and for another in Python - I've seen this kind of thing before, and as long as the whole team is on board, it works nicely. But if your teammates universally prefer Perl, then that is what you should focus on.

It's not about language features, it's about team cohesion.

Debra Klopfenstein , Used C++/VHDL/Verilog as an EE. now its Python for BMEng Answered Jul 22, 2015
I used Perl since about 1998 and almost no Python until 2013. I have now (almost) completely switched over to Python (written 10s of thousands of lines already for my project) and now only use Perl one-liners on the Linux command line when I want to use regex and format the print results when grepping through multiple files in my project. Perl's one-liners are great.

This surprisingly easy transition to fully adopt Python and pretty much drop Perl simply would not be have been possible without the Python modules, collections and re (regex package). Python's numpy, matplotlib, and scipy help seal the deal.

The collections package makes complex variables even easier to create than in Perl. It was created in 2004, I think. The regex package, re , works great, but I wish it was built into the Python language like it is in Perl because using regex usage is smooth in Perl and clunky(er) in Python.

OOP is super easy and not as verbose as in C++, so I find it faster to write, if needed in Python.

I've drank the Kool-Aid and there is no going back. Python is it. (for now)

Sai Janani Ganesan , Postdoctoral Scholar at UCSF Updated May 25, 2015
I like and use both, few quick points:I can't think of anything that you can do with Perl that you can't with Python. If I were you, I'd stick with Python.
Reese Currie , Professional programmer for over 25 years Answered May 26, 2015
It's kind of funny; back in the early 1970's, C won out over Pascal despite being much more cryptic. They actually have the same level of control over the machine, Pascal is just more verbose about it, and so it's quicker to write little things in C. Today, with Python and Perl, the situation is reversed; Python, the far less cryptic of the two, has won out over Perl. It shows how values change over time, I suppose.

One of the values of Python is the readability of the code. It's certainly a better language for receiving someone else's work and being able to comprehend it. I haven't had the problem where I can't read my own Perl scripts years later, but that's a matter of fairly strict discipline on my part. I've certainly received some puzzling and unreadable code from other Perl developers. I rather hate PowerShell, in part because the messy way it looks on-screen and its clumsy parameter passing reminds me of Perl.

For collaboration, the whole team would do better on Python than on Perl because of the inherent code readability. Python is an extreme pain in the neck to write without a syntax-aware editor to help with all the whitespace, and that could create a barrier for some of your co-workers. Also Python isn't as good for a really quick-and-dirty script, because nothing does quick (or dirty) better than Perl. There are a number of things you can do in Perl more quickly and I've done some things with Perl I probably wouldn't be interested in even trying in Python. But if I'm doing a scripting task today, I'll consider Python and even Scala in scripting mode before I'll consider Perl.

I'll take a guess that your co-workers are on average older than you, and probably started Perl before Python came along. There's a lot of value in both. Don't hate Perl because of your unfamiliarity with it, and if it's a better fit for the task, maybe you will switch to Perl. It's great to have the choice to use something else, but on the other hand, you may pick up an awful lot of maintenance burden if you work principally in a language nobody else in your company uses.

Abhishek Ghose , works at [24]7 Ai. Answered Jun 11, 2015
I have used Perl and Python both. Perl from 2004-2007, and Python, early 2009 onward. Both are great, malleable languages to work with. I would refrain from making any comments on the OOP model of PERL since my understanding is most likely out-of-date now.

Library-wise PERL and Python both have a fantastic number of user-contributed libraries. In the initial days of Python, PERL definitely had an edge in this regard - you could find almost anything in its library repository CPAN, but I am not sure whether this edge exists anymore; I think not.

Python is relatively constraining - in the sense that it does not give the same amount of freedom as PERL in implementing something (note I said 'same amount' - there is still some freedom to do things). But I also see that as Python's strength - by clamping down the parts of the language that could lead to chaos we can live without, I think it makes for a neat and tidy language.

I loved PERL to death in the years I used it. I was an ardent proselytizer. My biggest revelation/disappointment around the language came when I was asked to revisit a huge chunk of production code I had written a mere 6 months ago. I had a hard time understanding various parts of the code; most of it my own code! I realized that with the freedom that PERL offers, you and your team would probably work better (i.e. write code that's maintainable ) if you also had some coding discipline to go with it. So although PERL provides you a lot of freedom, it is difficult to make productive use of it unless you bring in your own discipline (why not Python then?) or you are so good/mature that any of the many ways to do something in the language is instantly comprehensible to you (i.e. a steeper learning curve if you are not looking forward to brain-teasers during code-maintenance).

The above is not an one-sided argument; it so happened that some years later I was in a similar situation again; only this time the language was Python. This time around I was easily able to understand the codebase. The consistency of doing things in Python helps. Emerson said consistency is the hobgoblin of little minds , but maybe faced with the daunting task of understanding huge legacy codebases we are relatively little minds. Or maybe its just me (actually not, speaking from experience :) ).

All said and done, I am still a bit miffed at the space/tab duality in Python :)

Jay Jarman , BS, MS, and PhD in Computer Science. Currently an Asst Prof in CS Answered Sep 5, 2015
You're asking the wrong question because the answer to your question is, it depends. There is no best language. It depends on what you're going to use it for. About 25 years ago, some bureaucratic decided that every system written by the DoD was to be written in Ada. The purpose was that we'd only have to train software developers in one language. The DoD could painlessly transfer developers from project to project. The only problem was that Ada wasn't the best language for all situations. So, what problem do you wish to solve by programming. That will help to determine what language is the best.
Steffen Winkler , software developer Answered Nov 24, 2015
Perl 5, Python and Ruby are nearly the same, because all have copied code from each other and describe the same programming level. Perl 5 is the most backward compatible language of all.

... ... ...

Mauro Lacy , Software Developer Answered Jun 8, 2015
Python is much better, if only because it's much more readable. Python programs are therefore easily modifiable, and then more maintainable, extensible, etc.

Perl is a powerful language, and a Perl script can be fun to write. But is a pain to read, even by the one who wrote it, after just a couple of months, or even weeks. Perl code usually looks like(and in many cases just is) a hack to get something done quickly and "easily".

What is still confusing and annoying in Python are the many different frameworks employed to build and install modules, the fact that much code and modules are still in beta stage, unavailable on certain platforms, or with difficult or almost impossible to satisfy binary(and also non-binary, i.e. due to different/incompatible versions) dependencies. Of course, this is related to the fact that Python is still a relatively young language. Python 3 vs. 2 incompatibilities, though justifiable and logical, also don't help in this regard.

Jim Abraham , worked at Takeda Pharmaceuticals Answered Jun 16

Perl is the unix of programming languages. Those who don't know it are condemned to re-implement it (and here I'm looking at you, Guido), badly. I've read a lot of Python books, and many of the things they crow about as being awesome in Python have existed in Perl since the beginning. And most of those Perl stole from sed or awk or lisp or shell. Sometimes I think Pythonistas are simply kids who don't know any other languages, so of course they think Python is awesome.

Eran Zimbler , works at Cloud of Things (2016-present) Answered May 30, 2015
as someone that worked professionally with Perl, and moved to python only two years ago the answer sound simple. Work with whatever you can that feels more comfortable, however if your teammates use Perl it would be better to learn Perl in order to share code and refrain from creating code that cannot be reused.

in terms of comparison

1. python is object oriented by design, Perl can be object oriented but it is not a must
2. Python has a very good standard library, Perl has cpan with almost everything.
3. Perl is everywhere and in most cases you won't need newer perl for most cpan modules, python is a bit more problematic in this regard
4. Python is more readable after a month than Perl

there are other reasons but those are the first that comes to mind

Alain Debecker , Carbon based biped Answered May 22, 2015
Your question sounds a bit like "Which is better a boat or a car?". You simply do not do the same things with them. Or more precisely, some things are easier to do with Perl and others are easier with Python. None of them is robust (compare with Eiffel or Oberon, and if you never heard of these it is because robustness is not so important for you). So learn both, and choose for yourself. And also pick a nice one in http://en.wikipedia.org/wiki/Tim... (why not Julia?). A language that none of your friend knows about and stick out your tongue to them.
Jelani Jenkins , studied at ITT Technical Institute Answered Jul 16

Python is notably slower than Perl when using regex or data manipulation. I think if you're worried about the appeal of Perl, you could use PyCharm with Perl too. Furthermore, I believe the primary reason why someone would use an interpreted language on the job is to manipulate or analyze data. Therefore, I would use the Perl in order to get the job done quickly and worry about the appears another time.

Oscar Philips , BSEE Electrical Engineering Answered Jun 13

I had to learn Perl in an old position at work, where system build scripts were written in Perl and we are talking system builds that would take 6 to 8 hours to run, and had to run in both Unix and Windows environments.

That said, I have continued to use Perl for a number of reasons

Yes, I now need to use Python at work, but I have actually found it more difficult to learn than Perl, especially for pulling in a web page and extracting data out of the page.

Franky Yang , Bioinformatics Ph.D. Answered Dec 23, 2015

For beginner, I suggest Python. Perl 6 is released and Perl 5 will be replaced by Perl 6 or Ruby. Python is easier for a beginner and more useful for future life no matter you want to be a scientist or a programmer. Perl is also a powerful language but it only popular in U.S., Japan, etc.

Anyway, select the one you like, and learn the other as a second-language.

Zach Phillips-Gary , College of Wooster '17, CS & Philosophy Double Major and Web Developer Answered May 20, 2015
Perl is and older language and isn't really in the vogue. Python has a wider array of applications and as a "trendy" language, greater value on the job market.
James McInnes Answered May 29, 2015
You don't give your criteria for "better". They are tools, and each suited for various tasks - sometimes interchangeable.

Python is fairly simple to learn and easy to read. The object model is simple to understand and use.

Perl allows you to be terse and more naturally handles regular expressions. The object model is complex, but has more features.

I tend to write one-off and quick-and-dirty processing tasks in Perl, and anything that might be used by someone else in Python.

Garry Taylor , Been programming since 8 bit computers Answered Dec 20, 2015

First, I'll say it's almost impossible to say if any one language is 'better' than another, especially without knowing what you're using them for, but...

Short answer, it's Python. To be honest, I don't know what you mean by 'robust', they are both established languages, been around a long time, they are both, to all intents and purposes, bug free, especially if you're just automating a few tasks. I've not used Perl in quite some time, except when I delve into a very old project I wrote. Python has terrible syntax, and Perl has worse, in my opinion. Just to make my answer a bit more incendiary, both Perl and Python both suck, Python sucks significantly less.

[Nov 04, 2017] Which is the best book for learning python for absolute beginners on their own?

Nov 04, 2017 | www.quora.com

Robert Love Software Engineer at Google

Mark Lutz's Learning Python is a favorite of many. It is a good book for novice programmers. The new fifth edition is updated to both Python 2.7 and 3.3. Thank you for your feedback! Your response is private. Is this answer still relevant and up to date?

Aditi Sharma , i love coding Answered Jul 10 2016

Originally Answered: Which is the best book for learning Python from beginners to advanced level?

Instead of book, I would advice you to start learning Python from CodesDope which is a wonderful site for starting to learn Python from the absolute beginning. The way its content explains everything step-by-step and in such an amazing typography that makes learning just fun and much more easy. It also provides you with a number of practice questions for each topic so that you can make your topic even stronger by solving its questions just after reading it and you won't have to go around searching for its questions for practice. Moreover, it has a discussion forum which is really very responsive in solving all your doubts instantly.

3.1k Views 11 Upvotes Promoted by Facebook Join Facebook Engineering Leadership. We're hiring! Join our engineering leadership team to help us bring the world closer together. Learn More at facebook.com/careers Alex Forsyth , Computer science major at MIT Answered Dec 28 2015 Originally Answered: What is the best way to learn to code? Specifically Python.

There are many good websites for learning the basics, but for going a bit deeper, I'd suggest MIT OCW 6.00SC. This is how I learned Python back in 2012 and what ultimately led me to MIT and to major in CS. 6.00 teaches Python syntax but also teaches some basic computer science concepts. There are lectures from John Guttag, which are generally well done and easy to follow. It also provides access to some of the assignments from that semester, which I found extremely useful in actually learning Python.

After completing that, you'd probably have a better idea of what direction you wanted to go. Some examples could be completing further OCW courses or completing projects in Python.

[Nov 04, 2017] Should I learn Python or Ruby

Notable quotes:
"... Depends on you use. If you are into number crunching, system related scripts, Go to Python by all means. ..."
Nov 04, 2017 | www.quora.com

Joshua Inkenbrandt , Engineer @ Pinterest Updated Nov 18, 2010 · Upvoted by Daniel Roy Greenfeld , Python and Django Developer so I'll tell you my opinion.

Choose Python. It has... (more) Originally Answered: If I only have time to learn Python or Ruby, which should I choose and why? First, it should be said that both languages are great at solving a vast array of problems. You're asking which one you should choose, so I'll tell you my opinion.

Choose Python. It has a bigger and more diverse community of developers. Ruby has a great community as well, but much of the community is far too focused on Rails. The fact that a lot of people use Rails and Ruby interchangeably is pretty telling.

As for the differences between their syntax go: They both have their high and low points. Python has some very nice functional features like list comprehension and generators. Ruby gives you more freedom in how you write your code: parentheses are optional and blocks aren't whitespace delimited like they are in Python. From a syntactical point of view, they're both in a league of their own.

My reason for choosing Python (as my preferred language) though, had everything to do with the actual language itself. It actually has a real module system. It's clean and very easy to read. It has an enormous standard library and an even larger community contributed one.

One last thing I'll add, so you know I'm not a complete homer. Some of my favorite tools are written in Ruby, like homebrew [ https://github.com/mxcl/homebrew] , Puppet [ http://www.puppetlabs.com/puppet... and capistrano [ https://github.com/halorgium/cap... . I think Ruby has some really awesome features, like real closures and anonymous functions. And if you could, it wouldn't hurt to lean both. Just my $0.02. Seyram Komla Sapaty , from __future__ import braces Updated Jul 22, 2012

1. Python is widely used across several domains. It's installed on almost all linux distros. Perfect for system administration. It's the de facto scripting language in the animation/v... (more) Originally Answered: If I only have time to learn Python or Ruby, which should I choose and why? Learn python

1. Python is widely used across several domains. It's installed on almost all linux distros. Perfect for system administration. It's the de facto scripting language in the animation/visual effect industry. The following industry standard programs use python as their scripting language:
Autodesk Maya, Softimage, Toxik, Houdini, 3DSMAX, Modo, MotionBuilder, Nuke, Blender, Cinema4D, RealFlow and a lot more....
If you love visual effect intensive movies and video games, chances are python helped in making them. Python is used extensively by Industrial Light Magic, Sony Imageworks, Weta Digital, Luma Pictures, Valve etc

2. Researchers use it heavily. Resulting in free high quality libraries: NumPy, SciPy, Matplotlib etc
3. Python is backed by a giant like Google
4. Both python and ruby are slow. But ruby is slower!

And oh, the One Laptop per Child project uses python a lot

Ruby is more or less a dsl. It is used widely for web development. So much so that the name Ruby is interchangeable with Rails. Dave Aronson , T. Rex at Codosaurus, LLC (1990-present) Answered Feb 27

This was apparently asked in 2010, but I've just been A2A'ed in 2017, so maybe an updated perspective will help. Unfortunately I haven't done much Python since about 2008, but I did a little bit in...

(more)

This was apparently asked in 2010, but I've just been A2A'ed in 2017, so maybe an updated perspective will help. Unfortunately I haven't done much Python since about 2008, but I did a little bit in 2014 but I have done a ton of Ruby in the meantime.

From what I hear, the main changes in Python are that on the downside it is no longer the main darling of Google, but on the upside, more and more and better and faster libraries have come out for assorted types of scientific computation, Django has continued to climb in popularity, and there are apparently some new and much easier techniques for integrating C libraries into Python. Ruby has gotten much faster, and more popular in settings other than Rails, and meanwhile Rails itself has gotten faster and more powerful. So it's still very much not a no-brainer.

There are some good libraries for both for games, and other uses of GUI interfaces (from ordinary dialog boxes and such, to custom graphics with motion). If you mean a highly visual video game, needing fine resolution and a high frame rate, forget 'em both, that's usually done in C++.

For websites, it depends what kind . If you want to just show some info, and have ability to change what's there pretty easily, from what I've heard Django is excellent at making that kind of system, i.e., a CMS (Content Management System). If you want it to do some storage, processing, and retrieval of user-supplied data, then Ruby is a better choice, via frameworks such as Rails (excellent for making something quickly even if the result isn't lightning-fast, though it can be with additional work) or Sinatra (better for fast stuff but it does less for you). If you don't want to do either then you don't need either language, just raw HTML, CSS (including maybe some libraries like Bootstrap or Foundation), and maybe a bit of JavaScript (including maybe some libraries like JQuery) -- and you'll have to learn all that anyway!

"Apps" is a bit broad of a term. These days it usually means mobile apps. I haven't heard of ways to do them in Python, but for Ruby there is RubyMotion, which used to be iOS-only but now supports Android too. Still, though, you'd be better off with Swift (or Objective-C) for iOS and Java for Android. On the other claw, if you just mean any kind of application, either one will do very nicely for a wide range.

Vikash Vikram Software Architect Worked at SAP Studied at Indian Institute of Technology, Delhi Lived in New Delhi

Depends on you use. If you are into number crunching, system related scripts, Go to Python by all means.

If you are interested into speed, Go somewhere else as both languages are slow. When you compare their speed to each other, they are more or less same (You are not at all going to get any order of difference in their performance). If you are into web application development or want to create backend for you mobile apps or writing scripts, then only you have to dig deeper. The major difference is in the mindset of the community. Python is conservative and Ruby is adventurous. Rails is the shining example of that. Rails guys are just crazy about trying new ideas and getting them integrated. They have been one of the first to has REST, CoffeeScript, SaSS and some many other stuffs by default. With Rails 5, they will have ES6 integrated as well. So if you want to bet on a framework that let you try all the new stuffs, then Rails and hence Ruby is the way. Personally I like Ruby because of Rails. I like Rails because I like the philosophy of its creators. The syntax of Ruby is icing on the cake. I am not fond of Python syntax and I am a web developer so the versatility of Python Libraries does not cut the cake for me. In the end, you won't be wrong with any of these but you will have to find out yourself about which one you will love to code in (if that is important for you).

Max Mautner , I <3 python, a lot Answered Jan 5, 2015

PyGame would be a fine place to start: http://pygame.org/news.html

Or more generally, Ludum Dare: http://ludumdare.com/compo/ ash

Agrawal , Principal Engineer at Slideshare Updated Jan 17, 2013

Following may help you decide better:

Ruby and Python. Two languages. Two communities. Both have a similar target: to make software development better. Better than Java, better than PHP and better for everyone. But where is the difference? And what language is "better"? For the last question I can say: none is better. Both camps are awesome and do tons of great stuff. But for the first question the answer is longer. And I hope to provide that in this little article.

Is the difference in the toolset around the language? No, I don't think so. Both have good package managers, tons of libraries for all kind of stuff and a few decent web frameworks. Both promote test driven development. On the language side one is whitespace sensitive, the other isn't. Is that so important? Maybe a little, but I think there is something else that is way more important. The culture.

It all started with a stupid python troll at the Sigint that wanted to troll our cologne.rb booth. To be prepared for the next troll attack I started to investigate Python. For that I talked with a lot of Python guys and wrote a few little things in Python to get a feel for the language and the ecosystem. Luckily at theFrOSCon our Ruby booth was right next to the pycologne folks and we talked a lot about the differences. During that time I got the feeling that I know what is different in the culture of both parties. Last month I had the opportunity to test my theory in real life. The cologne.rb and the django cologne folks did a joined meetup. And I took the opportunity to test my theory. And it got confirmed by lots of the Python people.

Okay, now what is the difference in the culture? It is pretty easy. Python folks are really conservative and afraid of change, Ruby folks love the new shiny stuff even if it breaks older things. It's that simple. But it has huge consequences. One you can see for example in the adaption of Ruby 1.9 vs Python 3. Both new versions did tons of breaking changes. A lot of code needed changes to run on the new plattform. In the Ruby world the transition went pretty quick. In the Python world it is very troublesome. Some Python people even say that Python 3 is broken and all energy should be focused on the 2.x-branch of the language. The Ruby community saw the opportunities. The Python community only saw the update problems. Yes, there have been update problems in the Ruby world, but we found an easy way to fix this: isitruby19.com . A simple plattform that showed if the gem is ready for 1.9. And if it wasn't and the gem was important, it got fixed with pull requests or something similar. And the problems went away fast.

Both models of thinking have pros and cons. The Python world is more stable, you can update your django installation without much troubles. But that also means new technology is only added very slowly. The Ruby world loves changes. So much that most of the "new stuff" in the Python world was tested in the Ruby world first. We love changes so much that the Rails core is build around that idea. You can easily change nearly everything and extend everything. Most of the new stuff the Rails Core Team is testing right now for version 4 is available as plugin for Rails 3. This is pretty interesting if you love new things, love change, and love playing around with stuff. If you don't and hate the idea of breaking changes, you maybe are better suited with the Python way. But don't be afraid of breaking changes. They are all pretty well documented in the release guides. It's not voodoo.

I for myself love the Ruby mindset. Something like Rails or Asset Pipelines or all the other things would not be possible if we are stuck with "no, don't change that, it works pretty well that way". Someone has to be the leader. Someone has to play around with new ideas. Yes, some ideas won't fly, some are removed pretty quickly. But at least we tried them. Yes, I know that some people prefer the conservative way. If you consider yourself to be like that, you should at least try Python. I stay with Ruby.


Source: http://bitboxer.de/2012/10/03/ru... Andrew Korzhuev , I'm interested in everything Answered Apr 11, 2012 Should I Learn Python or Ruby next? Why not to try both and choose which one you would like more? [1] [2]

[1] http://hackety.com/
[2] http://www.trypython.org/

[Nov 04, 2017] The top 20 programming languages the GitHut and Tiobe rankings - JAXenter

Nov 04, 2017 | jaxenter.com

Githut lists its ranking according to the following characteristics: active repositories, the number of pushes and the pushes per repository, as well as the new forks per repository, the open issues per repository and the new watcher per repository.

GitHut's top 20 ranking currently looks like this:

  1. JavaScript
  2. Java
  3. Python
  4. Ruby
  5. CSS
  6. PHP
  7. C++
  8. Shell
  9. C#
  10. Objective-C
  11. VimL
  12. Go
  13. Perl

[Nov 04, 2017] GitHub - EbookFoundation-free-programming-books Freely available programming books

Nov 04, 2017 | github.com

View the English list

This list was originally a clone of stackoverflow - List of Freely Available Programming Books by George Stocker.

The list was moved to GitHub by Victor Felder for collaborative updating and maintenance. It grew to become one of the most popular repositories on Github , with over 80,000 stars, over 4000 commits, over 800 contributors, and over 20,000 forks.

The repo is now administered by the Free Ebook Foundation , a not-for-profit organization devoted to promoting the creation, distribution, archiving and sustainability of free ebooks. Donations to the Free Ebook Foundation are tax-deductible in the US.

[Sep 18, 2017] The Fall Of Perl, The Webs Most Promising Language by Conor Myhrvold

The author pays outsize attention to superficial things like popularity with particular groups of users. For sysadmin this matter less then the the level of integration with the underling OS and the quality of the debugger.
The real story is that Python has less steep initial learning curve and that helped to entrenched it in universities. Students brought it to large companies like Red Hat. The rest is history. Google support also was a positive factor. Python also basked in OO hype. So this is more widespread language now much like Microsoft Basic. That does not automatically makes it a better language in sysadmin domain.
The phase " Perl's quirky stylistic conventions, such as using $ in front to declare variables, are in contrast for the other declarative symbol $ for practical programmers today–the money that goes into the continued development and feature set of Perl's frenemies such as Python and Ruby." smells with "syntax junkie" mentality. What wrong with dereferencing using $ symbol? yes it creates problem if you are using simultaneously other languages like C or Python, but for experienced programmer this is a minor thing. Yes Perl has some questionable syntax choices so so are any other language in existence. While painful, it is the semantic and "programming environment" that mater most.
My impression is that Perl returned to its roots -- migrated back to being an excellent sysadmin tool -- as there is strong synergy between Perl and Unix shells. The fact that Perl 5 is reasonably stable is a huge plus in this area.
Notable quotes:
"... By the late 2000s Python was not only the dominant alternative to Perl for many text parsing tasks typically associated with Perl (i.e. regular expressions in the field of bioinformatics ) but it was also the most proclaimed popular language , talked about with elegance and eloquence among my circle of campus friends, who liked being part of an up-and-coming movement. ..."
"... Others point out that Perl is left out of the languages to learn first –in an era where Python and Java had grown enormously, and a new entrant from the mid-2000s, Ruby, continues to gain ground by attracting new users in the web application arena (via Rails ), followed by the Django framework in Python (PHP has remained stable as the simplest option as well). ..."
"... In bioinformatics, where Perl's position as the most popular scripting language powered many 1990s breakthroughs like genetic sequencing, Perl has been supplanted by Python and the statistical language R (a variant of S-plus and descendent of S , also developed in the 1980s). ..."
"... By 2013, Python was the language of choice in academia, where I was to return for a year, and whatever it lacked in OOP classes, it made up for in college classes. Python was like Google, who helped spread Python and employed van Rossum for many years. Meanwhile, its adversary Yahoo (largely developed in Perl ) did well, but comparatively fell further behind in defining the future of programming. Python was the favorite and the incumbent; roles had been reversed. ..."
"... from my experience? Perl's eventual problem is that if the Perl community cannot attract beginner users like Python successfully has ..."
"... The fact that you have to import a library, or put up with some extra syntax, is significantly easier than the transactional cost of learning a new language and switching to it. ..."
"... MIT Python replaced Scheme as the first language of instruction for all incoming freshman, in the mid-2000s ..."
Jan 13, 2014 | www.fastcompany.com

And the rise of Python. Does Perl have a future?

I first heard of Perl when I was in middle school in the early 2000s. It was one of the world's most versatile programming languages, dubbed the Swiss army knife of the Internet. But compared to its rival Python, Perl has faded from popularity. What happened to the web's most promising language? Perl's low entry barrier compared to compiled, lower level language alternatives (namely, C) meant that Perl attracted users without a formal CS background (read: script kiddies and beginners who wrote poor code). It also boasted a small group of power users ("hardcore hackers") who could quickly and flexibly write powerful, dense programs that fueled Perl's popularity to a new generation of programmers.

A central repository (the Comprehensive Perl Archive Network, or CPAN ) meant that for every person who wrote code, many more in the Perl community (the Programming Republic of Perl ) could employ it. This, along with the witty evangelism by eclectic creator Larry Wall , whose interest in language ensured that Perl led in text parsing, was a formula for success during a time in which lots of text information was spreading over the Internet.

As the 21st century approached, many pearls of wisdom were wrought to move and analyze information on the web. Perl did have a learning curve–often meaning that it was the third or fourth language learned by adopters–but it sat at the top of the stack.

"In the race to the millennium, it looks like C++ will win, Java will place, and Perl will show," Wall said in the third State of Perl address in 1999. "Some of you no doubt will wish we could erase those top two lines, but I don't think you should be unduly concerned. Note that both C++ and Java are systems programming languages. They're the two sports cars out in front of the race. Meanwhile, Perl is the fastest SUV, coming up in front of all the other SUVs. It's the best in its class. Of course, we all know Perl is in a class of its own."

Then came the upset.

The Perl vs. Python Grudge Match

Then Python came along. Compared to Perl's straight-jacketed scripting, Python was a lopsided affair. It even took after its namesake, Monty Python's Flying Circus. Fittingly, most of Wall's early references to Python were lighthearted jokes at its expense. Well, the millennium passed, computers survived Y2K , and my teenage years came and went. I studied math, science, and humanities but kept myself an arm's distance away from typing computer code. My knowledge of Perl remained like the start of a new text file: cursory , followed by a lot of blank space to fill up.

In college, CS friends at Princeton raved about Python as their favorite language (in spite of popular professor Brian Kernighan on campus, who helped popularize C). I thought Python was new, but I later learned it was around when I grew up as well, just not visible on the charts.

By the late 2000s Python was not only the dominant alternative to Perl for many text parsing tasks typically associated with Perl (i.e. regular expressions in the field of bioinformatics ) but it was also the most proclaimed popular language , talked about with elegance and eloquence among my circle of campus friends, who liked being part of an up-and-coming movement.

Side By Side Comparison: Binary Search

Despite Python and Perl's well documented rivalry and design decision differences–which persist to this day–they occupy a similar niche in the programming ecosystem. Both are frequently referred to as "scripting languages," even though later versions are retro-fitted with object oriented programming (OOP) capabilities.

Stylistically, Perl and Python have different philosophies. Perl's best known mottos is " There's More Than One Way to Do It ". Python is designed to have one obvious way to do it. Python's construction gave an advantage to beginners: A syntax with more rules and stylistic conventions (for example, requiring whitespace indentations for functions) ensured newcomers would see a more consistent set of programming practices; code that accomplished the same task would look more or less the same. Perl's construction favors experienced programmers: a more compact, less verbose language with built-in shortcuts which made programming for the expert a breeze.

During the dotcom era and the tech recovery of the mid to late 2000s, high-profile websites and companies such as Dropbox (Python) and Amazon and Craigslist (Perl), in addition to some of the world's largest news organizations ( BBC , Perl ) used the languages to accomplish tasks integral to the functioning of doing business on the Internet. But over the course of the last 15 years , not only how companies do business has changed and grown, but so have the tools they use to have grown as well, unequally to the detriment of Perl. (A growing trend that was identified in the last comparison of the languages, " A Perl Hacker in the Land of Python ," as well as from the Python side a Pythonista's evangelism aggregator , also done in the year 2000.)

Perl's Slow Decline

Today, Perl's growth has stagnated. At the Orlando Perl Workshop in 2013, one of the talks was titled " Perl is not Dead, It is a Dead End ," and claimed that Perl now existed on an island. Once Perl programmers checked out, they always left for good, never to return. Others point out that Perl is left out of the languages to learn first –in an era where Python and Java had grown enormously, and a new entrant from the mid-2000s, Ruby, continues to gain ground by attracting new users in the web application arena (via Rails ), followed by the Django framework in Python (PHP has remained stable as the simplest option as well).

In bioinformatics, where Perl's position as the most popular scripting language powered many 1990s breakthroughs like genetic sequencing, Perl has been supplanted by Python and the statistical language R (a variant of S-plus and descendent of S , also developed in the 1980s).

In scientific computing, my present field, Python, not Perl, is the open source overlord, even expanding at Matlab's expense (also a child of the 1980s , and similarly retrofitted with OOP abilities ). And upstart PHP grew in size to the point where it is now arguably the most common language for web development (although its position is dynamic, as Ruby and Python have quelled PHP's dominance and are now entrenched as legitimate alternatives.)

While Perl is not in danger of disappearing altogether, it is in danger of losing cultural relevance , an ironic fate given Wall's love of language. How has Perl become the underdog, and can this trend be reversed? (And, perhaps more importantly, will Perl 6 be released!?)

How I Grew To Love Python

Why Python , and not Perl? Perhaps an illustrative example of what happened to Perl is my own experience with the language. In college, I still stuck to the contained environments of Matlab and Mathematica, but my programming perspective changed dramatically in 2012. I realized lacking knowledge of structured computer code outside the "walled garden" of a desktop application prevented me from fully simulating hypotheses about the natural world, let alone analyzing data sets using the web, which was also becoming an increasingly intellectual and financially lucrative skill set.

One year after college, I resolved to learn a "real" programming language in a serious manner: An all-in immersion taking me over the hump of knowledge so that, even if I took a break, I would still retain enough to pick up where I left off. An older alum from my college who shared similar interests–and an experienced programmer since the late 1990s–convinced me of his favorite language to sift and sort through text in just a few lines of code, and "get things done": Perl. Python, he dismissed, was what "what academics used to think." I was about to be acquainted formally.

Before making a definitive decision on which language to learn, I took stock of online resources, lurked on PerlMonks , and acquired several used O'Reilly books, the Camel Book and the Llama Book , in addition to other beginner books. Yet once again, Python reared its head , and even Perl forums and sites dedicated to the language were lamenting the digital siege their language was succumbing to . What happened to Perl? I wondered. Ultimately undeterred, I found enough to get started (quality over quantity, I figured!), and began studying the syntax and working through examples.

But it was not to be. In trying to overcome the engineered flexibility of Perl's syntax choices, I hit a wall. I had adopted Perl for text analysis, but upon accepting an engineering graduate program offer, switched to Python to prepare.

By this point, CPAN's enormous advantage had been whittled away by ad hoc, hodgepodge efforts from uncoordinated but overwhelming groups of Pythonistas that now assemble in Meetups , at startups, and on college and corporate campuses to evangelize the Zen of Python . This has created a lot of issues with importing ( pointed out by Wall ), and package download synchronizations to get scientific computing libraries (as I found), but has also resulted in distributions of Python such as Anaconda that incorporate the most important libraries besides the standard library to ease the time tariff on imports.

As if to capitalize on the zeitgiest, technical book publisher O'Reilly ran this ad , inflaming Perl devotees.


By 2013, Python was the language of choice in academia, where I was to return for a year, and whatever it lacked in OOP classes, it made up for in college classes. Python was like Google, who helped spread Python and employed van Rossum for many years. Meanwhile, its adversary Yahoo (largely developed in Perl ) did well, but comparatively fell further behind in defining the future of programming. Python was the favorite and the incumbent; roles had been reversed.

So after six months of Perl-making effort, this straw of reality broke the Perl camel's back and caused a coup that overthrew the programming Republic which had established itself on my laptop. I sheepishly abandoned the llama . Several weeks later, the tantalizing promise of a new MIT edX course teaching general CS principles in Python, in addition to numerous n00b examples , made Perl's syntax all too easy to forget instead of regret.

Measurements of the popularity of programming languages, in addition to friends and fellow programming enthusiasts I have met in the development community in the past year and a half, have confirmed this trend, along with the rise of Ruby in the mid-2000s, which has also eaten away at Perl's ubiquity in stitching together programs written in different languages.

While historically many arguments could explain away any one of these studies–perhaps Perl programmers do not cheerlead their language as much, since they are too busy productively programming. Job listings or search engine hits could mean that a programming language has many errors and issues with it, or that there is simply a large temporary gap between supply and demand.

The concomitant picture, and one that many in the Perl community now acknowledge, is that Perl is now essentially a second-tier language, one that has its place but will not be the first several languages known outside of the Computer Science domain such as Java, C, or now Python.

The Future Of Perl (Yes, It Has One)

I believe Perl has a future , but it could be one for a limited audience. Present-day Perl is more suitable to users who have worked with the language from its early days , already dressed to impress . Perl's quirky stylistic conventions, such as using $ in front to declare variables, are in contrast for the other declarative symbol $ for practical programmers today–the money that goes into the continued development and feature set of Perl's frenemies such as Python and Ruby. And the high activation cost of learning Perl, instead of implementing a Python solution. Ironically, much in the same way that Perl jested at other languages, Perl now finds itself at the receiving end .

What's wrong with Perl , from my experience? Perl's eventual problem is that if the Perl community cannot attract beginner users like Python successfully has, it runs the risk of become like Children of Men , dwindling away to a standstill; vast repositories of hieroglyphic code looming in sections of the Internet and in data center partitions like the halls of the Mines of Moria . (Awe-inspiring and historical? Yes. Lively? No.)

Perl 6 has been an ongoing development since 2000. Yet after 14 years it is not officially done , making it the equivalent of Chinese Democracy for Guns N' Roses. In Larry Wall's words : "We're not trying to make Perl a better language than C++, or Python, or Java, or JavaScript. We're trying to make Perl a better language than Perl. That's all." Perl may be on the same self-inflicted path to perfection as Axl Rose, underestimating not others but itself. "All" might still be too much.

Absent a game-changing Perl release (which still could be "too little, too late") people who learn to program in Python have no need to switch if Python can fulfill their needs, even if it is widely regarded as second or third best in some areas. The fact that you have to import a library, or put up with some extra syntax, is significantly easier than the transactional cost of learning a new language and switching to it. So over time, Python's audience stays young through its gateway strategy that van Rossum himself pioneered, Computer Programming for Everybody . (This effort has been a complete success. For example, at MIT Python replaced Scheme as the first language of instruction for all incoming freshman, in the mid-2000s.)

Python Plows Forward

Python continues to gain footholds one by one in areas of interest, such as visualization (where Python still lags behind other language graphics, like Matlab, Mathematica, or the recent d3.js ), website creation (the Django framework is now a mainstream choice), scientific computing (including NumPy/SciPy), parallel programming (mpi4py with CUDA), machine learning, and natural language processing (scikit-learn and NLTK) and the list continues.

While none of these efforts are centrally coordinated by van Rossum himself, a continually expanding user base, and getting to CS students first before other languages (such as even Java or C), increases the odds that collaborations in disciplines will emerge to build a Python library for themselves, in the same open source spirit that made Perl a success in the 1990s.

As for me? I'm open to returning to Perl if it can offer me a significantly different experience from Python (but "being frustrating" doesn't count!). Perhaps Perl 6 will be that release. However, in the interim, I have heeded the advice of many others with a similar dilemma on the web. I'll just wait and C .

[Sep 03, 2017] When To Choose Python Over Perl (And Vice Versa)

www.quora.com
When to use perl:

When to use python:

Edit after 5 years: gravatar for Istvan Albert 5.8 years ago by Istvan Albert ♦♦ 73k University Park, USA Istvan Albert ♦♦ 73k wrote:

I think very few people are completely agnostic about programming languages especially when it comes to languages with very similar strengths and weaknesses like: perl/python/ruby - therefore there is no general reason for using one language vs the other.

It is more common to find someone equally proficient in C and perl, than say equally proficient in perl and python. My guess would be that complementary languages require complementary skill sets that occupy different parts of the brain, whereas similar concepts will clash more easily.

ADD COMMENT • link modified 3.3 years ago • written 5.8 years ago by Istvan Albert ♦♦ 73k 7 gravatar for Michael Dondrup 5.8 years ago by Michael Dondrup42k Bergen, Norway Michael Dondrup42k wrote:

You are totally correct with your initial assumption. This question is similar to choosing between Spanish and English, which language to choose? Well if you go to Spain,...

All (programming) languages are equal, in the sense of that you can solve the same class of problems with them. Once you know one language, you can easily learn all imperative languages. Use the language that you already master or that suits your style. Both (Perl & Python) are interpreted languages and have their merits. Both have extensive Bio-libraries, and both have large archives of contributed packages.

An important criterion to decide is the availability of rich, stable, and well maintained libraries. Choose the language that provides the library you need. For example, if you want to program web-services (using SOAP not web-sites), you better use Java or maybe C#.

Conclusion: it does no harm to learn new languages. And no flame wars please.

ADD COMMENT • link modified 5.8 years ago • written 5.8 years ago by Michael Dondrup42k
Sep 03, 2017 | www.biostars.org

[Sep 03, 2017] Which is better, Perl or Python Which one is more robuts How do they compare with each other - Quora

www.danvk.org
27 Answers Dan Lenski Dan Lenski , I do all my best thinking in Python, and I've written a lot of it Updated Jun 1 2015 Though I may get flamed for it, I will put it even more bluntly than others have: Python is better than Perl . Python's syntax is cleaner, its object-oriented type system is more modern and consistent, and its libraries are more consistent. ( EDIT: As Christian Walde points out in the comments, my criticism of Perl OOP is out-of-date with respect to the current de facto standard of Moo/se. I do believe that Perl's utility is still encumbered by historical baggage in this area and others.)

I have used both languages extensively for both professional work and personal projects (Perl mainly in 1999-2007, Python mainly since), in domains ranging from number crunching (both PDL and NumPy are excellent) to web-based programming (mainly with Embperl and Flask ) to good ol' munging text files and database CRUD

Both Python and Perl have large user communities including many programmers who are far more skilled and experienced than I could ever hope to be. One of the best things about Python is that the community generally espouses this aspect of "The Zen of Python"

Python's philosophy rejects the Perl " there is more than one way to do it " approach to language design in favor of "there should be one!and preferably only one!obvious way to do it".

... while this principle might seem stultifying or constraining at first, in practice it means that most good Python programmers think about the principle of least surprise and make it easier for others to read and interface with their code.

In part as a consequence of this discipline, and definitely because of Python's strong typing , and arguably because of its "cleaner" syntax, Python code is considerably easier to read than Perl code. One of the main events that motivated me to switch was the experience of writing code to automate lab equipment in grad school. I realized I couldn't read Perl code I'd written several months prior, despite the fact that I consider myself a careful and consistent programmer when it comes to coding style ( Dunning–Kruger effect , perhaps? :-P).

One of the only things I still use Perl for occasionally is writing quick-and-dirty code to manipulate text strings with regular expressions; Sai Janani Ganesan's pointed out Perl's value for this as well . Perl has a lot of nice syntactic sugar and command line options for munging text quickly*, and in fact there's one regex idiom I used a lot in Perl and for which I've never found a totally satisfying substitute in Python


* For example, the one-liner perl bak pe 's/foo/bar/g' *. txt will go through a bunch of text files and replace foo with bar everywhere while making backup files with the bak extension.
Sep 03, 2017 | www.quora.com

[Sep 03, 2017] Perl-Python-Ruby Comparison

www.fastcompany.com
What is the Josephus problem? To quote from Concepts, Techniques, and Models of Computer Programming (a daunting title if ever there was one):

Flavius Josephus was a roman historian of Jewish origin. During the Jewish-Roman wars of the first century AD, he was in a cave with fellow soldiers, 40 men in all, surrounded by enemy Roman troops. They decided to commit suicide by standing in a ring and counting off each third man. Each man so designated was to commit suicide...Josephus, not wanting to die, managed to place himself in the position of the last survivor.

In the general version of the problem, there are n soldiers numbered from 1 to n and each k -th soldier will be eliminated. The count starts from the first soldier. What is the number of the last survivor?

I decided to model this situation using objects in three different scripting languages, Perl, Ruby, and Python. The solution in each of the languages is similar. A Person class is defined, which knows whether it is alive or dead, who the next person in the circle is, and what position number it is in. There are methods to pass along a kill signal, and to create a chain of people. Either of these could have been implemented using iteration, but I wanted to give recursion a whirl, since it's tougher on the languages. Here are my results.
Sep 03, 2017 | www.danvk.org

[Jul 10, 2017] Crowdsourcing, Open Data and Precarious Labour by Allana Mayer Model View Culture

Notable quotes:
"... Photo CC-BY Mace Ojala. ..."
"... Photo CC-BY Samantha Marx. ..."
Jul 10, 2017 | modelviewculture.com
Crowdsourcing, Open Data and Precarious Labour Crowdsourcing and microtransactions are two halves of the same coin: they both mark new stages in the continuing devaluation of labour. by Allana Mayer on February 24th, 2016 The cultural heritage industries (libraries, archives, museums, and galleries, often collectively called GLAMs) like to consider themselves the tech industry's little siblings. We're working to develop things like Linked Open Data, a decentralized network of collaboratively-improved descriptive metadata; we're building our own open-source tech to make our catalogues and collections more useful; we're pushing scholarly publishing out from behind paywalls and into open-access platforms; we're driving innovations in accessible tech.

We're only different in a few ways. One, we're a distinctly feminized set of professions , which comes with a large set of internally- and externally-imposed assumptions. Two, we rely very heavily on volunteer labour, and not just in the internship-and-exposure vein : often retirees and non-primary wage-earners are the people we "couldn't do without." Three, the underlying narrative of a "helping" profession ! essentially a social service ! can push us to ignore the first two distinctions, while driving ourselves to perform more and expect less.

I suppose the major way we're different is that tech doesn't acknowledge us, treat us with respect, build things for us, or partner with us, unless they need a philanthropic opportunity. Although, when some ingenue autodidact bootstraps himself up to a billion-dollar IPO, there's a good chance he's been educating himself using our free resources. Regardless, I imagine a few of the issues true in GLAMs are also true in tech culture, especially in regards to labour and how it's compensated.

Crowdsourcing

Notecards in a filing drawer: old-fashioned means of recording metadata.

Photo CC-BY Mace Ojala.

Here's an example. One of the latest trends is crowdsourcing: admitting we don't have all the answers, and letting users suggest some metadata for our records. (Not to be confused with crowdfunding.) The biggest example of this is Flickr Commons: the Library of Congress partnered with Yahoo! to publish thousands of images that had somehow ended up in the LOC's collection without identifying information. Flickr users were invited to tag pictures with their own keywords or suggest descriptions using comments.

Many orphaned works (content whose copyright status is unclear) found their way conclusively out into the public domain (or back into copyright) this way. Other popular crowdsourcing models include gamification , transcription of handwritten documents (which can't be done with Optical Character Recognition), or proofreading OCR output on digitized texts. The most-discussed side benefits of such projects include the PR campaign that raises general awareness about the organization, and a "lifting of the curtain" on our descriptive mechanisms.

The problem with crowdsourcing is that it's been conclusively proven not to function in the way we imagine it does: a handful of users end up contributing massive amounts of labour, while the majority of those signed up might do a few tasks and then disappear. Seven users in the "Transcribe Bentham" project contributed to 70% of the manuscripts completed; 10 "power-taggers" did the lion's share of the Flickr Commons' image-identification work. The function of the distributed digital model of volunteerism is that those users won't be compensated, even though many came to regard their accomplishments as full-time jobs .

It's not what you're thinking: many of these contributors already had full-time jobs , likely ones that allowed them time to mess around on the Internet during working hours. Many were subject-matter experts, such as the vintage-machinery hobbyist who created entire datasets of machine-specific terminology in the form of image tags. (By the way, we have a cute name for this: "folksonomy," a user-built taxonomy. Nothing like reducing unpaid labour to a deeply colonial ascription of communalism.) In this way, we don't have precisely the free-labour-for-exposure/project-experience problem the tech industry has ; it's not our internships that are the problem. We've moved past that, treating even our volunteer labour as a series of microtransactions. Nobody's getting even the dubious benefit of job-shadowing, first-hand looks at business practices, or networking. We've completely obfuscated our own means of production. People who submit metadata or transcriptions don't even have a means of seeing how the institution reviews and ingests their work, and often, to see how their work ultimately benefits the public.

All this really says to me is: we could've hired subject experts to consult, and given them a living wage to do so, instead of building platforms to dehumanize labour. It also means our systems rely on privilege , and will undoubtedly contain and promote content with a privileged bias, as Wikipedia does. (And hey, even Wikipedia contributions can sometimes result in paid Wikipedian-in-Residence jobs.)

For example, the Library of Congress's classification and subject headings have long collected books about the genocide of First Nations peoples during the colonization of North America under terms such as "first contact," "discovery and exploration," "race relations," and "government relations." No "subjugation," "cultural genocide," "extermination," "abuse," or even "racism" in sight. Also, the term "homosexuality" redirected people to "sexual perversion" up until the 1970s. Our patrons are disrespected and marginalized in the very organization of our knowledge.

If libraries continue on with their veneer of passive and objective authorities that offer free access to all knowledge, this underlying bias will continue to propagate subconsciously. As in Mechanical Turk , being "slightly more diverse than we used to be" doesn't get us any points, nor does it assure anyone that our labour isn't coming from countries with long-exploited workers.

Labor and Compensation

Rows and rows of books in a library, on vast curving shelves.

Photo CC-BY Samantha Marx.

I also want to draw parallels between the free labour of crowdsourcing and the free labour offered in civic hackathons or open-data contests. Specifically, I'd argue that open-data projects are less ( but still definitely ) abusive to their volunteers, because at least those volunteers have a portfolio object or other deliverable to show for their work. They often work in groups and get to network, whereas heritage crowdsourcers work in isolation.

There's also the potential for converting open-data projects to something monetizable: for example, a Toronto-specific bike-route app can easily be reconfigured for other cities and sold; while the Toronto version stays free under the terms of the civic initiative, freemium options can be added. The volunteers who supply thousands of transcriptions or tags can't usually download their own datasets and convert them into something portfolio-worthy, let alone sellable. Those data are useless without their digital objects, and those digital objects still belong to the museum or library.

Crowdsourcing and microtransactions are two halves of the same coin: they both mark new stages in the continuing devaluation of labour, and they both enable misuse and abuse of people who increasingly find themselves with few alternatives. If we're not offering these people jobs, reference letters, training, performance reviews, a "foot in the door" (cronyist as that is), or even acknowledgement by name, what impetus do they have to contribute? As with Wikipedia, I think the intrinsic motivation for many people to supply us with free labour is one of two things: either they love being right, or they've been convinced by the feel-good rhetoric that they're adding to the net good of the world. Of course, trained librarians, archivists, and museum workers have fallen sway to the conflation of labour and identity , too, but we expect to be paid for it.

As in tech, stereotypes and PR obfuscate labour in cultural heritage. For tech, an entrepreneurial spirit and a tendency to buck traditional thinking; for GLAMs, a passion for public service and opening up access to treasures ancient and modern. Of course, tech celebrates the autodidactic dropout; in GLAMs, you need a masters. Period. Maybe two. And entry-level jobs in GLAMs require one or more years of experience, across the board.

When library and archives students go into massive student debt, they're rarely apprised of the constant shortfall of funding for government-agency positions, nor do they get told how much work is done by volunteers (and, consequently, how much of the job is monitoring and babysitting said volunteers). And they're not trained with enough technological competency to sysadmin anything , let alone build a platform that pulls crowdsourced data into an authoritative record. The costs of commissioning these platforms aren't yet being made public, but I bet paying subject experts for their hourly labour would be cheaper.

Solutions

I've tried my hand at many of the crowdsourcing and gamifying interfaces I'm here to critique. I've never been caught up in the "passion" ascribed to those super-volunteers who deliver huge amounts of work. But I can tally up other ways I contribute to this problem: I volunteer for scholarly tasks such as peer-reviewing, committee work, and travelling on my own dime to present. I did an unpaid internship without receiving class credit. I've put my research behind a paywall. I'm complicit in the established practices of the industry, which sits uneasily between academic and social work: neither of those spheres have ever been profit-generators, and have always used their codified altruism as ways to finagle more labour for less money.

It's easy to suggest that we outlaw crowdsourced volunteer work, and outlaw microtransactions on Fiverr and MTurk, just as the easy answer would be to outlaw Uber and Lyft for divorcing administration from labour standards. Ideally, we'd make it illegal for technology to wade between workers and fair compensation.

But that's not going to happen, so we need alternatives. Just as unpaid internships are being eliminated ad-hoc through corporate pledges, rather than being prohibited region-by-region, we need pledges from cultural-heritage institutions that they will pay for labour where possible, and offer concrete incentives to volunteer or intern otherwise. Budgets may be shrinking, but that's no reason not to compensate people at least through resume and portfolio entries. The best template we've got so far is the Society of American Archivists' volunteer best practices , which includes "adequate training and supervision" provisions, which I interpret to mean outlawing microtransactions entirely. The Citizen Science Alliance , similarly, insists on "concrete outcomes" for its crowdsourcing projects, to " never waste the time of volunteers ." It's vague, but it's something.

We can boycott and publicly shame those organizations that promote these projects as fun ways to volunteer, and lobby them to instead seek out subject experts for more significant collaboration. We've seen a few efforts to shame job-posters for unicorn requirements and pathetic salaries, but they've flagged without productive alternatives to blind rage.

There are plenty more band-aid solutions. Groups like Shatter The Ceiling offer cash to women of colour who take unpaid internships. GLAM-specific internship awards are relatively common , but could: be bigger, focus on diverse applicants who need extra support, and have eligibility requirements that don't exclude people who most need them (such as part-time students, who are often working full-time to put themselves through school). Better yet, we can build a tech platform that enables paid work, or at least meaningful volunteer projects. We need nationalized or non-profit recruiting systems (a digital "volunteer bureau") that matches subject experts with the institutions that need their help. One that doesn't take a cut from every transaction, or reinforce power imbalances, the way Uber does. GLAMs might even find ways to combine projects, so that one person's work can benefit multiple institutions.

GLAMs could use plenty of other help, too: feedback from UX designers on our catalogue interfaces, helpful tools , customization of our vendor platforms, even turning libraries into Tor relays or exits . The open-source community seems to be looking for ways to contribute meaningful volunteer labour to grateful non-profits; this would be a good start.

What's most important is that cultural heritage preserves the ostensible benefits of crowdsourcing – opening our collections and processes up for scrutiny, and admitting the limits of our knowledge – without the exploitative labour practices. Just like in tech, a few more glimpses behind the curtain wouldn't go astray. But it would require deeper cultural shifts, not least in the self-perceptions of GLAM workers: away from overprotective stewards of information, constantly threatened by dwindling budgets and unfamiliar technologies, and towards facilitators, participants in the communities whose histories we hold.

Tech Workers Please Stop Defending Tech Companies by Shanley Kane Model View Culture

[Nov 11, 2016] pdxdoughnut

Notable quotes:
"... Show the command line for a PID, converting nulls to spaces and a newline ..."
Nov 11, 2016 | www.commandlinefu.com

Commands by pdxdoughnut from the last day the last week the last month all time sorted by date votes

Terminal - Commands by pdxdoughnut - 26 results

whatinstalled() { which "$@" | xargs -r readlink -f | xargs -r dpkg -S ;}

2016-11-08 20:59:25

Tags: which grep dpkg function packages realpath

Comments (4)

command ${MYVAR:+--someoption=$MYVAR}

2015-11-04 19:47:24

Use a var with more text only if it exists. See "Parameter Expansion" in the bash manpage. They refer to this as "Use Alternate Value", but we're including the var in the at alternative.

Comments (2)

[ -n "$REMOTE_USER" ] || read -p "Remote User: " -er -i "$LOGNAME" REMOTE_USER

2015-10-30 17:08:17

If (and only if) the variable is not set, prompt users and give them a default option already filled in. The read command reads input and puts it into a variable. With -i you set an initial value. In this case I used a known environment variable.

Show sample output | Comments (2)

debsecan --format detail

2015-10-22 18:46:41

List known debian vulnerabilities on your system -- many of which may not yet be patched.

You can search for CVEs at https://security-tracker.debian.org/tracker/ or use --report to get full links. This can be added to cron, but unless you're going to do manual patches, you'd just be torturing yourself.

Comments (3)

tr '\0' ' ' </proc/21679/cmdline ; echo

2015-09-25 22:08:31

Show the command line for a PID, converting nulls to spaces and a newline

Comments (7)

mv -iv $FILENAME{,.$(stat -c %y $FILENAME | awk '{print $1}')}

2014-12-01 22:41:38

Rename file to same name plus datestamp of last modification.

Note that the -i will not help in a script. Proper error checking is required.

Show sample output | Comments (2)

echo "I am $BASH_SUBSHELL levels nested";

2014-06-20 20:33:43

Comments (0)

diff -qr /dirA /dirB

2014-04-01 21:42:19

shows which files differ in two direcories

Comments (0)

find ./ -type l -ls

2014-03-21 17:13:39

Show all symlinks

Comments (3)

ls | xargs WHATEVER_COMMAND

xargs will automatically determine how namy args are too many and only pass a reasonable number of them at a time. In the example, 500,002 file names were split across 26 instantiations of the command "echo".

killall conky

kill a process(e.g. conky) by its name, useful when debugging conky:)

nmap -sn 192.168.1.0/24

2014-01-28 23:32:18

Ping all hosts on 192.168.1.0/24

find . -name "pattern" -type f -printf "%s\n" | awk '{total += $1} END {print total}'

2014-01-16 01:16:18

Find files and calculate size of result in shell. Using find's internal stat to get the file size is about 50 times faster than using -exec stat.

mkdir Epub ; mv -v --target-directory=Epub $(fgrep -lr epub *)

2014-01-16 01:07:29

Move all epub keyword containing files to Epub folder

Comments (0) | Add to favourites | Report as malicious | Submit alternative | Report as a duplicate

CMD=chrome ; ps h -o pmem -C $CMD | awk '{sum+=$1} END {print sum}'

Show total cumulative memory usage of a process that spawns multiple instances of itself

mussh -h 192.168.100.{1..50} -m -t 10 -c uptime

2013-11-27 18:01:12

This will run them at the same time and timeout for each host in ten seconds. Also, mussh will append the ip addres to the beginning of the output so you know which host resonded with which time.

The use of the sequence expression {1..50} is not specific to mussh. The `seq ...` works, but is less efficient.

Comments (1)

du -Sh | sort -h | tail

2013-11-27 17:50:11

Which files/dirs waste my disk space

I added -S to du so that you don't include /foo/bar/baz.iso in /foo, and change sorts -n to -h so that it can properly sort the human readable sizes.

base64 /dev/urandom | head -c 33554432 | split -b 8192 -da 4 - dummy.

2013-11-12 17:56:23

Create a bunch of dummy text files

Avoiding a for loop brought this time down to less than 3 seconds on my old machine. And just to be clear, 33554432 = 8192 * 4086.

Comments (6)

sudo lsof -iTCP:25 -sTCP:LISTEN

2013-11-12 17:32:34

Check if TCP port 25 is open

for i in {1..4096}; do base64 /dev/urandom | head -c 8192 > dummy$i.rnd ; done

2013-11-12 00:36:10

Create a bunch of dummy text files

Using the 'time' command, running this with 'tr' took 28 seconds (and change) each time but using base64 only took 8 seconds (and change). If the file doesn't have to be viewable, pulling straight from urandom with head only took 6 seconds (and change)

Comments (2)

find -name .git -prune -o -type f -exec md5sum {} \; | sort -k2 | md5sum

2013-10-28 22:14:08

Create md5sum of a directory

Comments (6)

logger -t MyProgramName "Whatever you're logging"

2013-10-22 16:34:49

creating you're logging function for your script. You could also pipe to logger.

find garbage/ -type f -delete

2013-10-21 23:26:51

rm filenames with spaces

I _think_ you were trying to delete files whether or not they had spaces. This would do that. You should probably be more specific though.

Comments (3)

lsof -iTCP:8080 -sTCP:LISTEN

2013-10-07 18:22:32

check to see what is running on a specific port number

find . -size +100M

2013-10-07 18:16:14

Recursivly search current directory for files larger than 100MB

Comments (3)

[Sep 18, 2016] R Weekly

Sep 18, 2016 | tm.durusau.net
September 12th, 2016 R Weekly

A new weekly publication of R resources that began on 21 May 2016 with Issue 0 .

Mostly titles of post and news articles, which is useful, but not as useful as short summaries, including the author's name.

[Nov 08, 2015] Abstraction

nickgeoghegan.net

Filed in Programming No Comments

A few things can confuse programming students, or new people to programming. One of these is abstraction.

Wikipedia says:

In computer science, abstraction is the process by which data and programs are defined with a representation similar to its meaning (semantics), while hiding away the implementation details. Abstraction tries to reduce and factor out details so that the programmer can focus on a few concepts at a time. A system can have several abstraction layers whereby different meanings and amounts of detail are exposed to the programmer. For example, low-level abstraction layers expose details of the hardware where the program is run, while high-level layers deal with the business logic of the program.

That might be a bit too wordy for some people, and not at all clear. Here's my analogy of abstraction.

Abstraction is like a car

A car has a few features that makes it unique.

If someone can drive a Manual transmission car, they can drive any Manual transmission car. Automatic drivers, sadly, cannot drive a Manual transmission drivers without "relearing" the car. That is an aside, we'll assume that all cars are Manual transmission cars – as is the case in Ireland for most cars.

Since I can drive my car, which is a Mitsubishi Pajero, that means that I can drive your car – a Honda Civic, Toyota Yaris, Volkswagen Passat.

All I need to know, in order to drive a car – any car – is how to use the breaks, accelerator, steering wheel, clutch and transmission. Since I already know this in my car, I can abstract away your car and it's controls.

I do not need to know the inner workings of your car in order to drive it, just the controls. I don't need to know how exactly the breaks work in your car, only that they work. I don't need to know, that your car has a turbo charger, only that when I push the accelerator, the car moves. I also don't need to know the exact revs that I should gear up or gear down (although that would be better on the engine!)

Virtually all controls are the same. Standardization means that the clutch, break and accelerator are all in the same place, regardless of the car. This means that I do not need to relearn how a car works. To me, a car is just a car, and is interchangeable with any other car.

Abstraction means not caring

As a programmer, or someone using a third party API (for example), abstraction means not caring how the inner workings of some function works – Linked list data structure, variable names inside the function, the sorting algorithm used, etc – just that I have a standard (preferable unchanging) interface to do whatever I need to do.

Abstraction can be taught of as a black box. For input, you get output. That shouldn't be the case, but often is. We need abstraction so that, as a programmer, we can concentrate on other aspects of the program – this is the corner-stone for large scale, multi developer, software projects.

[Feb 05, 2015] JavaScript, PHP Top Most Popular Languages, With Apple's Swift Rising Fast

Its not about the language, its about the ecosystem

Feb 04, 2015 | slashdot.org

Posted by samzenpus
from the king-of-the-hill dept.

Nerval's Lobster writes Developers assume that Swift, Apple's newish programming language for iOS and Mac OS X apps, will become extremely popular over the next few years. According to new data from RedMonk, a tech-industry analyst firm, Swift could reach that apex of popularity sooner rather than later. While the usual stalwarts-including JavaScript, Java, PHP, Python, C#, C++, and Ruby-top RedMonk's list of the most-used languages, Swift has, well, swiftly ascended 46 spots in the six months since the firm's last update, from 68th to 22nd. RedMonk pulls data from GitHub and Stack Overflow to create its rankings, due to those sites' respective sizes and the public nature of their data. While its top-ranked languages don't trade positions much between reports, there's a fair amount of churn at the lower end of the rankings. Among those "smaller" languages, R has enjoyed stable popularity over the past six months, Rust and Julia continue to climb, and Go has exploded upwards-although CoffeeScript, often cited as a language to watch, has seen its support crumble a bit.


Dutch Gun (899105) on Wednesday February 04, 2015 @09:45PM (#48985989)

Re:not really the whole story (Score:5, Insightful)

More critically, the question I always ask about this is: "Used for what?"

Without that context, why does popularity even matter? For example, I'm a game developer, so my programming life revolves around C++, at least for game-side or engine-level code - period. Nothing else is even on the radar when you're talking about highly-optimized, AAA games. For scripting, Lua is a popular contender. For internal tools, C# seems to be quite popular. I've also seen Python used for tool extensions, or for smaller tools in their own right. Javascript is generally only used for web-based games, or by the web development teams for peripheral stuff.

I'll bet everyone in their own particular industry has their own languages which are dominant. For instance, if you're working on the Linux kernel, you're obviously working in C. It doesn't matter what the hell everyone else does. If you're working in scientific computing, are you really looking seriously at Swift? Of course not. Fortran, F#, or C++ are probably more appropriate, or perhaps others I'm not aware of. A new lightweight iOS app? Swift it is!

Languages are not all equal. The popularity of Javascript is not the measure of merit of that particular language. It's a measure of how popular web-based development is (mostly). C/C++ is largely a measure of how many native, high-performance-required applications there are (games, OS development, large native applications). Etc, etc.

Raw popularity numbers probably only have one practical use, and that's finding a programming job without concern for the particular industry. Or I suppose if you're so emotionally invested in a particular language, it's nice to know where it stands among them all.

unrtst (777550) on Wednesday February 04, 2015 @10:34PM (#48986283)

... And not sure public github or stack overflow are really as representative as they want to believe

Yeah.. why is this any better than:

TIOBE index: http://www.tiobe.com/index.php... [tiobe.com]

This story about python surpassing java as top learning language:

http://developers.slashdot.org... [slashdot.org]

Or this about 5 languages you'll need to learn for the next year and on:

http://news.dice.com/2014/07/2... [dice.com]

... those are all from the past year on slashdot, and there's loads more.

Next "top languages" post I see, I hope it just combines all the other existing stats to provide a weightable index (allow you to tweak what's most important). Maybe BH can address that :-)

gavron (1300111) on Wednesday February 04, 2015 @08:21PM (#48985495)

68th to 22nd and there are many to go (Score:5, Insightful)

All new languages start out at the bottom, as Swift did. In time, the ones that don't get used fall down.

Swift has gotten up to 22nd, but the rest of the climb past the stragglers won't ever happen.

However, to be "the most popular language" is clearly no contest worth winning. Paris Hilton and Kim Kardashian are most popular compared to Steven Hawking and Isaac Asimov.

Being popular doesn't mean better, useful, or even of any value whatsoever. It just means someone has a better marketing-of-crap department.

There's a time to have popularity contests. It's called high school.

E

coop247 (974899) on Wednesday February 04, 2015 @08:54PM (#48985695)

Being popular doesn't mean better, useful, or even of any value whatsoever

PHP runs facebook, yahoo, wordpress, and wikipedia. Javascript runs everything on the internet. Yup, no value there.

UnknownSoldier (67820) on Wednesday February 04, 2015 @08:39PM (#48985617)

Popularity != Quality (Score:5, Insightful)

McDonalds may serve billions, but no one is trying to pass it off as gourmet food.

Kind of like PHP and Javascript. The most fucked up languages are the most popular ... Go figure.

* http://dorey.github.io/JavaScr... [github.io]

Xest (935314) on Thursday February 05, 2015 @04:47AM (#48987365)

Re:Popularity != Quality (Score:2)

I think this is the SO distortion effect.

Effectively the more warts a language has and/or the more poorly documented it is, the more questions that are bound to be asked about it, hence the more apparent popularity if you use SO as a metric.

So if companies like Microsoft and Oracle produce masses of great documentation for their respective technologies and provide entire sites of resources for them (such as www.asp.net or the MSDN developer forums) then they'll inherently see reduced "popularity" on SO.

Similarly some languages have a higher bar to entry, PHP and Javascript are both repeatedly sold as languages that beginners can start with, it should similarly be unsurprising therefore that more questions are asked about them than by people who have moved up the change to enterprise languages like C#, Java, and C++.

But I shouldn't complain too much, SO popularity whilst still blatantly flawed is still a far better metric than TIOBE whose methodology is just outright broken (they explain their methodology on their site, and even without high school statistics knowledge it shouldn't take more than 5 seconds to spot gaping holes in their methodology).

I'm still amazed no one's done an actual useful study on popularity and simply scraped data from job sites each month. It'd be nice to know what companies are actually asking for, and what they're paying. That is after all the only thing anyone really wants to know when they talk about popularity - how likely is it to get me a job, and how well is it likely to pay? Popularity doesn't matter beyond that as you just choose the best tool for the job regardless of how popular it is.

Shados (741919) on Wednesday February 04, 2015 @11:08PM (#48986457)

Re:Just learn C and Scala (Score:2)

Its not about the language, its about the ecosystem. ie: .NET may be somewhere in between java and scala, and the basic of the framework is the same, but if you do high end stuff, JVM languages and CLR languages are totally different. Different in how you debug it in production, different in what the standards are, different in what patterns people expect you to use when they build a library, different gotchas. And while you can pick up the basics in an afternoon, it can take years to really push it.

Doesn't fucking matter if you're doing yet-another-e-commerce-site (and if you are, why?). Really fucking big deal if you do something that's never been done before with a ridiculous amount of users.

[Oct 18, 2013] Tom Clancy, Best-Selling Master of Military Thrillers, Dies at 66

Fully applicable to programming...
NYTimes.com

"I tell them you learn to write the same way you learn to play golf," he once said. "You do it, and keep doing it until you get it right. A lot of people think something mystical happens to you, that maybe the muse kisses you on the ear. But writing isn't divinely inspired - it's hard work."

[Oct 12, 2013] YaST Developers Explain Move to Ruby by Susan Linton

Oct. 10, 2013 | OStatic

Last summer Lukas Ocilka mentioned the completion of the basic conversion of YaST from YCP to Ruby. At the time it was said the change was needed to encourage contributions from a wider set of developers, and Ruby is said to be simpler and more flexible. Well, today Jos Poortvliet posted an interview with two YaST developers explaining the move in more detail.

In a discussion with Josef Reidinger and David Majda, Poortvliet discovered the reason for the move was because all the original YCP developers had moved on to other things and everyone else felt YCP slowed them down. "It didn't support many useful concepts like OOP or exception handling, code written in it was hard to test, there were some annoying features (like a tendency to be "robust", which really means hiding errors)."

Ruby was chosen because it is a well known language over at the openSUSE camp and was already being used on other SUSE projects (such as WebYaST). "The internal knowledge and standardization was the decisive factor." The translation went smoothly according to developers because they "automated the whole process and did testing builds months in advance. We even did our custom builds of openSUSE 13. 1 Milestones 2 and 3 with pre-release versions of YaST in Ruby."

For now performance under the Ruby code is comparable to the YCP version because developers were concentrating on getting it working well during this first few phases and user will notice very little if any visual changes to the YaST interface. No more major changes are planned for this development cycle, but the new Yast will be used in 13.1 due out November 19.

See the full interview for lots more detail.

[Oct 19, 2012] Google's Engineers Are Well Paid, Not Just Well Fed

October 18, 2012 | Slashdot

Anonymous Coward

Re:$128,000?

writes: on Thursday , @12:38PM (#41694241)

I make more than $40k as a software developer, but it wasn't too long ago that I was making right around that amount.

I have an AAS (not a fancy degree, if you didn't already know), my GPA was 2.8, and I assure you that neither of those things has EVER come up in a job interview. I'm also old enough that my transcripts are gone. (Schools only keep them for about 10 years. After that, nobody's looking anyway.)

The factors that kept me from making more are:

So when I did finally land a programming job, it was as a code monkey in a PHP sweatshop. The headhunter wanted a decent payout, so I started at $40k. No raises. Got laid off after a year and a half due to it being a sweatshop and I had outstayed my welcome. (Basically, I wanted more money and they didn't want to give me any more money.)

Next job was a startup. Still $40k. Over 2.5 years, I got a couple of small raises. I topped out at $45k-ish before I got laid off during the early days of the recession.

Next job was through a headhunter again. I asked for $50k, but the employer could only go $40k. After 3 years and a few raises, I'm finally at $50k.

I could probably go to the larger employers in this city and make $70k, but that's really the limit in this area. Nobody in this line of work makes more than about $80k here.

aralin

Not accurate, smaller companies pay more

This survey must be only talking about companies above certain size. Our Sillicon Valley startup has about 50 employees and the average engineering salaries are north of $150,000. Large companies like Google actually don't have to pay that much, because the hours are more reasonable. I know there are other companies too that pay more than Google in the area.

Reply to This Parent Share twitter facebook Flag as Inappropriate

Re:Not accurate, smaller companies pay more (Score:4, Interesting)
by MisterSquid (231834) writes: on Thursday October 18, @11:16AM (#41693121)
Our Sillicon Valley startup has about 50 employees and the average engineering salaries are north of $150,000.

I suppose there are some start-ups that do pay developers the value of the labor, but my own experience is a bit different in that it was more stereotypical of Silicon-Valley startup compensation packages. That is, my salary was shamefully low (I was new to the profession), just about unlivable for the Bay Area, and was offset with a very accelerated stock options plan.

[Mar 14, 2012] Perl vs Python Why the debate is meaningless " The ByteBaker

Arvind Padmanabhan:

I don't know Python but I can comment on Perl. I have written many elegant scripts for complex problems and I still love it. I often come across comments about how a programmer went back to his program six months later and had difficulty understanding it. For my part, I haven't had this problem primarily because I use consistently a single syntax. If perl provides more than one way to do things, I choose and use only one. Secondly, I do agree that objects in perl are clunky and make for difficult writing/reading. I have never used them. This makes it difficult for me to write perl scripts for large projects. Perhaps this is where Python succeeds.

shane o mac

I was forced to learn Python in order to write scripts within Blender (Open Source 3D Modeler).

White Hat:
1. dir( object )
This is nice as it shows functions and constants.

Black Hat:
1. Indention to denote code blocks. (!caca?)

PERL was more of an experiment instead of necessity. Much of what I know about regular expressions probably came from reading about PERL. I never even wrote much code in PERL. You see CPAN (Repository for modules) alone makes up for all the drawbacks I can't think of at the moment.

White Hat:
FreeBSD use to extensively use PERL for installation routines (4.7., I keep a copy of it in my music case although I don't know why as I feel its a good luck charm of sorts). Then I read in 5.0 they started removing it in favor of shell script (BAsh). Why?

Black Hat:
I'm drawing a blank here.

With freedom there are costs, you are allowed to do as you please. Place variables as you must and toss code a-muck.

You can discipline yourself to write great code in any language. VB-Script I write has the appearance of a standard practices C application.

I think it's a waste of time to sit and debate over which language is suited for which project. Pick one and master it.

So you can't read PERL? Break the modules into several files. It's more about information management than artisan ablility. Divide and Conquor.

Peter

I disagree with mastering one language. Often there will be trade-offs in a program that match very nicely with a particular program.

For example, you could code everything in C++\C\assembler, but that only makes sense when you really need speed or memory compactness. After all, I find it difficult to write basic file processing applications in C in under 10 minutes.

Perl examples use a lot of default variables and various ways to approach problems, but this is really a nightmare when you have to maintain someone else's code. Especially if you don't know Perl. I think its hard to understand (without a background in Perl)

Juls

I'm a perl guy not a py programmer so I won't detract from python [except for the braces, Guido should at least let the language compile with them].

Note: Perl is like English it's a reflective language. So I can make nouns into adjectives and use the power of reflection. For example … 'The book on my living room table' vs [Spanish] 'The book on the table of my living room'.

And this makes sense … because Larry Wall was a linguist; and was very influenced by the fact that reflective languages can say more with less because much is implied based on usage. These languages can also say the same thing many different ways. [Perl makes me pull my hair out. | Perl makes me pull out my hair.] And being that we have chromosomes that wire us for human language … these difficulties are soon mastered by even children. But … but we don't have the same affinity for programming languages (well most of us) so yes Perl can be a struggle in the beginning. But once you achieve a strong familiarity and stop trying to turn Perl into C or Python and allow Perl just to be Perl you really really start to enjoy it for those reasons you didn't like it before.

The biggest failure of Perl has been its users enjoying the higher end values of the language and failing to publish and document simple examples to help non-monks get there. You shouldn't have to be a monk to seek wisdom at the monastic gates.

Example … Perl classes. Obtuse and hard to understand you say? It doesn't have to be … I think that most programmers will understand and be able to write their own after just looking at this simple example. Keep in mind, we just use 'package' instead or class. Bless tells the interpreter your intentions and is explicitly used because you can bless all kinds of things … including a class (package).

my $calc = new Calc; # or Calc->new;
print $calc->Add(1);
print $calc->Add(9);
print $calc->Pie(67);

package Calc;
sub new
{
my $class = shift; # inherits from package name, 'Calc'
my $self =
{
_undue => undef,
_currentVal => 0,
_pie => 3.14
};
bless $self, $class; # now we have a class named 'Calc'
return $self;
1;

sub Add
{
my ($self, $val) = @_;

$self->{_undue} = $self->{_currentVal}; # save off the last value

$self->{_currentVal} = $self->{_currentVal} + $val; # add the scalars
return $self->{_currentVal}; # return the new value
}

sub Pie
{
my ($self, $val) = @_;

$self->{_undue} = $self->{_currentVal}; # save off the last value

$self->{_currentVal} = $self->{_pie} * $val; # add the scalars
return $self->{_currentVal}; # return the new value
}
}

[Mar 14, 2012] How To Contribute To Open Source Without Being a Programming Rock Star

Esther Schindler writes "Plenty of people want to get involved in open source, but don't know where to start. In this article, Andy Lester lists several ways to help out even if you lack confidence in your technical chops. Here are a couple of his suggestions: 'Maintenance of code and the systems surrounding the code often are neglected in the rush to create new features and to fix bugs. Look to these areas as an easy way to get your foot into a project. Most projects have a publicly visible trouble ticket system, linked from the front page of the project's website and included in the documentation. It's the primary conduit of communication between the users and the developers. Keeping it current is a great way to help the project. You may need to get special permissions in the ticketing system, which most project leaders will be glad to give you when you say you want to help clean up the tickets.'" What's your favorite low-profile way to contribute?

[Dec 25, 2010] [PDF] Brian Kernighan - Random thoughts on scripting languages

I would expect better, higher level of thinking from Brian...

Other scripting languages

[Dec 20, 2010] What hurts you the most in Perl

LinkedIn

Steve Carrobis

Perl is a far better applications type language than JAVA/C/C#. Each has their niche. Threads were always an issue in Perl, and like OO, if you don't need it or know it don't use it.

My issues with Perl is when people get Overly Obfuscated with their code because the person thinks that less characters and a few pointers makes the code faster.

Unless you do some real smart OOesque building all you are doing is making it harder to figure out what you were thinking about. and please perl programmers, don't by into the "self documenting code" i am an old mainframer and self documenting code was as you wrote you added comments to the core parts of the code ... i can call my subroutine "apple" to describe it.. but is it really an apple? or is it a tomato or pomegranate. If written properly Perl is very efficient code. and like all the other languages if written incorrectly its' HORRIBLE. I have been writing perl since almost before 3.0 ;-)

Thats my 3 cents.. Have a HAPPY and a MERRY!

Nikolai Bezroukov

@steve Thanks for a valuable comment about the threat of overcomplexity junkies in Perl. That's a very important threat that can undermine the language future.

@Gabor: A well know fact is that PHP, which is a horrible language both as for general design and implementation of most features you mentioned is very successful and is widely used on for large Web applications with database backend (Mediawiki is one example). Also if we think about all dull, stupid and unrelaible Java coding of large business applications that we see on the marketplace the question arise whether we want this type of success ;-)

@Douglas: Mastering Perl requires slightly higher level of qualification from developers then "Basic-style" development in PHP or commercial Java development (where Java typically plays the role of Cobol) which is mainstream those days. Also many important factors are outside technical domain: ecosystem for Java is tremendous and is supported by players with deep pockets. Same is true for Python. Still Perl has unique advantages, is universally deployed on Unix and as such is and always will be attractive for thinking developers

I think that for many large business applications which in those days often means Web application with database backend one can use virtual appliance model and use OS facilities for multitasking. Nothing wrong with this approach on modern hardware. Here Perl provides important advantages due to good integration with Unix.

Also structuring of a large application into modules using pipes and sockets as communication mechanism often provides very good maintainability. Pointers are also very helpful and unique for Perl. Typically scripting languages do not provide pointers. Perl does and as such gives the developer unique power and flexibility (with additional risks as an inevitable side effect).

Another important advantage of Perl is that it is a higher level language then Python (to say nothing about Java ) and stimulates usage of prototyping which is tremendously important for large projects as the initial specification is usually incomplete and incorrect. Also despite proliferation of overcomplexity junkies in Perl community, some aspects of Perl prevent excessive number of layers/classes, a common trap that undermines large projects in Java. Look at IBM fiasco with Lotus Notes 8.5.

I think that Perl is great in a way it integrates with Unix and promote thinking of complex applications as virtual appliances. BTW this approach also permits usage of a second language for those parts of the system for which Perl does not present clear advantages.

Also Perl provide an important bridge to system administrators who often know the language and can use subset of it productively. That makes it preferable for large systems which depend on customization such as monitoring systems.

Absence of bytecode compiler hurts development of commercial applications in Perl in more ways than one but that's just question of money. I wonder why ActiveState missed this opportunity to increase its revenue stream. I also agree that the quality of many CPAN modules can be improved but abuse of CPAN along with fixation on OO is a typical trait of overcomplexity junkies so this has some positive aspect too :-).

I don't think that OO is a problem for Perl, if you use it where it belongs: in GUI interfaces. In many cases OO is used when hierarchical namespaces are sufficient. Perl provides a clean implementation of the concept of namespaces. The problem is that many people are trained in Java/C++ style of OO and as we know for hummer everything looks like a nail. ;-)

Allan Bowhill:

I think the original question Gabor posed implies there is a problem 'selling' Perl to companies for large projects. Maybe it's a question of narrowing its role.

It seems to me that if you want an angle to sell Perl on, it would make sense to cast it (in a marketing sense) into a narrower role that doesn't pretend to be everything to everyone. Because, despite what some hard-core Perl programmers might say, the language is somewhat dated. It hasn't really changed all that much since the 1990s.

Perl isn't particularly specialized so it has been used historically for almost every kind of application imaginable. Since it was (for a long time in the dot-com era) a mainstay of IT development (remember the 'duct tape' of the internet?) it gained high status among people who were developing new systems in short time-frames. This may in fact be one of the problems in selling it to people nowadays.

The FreeBSD OS even included Perl as part of their main (full) distribution for some time and if I remember correctly, Perl scripts were included to manage the ports/packaging system for all the 3rd party software. It was taken out of the OS shortly after the bust and committee reorganization at FreeBSD, where it was moved into third-party software. The package-management scripts were re-written in C. Other package management utilities were effectively displaced by a Ruby package.

A lot of technologies have come along since the 90s which are more appealing platforms than Perl for web development, which is mainly what it's about now.

If you are going to build modern web sites these days, you'll more than likely use some framework that utilizes object-oriented languages. I suppose the Moose augmentation of Perl would have some appeal with that, but CPAN modules and addons like Moose are not REALLY the Perl language itself. So if we are talking about selling the Perl language alone to potential adopters, you have to be honest in discussing the merits of the language itself without all the extras.

Along those lines I could see Perl having special appeal being cast in a narrower role, as a kind of advanced systems batching language - more capable and effective than say, NT scripting/batch files or UNIX shell scripts, but less suitable than object-oriented languages, which pretty much own the market for web and console utilities development now.

But there is a substantial role for high-level batching languages, particularly in systems that build data for consumption by other systems. These are traditionally implemented in the highest-level batching language possible. Such systems build things like help files, structured (non-relational) databases (often used on high-volume commercial web services), and software. Not to mention automation many systems administration tasks.

There are not too many features or advantages to Perl that are unique in itself in the realm of scripting languages, as they were in the 90s. The simplicity of built-in Perl data structures and regular expression capabilities are reflected almost identically in Ruby, and are at least accessible in other strongly-typed languages like Java and C#.

The fact that Perl is easy to learn, and holds consistent with the idea that "everything is a string" and there is no need to formalize things into an object-oriented model are a few of its selling points. If it is cast as an advanced batching language, there are almost no other languages that could compete with it in that role.

Dean Hamstead:

@Pascal: bytecode is nasty for the poor Sysadmin/Devop who has to run your code. She/he can never fix it when bugs arise. There is no advantage to bytecode over interpreted.

Which infact leads me to a good point.

All the 'selling points' of Java have all failed to be of any real substance.

In truth, Java is popular because it is popular.

Lots of people dont like perl because its not popular any more. Similar to how lots of people hate Mac's but have no logical reason for doing so.

Douglas is almost certainly right, that Python is rapidly becoming the new fad language.

Im not sure how perl OO is a 'hack'. When you bless a reference in to an object it becomes and object... I can see that some people are confused by perls honesty about what an object is. Other languages attempt to hide away how they have implemented objects in their compiler - who cares? Ultimately the objects are all converted in to machine code and executed.

In general perl objects are more object oriented than java objects. They are certainly more polymorphic.

Perl objects can fully hide their internals if thats something you want to do. Its not even hard, and you dont need to use moose. But does it afford any real benefit? Not really.

At the end of the day, if you want good software you need to hire good programmers it has nothing to do with the language. Even though some languages try to force the code to be neat (Python) and try to force certain behaviours (Java?) you can write complete garbage in any of them, then curse that language for allowing the author to do so.

A syntactic argument is pointless. As is something oriented around OO. What benefits perl brings to a business are...

- massive centralised website of libraries (CPAN)
- MVC's
- DBI
- POE
- Other frameworks etc
- automated code review (perlcritic)
- automated code formatting and tidying (perltidy)
- document as you code (POD)
- natural test driven development (Test::More etc)
- platform independence
- perl environments on more platforms than java
- perl comes out of the box on every unix
- excellent canon of printed literature, from beginner to expert
- common language with Sysadmin/Devops and traditional developers roles (with source code always available to *fix* them problem quickly, not have to try to set up an ant environment with and role a new War file)
- rolled up perl applications (PAR files)
- Perl can use more than 3.6gig of ram (try that in java)

Brian Martin

Well said Dean.

Personally, I don't really care if a system is written in Perl or Python or some other high level language, I don't get religious about which high level language is used.

There are many [very] high level languages, any one of them is vastly more productive & consequently less buggy than developing in a low level language like C or Java. Believe me, I have written more vanilla C code in my career than Perl or Python, by a factor of thousands, yet I still prefer Python or Perl as quite simply a more succinct expression of the intended algorithm.

If anyone wants to argue the meaning of "high level", well basically APL wins ok. In APL, to invert a matrix is a single operator. If you've never had to implement a matrix inversion from scratch, then you've never done serious programming. Meanwhile, Python or Perl are pretty convenient.

What I mean by a "[very] high level language" is basically how many pages of code does it take to play a decent game of draughts (chequers), or chess ?

[Nov 18, 2010] The Law of Software Development By James Kwak

November 17, 2010 | The Baseline Scenario

I recently read a frightening 2008 post by David Pogue about the breakdown of homemade DVDs. This inspired me to back up my old DVDs of my dog to my computer (now that hard drives are so much bigger than they used to be), which led me to install HandBrake. The Handbrake web site includes this gem:

"The Law of Software Development and Envelopment at MIT:

Every program in development at MIT expands until it can read mail."
I thought of that when I heard that Facebook is launching a (beyond) email service.

(The side benefit of this project is that now I get to watch videos of my dog sleeping whenever I want to.)

Nemo

Pogue should have mentioned whether he was talking about DVD-R or DVD-RW.

The rewritable variants are vastly more perishable than the write-once variants.

http://www.thexlab.com/faqs/opticalmedialongevity.html

Jason

That law is originally due to Jamie Zawinski (rather famous software developer known for his work on Netscape Navigator and contributions to Mozilla and XEmacs). In its original form:

Every program attempts to expand until it can read mail. Those programs which cannot so expand are replaced by ones which can.-Jamie Zawinski

Ted K

James, you remind me of myself when I'm drinking and thinking of past things. I'm mean, I'm not criticizing you, please don't take it as criticism. But watching videos of the dog that passed away…Don't torture yourself man. Wait some time and when you and the family is ready, get a new dog. Should probably be a different breed unless you're super-sold on that breed. Let the kids choose one from a set of breeds you like.

We got Pogue's book on the i-Pod because Apple's manual is so crappy. He is "da man".

You know if Apple gave a damn about customers they would include a charger cord with that damned thing to hook into the wall instead of making you shop for the charger cord separately, but Mr. "I'm an assh*le" Steve Jobs couldn't bother to show he's customer service inclined. He's slowly but surely going the way of Bill Gates.

Ted K

Mr. Kwak,
There is a super good story they're running on PBS NewsHour today with the former "NOW" host David Brancaccio on a show called "Fixing the Future". James you need to try to download that or catch the show. That looks really good and shows some promise for the future. Catch this link people. Nov 18 (Thursday)
http://www.pbs.org/now/fixing-the-future/index.html

David Petraitis

@Ted K
It looks like programs need to expand until they can delete uploaded Youtube videos which are seriously off topic.

As for James' original point, most applications today are Internet Aware and use the Internet in their base functionality (which is what was meant by the original email capability) The next level is for them to be mobile and location aware, and it is already happening.

Bruce E. Woych

facebook launches beyond e-mail…

I was browsing through some tech books at B&N's and came across some works assessing the questionable fields of cyber space tech wars and the current trends in development. The book has no axe to grind and was written well before the facebook attempt to dominate the media with multimedia dependency. Here's what was so interesting in the text that applies:

The two greatest "vulnerabilities of the future" involve what is categorized as consolidation and convergence.

Now it first occurred to me that this is much like micro and macro economics…but then I realized that it is precisely (in the field) like too big to fail!

So are we on another monopoly trail down the primrose path of self destructive dependencies?

Isn't this just another brand media Octopus looking to knock out variations and dominate our choices with their market offerings? And is this going to set us up for I.T. crisis of authorization for the systemic network and future of "ownership" wars in essential services?

3-D

Facebook is recreating AOL. A gigantic walled garden that becomes "the internet" for most of the people with computers. Look how AOL ended up.

And Handbrake is a great little program. I've been using it to rip my DVD collection to the 2TB of network storage I now have on my home network. A very convenient way to watch movies.

Anonymous

Shrub already handed them out to all his war + torture buddies, as well as Greenspan – and Daddy Shrub gave one to the teabaggers' favorite faux-economist (Hayek) and to Darth Cheney, so I'd say the reputation of the medal is pretty much already in the sewer.

[Aug 25, 2010] Sometimes the Old Ways Are Best by Brian Kernighan

IEEE Software Nov/Dec 2008, pp.18-19

As I write this column, I'm in the middle of two summer projects; with luck, they'll both be finished by the time you read it. One involves a forensic analysis of over 100,000 lines of old C and assembly code from about 1990, and I have to work on Windows XP. The other is a hack to translate code written in weird language L1 into weird language L2 with a program written in scripting language L3, where none of the L's even existed in 1990; this one uses Linux. Thus it's perhaps a bit surprising that I find myself relying on much the same toolset for these very different tasks.

... ... ...

here has surely been much progress in tools over the 25 years that IEEE Software has been around, and I wouldn't want to go back in time. But the tools I use today are mostly the same old ones-grep, diff, sort, awk, and friends.

This might well mean that I'm a dinosaur stuck in the past. On the other hand, when it comes to doing simple things quickly, I can often have the job done while experts are still waiting for their IDE to start up. Sometimes the old ways are best, and they're certainly worth knowing well

Embed Lua for scriptable apps

The Lua programming language is a small scripting language specifically designed to be embedded in other programs.

Lua's C API allows exceptionally clean and simple code both to call Lua from C, and to call C from Lua.

This allows developers who want a convenient runtime scripting language to easily implement the basic API elements needed by the scripting language, then use Lua code from their applications.

This article introduces the Lua language as a possible tool for simplifying common development tasks, and discusses some of the reasons to embed a scripting language in the first place.

Hope For Multi-Language Programming

chthonicdaemon is pretty naive and does not understand that combination of scripting language with complied language like C (or semi-compiled language like Java) is more productive environment that almost any other known... You need a common runtime like in Windows to make it a smooth approach (IronPython). Scripting helps to avoid OO trap that is pushed by "a hoard of practically illiterate researchers publishing crap papers in junk conferences."
"I have been using Linux as my primary environment for more than ten years. In this time, I have absorbed all the lore surrounding the Unix Way - small programs doing one thing well, communicating via text and all that. I have found the command line a productive environment for doing many of the things I often do, and I find myself writing lots of small scripts that do one thing, then piping them together to do other things. While I was spending the time learning grep, sed, awk, python and many other more esoteric languages, the world moved on to application-based programming, where the paradigm seems to be to add features to one program written in one language. I have traditionally associated this with Windows or MacOS, but it is happening with Linux as well. Environments have little or no support for multi-language projects - you choose a language, open a project and get it done. Recent trends in more targeted build environments like cmake or ant are understandably focusing on automatic dependency generation and cross-platform support, unfortunately making it more difficult to grow a custom build process for a multi-language project organically. All this is a bit painful for me, as I know how much is gained by using a targeted language for a particular problem. Now the question: Should I suck it up and learn to do all my programming in C++/Java/(insert other well-supported, popular language here) and unlearn ten years of philosophy, or is there hope for the multi-language development process?"

[Dec 22, 2008] 13 reasons why Ruby, Python and the gang will push Java to die… of old age

Very questionable logic. The cost of programming and especially the cost of maintenance of an application depends on the level of the language. It is less with Ruby/Python then with Java.
Lately I seem to find everywhere lots of articles about the imminent dismissal of Java and its replacement with the scripting language of the day or sometimes with other compiled languages.

No, that is not gonna happen. Java is gonna die eventually of old age many many years from now. I will share the reasoning behind my statement. Let's first look at some metrics.

Language popularity status as of May 2008

For this I am gonna use the TIOBE index (tiobe.com) and the nice graphs at langpop.com. I know lots of people don't like them because their statistics are based on search engine results but I think they are a reasonable fair indicator of popularity.

Facts from the TIOBE index:

TIOBE Index Top 20

What I find significant here is the huge share the "C like syntax" languages have.

C (15.292) + C++ (10.484) + Java (20.176) + C# (3.963) = 49.915%

This means 4 languages get half of all the attention on the web.

If we add PHP (10.637) here (somehow uses a similar syntax) we get 60.552%

As a result we can extract:

Reason number 1: Syntax is very important because it builds on previous knowledge. Also similar syntax means similar concepts. Programmers have to make less effort to learn the new syntax, can reuse the old concepts and thus they can concentrate on understanding the new concepts.

Let's look at a group of 10 challengers:

He forgot to add Perl

Python (4.613) + Ruby (2.851) + Lisp/Scheme (0.449) + Lua (0.393) + SmallTalk (0.138) +
Haskell (0.137) + Groovy (0.131) + Erlang (0.110) + Caml (0.090) + Scala (0.073) = 8.985%

This is less than the attention Visual Basic gets: 10.782% and leads us to…

TIOBE Index Top 21-50

Reason number 2: Too much noise is distracting. Programmers are busy and learning 10 languages to the level where they can evaluate them and make an educated decision is too much effort. The fact that most of these languages have a different syntax and introduce different (sometimes radically different) concepts doesn't help either.

Looking at the trend for the last 7 years we can see a pretty flat evolution in popularity for most of the languages. There are a few exceptions like the decline of Perl but nothing really is earth shattering. There are seasonal variations but in long term nothing seems to change.

TIOBE Trend

This shows that while various languages catch the mind of the programmer for a short time, they are put back on the shelf pretty fast. This might be caused by the lack of opportunity to use them in real life projects. Most of the programmers in the world work on ongoing projects.

Reason number 3: Lack of pressure on the programmers to switch. The market is pretty stable, the existing languages work pretty well and the management doesn't push programmers to learn new languages.

Number of new projects started

Looking at another site that does language popularity analysis, langpop.com, we see a slightly different view but the end result is almost the same from the point of view of challenger languages.

What I found interesting here was the analysis regarding new projects started in various languages. The sources for information are Freshmeat.net and Google Code. The results show a clear preference for C/C++/Java with Python getting some attention.

Reason number 4: Challenger languages don't seem to catch momentum in order to create an avalanche of new projects started with them. This can be again due to the fact that they spread thin when they are evaluated. They are too many.

Other interesting charts at langpop.com are those about books on programming languages at amazon.com and about language discussions statistics. Book writers write about subjects that have a chance to sell. On the other hand a lot of discussion about all theses new languages takes place online. One thing I noticed in these discussion is the attitude the supporters of certain languages have. There is a lot of elitism and concentration on what is wrong with Java instead of pointing to what their language of choice brings useful and on creating good tutorials for people wanting to attempt a switch.

Reason number 5: Challenger languages communities don't do a good job at attracting programmers from established languages. Telling to somebody why she is wrong will most likely create a counter reaction not interest.

Let's look now at what is happening on the job market. I used the tools offered by indeed.com and I compared a bunch of languages to produce this graph:


Java, C, C++, C#, Python, Ruby, PHP, Scala Job Trends graph

Reason number 6: There is no great incentive to switch to one of the challenger languages since gaining this skill is not likely to translate into income in the near future.

Well, I looked at all these statistics and I extracted some reasons, but what are the qualities a language needs and what are the external conditions that will make a programming language popular?

How and when does a language become popular

So we can draw more reasons:

For one's curiosity here is a list of talked about languages with their birth date:
Ruby (mid 1990s), Python (1991), Lisp (1958), Scheme (1970s), Lua (1993), Smalltalk (1969-1980), Haskell (1990), Erlang (1987), Caml (1985), OCaml (1996), Groovy (2003), Scala (2003)

Compare this with older successful languages:
C (1972), C++ (1983), Java (1995), C# (2001), BASIC (1964), Pascal (1970), FORTRAN (1957), Ada (1983), COBOL (1959)

It is pretty obvious most of these "new" languages lost the train to success.

Why many of the new languages will never be popular

Reason number 11: "Features" that look and are dangerous for big projects. Since there are not a lot of big projects written in any of these languages it is hard to make an unbiased evaluation. But bias is in the end a real obstacle for their adoption.

Reason number 12: Unnatural concepts (for majority of programmers) raise the entry level. Functional languages make you write code like mathematical equations. But how many people actually love math so much to write everything in it? Object oriented languages provide a great advantage: they let programmers think about the domain they want to model, not about the language or the machine.

Reason number 13: Lack of advanced tools for development and refactoring cripple the programmer and the development teams when faced with big amounts of lines of code.

How would a real Java challenger look like

Pick me, pick mee, pick meeee!!!

Looking at all those (smart) languages and all the heated discussions that surround them makes me think about the donkey from Shrek yelling "Pick me! Pick mee!! Pick meeee!!!". In the end only one can be the real winner even if in a limited part of the market.

The danger for Java doesn't come from outside. None of these new (actually most of them are pretty old) languages have the potential to displace Java. The danger for Java comes from inside and it is caused by too many "features" making their way into the language and transforming if from a language that wanted to keep only the essential features of C++ into a trash box for features and concepts from all languages.

In the end I want to make it clear that I am not advocating against any of those languages. There is TAO in all of them. I actually find them interesting, cool and useful as exercise for my brain, when I have time. I recommend to every programmer to look around from time to time and try to understand what is going on the language market.

This article is part of a series of opinions and rants:

[Dec 12, 2008] The A-Z of Programming Languages Perl

What new elements does Perl 5.10.0 bring to the language? In what way is it preparing for Perl 6?

Perl 5.10.0 involves backporting some ideas from Perl 6, like switch statements and named pattern matches.

One of the most popular things is the use of "say" instead of "print".

This is an explicit programming design in Perl - easy things should be easy and hard things should be possible. It's optimised for the common case. Similar things should look similar but similar things should also look different, and how you trade those things off is an interesting design principle.

Huffman Coding is one of those principles that makes similar things look different.

In your opinion, what lasting legacy has Perl brought to computer development?

An increased awareness of the interplay between technology and culture. Ruby has borrowed a few ideas from Perl and so has PHP. I don't think PHP understands the use of signals, but all languages borrow from other languages, otherwise they risk being single-purpose languages. Competition is good.

It's interesting to see PHP follow along with the same mistakes Perl made over time and recover from them. But Perl 6 also borrows back from other languages too, like Ruby. My ego may be big, but it's not that big.

Where do you envisage Perl's future lying?

My vision of Perl's future is that I hope I don't recognize it in 20 years.

Where do you see computer programming languages heading in the future, particularly in the next 5 to 20 years?

Don't design everything you will need in the next 100 years, but design the ability to create things we will need in 20 or 100 years. The heart of the Perl 6 effort is the extensibility we have built into the parser and introduced language changes as non-destructively as possible.

Linux Today's comments

> Given the horrible mess that is Perl (and, BTW, I derive 90% of my income from programming in Perl),
.
Did the thought that 'horrible mess' you produce with $language 'for an income' could be YOUR horrible mess already cross your mind? The language itself doesn't write any code.

> You just said something against his beloved
> Perl and compounded your heinous crime by
> saying something nice about Python...in his
> narrow view you are the antithesis of all that is
> right in the world. He will respond with his many
> years of Perl == good and everything else == bad
> but just let it go...
.
That's a pretty pointless insult. Languages don't write code. People do. A statement like 'I think that code written in Perl looks very ugly because of the large amount of non-alphanumeric characters' would make sense. Trying to elevate entirely subjective, aesthetic preferences into 'general principles' doesn't. 'a mess' is something inherently chaotic, hence, this is not a sensible description for a regularly structured program of any kind. It is obviously possible to write (or not write) regularly structured programs in any language providing the necessary abstractions for that. This set includes Perl.
.
I had the mispleasure to have to deal with messes created by people both in Perl and Python (and a couple of other languages) in the past. You've probably heard the saying that "real programmers can write FORTRAN in any language" already.

It is even true that the most horrible code mess I have seen so far had been written in Perl. But this just means that a fairly chaotic person happened to use this particular programming language.

[Dec 9, 2008] What Programming Language For Linux Development

Slashdot

by dkf (304284) <donal.k.fellows@man.ac.uk> on Saturday December 06, @07:08PM (#26016101) Homepage

C/C++ are the languages you'd want to go for. They can do *everything*, have great support, are fast etc.

Let's be honest here. C and C++ are very fast indeed if you use them well (very little can touch them; most other languages are actually implemented in terms of them) but they're also very easy to use really badly. They're genuine professional power tools: they'll do what you ask them to really quickly, even if that is just to spin on the spot chopping peoples' legs off. Care required!

If you use a higher-level language (I prefer Tcl, but you might prefer Python, Perl, Ruby, Lua, Rexx, awk, bash, etc. - the list is huge) then you probably won't go as fast. But unless you're very good at C/C++ you'll go acceptably fast at a much earlier calendar date. It's just easier for most people to be productive in higher-level languages. Well, unless you're doing something where you have to be incredibly close to the metal like a device driver, but even then it's best to keep the amount of low-level code small and to try to get to use high-level things as soon as you can.

One technique that is used quite a bit, especially by really experienced developers, is to split the program up into components that are then glued together. You can then write the components in a low-level language if necessary, but use the far superior gluing capabilities of a high-level language effectively. I know many people are very productive doing this.

[Apr 25, 2008] Interview with Donald Knuth by Donald E. Knuth & Andrew Binstock


Andrew Binstock and Donald Knuth converse on the success of open source, the problem with multicore architecture, the disappointing lack of interest in literate programming, the menace of reusable code, and that urban legend about winning a programming contest with a single compilation.

Andrew Binstock: You are one of the fathers of the open-source revolution, even if you aren't widely heralded as such. You previously have stated that you released TeX as open source because of the problem of proprietary implementations at the time, and to invite corrections to the code-both of which are key drivers for open-source projects today. Have you been surprised by the success of open source since that time?

Donald Knuth:

The success of open source code is perhaps the only thing in the computer field that hasn't surprised me during the past several decades. But it still hasn't reached its full potential; I believe that open-source programs will begin to be completely dominant as the economy moves more and more from products towards services, and as more and more volunteers arise to improve the code.

For example, open-source code can produce thousands of binaries, tuned perfectly to the configurations of individual users, whereas commercial software usually will exist in only a few versions. A generic binary executable file must include things like inefficient "sync" instructions that are totally inappropriate for many installations; such wastage goes away when the source code is highly configurable. This should be a huge win for open source.

Yet I think that a few programs, such as Adobe Photoshop, will always be superior to competitors like the Gimp-for some reason, I really don't know why! I'm quite willing to pay good money for really good software, if I believe that it has been produced by the best programmers.

Remember, though, that my opinion on economic questions is highly suspect, since I'm just an educator and scientist. I understand almost nothing about the marketplace.

Andrew: A story states that you once entered a programming contest at Stanford (I believe) and you submitted the winning entry, which worked correctly after a single compilation. Is this story true? In that vein, today's developers frequently build programs writing small code increments followed by immediate compilation and the creation and running of unit tests. What are your thoughts on this approach to software development?

Donald:

The story you heard is typical of legends that are based on only a small kernel of truth. Here's what actually happened: John McCarthy decided in 1971 to have a Memorial Day Programming Race. All of the contestants except me worked at his AI Lab up in the hills above Stanford, using the WAITS time-sharing system; I was down on the main campus, where the only computer available to me was a mainframe for which I had to punch cards and submit them for processing in batch mode. I used Wirth's ALGOL W system (the predecessor of Pascal). My program didn't work the first time, but fortunately I could use Ed Satterthwaite's excellent offline debugging system for ALGOL W, so I needed only two runs. Meanwhile, the folks using WAITS couldn't get enough machine cycles because their machine was so overloaded. (I think that the second-place finisher, using that "modern" approach, came in about an hour after I had submitted the winning entry with old-fangled methods.) It wasn't a fair contest.

As to your real question, the idea of immediate compilation and "unit tests" appeals to me only rarely, when I'm feeling my way in a totally unknown environment and need feedback about what works and what doesn't. Otherwise, lots of time is wasted on activities that I simply never need to perform or even think about. Nothing needs to be "mocked up."

Andrew: One of the emerging problems for developers, especially client-side developers, is changing their thinking to write programs in terms of threads. This concern, driven by the advent of inexpensive multicore PCs, surely will require that many algorithms be recast for multithreading, or at least to be thread-safe. So far, much of the work you've published for Volume 4 of The Art of Computer Programming (TAOCP) doesn't seem to touch on this dimension. Do you expect to enter into problems of concurrency and parallel programming in upcoming work, especially since it would seem to be a natural fit with the combinatorial topics you're currently working on?

Donald:

The field of combinatorial algorithms is so vast that I'll be lucky to pack its sequential aspects into three or four physical volumes, and I don't think the sequential methods are ever going to be unimportant. Conversely, the half-life of parallel techniques is very short, because hardware changes rapidly and each new machine needs a somewhat different approach. So I decided long ago to stick to what I know best. Other people understand parallel machines much better than I do; programmers should listen to them, not me, for guidance on how to deal with simultaneity.

Andrew: Vendors of multicore processors have expressed frustration at the difficulty of moving developers to this model. As a former professor, what thoughts do you have on this transition and how to make it happen? Is it a question of proper tools, such as better native support for concurrency in languages, or of execution frameworks? Or are there other solutions?

Donald:

I don't want to duck your question entirely. I might as well flame a bit about my personal unhappiness with the current trend toward multicore architecture. To me, it looks more or less like the hardware designers have run out of ideas, and that they're trying to pass the blame for the future demise of Moore's Law to the software writers by giving us machines that work faster only on a few key benchmarks! I won't be surprised at all if the whole multithreading idea turns out to be a flop, worse than the "Titanium" approach that was supposed to be so terrific - until it turned out that the wished-for compilers were basically impossible to write.

Let me put it this way: During the past 50 years, I've written well over a thousand programs, many of which have substantial size. I can't think of even five of those programs that would have been enhanced noticeably by parallelism or multithreading. Surely, for example, multiple processors are no help to TeX.[1]

How many programmers do you know who are enthusiastic about these promised machines of the future? I hear almost nothing but grief from software people, although the hardware folks in our department assure me that I'm wrong.

I know that important applications for parallelism exist-rendering graphics, breaking codes, scanning images, simulating physical and biological processes, etc. But all these applications require dedicated code and special-purpose techniques, which will need to be changed substantially every few years.

Even if I knew enough about such methods to write about them in TAOCP, my time would be largely wasted, because soon there would be little reason for anybody to read those parts. (Similarly, when I prepare the third edition of Volume 3 I plan to rip out much of the material about how to sort on magnetic tapes. That stuff was once one of the hottest topics in the whole software field, but now it largely wastes paper when the book is printed.)

The machine I use today has dual processors. I get to use them both only when I'm running two independent jobs at the same time; that's nice, but it happens only a few minutes every week. If I had four processors, or eight, or more, I still wouldn't be any better off, considering the kind of work I do-even though I'm using my computer almost every day during most of the day. So why should I be so happy about the future that hardware vendors promise? They think a magic bullet will come along to make multicores speed up my kind of work; I think it's a pipe dream. (No-that's the wrong metaphor! "Pipelines" actually work for me, but threads don't. Maybe the word I want is "bubble.")

From the opposite point of view, I do grant that web browsing probably will get better with multicores. I've been talking about my technical work, however, not recreation. I also admit that I haven't got many bright ideas about what I wish hardware designers would provide instead of multicores, now that they've begun to hit a wall with respect to sequential computation. (But my MMIX design contains several ideas that would substantially improve the current performance of the kinds of programs that concern me most-at the cost of incompatibility with legacy x86 programs.)

Andrew: One of the few projects of yours that hasn't been embraced by a widespread community is literate programming. What are your thoughts about why literate programming didn't catch on? And is there anything you'd have done differently in retrospect regarding literate programming?

Donald:

Literate programming is a very personal thing. I think it's terrific, but that might well be because I'm a very strange person. It has tens of thousands of fans, but not millions.

In my experience, software created with literate programming has turned out to be significantly better than software developed in more traditional ways. Yet ordinary software is usually okay-I'd give it a grade of C (or maybe C++), but not F; hence, the traditional methods stay with us. Since they're understood by a vast community of programmers, most people have no big incentive to change, just as I'm not motivated to learn Esperanto even though it might be preferable to English and German and French and Russian (if everybody switched).

Jon Bentley probably hit the nail on the head when he once was asked why literate programming hasn't taken the whole world by storm. He observed that a small percentage of the world's population is good at programming, and a small percentage is good at writing; apparently I am asking everybody to be in both subsets.

Yet to me, literate programming is certainly the most important thing that came out of the TeX project. Not only has it enabled me to write and maintain programs faster and more reliably than ever before, and been one of my greatest sources of joy since the 1980s-it has actually been indispensable at times. Some of my major programs, such as the MMIX meta-simulator, could not have been written with any other methodology that I've ever heard of. The complexity was simply too daunting for my limited brain to handle; without literate programming, the whole enterprise would have flopped miserably.

If people do discover nice ways to use the newfangled multithreaded machines, I would expect the discovery to come from people who routinely use literate programming. Literate programming is what you need to rise above the ordinary level of achievement. But I don't believe in forcing ideas on anybody. If literate programming isn't your style, please forget it and do what you like. If nobody likes it but me, let it die.

On a positive note, I've been pleased to discover that the conventions of CWEB are already standard equipment within preinstalled software such as Makefiles, when I get off-the-shelf Linux these days.

Andrew: In Fascicle 1 of Volume 1, you reintroduced the MMIX computer, which is the 64-bit upgrade to the venerable MIX machine comp-sci students have come to know over many years. You previously described MMIX in great detail in MMIXware. I've read portions of both books, but can't tell whether the Fascicle updates or changes anything that appeared in MMIXware, or whether it's a pure synopsis. Could you clarify?

Donald:

Volume 1 Fascicle 1 is a programmer's introduction, which includes instructive exercises and such things. The MMIXware book is a detailed reference manual, somewhat terse and dry, plus a bunch of literate programs that describe prototype software for people to build upon. Both books define the same computer (once the errata to MMIXware are incorporated from my website). For most readers of TAOCP, the first fascicle contains everything about MMIX that they'll ever need or want to know.

I should point out, however, that MMIX isn't a single machine; it's an architecture with almost unlimited varieties of implementations, depending on different choices of functional units, different pipeline configurations, different approaches to multiple-instruction-issue, different ways to do branch prediction, different cache sizes, different strategies for cache replacement, different bus speeds, etc. Some instructions and/or registers can be emulated with software on "cheaper" versions of the hardware. And so on. It's a test bed, all simulatable with my meta-simulator, even though advanced versions would be impossible to build effectively until another five years go by (and then we could ask for even further advances just by advancing the meta-simulator specs another notch).

Suppose you want to know if five separate multiplier units and/or three-way instruction issuing would speed up a given MMIX program. Or maybe the instruction and/or data cache could be made larger or smaller or more associative. Just fire up the meta-simulator and see what happens.

Andrew: As I suspect you don't use unit testing with MMIXAL, could you step me through how you go about making sure that your code works correctly under a wide variety of conditions and inputs? If you have a specific work routine around verification, could you describe it?

Donald:

Most examples of machine language code in TAOCP appear in Volumes 1-3; by the time we get to Volume 4, such low-level detail is largely unnecessary and we can work safely at a higher level of abstraction. Thus, I've needed to write only a dozen or so MMIX programs while preparing the opening parts of Volume 4, and they're all pretty much toy programs-nothing substantial. For little things like that, I just use informal verification methods, based on the theory that I've written up for the book, together with the MMIXAL assembler and MMIX simulator that are readily available on the Net (and described in full detail in the MMIXware book).

That simulator includes debugging features like the ones I found so useful in Ed Satterthwaite's system for ALGOL W, mentioned earlier. I always feel quite confident after checking a program with those tools.

Andrew: Despite its formulation many years ago, TeX is still thriving, primarily as the foundation for LaTeX. While TeX has been effectively frozen at your request, are there features that you would want to change or add to it, if you had the time and bandwidth? If so, what are the major items you add/change?

Donald:

I believe changes to TeX would cause much more harm than good. Other people who want other features are creating their own systems, and I've always encouraged further development-except that nobody should give their program the same name as mine. I want to take permanent responsibility for TeX and Metafont, and for all the nitty-gritty things that affect existing documents that rely on my work, such as the precise dimensions of characters in the Computer Modern fonts.

Andrew: One of the little-discussed aspects of software development is how to do design work on software in a completely new domain. You were faced with this issue when you undertook TeX: No prior art was available to you as source code, and it was a domain in which you weren't an expert. How did you approach the design, and how long did it take before you were comfortable entering into the coding portion?

Donald:

That's another good question! I've discussed the answer in great detail in Chapter 10 of my book Literate Programming, together with Chapters 1 and 2 of my book Digital Typography. I think that anybody who is really interested in this topic will enjoy reading those chapters. (See also Digital Typography Chapters 24 and 25 for the complete first and second drafts of my initial design of TeX in 1977.)

Andrew: The books on TeX and the program itself show a clear concern for limiting memory usage-an important problem for systems of that era. Today, the concern for memory usage in programs has more to do with cache sizes. As someone who has designed a processor in software, the issues of cache-aware and cache-oblivious algorithms surely must have crossed your radar screen. Is the role of processor caches on algorithm design something that you expect to cover, even if indirectly, in your upcoming work?

Donald:

I mentioned earlier that MMIX provides a test bed for many varieties of cache. And it's a software-implemented machine, so we can perform experiments that will be repeatable even a hundred years from now. Certainly the next editions of Volumes 1-3 will discuss the behavior of various basic algorithms with respect to different cache parameters.

In Volume 4 so far, I count about a dozen references to cache memory and cache-friendly approaches (not to mention a "memo cache," which is a different but related idea in software).

Andrew: What set of tools do you use today for writing TAOCP? Do you use TeX? LaTeX? CWEB? Word processor? And what do you use for the coding?

Donald:

My general working style is to write everything first with pencil and paper, sitting beside a big wastebasket. Then I use Emacs to enter the text into my machine, using the conventions of TeX. I use tex, dvips, and gv to see the results, which appear on my screen almost instantaneously these days. I check my math with Mathematica.

I program every algorithm that's discussed (so that I can thoroughly understand it) using CWEB, which works splendidly with the GDB debugger. I make the illustrations with MetaPost (or, in rare cases, on a Mac with Adobe Photoshop or Illustrator). I have some homemade tools, like my own spell-checker for TeX and CWEB within Emacs. I designed my own bitmap font for use with Emacs, because I hate the way the ASCII apostrophe and the left open quote have morphed into independent symbols that no longer match each other visually. I have special Emacs modes to help me classify all the tens of thousands of papers and notes in my files, and special Emacs keyboard shortcuts that make bookwriting a little bit like playing an organ. I prefer rxvt to xterm for terminal input. Since last December, I've been using a file backup system called backupfs, which meets my need beautifully to archive the daily state of every file.

According to the current directories on my machine, I've written 68 different CWEB programs so far this year. There were about 100 in 2007, 90 in 2006, 100 in 2005, 90 in 2004, etc. Furthermore, CWEB has an extremely convenient "change file" mechanism, with which I can rapidly create multiple versions and variations on a theme; so far in 2008 I've made 73 variations on those 68 themes. (Some of the variations are quite short, only a few bytes; others are 5KB or more. Some of the CWEB programs are quite substantial, like the 55-page BDD package that I completed in January.) Thus, you can see how important literate programming is in my life.

I currently use Ubuntu Linux, on a standalone laptop-it has no Internet connection. I occasionally carry flash memory drives between this machine and the Macs that I use for network surfing and graphics; but I trust my family jewels only to Linux. Incidentally, with Linux I much prefer the keyboard focus that I can get with classic FVWM to the GNOME and KDE environments that other people seem to like better. To each his own.

Andrew: You state in the preface of Fascicle 0 of Volume 4 of TAOCP that Volume 4 surely will comprise three volumes and possibly more. It's clear from the text that you're really enjoying writing on this topic. Given that, what is your confidence in the note posted on the TAOCP website that Volume 5 will see light of day by 2015?

Donald:

If you check the Wayback Machine for previous incarnations of that web page, you will see that the number 2015 has not been constant.

You're certainly correct that I'm having a ball writing up this material, because I keep running into fascinating facts that simply can't be left out-even though more than half of my notes don't make the final cut.

Precise time estimates are impossible, because I can't tell until getting deep into each section how much of the stuff in my files is going to be really fundamental and how much of it is going to be irrelevant to my book or too advanced. A lot of the recent literature is academic one-upmanship of limited interest to me; authors these days often introduce arcane methods that outperform the simpler techniques only when the problem size exceeds the number of protons in the universe. Such algorithms could never be important in a real computer application. I read hundreds of such papers to see if they might contain nuggets for programmers, but most of them wind up getting short shrift.

From a scheduling standpoint, all I know at present is that I must someday digest a huge amount of material that I've been collecting and filing for 45 years. I gain important time by working in batch mode: I don't read a paper in depth until I can deal with dozens of others on the same topic during the same week. When I finally am ready to read what has been collected about a topic, I might find out that I can zoom ahead because most of it is eminently forgettable for my purposes. On the other hand, I might discover that it's fundamental and deserves weeks of study; then I'd have to edit my website and push that number 2015 closer to infinity.

Andrew: In late 2006, you were diagnosed with prostate cancer. How is your health today?

Donald:

Naturally, the cancer will be a serious concern. I have superb doctors. At the moment I feel as healthy as ever, modulo being 70 years old. Words flow freely as I write TAOCP and as I write the literate programs that precede drafts of TAOCP. I wake up in the morning with ideas that please me, and some of those ideas actually please me also later in the day when I've entered them into my computer.

On the other hand, I willingly put myself in God's hands with respect to how much more I'll be able to do before cancer or heart disease or senility or whatever strikes. If I should unexpectedly die tomorrow, I'll have no reason to complain, because my life has been incredibly blessed. Conversely, as long as I'm able to write about computer science, I intend to do my best to organize and expound upon the tens of thousands of technical papers that I've collected and made notes on since 1962.

Andrew: On your website, you mention that the Peoples Archive recently made a series of videos in which you reflect on your past life. In segment 93, "Advice to Young People," you advise that people shouldn't do something simply because it's trendy. As we know all too well, software development is as subject to fads as any other discipline. Can you give some examples that are currently in vogue, which developers shouldn't adopt simply because they're currently popular or because that's the way they're currently done? Would you care to identify important examples of this outside of software development?

Donald:

Hmm. That question is almost contradictory, because I'm basically advising young people to listen to themselves rather than to others, and I'm one of the others. Almost every biography of every person whom you would like to emulate will say that he or she did many things against the "conventional wisdom" of the day.

Still, I hate to duck your questions even though I also hate to offend other people's sensibilities-given that software methodology has always been akin to religion. With the caveat that there's no reason anybody should care about the opinions of a computer scientist/mathematician like me regarding software development, let me just say that almost everything I've ever heard associated with the term "extreme programming" sounds like exactly the wrong way to go...with one exception. The exception is the idea of working in teams and reading each other's code. That idea is crucial, and it might even mask out all the terrible aspects of extreme programming that alarm me.

I also must confess to a strong bias against the fashion for reusable code. To me, "re-editable code" is much, much better than an untouchable black box or toolkit. I could go on and on about this. If you're totally convinced that reusable code is wonderful, I probably won't be able to sway you anyway, but you'll never convince me that reusable code isn't mostly a menace.

Here's a question that you may well have meant to ask: Why is the new book called Volume 4 Fascicle 0, instead of Volume 4 Fascicle 1? The answer is that computer programmers will understand that I wasn't ready to begin writing Volume 4 of TAOCP at its true beginning point, because we know that the initialization of a program can't be written until the program itself takes shape. So I started in 2005 with Volume 4 Fascicle 2, after which came Fascicles 3 and 4. (Think of Star Wars, which began with Episode 4.)

[Apr 23, 2008] Dr. Dobb's Programming Languages Everyone Has a Favorite One by Deirdre Blake

I think the article misstates position of Perl (according to TIOBE index it is No.6 about above C#, Python and Ruby). The author definitely does not understand the value and staying power of C. Also there is an inherent problem with their methodology (as with any web presence based metric). This is visible in positions of C# which now definitely looks stronger then Python and Perl (and may be even PHP) as well as in positions of bash, awk and PowerShell. That means all other statements should be taken with the grain of salt...
April 23, 2008 | DDJ

From what Paul Jansen has seen, everyone has a favorite programming language.

Paul Jansen is managing director of TIOBE Software (www.tiobe.com). He can be contacted at Paul.Jansen@tiobe.com.


DDJ: Paul, can you tell us about the TIOBE Programming Community Index?

PJ: The TIOBE index tries to measure the popularity of programming languages by monitoring their web presence. The most popular search engines Google, Yahoo!, Microsoft, and YouTube are used to calculate these figures. YouTube has been added recently as an experiment (and only counts for 4 percent of the total). Since the TIOBE index has been published now for more than 6 years, it gives an interesting picture about trends in the area of programming languages. I started the index because I was curious to know whether my programming skills were still up to date and to know for which programming languages our company should create development tools. It is amazing to see that programming languages are something very personal. Every day we receive e-mails from people that are unhappy with the position of "their" specific language in the index. I am also a bit overwhelmed about the vast and constant traffic this index generates.

DDJ: Which language has moved to the top of the heap, so to speak, in terms of popularity, and why do you think this is the case?

PJ: If we take a look at the top 10 programming languages, not much has happened the last five years. Only Python entered the top 10, replacing COBOL. This comes as a surprise because the IT world is moving so fast that in most areas, the market is usually completely changed in five years time. Python managed to reach the top 10 because it is the truly object-oriented successor of Perl. Other winners of the last couple of years are Visual Basic, Ruby, JavaScript, C#, and D (a successor of C++). I expect in five years time there will be two main languages: Java and C#, closely followed by good-old Visual Basic. There is no new paradigm foreseen.

DDJ: Which languages seem to be losing ground?

PJ: C and C++ are definitely losing ground. There is a simple explanation for this. Languages without automated garbage collection are getting out of fashion. The chance of running into all kinds of memory problems is gradually outweighing the performance penalty you have to pay for garbage collection. Another language that has had its day is Perl. It was once the standard language for every system administrator and build manager, but now everyone has been waiting on a new major release for more than seven years. That is considered far too long.

DDJ: On the flip side, what other languages seem to be moving into the limelight?

PJ: It is interesting to observe that dynamically typed object-oriented (scripting) languages are evolving the most. A new language has hardly arrived on the scene, only to be immediately replaced by another new emerging language. I think this has to do with the increase in web programming. The web programming area demands a language that is easy to learn, powerful, and secure. New languages pop up every day, trying to be leaner and meaner than their predecessors. A couple of years ago, Ruby was rediscovered (thanks to Rails). Recently, Lua was the hype, but now other scripting languages such as ActionScript, Groovy, and Factor are about to claim a top 20 position. There is quite some talk on the Internet about the NBL (next big language). But although those web-programming languages generate a lot of attention, there is never a real breakthrough.

DDJ: What are the benefits of introducing coding standards into an organization? And how does an organization choose a standard that is a "right fit" for its development goals?

PJ: Coding standards help to improve the general quality of software. A good coding standard focuses on best programming practices (avoiding known language pitfalls), not only on style and naming conventions. Every language has its constructions that are perfectly legitimate according to its language definition but will lead to reliability, security, or maintainability problems. Coding standards help engineers to stick to a subset of a programming language to make sure that these problems do not occur. The advantage of introducing coding standards as a means to improve quality is that-once it is in place-it does not change too often. This is in contrast with dynamic testing. Every change in your program calls for a change in your dynamic tests. In short, dynamic tests are far more labor intensive than coding standards. On the other hand, coding standards can only take care of nonfunctional defects. Bugs concerning incorrectly implemented requirements remain undetected. The best way to start with coding standards is to download a code checker and tweak it to your needs. It is our experience that if you do not check the rules of your coding standard automatically, the coding standard will soon end as a dusty document on some shelf.

[Apr 18, 2008] CoScripter

A useful Firefox plug-in

CoScripter is a system for recording, automating, and sharing processes performed in a web browser such as printing photos online, requesting a vacation hold for postal mail, or checking flight arrival times. Instructions for processes are recorded and stored in easy-to-read text here on the CoScripter web site, so anyone can make use of them. If you are having trouble with a web-based process, check to see if someone has written a CoScript for it!

[Mar 12, 2008] Tiny Eclipse by Eric Suen

About: Tiny Eclipse is distribution of Eclipse for development with dynamic languages for the Web, such as JSP, PHP, Ruby, TCL, and Web Services. It features a small download size, the ability to choose the features you want to install, and GUI installers for Win32 and Linux GTK x86.

[Jan 2, 2008] Java is becoming the new Cobol by Bill Snyder

"Simply put, developers are saying that Java slows them down"
Dec 28, 2007 | infoworld.com

Simply put, developers are saying that Java slows them down. "There were big promises that Java would solve incompatibility problems [across platforms]. But now there are different versions and different downloads, creating complications," says Peter Thoneny, CEO of Twiki.net, which produces a certified version of the open source Twiki wiki-platform software. "It has not gotten easier. It's more complicated," concurs Ofer Ronen, CEO of Sendori, which routes domain traffic to online advertisers and ad networks. Sendori has moved to Ruby on Rails. Ronen says Ruby offers pre-built structures - say, a shopping cart for an e-commerce site - that you'd have to code from the ground up using Java.

Another area of weakness is the development of mobile applications. Java's UI capabilities and its memory footprint simply don't measure up, says Samir Shah, CEO of software testing provider Zephyr. No wonder the mobile edition of Java has all but disappeared, and no wonder Google is creating its own version (Android).

These weaknesses are having a real effect. Late last month, Info-Tech Research Group said its survey of 1,850 businesses found .Net the choice over Java among businesses of all sizes and industries, thanks to its promotion via Visual Studio and SharePoint. Microsoft is driving uptake of the .Net platform at the expense of Java," says George Goodall, a senior research analyst at Info-Tech.

One bit of good news: developers and analysts agree that Java is alive and well for internally developed enterprise apps. "On the back end, there is still a substantial amount of infrastructure available that makes Java a very strong contender," says Zephyr's Shah.

The Bottom Line: Now that Java is no longer the unchallenged champ for Internet-delivered apps, it makes sense for companies to find programmers who are skilled in the new languages. If you're a Java developer, now's the time to invest in new skills.

[Jan 2, 2008] Java is Becoming the New Cobol

I think that the author of this comment is deeply mistaken: the length of the code has tremendous influence of the cost of maintenance and number of errors and here Java sucks.
Linux Today

It's true. putting together an Enterprise-scale Java application takes a considerable amount of planning, design, and co-ordination. Scripted languages like Python are easier - just hack something out and you've a working webapp by the end of the day.

But then you get called in at midnight, because a lot of the extra front-end work in Java has to do with the fact that the compiler is doing major datatype validation. You're a lot less likely to have something blow up after it went into production, since a whole raft of potential screw-ups get caught at build time.

Scripting systems like Python, Perl, PHP, etc. not only have late binding, but frequently have late compiling as well, so until the offending code is invoked, it's merely a coiled-up snake.

In fact, after many years and many languages, I'm just about convinced that the amount of time and effort for producing a debugged major app in just about any high-level language is about the same.

Myself, I prefer an environment that keeps me from having to wear a pager. For those who need less sleep and more Instant Gratification, they're welcome to embrace the other end of the spectrum.

[Dec 28, 2007] My Programming Language History by Keith Waclena

It's pretty strange for the system admin and CGI programmer prefer Python to Perl... It goes without saying that in any case such evaluations should be taken with a grain of sslt. what makes this comparison interesting is that author claim to have substantial programming experience in Perl-4, Tcl and Python

Before going any further, read the disclaimer!


Some languages I've used and how I've felt about them. This may help you figure out where I'm coming from. I'm only listing the highlights here, and not including the exotica (Trac, SAM76, Setl, Rec, Convert, J...) and languages I've only toyed with or programmed in my head (Algol 68, BCPL, APL, S-Algol, Pop-2 / Pop-11, Refal, Prolog...).
Pascal.
My first language. Very nearly turned me off programming before I got started. I hated it. Still do.
Sail.
The first language I loved, Sail was Algol 60 with zillions of extensions, from dynamically allocated strings to Leap, a weird production system / logic programming / database language that I never understood (and now can barely recall), and access to every Twenex JSYS and TOPS 10 UUO! Pretty much limited to PDP-10s; supposedly reincarnated as MAINSAIL, but I never saw that.
Teco.
The first language I actually became fluent in; also the first language I ever got paid to program in.
Snobol4.
The first language I actually wrote halfway-decent sizable code in, developed a personal subroutine library for, wrote multi-platform code in, and used on an IBM mainframe (Spitbol -- but I did all my development under Twenex with Sitbol, thank goodness). I loved Snobol: I used to dream in it.
PL/I.
The first language I ever thought was great at first and then grew to loathe. Subset G only, but that was enough.
Forth.
The first language I ever ran on my own computer; also the first language I ever wrote useful assembler in -- serial I/O routines for the Z/80 (my first assembly language was a handful of toy programs in IBM 360 assembly language, using the aptly-named SPASM assembler), and the first language I thought was really wonderful but had a really difficult time writing useful programs in. Also the first language whose implementation I actually understood, the first language that really taught me about hardware, and the first language implementation I installed myself. Oh and the first language I taught.
C.
What can I say here? It's unpleasant, but it works, it's everywhere, and you have to use it.
Lisp.
The first language I thought was truly brilliant and still think so to this day. I programmed in Maclisp and Muddle (MDL) at first, on the PDP-10; Franz and then Common Lisp later (but not much).
Scheme.
How could you improve on Lisp? Scheme is how.
Perl.
The first language I wrote at least hundreds of useful programs in (Perl 4 (and earlier) only). Probably the second language I thought was great and grew to loathe (for many of the same reasons I grew to loathe PL/I, interestingly enough -- but it took longer).
Lazy Functional Languages.
How could you improve on Scheme? Lazy functional languages is how, but can you actually do anything with them (except compile lazy functional languages, of course)?
Tcl.
My previous standard, daily language. It's got a lot of problems, and it's amazing that Tcl programs ever get around to terminating, but they do, and astonishingly quickly (given the execution model...). I've developed a large library of Tcl procs that allow me to whip up substantial programs really quickly, the mark of a decent language. And it's willing to dress up as Lisp to fulfill my kinky desires.
Python
My current standard, daily language. Faster than Tcl, about as fast as Perl and with nearly as large a standard library, but with a reasonable syntax and real data structures. It's by no means perfect -- still kind of slow, not enough of an expression language to suit me, dynamically typed, no macro system -- but I'm really glad I found it.

[Oct 5, 2007] Turn Vim into a bash IDE By Joe 'Zonker' Brockmeier

June 11, 2007 | Linux.com

By itself, Vim is one of the best editors for shell scripting. With a little tweaking, however, you can turn Vim into a full-fledged IDE for writing scripts. You could do it yourself, or you can just install Fritz Mehner's Bash Support plugin.

To install Bash Support, download the zip archive, copy it to your ~/.vim directory, and unzip the archive. You'll also want to edit your ~/.vimrc to include a few personal details; open the file and add these three lines:

let g:BASH_AuthorName   = 'Your Name'
let g:BASH_Email        = 'my@email.com'
let g:BASH_Company      = 'Company Name'

These variables will be used to fill in some headers for your projects, as we'll see below.

The Bash Support plugin works in the Vim GUI (gVim) and text mode Vim. It's a little easier to use in the GUI, and Bash Support doesn't implement most of its menu functions in Vim's text mode, so you might want to stick with gVim when scripting.

When Bash Support is installed, gVim will include a new menu, appropriately titled Bash. This puts all of the Bash Support functions right at your fingertips (or mouse button, if you prefer). Let's walk through some of the features, and see how Bash Support can make Bash scripting a breeze.

Header and comments

If you believe in using extensive comments in your scripts, and I hope you are, you'll really enjoy using Bash Support. Bash Support provides a number of functions that make it easy to add comments to your bash scripts and programs automatically or with just a mouse click or a few keystrokes.

When you start a non-trivial script that will be used and maintained by others, it's a good idea to include a header with basic information -- the name of the script, usage, description, notes, author information, copyright, and any other info that might be useful to the next person who has to maintain the script. Bash Support makes it a breeze to provide this information. Go to Bash -> Comments -> File Header, and gVim will insert a header like this in your script:

#!/bin/bash
#===============================================================================
#
#          FILE:  test.sh
#
#         USAGE:  ./test.sh
#
#   DESCRIPTION:
#
#       OPTIONS:  ---
#  REQUIREMENTS:  ---
#          BUGS:  ---
#         NOTES:  ---
#        AUTHOR:  Joe Brockmeier, jzb@zonker.net
#       COMPANY:  Dissociated Press
#       VERSION:  1.0
#       CREATED:  05/25/2007 10:31:01 PM MDT
#      REVISION:  ---
#===============================================================================

You'll need to fill in some of the information, but Bash Support grabs the author, company name, and email address from your ~/.vimrc, and fills in the file name and created date automatically. To make life even easier, if you start Vim or gVim with a new file that ends with an .sh extension, it will insert the header automatically.

As you're writing your script, you might want to add comment blocks for your functions as well. To do this, go to Bash -> Comment -> Function Description to insert a block of text like this:

#===  FUNCTION  ================================================================
#          NAME:
#   DESCRIPTION:
#    PARAMETERS:
#       RETURNS:
#===============================================================================

Just fill in the relevant information and carry on coding.

The Comment menu allows you to insert other types of comments, insert the current date and time, and turn selected code into a comment, and vice versa.

Statements and snippets

Let's say you want to add an if-else statement to your script. You could type out the statement, or you could just use Bash Support's handy selection of pre-made statements. Go to Bash -> Statements and you'll see a long list of pre-made statements that you can just plug in and fill in the blanks. For instance, if you want to add a while statement, you can go to Bash -> Statements -> while, and you'll get the following:

while _; do
done

The cursor will be positioned where the underscore (_) is above. All you need to do is add the test statement and the actual code you want to run in the while statement. Sure, it'd be nice if Bash Support could do all that too, but there's only so far an IDE can help you.

However, you can help yourself. When you do a lot of bash scripting, you might have functions or code snippets that you reuse in new scripts. Bash Support allows you to add your snippets and functions by highlighting the code you want to save, then going to Bash -> Statements -> write code snippet. When you want to grab a piece of prewritten code, go to Bash -> Statements -> read code snippet. Bash Support ships with a few included code fragments.

Another way to add snippets to the statement collection is to just place a text file with the snippet under the ~/.vim/bash-support/codesnippets directory.

Running and debugging scripts

Once you have a script ready to go, and it's testing and debugging time. You could exit Vim, make the script executable, run it and see if it has any bugs, and then go back to Vim to edit it, but that's tedious. Bash Support lets you stay in Vim while doing your testing.

When you're ready to make the script executable, just choose Bash -> Run -> make script executable. To save and run the script, press Ctrl-F9, or go to Bash -> Run -> save + run script.

Bash Support also lets you call the bash debugger (bashdb) directly from within Vim. On Ubuntu, it's not installed by default, but that's easily remedied with apt-get install bashdb. Once it's installed, you can debug the script you're working on with F9 or Bash -> Run -> start debugger.

If you want a "hard copy" -- a PostScript printout -- of your script, you can generate one by going to Bash -> Run -> hardcopy to FILENAME.ps. This is where Bash Support comes in handy for any type of file, not just bash scripts. You can use this function within any file to generate a PostScript printout.

Bash Support has several other functions to help run and test scripts from within Vim. One useful feature is syntax checking, which you can access with Alt-F9. If you have no syntax errors, you'll get a quick OK. If there are problems, you'll see a small window at the bottom of the Vim screen with a list of syntax errors. From that window you can highlight the error and press Enter, and you'll be taken to the line with the error.

Put away the reference book...

Don't you hate it when you need to include a regular expression or a test in a script, but can't quite remember the syntax? That's no problem when you're using Bash Support, because you have Regex and Tests menus with all you'll need. For example, if you need to verify that a file exists and is owned by the correct user ID (UID), go to Bash -> Tests -> file exists and is owned by the effective UID. Bash Support will insert the appropriate test ([ -O _]) with your cursor in the spot where you have to fill in the file name.

To build regular expressions quickly, go to the Bash menu, select Regex, then pick the appropriate expression from the list. It's fairly useful when you can't remember exactly how to express "zero or one" or other regular expressions.

Bash Support also includes menus for environment variables, bash builtins, shell options, and a lot more.

Hotkey support

Vim users can access many of Bash Support's features using hotkeys. While not as simple as clicking the menu, the hotkeys do follow a logical scheme that makes them easy to remember. For example, all of the comment functions are accessed with \c, so if you want to insert a file header, you use \ch; if you want a date inserted, type \cd; and for a line end comment, use \cl.

Statements can be accessed with \a. Use \ac for a case statement, \aie for an "if then else" statement, \af for a "for in..." statement, and so on. Note that the online docs are incorrect here, and indicate that statements begin with \s, but Bash Support ships with a PDF reference card (under .vim/bash-support/doc/bash-hot-keys.pdf) that gets it right.

Run commands are accessed with \r. For example, to save the file and run a script, use \rr; to make a script executable, use \re; and to start the debugger, type \rd. I won't try to detail all of the shortcuts, but you can pull up a reference using :help bashsupport-usage-vim when in Vim, or use the PDF. The full Bash Support reference is available within Vim by running :help bashsupport, or you can read it online.

Of course, we've covered only a small part of Bash Support's functionality. The next time you need to whip up a shell script, try it using Vim with Bash Support. This plugin makes scripting in bash a lot easier.

[Sep 26, 2007] Beware Exotic tools can kill you!

Killersites.com

Once and a while you may come across, what would seem to be a 'killer' piece of software, or maybe a cool new programming language - something in that would appear to give you some advantage.

That MAY be the case, but many times, it isn't really so - think twice before your leap!

Consider these points:

Do you notice a pattern here?

Yes, it's all about time. All this junk (software, programming languages, markup languages etc…) have one purpose in the end: to save you time.

Keep that in mind when you approach things - ask yourself:

'Will using this save me time?'

[Sep 26, 2007] Will Ruby kill PHP

KILLERPHP.COM

OO is definitely overkill for a lot of web projects. It seems to me that so many people use OO frameworks like Ruby and Zope because "it's enterprise level". But using an 'enterprise' framework for small to medium sized web applications just adds so much overhead and frustration at having to learn the framework that it just doesnt seem worth it to me.

Having said all this I must point out that I'm distrustful of large corporations and hate their dehumanising heirarchical structure. Therefore i am naturally drawn towards open source and away from the whole OO/enterprise/heirarchy paradigm. Maybe people want to push open source to the enterprise level in the hope that they will adopt the technology and therefore they will have more job security. Get over it - go and learn Java and .NET if you want job security and preserve open source software as an oasis of freedom away from the corporate world. Just my 2c

===

OOP has its place, but the diversity of frameworks is just as challenging to figure out as a new class you didn't write, if not more. None of them work the same or keep a standard convention between them that makes learning them easier. Frameworks are great, but sometimes I think maybe they don't all have to be OO. I keep a small personal library of functions I've (and others have) written procedurally and include them just like I would a class.

Beyond the overhead issues is complexity. OOP has you chasing declarations over many files to figure out what's happening. If you're trying to learn how that unique class you need works, it can be time consuming to read through it and see how the class is structured. By the time you're done you may as well have written the class yourself, at least by then you'd have a solid understanding. Encapsulation and polymorphism have their advantages, but the cost is complexity which can equal time. And for smaller projects that will likely never expand, that time and energy can be a waste.

Not trying to bash OOP, just to defend procedural style. They each have their place.

===

Sorry, but I don't like your text, because you mix Ruby and Ruby on Rails alot. Ruby is in my oppinion easier to use then PHP, because PHP has no design-principle beside "make it work, somehow easy to use". Ruby has some really cool stuff I miss quite often, when I have to program in PHP again (blocks for example), but has a more clear and logic syntax.

Ruby on Rails is of course not that easy to use, at least when speaking about small-scale projects. This is, because it does alot more than PHP does. Of course, there are other good reasons to prefere PHP over Rails (like the better support by providers, more modules, more documentation), but from my opinion, most projects done in PHP from the complexity of a blog could profit from being programmed in Rails, from the pure technical point of view. At least I won't program in PHP again unless a customer asks me.

===

I have a reasonable level of experience with PHP and Python but unfortunately haven't touched Ruby yet. They both seem to be a good choice for low complexity projects. I can even say that I like Python a lot. But I would never consider it again for projects where design is an issue. They also say it is for (rapid) prototyping. My experience is that as long as you can't afford a proper IDE Python is maybe the best place to go to. But a properly "equipped" environment can formidably boost your productivity with a statically typed language like Java. In that case Python's advantage shrinks to the benefits of quick tests accesible through its command line.

Another problem of Python is that it wants to be everything: simple and complete, flexible and structured, high-level while allowing for low-level programming. The result is a series of obscure features

Having said all that I must give Python all the credits of a good language. It's just not perfect. Maybe it's Ruby :-)
My apologies for not sticking too closely to the subject of the article.

===

The one thing I hate is OOP geeks trying to prove that they can write code that does nothing usefull and nobody understands.

"You don't have to use OOP in ruby! You can do it PHP way! So you better do your homework before making such statements!"

Then why use ruby in the first place?

"What is really OVERKILL to me, is to know the hundrets of functions, PHP provides out of the box, and available in ANY scope! So I have to be extra careful whether I can use some name. And the more functions - the bigger the MESS."

On the other hand, in ruby you use only functions available for particular object you use.

I would rather say: "some text".length than strlen("some text"); which is much more meaningful! Ruby language itself much more descriptive. I remember myself, from my old PHP days, heaving always to look up the php.net for appropriate function, but now I can just guess!"

Yeah you must have weak memory and can`t remember whether strlen() is for strings or for numbers….

Doesn`t ruby have the same number of functions just stored in objects?

Look if you can`t remember strlen than invent your own classes you can make a whole useless OOP framework for PHP in a day……

[Sep 26, 2007] Will Ruby on Rails kill .net and Java

www dot james mckay dot net

Ruby on Rails 1.1 has been released.

Dion Hinchcliffe has posted a blog entry at the end of which he asks the question, will it be a nail in the coffin for .net and Java?

Rails certainly looks beautiful. It is fully object oriented, with built in O/R mapping, powerful AJAX support, an elegant syntax, a proper implementation of the Model-View-Controller design pattern, and even a Ruby to Javascript converter which lets you write client side web code in Ruby.

However, I don't think it's the end of the line for C# and Java by a long shot. Even if it does draw a lot of fire, there is a heck of a lot of code knocking around in these languages, and there likely still will be for a very long time to come. Even throwaway code and hacked together interim solutions have a habit of living a lot longer than anyone ever expects. Look at how much code is still out there in Fortran, COBOL and Lisp, for instance.

Like most scripting languages such as Perl, Python, PHP and so on, Ruby is still a dynamically typed language. For this reason it will be slower than statically typed languages such as C#, C++ and Java. So it won't be used so much in places where you need lots of raw power. However, most web applications don't need such raw power in the business layer. The main bottleneck in web development is database access and network latency in communicating with the browser, so using C# rather than Rails would have only a very minor impact on performance. But some of them do, and in such cases the solutions often have different parts of the application written in different languages and even running on different servers. One of the solutions that we have developed, for instance, has a web front end in PHP running on a Linux box, with a back end application server running a combination of Python and C++ on a Windows server.

Rails certainly knocks the spots off PHP though…

[May 7, 2007] The Hundred-Year Language by Paul Graham

April 2003

(Keynote from PyCon2003)

...I have a hunch that the main branches of the evolutionary tree pass through the languages that have the smallest, cleanest cores. The more of a language you can write in itself, the better.

...Languages evolve slowly because they're not really technologies. Languages are notation. A program is a formal description of the problem you want a computer to solve for you. So the rate of evolution in programming languages is more like the rate of evolution in mathematical notation than, say, transportation or communications. Mathematical notation does evolve, but not with the giant leaps you see in technology.

...I learned to program when computer power was scarce. I can remember taking all the spaces out of my Basic programs so they would fit into the memory of a 4K TRS-80. The thought of all this stupendously inefficient software burning up cycles doing the same thing over and over seems kind of gross to me. But I think my intuitions here are wrong. I'm like someone who grew up poor, and can't bear to spend money even for something important, like going to the doctor.

Some kinds of waste really are disgusting. SUVs, for example, would arguably be gross even if they ran on a fuel which would never run out and generated no pollution. SUVs are gross because they're the solution to a gross problem. (How to make minivans look more masculine.) But not all waste is bad. Now that we have the infrastructure to support it, counting the minutes of your long-distance calls starts to seem niggling. If you have the resources, it's more elegant to think of all phone calls as one kind of thing, no matter where the other person is.

There's good waste, and bad waste. I'm interested in good waste-- the kind where, by spending more, we can get simpler designs. How will we take advantage of the opportunities to waste cycles that we'll get from new, faster hardware?

The desire for speed is so deeply engrained in us, with our puny computers, that it will take a conscious effort to overcome it. In language design, we should be consciously seeking out situations where we can trade efficiency for even the smallest increase in convenience.

Most data structures exist because of speed. For example, many languages today have both strings and lists. Semantically, strings are more or less a subset of lists in which the elements are characters. So why do you need a separate data type? You don't, really. Strings only exist for efficiency. But it's lame to clutter up the semantics of the language with hacks to make programs run faster. Having strings in a language seems to be a case of premature optimization.

... Inefficient software isn't gross. What's gross is a language that makes programmers do needless work. Wasting programmer time is the true inefficiency, not wasting machine time. This will become ever more clear as computers get faster

... Somehow the idea of reusability got attached to object-oriented programming in the 1980s, and no amount of evidence to the contrary seems to be able to shake it free. But although some object-oriented software is reusable, what makes it reusable is its bottom-upness, not its object-orientedness. Consider libraries: they're reusable because they're language, whether they're written in an object-oriented style or not.

I don't predict the demise of object-oriented programming, by the way. Though I don't think it has much to offer good programmers, except in certain specialized domains, it is irresistible to large organizations. Object-oriented programming offers a sustainable way to write spaghetti code. It lets you accrete programs as a series of patches. Large organizations always tend to develop software this way, and I expect this to be as true in a hundred years as it is today.

... As this gap widens, profilers will become increasingly important. Little attention is paid to profiling now. Many people still seem to believe tha